hardware_aware_nas.rst 4.31 KB
Newer Older
1
2
3
Hardware-aware NAS
==================

Yuge Zhang's avatar
Yuge Zhang committed
4
.. This file should be rewritten as a tutorial
5

6
7
End-to-end Multi-trial SPOS Demo
--------------------------------
8

9
To empower affordable DNN on the edge and mobile devices, hardware-aware NAS searches both high accuracy and low latency models. In particular, the search algorithm only considers the models within the target latency constraints during the search process.
10

11
To run this demo, first install nn-Meter by running:
12
13
14

.. code-block:: bash

15
  pip install nn-meter
16
17
18
19
20

Then run multi-trail SPOS demo:

.. code-block:: bash

21
22
  cd ${NNI_ROOT}/examples/nas/oneshot/spos/
  python search.py --latency-filter cortexA76cpu_tflite21
23

24

25
How the demo works
26
^^^^^^^^^^^^^^^^^^
27

28
To support hardware-aware NAS, you first need a ``Strategy`` that supports filtering the models by latency. We provide such a filter named ``LatencyFilter`` in NNI and initialize a ``RegularizedEvolution`` strategy with the filter:
29
30
31

.. code-block:: python

32
33
34
35
  evolution_strategy = strategy.RegularizedEvolution(
        model_filter=latency_filter,
        sample_size=args.evolution_sample_size, population_size=args.evolution_population_size, cycles=args.evolution_cycles
        )
36
37
38
39

``LatencyFilter`` will predict the models\' latency by using nn-Meter and filter out the models whose latency are larger than the threshold (i.e., ``100`` in this example).
You can also build your own strategies and filters to support more flexible NAS such as sorting the models according to latency.

40
Then, pass this strategy to ``RetiariiExperiment``:
41
42
43

.. code-block:: python

44
  exp = RetiariiExperiment(base_model, evaluator, strategy=evolution_strategy)
45

46
47
  exp_config = RetiariiExeConfig('local')
  ...
48
  exp_config.dummy_input = [1, 3, 224, 224]
49

50
  exp.run(exp_config, args.port)
51

52
In ``exp_config``, ``dummy_input`` is required for tracing shape info in latency predictor.
53
54
55
56
57


End-to-end ProxylessNAS with Latency Constraints
------------------------------------------------

58
`ProxylessNAS <https://arxiv.org/abs/1812.00332>`__ is a hardware-aware one-shot NAS algorithm. ProxylessNAS applies the expected latency of the model to build a differentiable metric and design efficient neural network architectures for hardware. The latency loss is added as a regularization term for architecture parameter optimization. In this example, nn-Meter provides a latency estimator to predict expected latency for the mixed operation on other types of mobile and edge hardware. 
59
60
61
62
63
64
65
66
67

To run the one-shot ProxylessNAS demo, first install nn-Meter by running:

.. code-block:: bash

  pip install nn-meter

Then run one-shot ProxylessNAS demo:

Yuge Zhang's avatar
Yuge Zhang committed
68
69
.. code-block:: bash

Yuge Zhang's avatar
Yuge Zhang committed
70
   python ${NNI_ROOT}/examples/nas/oneshot/proxylessnas/main.py --applied_hardware HARDWARE --reference_latency REFERENCE_LATENCY_MS
71
72
73
74

How the demo works
^^^^^^^^^^^^^^^^^^

Yuge Zhang's avatar
Yuge Zhang committed
75
In the implementation of ProxylessNAS ``trainer``, we provide a ``HardwareLatencyEstimator`` which currently builds a lookup table, that stores the measured latency of each candidate building block in the search space. The latency sum of all building blocks in a candidate model will be treated as the model inference latency. The latency prediction is obtained by ``nn-Meter``. ``HardwareLatencyEstimator`` predicts expected latency for the mixed operation based on the path weight of ``ProxylessLayerChoice``. With leveraging ``nn-Meter`` in NNI, users can apply ProxylessNAS to search efficient DNN models on more types of edge devices. 
76
77
78
79
80
81
82
83

Despite of ``applied_hardware`` and ``reference_latency``, There are some other parameters related to hardware-aware ProxylessNAS training in this :githublink:`example <examples/nas/oneshot/proxylessnas/main.py>`:

* ``grad_reg_loss_type``: Regularization type to add hardware related loss. Allowed types include ``"mul#log"`` and ``"add#linear"``. Type of ``mul#log`` is calculate by ``(torch.log(expected_latency) / math.log(reference_latency)) ** beta``. Type of ``"add#linear"`` is calculate by ``reg_lambda * (expected_latency - reference_latency) / reference_latency``. 
* ``grad_reg_loss_lambda``: Regularization params, is set to ``0.1`` by default.
* ``grad_reg_loss_alpha``: Regularization params, is set to ``0.2`` by default.
* ``grad_reg_loss_beta``: Regularization params, is set to ``0.3`` by default.
* ``dummy_input``: The dummy input shape when applied to the target hardware. This parameter is set as (1, 3, 224, 224) by default.