[Docs] Refine proxylessNAS doc (#4321)

0e0ee86d · Jiahang Xu · GitHub · 443ba8c1 · 0e0ee86d · 0e0ee86d
Unverified Commit 0e0ee86d authored Dec 07, 2021 by Jiahang Xu Committed by GitHub Dec 07, 2021
5 changed files
--- a/docs/en_US/NAS/HardwareAwareNAS.rst
+++ b/docs/en_US/NAS/HardwareAwareNAS.rst
@@ -3,16 +3,16 @@ Hardware-aware NAS
 .. contents::
-EndToEnd Multi-trial SPOS Demo
+End-to-end Multi-trial SPOS Demo
------------------------------
+--------------------------------
 To empower affordable DNN on the edge and mobile devices, hardware-aware NAS searches both high accuracy and low latency models. In particular, the search algorithm only considers the models within the target latency constraints during the search process.
-To run this demo, first install nn-Meter from source code (Github repo link: https://github.com/microsoft/nn-Meter. Currently we haven't released this package, so development installation is required).
+To run this demo, first install nn-Meter by running:
 .. code-block:: bash
-  python setup.py develop
+  pip install nn-meter
 Then run multi-trail SPOS demo:
@@ -20,10 +20,11 @@ Then run multi-trail SPOS demo:
  python ${NNI_ROOT}/examples/nas/oneshot/spos/multi_trial.py
 How the demo works
------------------
+^^^^^^^^^^^^^^^^^^
-To support hardware-aware NAS, you first need a `Strategy` that supports filtering the models by latency. We provide such a filter named `LatencyFilter` in NNI and initialize a `Random` strategy with the filter:
+To support hardware-aware NAS, you first need a ``Strategy`` that supports filtering the models by latency. We provide such a filter named ``LatencyFilter`` in NNI and initialize a ``Random`` strategy with the filter:
 .. code-block:: python
@@ -45,3 +46,34 @@ Then, pass this strategy to ``RetiariiExperiment``:
  exp.run(exp_config, port)
 In ``exp_config``, ``dummy_input`` is required for tracing shape info.
+End-to-end ProxylessNAS with Latency Constraints
+------------------------------------------------
+`ProxylessNAS <https://arxiv.org/pdf/1812.00332.pdf>`__ is a hardware-aware one-shot NAS algorithm. ProxylessNAS applies the expected latency of the model to build a differentiable metric and design efficient neural network architectures for hardware. The latency loss is added as a regularization term for architecture parameter optimization. In this example, nn-Meter provides a latency estimator to predict expected latency for the mixed operation on other types of mobile and edge hardware. 
+To run the one-shot ProxylessNAS demo, first install nn-Meter by running:
+.. code-block:: bash
+  pip install nn-meter
+Then run one-shot ProxylessNAS demo:
+```bash
+python ${NNI_ROOT}/examples/nas/oneshot/proxylessnas/main.py --applied_hardware <hardware> --reference_latency <reference latency (ms)>
+```
+How the demo works
+^^^^^^^^^^^^^^^^^^
+In the implementation of ProxylessNAS ``trainer``, we provide a ``HardwareLatencyEstimator`` which currently builds a lookup table, that stores the measured latency of each candidate building block in the search space. The latency sum of all building blocks in a candidate model will be treated as the model inference latency. The latency prediction is obtained by ``nn-Meter``. ``HardwareLatencyEstimator`` predicts expected latency for the mixed operation based on the path weight of `ProxylessLayerChoice`. With leveraging ``nn-Meter`` in NNI, users can apply ProxylessNAS to search efficient DNN models on more types of edge devices. 
+Despite of ``applied_hardware`` and ``reference_latency``, There are some other parameters related to hardware-aware ProxylessNAS training in this :githublink:`example <examples/nas/oneshot/proxylessnas/main.py>`:
+* ``grad_reg_loss_type``: Regularization type to add hardware related loss. Allowed types include ``"mul#log"`` and ``"add#linear"``. Type of ``mul#log`` is calculate by ``(torch.log(expected_latency) / math.log(reference_latency)) ** beta``. Type of ``"add#linear"`` is calculate by ``reg_lambda * (expected_latency - reference_latency) / reference_latency``. 
+* ``grad_reg_loss_lambda``: Regularization params, is set to ``0.1`` by default.
+* ``grad_reg_loss_alpha``: Regularization params, is set to ``0.2`` by default.
+* ``grad_reg_loss_beta``: Regularization params, is set to ``0.3`` by default.
+* ``dummy_input``: The dummy input shape when applied to the target hardware. This parameter is set as (1, 3, 224, 224) by default.
--- a/docs/en_US/NAS/Proxylessnas.rst
+++ b/docs/en_US/NAS/Proxylessnas.rst
@@ -55,7 +55,7 @@ Implementation
 The implementation on NNI is based on the `offical implementation <https://github.com/mit-han-lab/ProxylessNAS>`__. The official implementation supports two training approaches: gradient descent and RL based. In our current implementation on NNI, gradient descent training approach is supported. The complete support of ProxylessNAS is ongoing.
-The official implementation supports different targeted hardware, including 'mobile', 'cpu', 'gpu8', 'flops'.  In NNI repo, the hardware latency prediction is supported by `Microsoft nn-Meter <https://github.com/microsoft/nn-Meter>`__. nn-Meter is an accurate inference latency predictor for DNN models on diverse edge devices. nn-Meter support four hardwares up to now, including *'cortexA76cpu_tflite21'*, *'adreno640gpu_tflite21'*, *'adreno630gpu_tflite21'*, and *'myriadvpu_openvino2019r2'*. Users can find more information about nn-Meter on its website. More hardware will be supported in the future.
+The official implementation supports different targeted hardware, including 'mobile', 'cpu', 'gpu8', 'flops'.  In NNI repo, the hardware latency prediction is supported by `Microsoft nn-Meter <https://github.com/microsoft/nn-Meter>`__. nn-Meter is an accurate inference latency predictor for DNN models on diverse edge devices. nn-Meter support four hardwares up to now, including *'cortexA76cpu_tflite21'*, *'adreno640gpu_tflite21'*, *'adreno630gpu_tflite21'*, and *'myriadvpu_openvino2019r2'*. Users can find more information about nn-Meter on its website. More hardware will be supported in the future. Users could find more details about applying ``nn-Meter`` `here <./HardwareAwareNAS.rst>`__ .
 Below we will describe implementation details. Like other one-shot NAS algorithms on NNI, ProxylessNAS is composed of two parts: *search space* and *training approach*. For users to flexibly define their own search space and use built-in ProxylessNAS training approach, we put the specified search space in :githublink:`example code <examples/nas/oneshot/proxylessnas>` using :githublink:`NNI NAS interface <nni/retiarii/oneshot/pytorch/proxyless>`.

--- a/docs/en_US/NAS/multi_trial_nas.rst
+++ b/docs/en_US/NAS/multi_trial_nas.rst
@@ -11,4 +11,3 @@ In multi-trial NAS, users need model evaluator to evaluate the performance of ea
    Exploration Strategies <ExplorationStrategies>
    Customize Exploration Strategies <WriteStrategy>
    Execution Engines <ExecutionEngines>
-    Hardware-aware NAS <HardwareAwareNAS>
--- a/docs/en_US/NAS/one_shot_nas.rst
+++ b/docs/en_US/NAS/one_shot_nas.rst
@@ -13,4 +13,4 @@ One-shot NAS algorithms leverage weight sharing among models in neural architect
    SPOS <SPOS>
    ProxylessNAS <Proxylessnas>
    FBNet <FBNet>
-    Customize one-shot NAS <WriteOneshot>
+    Customize One-shot NAS <WriteOneshot>
--- a/docs/en_US/nas.rst
+++ b/docs/en_US/nas.rst
@@ -29,5 +29,6 @@ Follow the instructions below to start your journey with Retiarii.
    Construct Model Space <NAS/construct_space>
    Multi-trial NAS <NAS/multi_trial_nas>
    One-shot NAS <NAS/one_shot_nas>
+    Hardware-aware NAS <NAS/HardwareAwareNAS>
    NAS Benchmarks <NAS/Benchmarks>
    NAS API References <NAS/ApiReference>