update doc (#2047)

* update readme * move doc of model speedup * add quick start for nas & compression

update doc (#2047)
* update readme * move doc of model speedup * add quick start for nas & compression
50e425f2 · QuanluZhang · GitHub · 1958adb0 · 50e425f2 · 50e425f2
Unverified Commit 50e425f2 authored Feb 13, 2020 by QuanluZhang Committed by GitHub Feb 13, 2020
8 changed files
--- a/README.md
+++ b/README.md
@@ -177,9 +177,9 @@ Within the following table, we summarized the current NNI capabilities, we are g
      </td>
     <td style="border-top:#FF0000 solid 0px;">
      <ul>
-        <li><a href="docs/en_US/sdk_reference.rst">Python API</a></li>
+        <li><a href="https://nni.readthedocs.io/en/latest/autotune_ref.html#trial">Python API</a></li>
        <li><a href="docs/en_US/Tutorial/AnnotationSpec.md">NNI Annotation</a></li>
-         <li><a href="docs/en_US/Tutorial/Installation.md">Supported OS</a></li>
+         <li><a href="https://nni.readthedocs.io/en/latest/installation.html">Supported OS</a></li>
      </ul>
      </td>
       <td style="border-top:#FF0000 solid 0px;">
@@ -216,9 +216,9 @@ Windows
 python -m pip install --upgrade nni
 ```

-If you want to try latest code, please [install NNI](docs/en_US/Tutorial/Installation.md) from source code.
+If you want to try latest code, please [install NNI](https://nni.readthedocs.io/en/latest/installation.html) from source code.

-For detail system requirements of NNI, please refer to [here](docs/en_US/Tutorial/Installation.md#system-requirements).
+For detail system requirements of NNI, please refer to [here](https://nni.readthedocs.io/en/latest/Tutorial/InstallationLinux.html#system-requirements) for Linux & macOS, and [here](https://nni.readthedocs.io/en/latest/Tutorial/InstallationWin.html#system-requirements) for Windows.

 Note:


--- a/examples/model_compress/speedup.md
+++ b/examples/model_compress/speedup.md
--- a/docs/en_US/Compressor/QuickStart.md
+++ b/docs/en_US/Compressor/QuickStart.md
+# Quick Start to Compress a Model
+
+NNI provides very simple APIs for compressing a model. The compression includes pruning algorithms and quantization algorithms. The usage of them are the same, thus, here we use slim pruner as an example to show the usage. The complete code of this example can be found [here](https://github.com/microsoft/nni/blob/master/examples/model_compress/slim_torch_cifar10.py).
+
+## Write configuration
+
+Write a configuration to specify the layers that you want to prune. The following configuration means pruning all the `BatchNorm2d`s to sparsity 0.7 while keeping other layers unpruned.
+
+```python
+configure_list = [{
+    'sparsity': 0.7,
+    'op_types': ['BatchNorm2d'],
+}]
+```
+
+The specification of configuration can be found [here](Overview.md#user-configuration-for-a-compression-algorithm). Note that different pruners may have their own defined fields in configuration, for exmaple `start_epoch` in AGP pruner. Please refer to each pruner's [usage](Overview.md#supported-algorithms) for details, and adjust the configuration accordingly.
+
+## Choose a compression algorithm
+
+Choose a pruner to prune your model. First instantiate the chosen pruner with your model and configuration as arguments, then invoke `compress()` to compress your model.
+
+```python
+pruner = SlimPruner(model, configure_list)
+model = pruner.compress()
+```
+
+Then, you can train your model using traditional training approach (e.g., SGD), pruning is applied transparently during the training. Some pruners prune once at the beginning, the following training can be seen as fine-tune. Some pruners prune your model iteratively, the masks are adjusted epoch by epoch during training.
+
+## Export compression result
+
+After training, you get accuracy of the pruned model. You can export model weights to a file, and the generated masks to a file as well. Exporting onnx model is also supported.
+
+```python
+pruner.export_model(model_path='pruned_vgg19_cifar10.pth', mask_path='mask_vgg19_cifar10.pth')
+```
+
+## Speed up the model
+
+Masks do not provide real speedup of your model. The model should be speeded up based on the exported masks, thus, we provide an API to speed up your model as shown below. After invoking `apply_compression_results` on your model, your model becomes a smaller one with shorter inference latency.
+
+```python
+from nni.compression.torch import apply_compression_results
+apply_compression_results(model, 'mask_vgg19_cifar10.pth')
+```
+
+Please refer to [here](ModelSpeedup.md) for detailed description.
\ No newline at end of file
--- a/docs/en_US/NAS/Advanced.md
+++ b/docs/en_US/NAS/Advanced.md
+# Customize a NAS Algorithm
+
+## Extend the Ability of One-Shot Trainers
+
+Users might want to do multiple things if they are using the trainers on real tasks, for example, distributed training, half-precision training, logging periodically, writing tensorboard, dumping checkpoints and so on. As mentioned previously, some trainers do have support for some of the items listed above; others might not. Generally, there are two recommended ways to add anything you want to an existing trainer: inherit an existing trainer and override, or copy an existing trainer and modify.
+
+Either way, you are walking into the scope of implementing a new trainer. Basically, implementing a one-shot trainer is no different from any traditional deep learning trainer, except that a new concept called mutator will reveal itself. So that the implementation will be different in at least two places:
+
+* Initialization
+
+```python
+model = Model()
+mutator = MyMutator(model)
+```
+
+* Training
+
+```python
+for _ in range(epochs):
+    for x, y in data_loader:
+        mutator.reset()  # reset all the choices in model
+        out = model(x)  # like traditional model
+        loss = criterion(out, y)
+        loss.backward()
+        # no difference below
+```
+
+To demonstrate what mutators are for, we need to know how one-shot NAS normally works. Usually, one-shot NAS "co-optimize model weights and architecture weights". It repeatedly: sample an architecture or combination of several architectures from the supernet, train the chosen architectures like traditional deep learning model, update the trained parameters to the supernet, and use the metrics or loss as some signal to guide the architecture sampler. The mutator, is the architecture sampler here, often defined to be another deep-learning model. Therefore, you can treat it as any model, by defining parameters in it and optimizing it with optimizers. One mutator is initialized with exactly one model. Once a mutator is binded to a model, it cannot be rebinded to another model.
+
+`mutator.reset()` is the core step. That's where all the choices in the model are finalized. The reset result will be always effective, until the next reset flushes the data. After the reset, the model can be seen as a traditional model to do forward-pass and backward-pass.
+
+Finally, mutators provide a method called `mutator.export()` that export a dict with architectures to the model. Note that currently this dict this a mapping from keys of mutables to tensors of selection. So in order to dump to json, users need to convert the tensors explicitly into python list.
+
+Meanwhile, NNI provides some useful tools so that users can implement trainers more easily. See [Trainers](./NasReference.md#trainers) for details.
+
+## Implement New Mutators
+
+To start with, here is the pseudo-code that demonstrates what happens on `mutator.reset()` and `mutator.export()`.
+
+```python
+def reset(self):
+    self.apply_on_model(self.sample_search())
+```
+
+```python
+def export(self):
+    return self.sample_final()
+```
+
+On reset, a new architecture is sampled with `sample_search()` and applied on the model. Then the model is trained for one or more steps in search phase. On export, a new architecture is sampled with `sample_final()` and **do nothing to the model**. This is either for checkpoint or exporting the final architecture.
+
+The requirements of return values of `sample_search()` and `sample_final()` are the same: a mapping from mutable keys to tensors. The tensor can be either a BoolTensor (true for selected, false for negative), or a FloatTensor which applies weight on each candidate. The selected branches will then be computed (in `LayerChoice`, modules will be called; in `InputChoice`, it's just tensors themselves), and reduce with the reduction operation specified in the choices. For most algorithms only worrying about the former part, here is an example of your mutator implementation.
+
+```python
+class RandomMutator(Mutator):
+    def __init__(self, model):
+        super().__init__(model)  # don't forget to call super
+        # do something else
+
+    def sample_search(self):
+        result = dict()
+        for mutable in self.mutables:  # this is all the mutable modules in user model
+            # mutables share the same key will be de-duplicated
+            if isinstance(mutable, LayerChoice):
+                # decided that this mutable should choose `gen_index`
+                gen_index = np.random.randint(mutable.length)
+                result[mutable.key] = torch.tensor([i == gen_index for i in range(mutable.length)], 
+                                                   dtype=torch.bool)
+            elif isinstance(mutable, InputChoice):
+                if mutable.n_chosen is None:  # n_chosen is None, then choose any number
+                    result[mutable.key] = torch.randint(high=2, size=(mutable.n_candidates,)).view(-1).bool()
+                # else do something else
+        return result
+
+    def sample_final(self):
+        return self.sample_search()  # use the same logic here. you can do something different
+```
+
+The complete example of random mutator can be found [here](https://github.com/microsoft/nni/blob/master/src/sdk/pynni/nni/nas/pytorch/random/mutator.py).
+
+For advanced usages, e.g., users want to manipulate the way modules in `LayerChoice` are executed, they can inherit `BaseMutator`, and overwrite `on_forward_layer_choice` and `on_forward_input_choice`, which are the callback implementation of `LayerChoice` and `InputChoice` respectively. Users can still use property `mutables` to get all `LayerChoice` and `InputChoice` in the model code. For details, please refer to [reference](https://github.com/microsoft/nni/tree/master/src/sdk/pynni/nni/nas/pytorch) here to learn more.
+
+```eval_rst
+.. tip::
+    A useful application of random mutator is for debugging. Use
+   
+    .. code-block:: python
+
+        mutator = RandomMutator(model)
+        mutator.reset()
+
+    will immediately set one possible candidate in the search space as the active one.
+```
+
+## Implemented a Distributed NAS Tuner
+
+Before learning how to write a one-shot NAS tuner, users should first learn how to write a general tuner. read [Customize Tuner](../Tuner/CustomizeTuner.md) for tutorials.
+
+When users call "[nnictl ss_gen](../Tutorial/Nnictl.md)" to generate search space file, a search space file like this will be generated:
+
+```json
+{
+    "key_name": {
+        "_type": "layer_choice",
+        "_value": ["op1_repr", "op2_repr", "op3_repr"]
+    },
+    "key_name": {
+        "_type": "input_choice",
+        "_value": {
+            "candidates": ["in1_key", "in2_key", "in3_key"],
+            "n_chosen": 1
+        }
+    }
+}
+```
+
+This is the exact search space tuners will receive in `update_search_space`. It's then tuners' responsibility to interpret the search space and generate new candidates in `generate_parameters`. A valid "parameters" will be in the following format:
+
+```json
+{
+    "key_name": {
+        "_value": "op1_repr",
+        "_idx": 0
+    },
+    "key_name": {
+        "_value": ["in2_key"],
+        "_idex": [1]
+    }
+}
+```
+
+Send it through `generate_parameters`, and the tuner would look like any HPO tuner. Refer to [SPOS](./SPOS.md) example code for an example.
\ No newline at end of file
--- a/docs/en_US/NAS/NasGuide.md
+++ b/docs/en_US/NAS/NasGuide.md
@@ -167,139 +167,6 @@ After applying, the model is then fixed and ready for a final training. The mode

 Also refer to [DARTS](./DARTS.md) for example code of retraining.

-## Customize a Search Algorithm
-
-### Extend the Ability of One-Shot Trainers
-
-Users might want to do multiple things if they are using the trainers on real tasks, for example, distributed training, half-precision training, logging periodically, writing tensorboard, dumping checkpoints and so on. As mentioned previously, some trainers do have support for some of the items listed above; others might not. Generally, there are two recommended ways to add anything you want to an existing trainer: inherit an existing trainer and override, or copy an existing trainer and modify.
-
-Either way, you are walking into the scope of implementing a new trainer. Basically, implementing a one-shot trainer is no different from any traditional deep learning trainer, except that a new concept called mutator will reveal itself. So that the implementation will be different in at least two places:
-
-* Initialization
-
-```python
-model = Model()
-mutator = MyMutator(model)
-```
-
-* Training
-
-```python
-for _ in range(epochs):
-    for x, y in data_loader:
-        mutator.reset()  # reset all the choices in model
-        out = model(x)  # like traditional model
-        loss = criterion(out, y)
-        loss.backward()
-        # no difference below
-```
-
-To demonstrate what mutators are for, we need to know how one-shot NAS normally works. Usually, one-shot NAS "co-optimize model weights and architecture weights". It repeatedly: sample an architecture or combination of several architectures from the supernet, train the chosen architectures like traditional deep learning model, update the trained parameters to the supernet, and use the metrics or loss as some signal to guide the architecture sampler. The mutator, is the architecture sampler here, often defined to be another deep-learning model. Therefore, you can treat it as any model, by defining parameters in it and optimizing it with optimizers. One mutator is initialized with exactly one model. Once a mutator is binded to a model, it cannot be rebinded to another model.
-
-`mutator.reset()` is the core step. That's where all the choices in the model are finalized. The reset result will be always effective, until the next reset flushes the data. After the reset, the model can be seen as a traditional model to do forward-pass and backward-pass.
-
-Finally, mutators provide a method called `mutator.export()` that export a dict with architectures to the model. Note that currently this dict this a mapping from keys of mutables to tensors of selection. So in order to dump to json, users need to convert the tensors explicitly into python list.
-
-Meanwhile, NNI provides some useful tools so that users can implement trainers more easily. See [Trainers](./NasReference.md#trainers) for details.
-
-### Implement New Mutators
-
-To start with, here is the pseudo-code that demonstrates what happens on `mutator.reset()` and `mutator.export()`.
-
-```python
-def reset(self):
-    self.apply_on_model(self.sample_search())
-```
-
-```python
-def export(self):
-    return self.sample_final()
-```
-
-On reset, a new architecture is sampled with `sample_search()` and applied on the model. Then the model is trained for one or more steps in search phase. On export, a new architecture is sampled with `sample_final()` and **do nothing to the model**. This is either for checkpoint or exporting the final architecture.
-
-The requirements of return values of `sample_search()` and `sample_final()` are the same: a mapping from mutable keys to tensors. The tensor can be either a BoolTensor (true for selected, false for negative), or a FloatTensor which applies weight on each candidate. The selected branches will then be computed (in `LayerChoice`, modules will be called; in `InputChoice`, it's just tensors themselves), and reduce with the reduction operation specified in the choices. For most algorithms only worrying about the former part, here is an example of your mutator implementation.
-
-```python
-class RandomMutator(Mutator):
-    def __init__(self, model):
-        super().__init__(model)  # don't forget to call super
-        # do something else
-
-    def sample_search(self):
-        result = dict()
-        for mutable in self.mutables:  # this is all the mutable modules in user model
-            # mutables share the same key will be de-duplicated
-            if isinstance(mutable, LayerChoice):
-                # decided that this mutable should choose `gen_index`
-                gen_index = np.random.randint(mutable.length)
-                result[mutable.key] = torch.tensor([i == gen_index for i in range(mutable.length)], 
-                                                   dtype=torch.bool)
-            elif isinstance(mutable, InputChoice):
-                if mutable.n_chosen is None:  # n_chosen is None, then choose any number
-                    result[mutable.key] = torch.randint(high=2, size=(mutable.n_candidates,)).view(-1).bool()
-                # else do something else
-        return result
-
-    def sample_final(self):
-        return self.sample_search()  # use the same logic here. you can do something different
-```
-
-The complete example of random mutator can be found [here](https://github.com/microsoft/nni/blob/master/src/sdk/pynni/nni/nas/pytorch/random/mutator.py).
-
-For advanced usages, e.g., users want to manipulate the way modules in `LayerChoice` are executed, they can inherit `BaseMutator`, and overwrite `on_forward_layer_choice` and `on_forward_input_choice`, which are the callback implementation of `LayerChoice` and `InputChoice` respectively. Users can still use property `mutables` to get all `LayerChoice` and `InputChoice` in the model code. For details, please refer to [reference](https://github.com/microsoft/nni/tree/master/src/sdk/pynni/nni/nas/pytorch) here to learn more.
-
-```eval_rst
-.. tip::
-    A useful application of random mutator is for debugging. Use
-   
-    .. code-block:: python
-
-        mutator = RandomMutator(model)
-        mutator.reset()
-
-    will immediately set one possible candidate in the search space as the active one.
-```
-
-### Implemented a Distributed NAS Tuner
-
-Before learning how to write a one-shot NAS tuner, users should first learn how to write a general tuner. read [Customize Tuner](../Tuner/CustomizeTuner.md) for tutorials.
-
-When users call "[nnictl ss_gen](../Tutorial/Nnictl.md)" to generate search space file, a search space file like this will be generated:
-
-```json
-{
-    "key_name": {
-        "_type": "layer_choice",
-        "_value": ["op1_repr", "op2_repr", "op3_repr"]
-    },
-    "key_name": {
-        "_type": "input_choice",
-        "_value": {
-            "candidates": ["in1_key", "in2_key", "in3_key"],
-            "n_chosen": 1
-        }
-    }
-}
-```
-
-This is the exact search space tuners will receive in `update_search_space`. It's then tuners' responsibility to interpret the search space and generate new candidates in `generate_parameters`. A valid "parameters" will be in the following format:
-
-```json
-{
-    "key_name": {
-        "_value": "op1_repr",
-        "_idx": 0
-    },
-    "key_name": {
-        "_value": ["in2_key"],
-        "_idex": [1]
-    }
-}
-```
-
-Send it through `generate_parameters`, and the tuner would look like any HPO tuner. Refer to [SPOS](./SPOS.md) example code for an example.
-
 [1]: https://arxiv.org/abs/1808.05377
 [2]: https://arxiv.org/abs/1802.03268
 [3]: https://arxiv.org/abs/1812.03443

--- a/docs/en_US/NAS/QuickStart.md
+++ b/docs/en_US/NAS/QuickStart.md
+# NAS Quick Start
+
+The NAS feature provided by NNI has two key components: APIs for expressing search space, and NAS training approaches. The former is for users to easily specify a class of models (i.e., the candidate models specified by search space) which may perform well. The latter is for users to easily apply state-of-the-art NAS training approaches on their own model.
+
+Here we use a simple example to demonstrate how to tune your model architecture with NNI NAS APIs step by step. The complete code of this example can be found [here](https://github.com/microsoft/nni/tree/master/examples/nas/naive).
+
+## Write your model with NAS APIs
+
+Instead of writing a concrete neural model, you can write a class of neural models using two NAS APIs `LayerChoice` and `InputChoice`. For example, you think either of two operations might work in the first convolution layer, then you can get one from them using `LayerChoice` as shown by `self.conv1` in the code. Similarly, the second convolution layer `self.conv2` also chooses one from two operations. To this line, four candidate neural networks are specified. `self.skipconnect` uses `InputChoice` to specify two choices, i.e., adding skip connection or not.
+
+```python
+import torch.nn as nn
+from nni.nas.pytorch.mutables import LayerChoice, InputChoice
+
+class Net(nn.Module):
+    def __init__(self):
+        super(Net, self).__init__()
+        self.conv1 = LayerChoice([nn.Conv2d(3, 6, 3, padding=1), nn.Conv2d(3, 6, 5, padding=2)])
+        self.pool = nn.MaxPool2d(2, 2)
+        self.conv2 = LayerChoice([nn.Conv2d(6, 16, 3, padding=1), nn.Conv2d(6, 16, 5, padding=2)])
+        self.conv3 = nn.Conv2d(16, 16, 1)
+
+        self.skipconnect = InputChoice(n_candidates=1)
+        self.bn = nn.BatchNorm2d(16)
+
+        self.gap = nn.AdaptiveAvgPool2d(4)
+        self.fc1 = nn.Linear(16 * 4 * 4, 120)
+        self.fc2 = nn.Linear(120, 84)
+        self.fc3 = nn.Linear(84, 10)
+```
+
+For detailed description of `LayerChoice` and `InputChoice`, please refer to [the guidance](NasGuide.md)
+
+## Choose a NAS trainer
+
+After the model is instantiated, it is time to train the model using NAS trainer. Different trainers use different approaches to search for the best one from a class of neural models that you specified. NNI provides popular NAS training approaches, such as DARTS, ENAS. Here we use `DartsTrainer` as an example below. After the trainer is instantiated, invoke `trainer.train()` to do the search.
+
+```python
+trainer = DartsTrainer(net,
+                       loss=criterion,
+                       metrics=accuracy,
+                       optimizer=optimizer,
+                       num_epochs=2,
+                       dataset_train=dataset_train,
+                       dataset_valid=dataset_valid,
+                       batch_size=64,
+                       log_frequency=10)
+trainer.train()
+```
+
+## Export the best model
+
+After the search (i.e., `trainer.train()`) is done, we want to get the best performing model, then simply call `trainer.export("final_arch.json")` to export the found neural architecture to a file.
+
+## NAS visualization
+
+We are working on visualization of NAS and will release soon.
+
+## Retrain the exported best model
+
+It is simple to retrain the found (exported) neural architecture. Step one, instantiate the model you defined above. Step two, invoke `apply_fixed_architecture` on the model. Then the model becomes the found (exported) one, you can use traditional model training to train this model.
+
+```python
+model = Net()
+apply_fixed_architecture(model, "final_arch.json")
+```
--- a/docs/en_US/model_compression.rst
+++ b/docs/en_US/model_compression.rst
@@ -16,6 +16,8 @@ For details, please refer to the following tutorials:
    :maxdepth: 2

    Overview <Compressor/Overview>
+    Quick Start <Compressor/QuickStart>
    Pruners <pruners>
    Quantizers <quantizers>
+    Model Speedup <Compressor/ModelSpeedup>
    Automatic Model Compression <Compressor/AutoCompression>
--- a/docs/en_US/nas.rst
+++ b/docs/en_US/nas.rst
@@ -18,6 +18,7 @@ For details, please refer to the following tutorials:
    :maxdepth: 2

    Overview <NAS/Overview>
+    Quick Start <NAS/QuickStart>
    Tutorial <NAS/NasGuide>
    ENAS <NAS/ENAS>
    DARTS <NAS/DARTS>
@@ -25,4 +26,5 @@ For details, please refer to the following tutorials:
    SPOS <NAS/SPOS>
    CDARTS <NAS/CDARTS>
    ProxylessNAS <NAS/Proxylessnas>
+    Customize a NAS Algorithm <NAS/Advanced>
    API Reference <NAS/NasReference>