add repro results for ENAS and DARTS (#1810)

2576e55c · Yuge Zhang · chicm-ms · df5b6a6f · 2576e55c · 2576e55c
Commit 2576e55c authored Dec 02, 2019 by Yuge Zhang Committed by chicm-ms Dec 02, 2019
Hide whitespace changes
Inline Side-by-side

Showing with 27 additions and 2 deletions

docs/en_US/NAS/DARTS.md docs/en_US/NAS/DARTS.md +18 -0

docs/en_US/NAS/ENAS.md docs/en_US/NAS/ENAS.md +7 -0

docs/en_US/NAS/Overview.md docs/en_US/NAS/Overview.md +2 -2

No files found.
--- a/docs/en_US/NAS/DARTS.md
+++ b/docs/en_US/NAS/DARTS.md
+# DARTS on NNI
+
+## Introduction
+
+The paper [DARTS: Differentiable Architecture Search](https://arxiv.org/abs/1806.09055) addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Their method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent
+
+To implement, authors optimize the network weights and architecture weights alternatively in mini-batches. They further explore the possibility that uses second order optimization (unroll) instead of first order, to improve the performance.
+
+Implementation on NNI is based on the [official implementation](https://github.com/quark0/darts) and a [popular 3rd-party repo](https://github.com/khanrc/pt.darts). So far, first and second order optimization and training from scratch on CIFAR10 have been implemented.
+
+## Reproduce Results
+
+To reproduce the results in the paper, we do experiments with first and second order optimization. Due to the time limit, we retrain *only the best architecture* derived from the search phase and we repeat the experiment *only once*. Our results is currently on par with the results reported in paper. We will add more results later when ready.
+
+|                        | In paper      | Reproduction |
+| ---------------------- | ------------- | ------------ |
+| First order (CIFAR10)  | 3.00 +/- 0.14 | 2.78         |
+| Second order (CIFAR10) | 2.76 +/- 0.09 | 2.89         |
--- a/docs/en_US/NAS/ENAS.md
+++ b/docs/en_US/NAS/ENAS.md
+# ENAS on NNI
+
+## Introduction
+
+The paper [Efficient Neural Architecture Search via Parameter Sharing](https://arxiv.org/abs/1802.03268) uses parameter sharing between child models to accelerate the NAS process. In ENAS, a controller learns to discover neural network architectures by searching for an optimal subgraph within a large computational graph. The controller is trained with policy gradient to select a subgraph that maximizes the expected reward on the validation set. Meanwhile the model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss.
+
+Implementation on NNI is based on the [official implementation in Tensorflow](https://github.com/melodyguan/enas), macro and micro search space on CIFAR10 included. Since code to train from scratch on NNI is not ready yet, reproduction results are currently unavailable.
--- a/docs/en_US/NAS/Overview.md
+++ b/docs/en_US/NAS/Overview.md
@@ -37,7 +37,7 @@ Note, these algorithms run **standalone without nnictl**, and supports PyTorch o

 #### Usage

-ENAS in NNI is still under development and we only support search phase for macro/micro search space on CIFAR10. Training from scratch and search space on PTB has not been finished yet.
+ENAS in NNI is still under development and we only support search phase for macro/micro search space on CIFAR10. Training from scratch and search space on PTB has not been finished yet. [Detailed Description](ENAS.md)

 ```bash
 # In case NNI code is not cloned. If the code is cloned already, ignore this line and enter code folder.
@@ -58,7 +58,7 @@ python3 search.py -h

 ### DARTS

-The main contribution of [DARTS: Differentiable Architecture Search][3] on algorithm is to introduce a novel algorithm for differentiable network architecture search on bilevel optimization.
+The main contribution of [DARTS: Differentiable Architecture Search][3] on algorithm is to introduce a novel algorithm for differentiable network architecture search on bilevel optimization. [Detailed Description](DARTS.md)

 #### Usage