Unverified Commit efa479b0 authored by Chi Song's avatar Chi Song Committed by GitHub
Browse files

Doc fix: formats and typo. (#582)

Fix document with formats and typos
parent 0405a426
...@@ -35,7 +35,7 @@ class CustomizedAssessor(Assessor): ...@@ -35,7 +35,7 @@ class CustomizedAssessor(Assessor):
```python ```python
import argparse import argparse
import CustomizedAssesor import CustomizedAssessor
def main(): def main():
parser = argparse.ArgumentParser(description='parse command line parameters.') parser = argparse.ArgumentParser(description='parse command line parameters.')
...@@ -49,9 +49,9 @@ def main(): ...@@ -49,9 +49,9 @@ def main():
main() main()
``` ```
Please noted in 2). The object ```trial_history``` are exact the object that Trial send to Assesor by using SDK ```report_intermediate_result``` function. Please noted in 2). The object `trial_history` are exact the object that Trial send to Assessor by using SDK `report_intermediate_result` function.
Also, user could override the ```run``` function in Assessor to control the process logic. Also, user could override the `run` function in Assessor to control the process logic.
More detail example you could see: More detail example you could see:
> * [Base-Assessor](https://msrasrg.visualstudio.com/NeuralNetworkIntelligenceOpenSource/_git/Default?_a=contents&path=%2Fsrc%2Fsdk%2Fpynni%2Fnni%2Fassessor.py&version=GBadd_readme) > * [Base-Assessor](https://msrasrg.visualstudio.com/NeuralNetworkIntelligenceOpenSource/_git/Default?_a=contents&path=%2Fsrc%2Fsdk%2Fpynni%2Fnni%2Fassessor.py&version=GBadd_readme)
...@@ -6,7 +6,7 @@ So when user want to write a Trial running on NNI, she/he should: ...@@ -6,7 +6,7 @@ So when user want to write a Trial running on NNI, she/he should:
**1)Have an original Trial could run**, **1)Have an original Trial could run**,
Trial's code could be any machine learning code that could run in local. Here we use ```mnist-keras.py``` as example: Trial's code could be any machine learning code that could run in local. Here we use `mnist-keras.py` as example:
```python ```python
import argparse import argparse
...@@ -86,7 +86,7 @@ if __name__ == '__main__': ...@@ -86,7 +86,7 @@ if __name__ == '__main__':
**2)Get configure from Tuner** **2)Get configure from Tuner**
User import ```nni``` and use ```nni.get_next_parameter()``` to receive configure. Please noted **10**, **24** and **25** line in the following code. User import `nni` and use `nni.get_next_parameter()` to receive configure. Please noted **10**, **24** and **25** line in the following code.
```python ```python
...@@ -121,7 +121,7 @@ if __name__ == '__main__': ...@@ -121,7 +121,7 @@ if __name__ == '__main__':
**3) Send intermediate result** **3) Send intermediate result**
Use ```nni.report_intermediate_result``` to send intermediate result to Assessor. Please noted **5** line in the following code. Use `nni.report_intermediate_result` to send intermediate result to Assessor. Please noted **5** line in the following code.
```python ```python
...@@ -144,7 +144,7 @@ def train(args, params): ...@@ -144,7 +144,7 @@ def train(args, params):
``` ```
**4) Send final result** **4) Send final result**
Use ```nni.report_final_result``` to send final result to Trial. Please noted **15** line in the following code. Use `nni.report_final_result` to send final result to Tuner. Please noted **15** line in the following code.
```python ```python
... ...
...@@ -281,4 +281,4 @@ if __name__ == '__main__': ...@@ -281,4 +281,4 @@ if __name__ == '__main__':
LOG.exception(e) LOG.exception(e)
raise raise
``` ```
\ No newline at end of file
...@@ -36,17 +36,17 @@ chmod +x ./download.sh ...@@ -36,17 +36,17 @@ chmod +x ./download.sh
1. download "dev-v1.1.json" and "train-v1.1.json" in https://rajpurkar.github.io/SQuAD-explorer/ 1. download "dev-v1.1.json" and "train-v1.1.json" in https://rajpurkar.github.io/SQuAD-explorer/
``` ```bash
wget https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json wget https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json
wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json
``` ```
2. download "glove.840B.300d.txt" in https://nlp.stanford.edu/projects/glove/ 2. download "glove.840B.300d.txt" in https://nlp.stanford.edu/projects/glove/
``` ```bash
wget http://nlp.stanford.edu/data/glove.840B.300d.zip wget http://nlp.stanford.edu/data/glove.840B.300d.zip
unzip glove.840B.300d.zip unzip glove.840B.300d.zip
``` ```
### Update configuration ### Update configuration
Modify `nni/examples/trials/ga_squad/config.yml`, here is the default configuration: Modify `nni/examples/trials/ga_squad/config.yml`, here is the default configuration:
......
...@@ -105,4 +105,4 @@ There are two examples, [FashionMNIST-keras.py](./FashionMNIST/FashionMNIST_kera ...@@ -105,4 +105,4 @@ There are two examples, [FashionMNIST-keras.py](./FashionMNIST/FashionMNIST_kera
The `CIFAR-10` dataset [Canadian Institute For Advanced Research](https://www.cifar.ca/) is a collection of images that are commonly used to train machine learning and computer vision algorithms. It is one of the most widely used datasets for machine learning research. The CIFAR-10 dataset contains 60,000 32x32 color images in 10 different classes. The `CIFAR-10` dataset [Canadian Institute For Advanced Research](https://www.cifar.ca/) is a collection of images that are commonly used to train machine learning and computer vision algorithms. It is one of the most widely used datasets for machine learning research. The CIFAR-10 dataset contains 60,000 32x32 color images in 10 different classes.
There are two examples, [cifar10-keras.py](./cifar10/cifar10_keras.py) and [cifar10-pytorch.py](./cifar10/cifar10_pytorch.py). The value `input_width` is 32 and the value `input_channel` is 3 in `config.yaml ` for this dataset. There are two examples, [cifar10-keras.py](./cifar10/cifar10_keras.py) and [cifar10-pytorch.py](./cifar10/cifar10_pytorch.py). The value `input_width` is 32 and the value `input_channel` is 3 in `config.yaml ` for this dataset.
\ No newline at end of file
...@@ -3,4 +3,4 @@ ...@@ -3,4 +3,4 @@
Now we have an enas example [enas-nni](https://github.com/countif/enas_nni) run in nni from our contributors. Now we have an enas example [enas-nni](https://github.com/countif/enas_nni) run in nni from our contributors.
Thanks our lovely contributors. Thanks our lovely contributors.
And welcome more and more people to join us! And welcome more and more people to join us!
\ No newline at end of file
...@@ -12,4 +12,4 @@ tuner: ...@@ -12,4 +12,4 @@ tuner:
className: CustomerTuner className: CustomerTuner
classArgs: classArgs:
optimize_mode: maximize optimize_mode: maximize
``` ```
\ No newline at end of file
...@@ -12,4 +12,4 @@ tuner: ...@@ -12,4 +12,4 @@ tuner:
className: CustomerTuner className: CustomerTuner
classArgs: classArgs:
optimize_mode: maximize optimize_mode: maximize
``` ```
\ No newline at end of file
...@@ -23,8 +23,11 @@ Assuming additive a Gaussian noise and the noise parameter is initialized to its ...@@ -23,8 +23,11 @@ Assuming additive a Gaussian noise and the noise parameter is initialized to its
We determine the maximum probability value of the new combined parameter vector by learing the historical data. Use such value to predict the future trial performance, and stop the inadequate experiments to save computing resource. We determine the maximum probability value of the new combined parameter vector by learing the historical data. Use such value to predict the future trial performance, and stop the inadequate experiments to save computing resource.
Concretely,this algorithm goes through three stages of learning, predicting and assessing. Concretely,this algorithm goes through three stages of learning, predicting and assessing.
* Step1: Learning. We will learning about the trial history of the current trial and determine the \xi at Bayesian angle. First of all, We fit each curve using the least squares method(implement by `fit_theta`) to save our time. After we obtained the parameters, we filter the curve and remove the outliers(implement by `filter_curve`). Fially, we use the MCMC sampling method(implement by `mcmc_sampling`) to adjust the weight of each curve. Up to now, we have dertermined all the parameters in \xi.
* Step1: Learning. We will learning about the trial history of the current trial and determine the \xi at Bayesian angle. First of all, We fit each curve using the least squares method(implement by `fit_theta`) to save our time. After we obtained the parameters, we filter the curve and remove the outliers(implement by `filter_curve`). Finally, we use the MCMC sampling method(implement by `mcmc_sampling`) to adjust the weight of each curve. Up to now, we have dertermined all the parameters in \xi.
* Step2: Predicting. Calculates the expected final result accuracy(implement by `f_comb`) at target position(ie the total number of epoch) by the \xi and the formula of the combined model. * Step2: Predicting. Calculates the expected final result accuracy(implement by `f_comb`) at target position(ie the total number of epoch) by the \xi and the formula of the combined model.
* Step3: If the fitting result doesn't converge, the predicted value will be `None`, in this case we return `AssessResult.Good` to ask for future accuracy information and predict again. Furthermore, we will get a positive value by `predict()` function, if this value is strictly greater than the best final performance in history * `THRESHOLD`(default value = 0.95), return `AssessResult.Good`, otherwise, return `AssessResult.Bad` * Step3: If the fitting result doesn't converge, the predicted value will be `None`, in this case we return `AssessResult.Good` to ask for future accuracy information and predict again. Furthermore, we will get a positive value by `predict()` function, if this value is strictly greater than the best final performance in history * `THRESHOLD`(default value = 0.95), return `AssessResult.Good`, otherwise, return `AssessResult.Bad`
The figure below is the result of our algorithm on MNIST trial history data, where the green point represents the data obtained by Assessor, the blue point represents the future but unknown data, and the red line is the Curve predicted by the Curve fitting assessor. The figure below is the result of our algorithm on MNIST trial history data, where the green point represents the data obtained by Assessor, the blue point represents the future but unknown data, and the red line is the Curve predicted by the Curve fitting assessor.
......
...@@ -2,12 +2,12 @@ Hyperband on nni ...@@ -2,12 +2,12 @@ Hyperband on nni
=== ===
## 1. Introduction ## 1. Introduction
[Hyperband][1] is a popular automl algorithm. The basic idea of Hyperband is that it creates several brackets, each bracket has `n` randomly generated hyperparameter configurations, each configuration uses `r` resource (e.g., epoch number, batch number). After the `n` configurations is finished, it chooses top `n/eta` configurations and runs them using increased `r*eta` resource. At last, it chooses the best configuration it has found so far. [Hyperband][1] is a popular automl algorithm. The basic idea of Hyperband is that it creates several buckets, each bucket has `n` randomly generated hyperparameter configurations, each configuration uses `r` resource (e.g., epoch number, batch number). After the `n` configurations is finished, it chooses top `n/eta` configurations and runs them using increased `r*eta` resource. At last, it chooses the best configuration it has found so far.
## 2. Implementation with fully parallelism ## 2. Implementation with fully parallelism
Frist, this is an example of how to write an automl algorithm based on MsgDispatcherBase, rather than Tuner and Assessor. Hyperband is implemented in this way because it integrates the functions of both Tuner and Assessor, thus, we call it advisor. Frist, this is an example of how to write an automl algorithm based on MsgDispatcherBase, rather than Tuner and Assessor. Hyperband is implemented in this way because it integrates the functions of both Tuner and Assessor, thus, we call it advisor.
Second, this implementation fully leverages Hyperband's internal parallelism. More specifically, the next bracket is not started strictly after the current bracket, instead, it starts when there is available resource. Second, this implementation fully leverages Hyperband's internal parallelism. More specifically, the next bucket is not started strictly after the current bucket, instead, it starts when there is available resource.
## 3. Usage ## 3. Usage
To use Hyperband, you should add the following spec in your experiment's yaml config file: To use Hyperband, you should add the following spec in your experiment's yaml config file:
...@@ -43,7 +43,7 @@ Here is a concrete example of `R=81` and `eta=3`: ...@@ -43,7 +43,7 @@ Here is a concrete example of `R=81` and `eta=3`:
|3 |3 27 |1 81 | | | | |3 |3 27 |1 81 | | | |
|4 |1 81 | | | | | |4 |1 81 | | | | |
`s` means bracket, `n` means the number of configurations that are generated, the corresponding `r` means how many STEPS these configurations run. `i` means round, for example, bracket 4 has 5 rounds, bracket 3 has 4 rounds. `s` means bucket, `n` means the number of configurations that are generated, the corresponding `r` means how many STEPS these configurations run. `i` means round, for example, bucket 4 has 5 rounds, bucket 3 has 4 rounds.
About how to write trial code, please refer to the instructions under `examples/trials/mnist-hyperband/`. About how to write trial code, please refer to the instructions under `examples/trials/mnist-hyperband/`.
......
# Network Morphism Tuner on NNI # Network Morphism Tuner on NNI
## 1. Intorduction ## 1. Introduction
[Autokeras](https://arxiv.org/abs/1806.10282) is a popular automl tools using Network Morphism. The basic idea of Autokeras is to use Bayesian Regression to estimate the metric of the Neural Network Architecture. Each time, it generates several child networks from father networks. Then it uses a naïve Bayesian regression estimate its metric value from history trained results of network and metric value pair. Next, it chooses the the child which has best estimated performance and adds it to the training queue. Inspired by its work and referring to its [code](https://github.com/jhfjhfj1/autokeras), we implement our Network Morphism method in our NNI platform. [Autokeras](https://arxiv.org/abs/1806.10282) is a popular automl tools using Network Morphism. The basic idea of Autokeras is to use Bayesian Regression to estimate the metric of the Neural Network Architecture. Each time, it generates several child networks from father networks. Then it uses a naïve Bayesian regression estimate its metric value from history trained results of network and metric value pair. Next, it chooses the the child which has best estimated performance and adds it to the training queue. Inspired by its work and referring to its [code](https://github.com/jhfjhfj1/autokeras), we implement our Network Morphism method in our NNI platform.
......
...@@ -10,7 +10,7 @@ Click the tab "Overview". ...@@ -10,7 +10,7 @@ Click the tab "Overview".
## View job accuracy ## View job accuracy
Click the tab "Optimization Progress" to see the point graph of all trials. Hover every point to see its specific accuracy. Click the tab "Default Metric" to see the point graph of all trials. Hover every point to see its specific accuracy.
## View hyper parameter ## View hyper parameter
...@@ -19,19 +19,14 @@ Click the tab "Hyper Parameter" to see the parallel graph. ...@@ -19,19 +19,14 @@ Click the tab "Hyper Parameter" to see the parallel graph.
* You can select the percentage to see top trials. * You can select the percentage to see top trials.
* Choose two axis to swap its positions * Choose two axis to swap its positions
## View trial status ## View trial status
Click the tab "Trial Status" to see the status of the all trials. Specifically: Click the tab "Trials Detail" to see the status of the all trials. Specifically:
* Trial duration: trial's duration in the bar graph. * Trial duration: trial's duration in the bar graph.
* Trial detail: trial's id, trial's duration, start time, end time, status, accuracy and search space file. * Trial detail: trial's id, trial's duration, start time, end time, status, accuracy and search space file.
* Kill: you can kill a job that status is running. * Kill: you can kill a job that status is running.
* Tensor: you can see a job in the tensorflow graph, it will link to the Tensorboard page.
## Control
Click the tab "Control" to add a new trial or update the search_space file and some experiment parameters.
## Feedback ## Feedback
[Known Issues](https://github.com/Microsoft/nni/issues). [Known Issues](https://github.com/Microsoft/nni/issues).
\ No newline at end of file
...@@ -9,44 +9,49 @@ python >= 3.5 ...@@ -9,44 +9,49 @@ python >= 3.5
## Installation ## Installation
1. Enter tools directory 1. Enter tools directory
2. Use pip to install packages 1. Use pip to install packages
* Install for current user: * Install for current user:
python3 -m pip install --user -e . ```bash
python3 -m pip install --user -e .
```
* Install for all users: * Install for all users:
python3 -m pip install -e . ```bash
python3 -m pip install -e .
```
1. Change the mode of nnictl file 1. Change the mode of nnictl file
```bash
chmod +x ./nnictl chmod +x ./nnictl
```
2. Add nnictl to your PATH system environment variable.
1. Add nnictl to your PATH system environment variable.
* You could use `export` command to set PATH variable temporary. * You could use `export` command to set PATH variable temporary.
export PATH={your nnictl path}:$PATH export PATH={your nnictl path}:$PATH
* Or you could edit your `/etc/profile` file. * Or you could edit your `/etc/profile` file.
1.sudo vim /etc/profile ```txt
1.sudo vim /etc/profile
2.At the end of the file, add
2.At the end of the file, add
export PATH={your nnictl path}:$PATH
export PATH={your nnictl path}:$PATH
save and exit.
save and exit.
3.source /etc/profile
3.source /etc/profile
```
## To start using NNI CTL ## To start using NNI CTL
please reference to the [NNI CTL document]. please reference to the [NNI CTL document].
[NNI CTL document]: ../docs/NNICTLDOC.md
[NNI CTL document]: ../docs/NNICTLDOC.md
\ No newline at end of file
...@@ -20,7 +20,7 @@ If users use NNI system, they only need to: ...@@ -20,7 +20,7 @@ If users use NNI system, they only need to:
'''@nni.function_choice(max_pool(h_conv1, self.pool_size),avg_pool(h_conv1, self.pool_size),name=max_pool)''' '''@nni.function_choice(max_pool(h_conv1, self.pool_size),avg_pool(h_conv1, self.pool_size),name=max_pool)'''
In this way, they can easily realize automatic tuning on NNI. In this way, they can easily implement automatic tuning on NNI.
For `@nni.variable`, `nni.choice` is the type of search space and there are 10 types to express your search space as follows: For `@nni.variable`, `nni.choice` is the type of search space and there are 10 types to express your search space as follows:
...@@ -51,5 +51,5 @@ For `@nni.variable`, `nni.choice` is the type of search space and there are 10 t ...@@ -51,5 +51,5 @@ For `@nni.variable`, `nni.choice` is the type of search space and there are 10 t
9. `@nni.variable(nni.lognormal(label, mu, sigma),name=variable)` 9. `@nni.variable(nni.lognormal(label, mu, sigma),name=variable)`
Which means the variable value is a value drawn according to exp(normal(mu, sigma)) Which means the variable value is a value drawn according to exp(normal(mu, sigma))
10. `@nni.variable(nni.qlognormal(label, mu, sigma, q),name=variable)` 10. `@nni.variable(nni.qlognormal(label, mu, sigma, q),name=variable)`
Which means the variable value is a value like round(exp(normal(mu, sigma)) / q) * q Which means the variable value is a value like round(exp(normal(mu, sigma)) / q) * q
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment