"docs/git@developer.sourcefind.cn:change/sglang.git" did not exist on "48473684cc3e3d080fca85b089375700788f2d7a"
Unverified Commit 4dfd9d14 authored by liuzhe-lz's avatar liuzhe-lz Committed by GitHub
Browse files

Merge pull request #2254 from microsoft/v1.5

Merge v1.5 branch back to master
parents d2c57770 7d586d3f
...@@ -108,6 +108,7 @@ Within the following table, we summarized the current NNI capabilities, we are g ...@@ -108,6 +108,7 @@ Within the following table, we summarized the current NNI capabilities, we are g
<li><a href="docs/en_US/Tuner/BuiltinTuner.md#Evolution">Naïve Evolution</a></li> <li><a href="docs/en_US/Tuner/BuiltinTuner.md#Evolution">Naïve Evolution</a></li>
<li><a href="docs/en_US/Tuner/BuiltinTuner.md#Anneal">Anneal</a></li> <li><a href="docs/en_US/Tuner/BuiltinTuner.md#Anneal">Anneal</a></li>
<li><a href="docs/en_US/Tuner/BuiltinTuner.md#Hyperband">Hyperband</a></li> <li><a href="docs/en_US/Tuner/BuiltinTuner.md#Hyperband">Hyperband</a></li>
<li><a href="docs/en_US/Tuner/BuiltinTuner.md#PBTTuner">PBT</a></li>
</ul> </ul>
<b>Bayesian optimization</b> <b>Bayesian optimization</b>
<ul> <ul>
...@@ -131,7 +132,8 @@ Within the following table, we summarized the current NNI capabilities, we are g ...@@ -131,7 +132,8 @@ Within the following table, we summarized the current NNI capabilities, we are g
<li><a href="docs/en_US/NAS/CDARTS.md">CDARTS</a></li> <li><a href="docs/en_US/NAS/CDARTS.md">CDARTS</a></li>
<li><a href="docs/en_US/NAS/SPOS.md">SPOS</a></li> <li><a href="docs/en_US/NAS/SPOS.md">SPOS</a></li>
<li><a href="docs/en_US/NAS/Proxylessnas.md">ProxylessNAS</a></li> <li><a href="docs/en_US/NAS/Proxylessnas.md">ProxylessNAS</a></li>
<li><a href="docs/en_US/Tuner/BuiltinTuner.md#NetworkMorphism">Network Morphism</a> </li> <li><a href="docs/en_US/Tuner/BuiltinTuner.md#NetworkMorphism">Network Morphism</a></li>
<li><a href="docs/en_US/NAS/TextNAS.md">TextNAS</a></li>
</ul> </ul>
</ul> </ul>
<a href="docs/en_US/Compressor/Overview.md">Model Compression</a> <a href="docs/en_US/Compressor/Overview.md">Model Compression</a>
......
...@@ -112,7 +112,7 @@ jobs: ...@@ -112,7 +112,7 @@ jobs:
dependsOn: version_number_validation dependsOn: version_number_validation
condition: succeeded() condition: succeeded()
pool: pool:
vmImage: 'macOS 10.13' vmImage: 'macOS-10.15'
strategy: strategy:
matrix: matrix:
Python36: Python36:
......
...@@ -41,7 +41,8 @@ build: ...@@ -41,7 +41,8 @@ build:
cp -r $(CWD)../../src/nni_manager/dist $(CWD)nni cp -r $(CWD)../../src/nni_manager/dist $(CWD)nni
cp -r $(CWD)../../src/nni_manager/config $(CWD)nni cp -r $(CWD)../../src/nni_manager/config $(CWD)nni
cp -r $(CWD)../../src/webui/build $(CWD)nni/static cp -r $(CWD)../../src/webui/build $(CWD)nni/static
cp -r $(CWD)../../src/nasui/build $(CWD)nni/nasui mkdir -p $(CWD)nni/nasui/build
cp -r $(CWD)../../src/nasui/build/. $(CWD)nni/nasui/build
cp $(CWD)../../src/nasui/server.js $(CWD)nni/nasui cp $(CWD)../../src/nasui/server.js $(CWD)nni/nasui
cp $(CWD)../../src/nni_manager/package.json $(CWD)nni cp $(CWD)../../src/nni_manager/package.json $(CWD)nni
sed -ie 's/$(NNI_VERSION_TEMPLATE)/$(NNI_VERSION_VALUE)/' $(CWD)nni/package.json sed -ie 's/$(NNI_VERSION_TEMPLATE)/$(NNI_VERSION_VALUE)/' $(CWD)nni/package.json
......
...@@ -119,7 +119,9 @@ trainer.export(file="model_dir/final_architecture.json") # export the final arc ...@@ -119,7 +119,9 @@ trainer.export(file="model_dir/final_architecture.json") # export the final arc
Users can directly run their training file through `python3 train.py` without `nnictl`. After training, users can export the best one of the found models through `trainer.export()`. Users can directly run their training file through `python3 train.py` without `nnictl`. After training, users can export the best one of the found models through `trainer.export()`.
Normally, the trainer exposes a few arguments that you can customize. For example, the loss function, the metrics function, the optimizer, and the datasets. These should satisfy most usages needs and we do our best to make sure our built-in trainers work on as many models, tasks, and datasets as possible. But there is no guarantee. For example, some trainers have the assumption that the task is a classification task; some trainers might have a different definition of "epoch" (e.g., an ENAS epoch = some child steps + some controller steps); most trainers do not have support for distributed training: they won't wrap your model with `DataParallel` or `DistributedDataParallel` to do that. So after a few tryouts, if you want to actually use the trainers on your very customized applications, you might need to [customize your trainer](#extend-the-ability-of-one-shot-trainers). Normally, the trainer exposes a few arguments that you can customize. For example, the loss function, the metrics function, the optimizer, and the datasets. These should satisfy most usages needs and we do our best to make sure our built-in trainers work on as many models, tasks, and datasets as possible. But there is no guarantee. For example, some trainers have the assumption that the task is a classification task; some trainers might have a different definition of "epoch" (e.g., an ENAS epoch = some child steps + some controller steps); most trainers do not have support for distributed training: they won't wrap your model with `DataParallel` or `DistributedDataParallel` to do that. So after a few tryouts, if you want to actually use the trainers on your very customized applications, you might need to [customize your trainer](./Advanced.md#extend-the-ability-of-one-shot-trainers).
Furthermore, one-shot NAS can be visualized with our NAS UI. [See more details.](./Visualization.md)
### Distributed NAS ### Distributed NAS
......
...@@ -19,17 +19,19 @@ NNI currently supports the NAS algorithms listed below and is adding more. Users ...@@ -19,17 +19,19 @@ NNI currently supports the NAS algorithms listed below and is adding more. Users
| [P-DARTS](PDARTS.md) | [Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation](https://arxiv.org/abs/1904.12760) is based on DARTS. It introduces an efficient algorithm which allows the depth of searched architectures to grow gradually during the training procedure. | | [P-DARTS](PDARTS.md) | [Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation](https://arxiv.org/abs/1904.12760) is based on DARTS. It introduces an efficient algorithm which allows the depth of searched architectures to grow gradually during the training procedure. |
| [SPOS](SPOS.md) | [Single Path One-Shot Neural Architecture Search with Uniform Sampling](https://arxiv.org/abs/1904.00420) constructs a simplified supernet trained with a uniform path sampling method and applies an evolutionary algorithm to efficiently search for the best-performing architectures. | | [SPOS](SPOS.md) | [Single Path One-Shot Neural Architecture Search with Uniform Sampling](https://arxiv.org/abs/1904.00420) constructs a simplified supernet trained with a uniform path sampling method and applies an evolutionary algorithm to efficiently search for the best-performing architectures. |
| [CDARTS](CDARTS.md) | [Cyclic Differentiable Architecture Search](https://arxiv.org/abs/****) builds a cyclic feedback mechanism between the search and evaluation networks. It introduces a cyclic differentiable architecture search framework which integrates the two networks into a unified architecture.| | [CDARTS](CDARTS.md) | [Cyclic Differentiable Architecture Search](https://arxiv.org/abs/****) builds a cyclic feedback mechanism between the search and evaluation networks. It introduces a cyclic differentiable architecture search framework which integrates the two networks into a unified architecture.|
| [ProxylessNAS](Proxylessnas.md) | [ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware](https://arxiv.org/abs/1812.00332).| | [ProxylessNAS](Proxylessnas.md) | [ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware](https://arxiv.org/abs/1812.00332). It removes proxy, directly learns the architectures for large-scale target tasks and target hardware platforms. |
| [TextNAS](TextNAS.md) | [TextNAS: A Neural Architecture Search Space tailored for Text Representation](https://arxiv.org/pdf/1912.10729.pdf). It is a neural architecture search algorithm tailored for text representation. |
One-shot algorithms run **standalone without nnictl**. Only the PyTorch version has been implemented. Tensorflow 2.x will be supported in a future release. One-shot algorithms run **standalone without nnictl**. Only the PyTorch version has been implemented. Tensorflow 2.x will be supported in a future release.
Here are some common dependencies to run the examples. PyTorch needs to be above 1.2 to use ``BoolTensor``. Here are some common dependencies to run the examples. PyTorch needs to be above 1.2 to use ``BoolTensor``.
* NNI 1.2+
* tensorboard * tensorboard
* PyTorch 1.2+ * PyTorch 1.2+
* git * git
One-shot NAS can be visualized with our visualization tool. Learn more details [here](./Visualization.md).
## Supported Distributed NAS Algorithms ## Supported Distributed NAS Algorithms
|Name|Brief Introduction of Algorithm| |Name|Brief Introduction of Algorithm|
...@@ -49,6 +51,10 @@ The programming interface of designing and searching a model is often demanded i ...@@ -49,6 +51,10 @@ The programming interface of designing and searching a model is often demanded i
[Here](./NasGuide.md) is the user guide to get started with using NAS on NNI. [Here](./NasGuide.md) is the user guide to get started with using NAS on NNI.
## NAS Visualization
To help users track the process and status of how the model is searched under specified search space, we developed a visualization tool. It visualizes search space as a super-net and shows importance of subnets and layers/operations, as well as how the importance changes along with the search process. Please refer to [the document of NAS visualization](./Visualization.md) for how to use it.
## Reference and Feedback ## Reference and Feedback
[1]: https://arxiv.org/abs/1802.03268 [1]: https://arxiv.org/abs/1802.03268
......
# NAS Visualization (Experimental)
## Built-in Trainers Support
Currently, only ENAS and DARTS support visualization. Examples of [ENAS](./ENAS.md) and [DARTS](./DARTS.md) has demonstrated how to enable visualization in your code, namely, adding this before `trainer.train()`:
```python
trainer.enable_visualization()
```
This will create a directory `logs/<current_time_stamp>` in your working folder, in which you will find two files `graph.json` and `log`.
You don't have to wait until your program finishes to launch NAS UI, but it's important that these two files have been already created. Launch NAS UI with
```bash
nnictl webui nas --logdir logs/<current_time_stamp> --port <port>
```
## Visualize a Customized Trainer
If you are interested in how to customize a trainer, please read this [doc](./Advanced.md#extend-the-ability-of-one-shot-trainers).
You should do two modifications to an existing trainer to enable visualization:
1. Export your graph before training, with
```python
vis_graph = self.mutator.graph(inputs)
# `inputs` is a dummy input to your model. For example, torch.randn((1, 3, 32, 32)).cuda()
# If your model has multiple inputs, it should be a tuple.
with open("/path/to/your/logdir/graph.json", "w") as f:
json.dump(vis_graph, f)
```
2. Logging the choices you've made. You can do it once per epoch, once per mini-batch or whatever frequency you'd like.
```python
def __init__(self):
# ...
self.status_writer = open("/path/to/your/logdir/log", "w") # create a writer
def train(self):
# ...
print(json.dumps(self.mutator.status()), file=self.status_writer, flush=True) # dump a record of status
```
If you are implementing your customized trainer inheriting `Trainer`. We have provided `enable_visualization()` and `_write_graph_status()` for easy-to-use purposes. All you need to do is calling `trainer.enable_visualization()` before start, and `trainer._write_graph_status()` each time you want to do the logging. But remember both of these APIs are experimental and subject to change in future.
Last but not least, invode NAS UI with
```bash
nnictl webui nas --logdir /path/to/your/logdir
```
## NAS UI Preview
![](../../img/nasui-1.png)
![](../../img/nasui-2.png)
## Limitations
* NAS visualization only works with PyTorch >=1.4. We've tested it on PyTorch 1.3.1 and it doesn't work.
* We rely on PyTorch support for tensorboard for graph export, which relies on `torch.jit`. It will not work if your model doesn't support `jit`.
* There are known performance issues when loading a moderate-size graph with many op choices (like DARTS search space).
## Feedback
NAS UI is currently experimental. We welcome your feedback. [Here](https://github.com/microsoft/nni/pull/2085) we have listed all the to-do items of NAS UI in the future. Feel free to comment (or [submit a new issue](https://github.com/microsoft/nni/issues/new?template=enhancement.md)) if you have other suggestions.
...@@ -463,13 +463,13 @@ tuner: ...@@ -463,13 +463,13 @@ tuner:
**Suggested scenario** **Suggested scenario**
Population Based Training (PBT) which bridges and extends parallel search methods and sequential optimization methods. It has a wallclock run time that is no greater than that of a single optimization process, does not require sequential runs, and is also able to use fewer computational resources than naive search methods. Therefore, it's effective when you want to save computational resources and time. Besides, PBT returns hyperparameter scheduler instead of configuration. If you don't need to get a specific configuration, but just expect good results, you can choose this tuner. It should be noted that, in our implementation, the operation of checkpoint storage location is involved. A trial is considered as several traning epochs of training, so the loading and saving of checkpoint must be specified in the trial code, which is different with other tuners. Otherwise, if the experiment is not local mode, users should provide a path in a shared storage which can be accessed by all the trials. You could try it on very simple task, such as the [mnist-pbt-tuner-pytorch](https://github.com/microsoft/nni/tree/master/examples/trials/mnist-pbt-tuner-pytorch) example. [See details](./PBTTuner.md) Population Based Training (PBT) bridges and extends parallel search methods and sequential optimization methods. It requires relatively small computation resource, by inheriting weights from currently good-performing ones to explore better ones periodically. With PBTTuner, users finally get a trained model, rather than a configuration that could reproduce the trained model by training the model from scratch. This is because model weights are inherited periodically through the whole search process. PBT can also be seen as a training approach. If you don't need to get a specific configuration, but just expect a good model, PBTTuner is a good choice. [See details](./PBTTuner.md)
**classArgs requirements:** **classArgs requirements:**
* **optimize_mode** (*'maximize' or 'minimize'*) - If 'maximize', the tuner will target to maximize metrics. If 'minimize', the tuner will target to minimize metrics. * **optimize_mode** (*'maximize' or 'minimize'*) - If 'maximize', the tuner will target to maximize metrics. If 'minimize', the tuner will target to minimize metrics.
* **all_checkpoint_dir** (*str, optional, default = None*) - Directory for trials to load and save checkpoint, if not specified, the directory would be "~/nni/checkpoint/<exp-id>". Note that if the experiment is not local mode, users should provide a path in a shared storage which can be accessed by all the trials. * **all_checkpoint_dir** (*str, optional, default = None*) - Directory for trials to load and save checkpoint, if not specified, the directory would be "~/nni/checkpoint/<exp-id>". Note that if the experiment is not local mode, users should provide a path in a shared storage which can be accessed by all the trials.
* **population_size** (*int, optional, default = 10*) - Number of trials for each step. In our implementation, one step is running each trial by specific training epochs set by users. * **population_size** (*int, optional, default = 10*) - Number of trials in a population. Each step has this number of trials. In our implementation, one step is running each trial by specific training epochs set by users.
* **factors** (*tuple, optional, default = (1.2, 0.8)*) - Factors for perturbation of hyperparameters. * **factors** (*tuple, optional, default = (1.2, 0.8)*) - Factors for perturbation of hyperparameters.
* **fraction** (*float, optional, default = 0.2*) - Fraction for selecting bottom and top trials. * **fraction** (*float, optional, default = 0.2*) - Fraction for selecting bottom and top trials.
...@@ -482,6 +482,10 @@ tuner: ...@@ -482,6 +482,10 @@ tuner:
classArgs: classArgs:
optimize_mode: maximize optimize_mode: maximize
``` ```
Note that, to use this tuner, your trial code should be modified accordingly, please refer to [the document of PBTTuner](./PBTTuner.md) for details.
## **Reference and Feedback** ## **Reference and Feedback**
* To [report a bug](https://github.com/microsoft/nni/issues/new?template=bug-report.md) for this feature in GitHub; * To [report a bug](https://github.com/microsoft/nni/issues/new?template=bug-report.md) for this feature in GitHub;
* To [file a feature or improvement request](https://github.com/microsoft/nni/issues/new?template=enhancement.md) for this feature in GitHub; * To [file a feature or improvement request](https://github.com/microsoft/nni/issues/new?template=enhancement.md) for this feature in GitHub;
......
...@@ -5,8 +5,48 @@ PBT Tuner on NNI ...@@ -5,8 +5,48 @@ PBT Tuner on NNI
Population Based Training (PBT) comes from [Population Based Training of Neural Networks](https://arxiv.org/abs/1711.09846v1). It's a simple asynchronous optimization algorithm which effectively utilizes a fixed computational budget to jointly optimize a population of models and their hyperparameters to maximize performance. Importantly, PBT discovers a schedule of hyperparameter settings rather than following the generally sub-optimal strategy of trying to find a single fixed set to use for the whole course of training. Population Based Training (PBT) comes from [Population Based Training of Neural Networks](https://arxiv.org/abs/1711.09846v1). It's a simple asynchronous optimization algorithm which effectively utilizes a fixed computational budget to jointly optimize a population of models and their hyperparameters to maximize performance. Importantly, PBT discovers a schedule of hyperparameter settings rather than following the generally sub-optimal strategy of trying to find a single fixed set to use for the whole course of training.
PBTTuner initializes a population with several trials. Users can set a specific number of training epochs. After a certain number of epochs, the parameters and hyperparameters in the trial with bad metrics will be replaced with a better trial (exploit). Then the hyperparameters are perturbed (explore). ![](../../img/pbt.jpg)
In our implementation, training epochs in the trial code is regarded as a step of PBT, different with other tuners. At the end of each step, PBT tuner will do exploitation and exploration -- replacing some trials with new trials. This is implemented by constantly modifying the values of `load_checkpoint_dir` and `save_checkpoint_dir`. We can directly change `load_checkpoint_dir` to replace parameters and hyperparameters, and `save_checkpoint_dir` to save a checkpoint that will be loaded in the next step. To this end, we need a shared folder which is accessible to all trials. PBTTuner initializes a population with several trials (i.e., `population_size`). There are four steps in the above figure, each trial only runs by one step. How long is one step is controlled by trial code, e.g., one epoch. When a trial starts, it loads a checkpoint specified by PBTTuner and continues to run one step, then saves checkpoint to a directory specified by PBTTuner and exits. The trials in a population run steps synchronously, that is, after all the trials finish the `i`-th step, the `(i+1)`-th step can be started. Exploitation and exploration of PBT are executed between two consecutive steps.
If the experiment is running in local mode, users could provide an argument `all_checkpoint_dir` which will be the base folder of `load_checkpoint_dir` and `save_checkpoint_dir` (`checkpoint_dir` is set to `all_checkpoint_dir/<population-id>/<step>`). By default, `all_checkpoint_dir` is set to be `~/nni/experiments/<exp-id>/checkpoint`. If the experiment is in non-local mode, then users should provide a path in a shared storage folder which is mounted at `all_checkpoint_dir` on worker machines (but it's not necessarily available on the machine which runs tuner). ### Provide checkpoint directory
Since some trials need to load other trial's checkpoint, users should provide a directory (i.e., `all_checkpoint_dir`) which is accessible by every trial. It is easy for local mode, users could directly use the default directory or specify any directory on the local machine. For other training services, users should follow [the document of those training services](../TrainingService/SupportTrainingService.md) to provide a directory in a shared storage, such as NFS, Azure storage.
### Modify your trial code
Before running a step, a trial needs to load a checkpoint, the checkpoint directory is specified in hyper-parameter configuration generated by PBTTuner, i.e., `params['load_checkpoint_dir']`. Similarly, the directory for saving checkpoint is also included in the configuration, i.e., `params['save_checkpoint_dir']`. Here, `all_checkpoint_dir` is base folder of `load_checkpoint_dir` and `save_checkpoint_dir` whose format is `all_checkpoint_dir/<population-id>/<step>`.
```python
params = nni.get_next_parameter()
# the path of the checkpoint to load
load_path = os.path.join(params['load_checkpoint_dir'], 'model.pth')
# load checkpoint from `load_path`
...
# run one step
...
# the path for saving a checkpoint
save_path = os.path.join(params['save_checkpoint_dir'], 'model.pth')
# save checkpoint to `save_path`
...
```
The complete example code can be found [here](https://github.com/microsoft/nni/tree/master/examples/trials/mnist-pbt-tuner-pytorch).
### Experiment config
Below is an exmaple of PBTTuner configuration in experiment config file. **Note that Assessor is not allowed if PBTTuner is used.**
```yaml
# config.yml
tuner:
builtinTunerName: PBTTuner
classArgs:
optimize_mode: maximize
all_checkpoint_dir: /the/path/to/store/checkpoints
population_size: 10
```
### Limitation
Importing data has not been supported yet.
\ No newline at end of file
...@@ -5,7 +5,8 @@ PPO Tuner on NNI ...@@ -5,7 +5,8 @@ PPO Tuner on NNI
This is a tuner geared for NNI's Neural Architecture Search (NAS) interface. It uses the [ppo algorithm](https://arxiv.org/abs/1707.06347). The implementation inherits the main logic of the ppo2 OpenAI implementation [here](https://github.com/openai/baselines/tree/master/baselines/ppo2) and is adapted for the NAS scenario. This is a tuner geared for NNI's Neural Architecture Search (NAS) interface. It uses the [ppo algorithm](https://arxiv.org/abs/1707.06347). The implementation inherits the main logic of the ppo2 OpenAI implementation [here](https://github.com/openai/baselines/tree/master/baselines/ppo2) and is adapted for the NAS scenario.
It can successfully tune the [mnist-nas example](https://github.com/microsoft/nni/tree/master/examples/trials/mnist-nas), and has the following result: We had successfully tuned the mnist-nas example and has the following result:
**NOTE: we are refactoring this example to the latest NAS interface, will publish the example codes after the refactor.**
![](../../img/ppo_mnist.png) ![](../../img/ppo_mnist.png)
......
...@@ -525,7 +525,7 @@ Used to specify designated GPU devices for NNI, if it is set, only the specified ...@@ -525,7 +525,7 @@ Used to specify designated GPU devices for NNI, if it is set, only the specified
#### maxTrialNumPerGpu #### maxTrialNumPerGpu
Optional. Integer. Default: 99999. Optional. Integer. Default: 1.
Used to specify the max concurrency trial number on a GPU device. Used to specify the max concurrency trial number on a GPU device.
......
...@@ -45,7 +45,7 @@ Probably it's a problem with your network config. Here is a checklist. ...@@ -45,7 +45,7 @@ Probably it's a problem with your network config. Here is a checklist.
### NNI on Windows problems ### NNI on Windows problems
Please refer to [NNI on Windows](InstallationWin.md#FAQ) Please refer to [NNI on Windows](InstallationWin.md)
### More FAQ issues ### More FAQ issues
......
...@@ -28,4 +28,5 @@ For details, please refer to the following tutorials: ...@@ -28,4 +28,5 @@ For details, please refer to the following tutorials:
ProxylessNAS <NAS/Proxylessnas> ProxylessNAS <NAS/Proxylessnas>
TextNAS <NAS/TextNAS> TextNAS <NAS/TextNAS>
Customize a NAS Algorithm <NAS/Advanced> Customize a NAS Algorithm <NAS/Advanced>
NAS Visualization <NAS/Visualization>
API Reference <NAS/NasReference> API Reference <NAS/NasReference>
...@@ -27,7 +27,7 @@ prune_config = { ...@@ -27,7 +27,7 @@ prune_config = {
'model_name': 'naive', 'model_name': 'naive',
'pruner_class': AGP_Pruner, 'pruner_class': AGP_Pruner,
'config_list': [{ 'config_list': [{
'initial_sparsity': 0, 'initial_sparsity': 0.,
'final_sparsity': 0.8, 'final_sparsity': 0.8,
'start_epoch': 0, 'start_epoch': 0,
'end_epoch': 10, 'end_epoch': 10,
......
...@@ -24,6 +24,7 @@ if __name__ == "__main__": ...@@ -24,6 +24,7 @@ if __name__ == "__main__":
parser.add_argument("--epochs", default=50, type=int) parser.add_argument("--epochs", default=50, type=int)
parser.add_argument("--channels", default=16, type=int) parser.add_argument("--channels", default=16, type=int)
parser.add_argument("--unrolled", default=False, action="store_true") parser.add_argument("--unrolled", default=False, action="store_true")
parser.add_argument("--visualization", default=False, action="store_true")
args = parser.parse_args() args = parser.parse_args()
dataset_train, dataset_valid = datasets.get_dataset("cifar10") dataset_train, dataset_valid = datasets.get_dataset("cifar10")
...@@ -45,4 +46,6 @@ if __name__ == "__main__": ...@@ -45,4 +46,6 @@ if __name__ == "__main__":
log_frequency=args.log_frequency, log_frequency=args.log_frequency,
unrolled=args.unrolled, unrolled=args.unrolled,
callbacks=[LRSchedulerCallback(lr_scheduler), ArchitectureCheckpoint("./checkpoints")]) callbacks=[LRSchedulerCallback(lr_scheduler), ArchitectureCheckpoint("./checkpoints")])
if args.visualization:
trainer.enable_visualization()
trainer.train() trainer.train()
...@@ -25,6 +25,7 @@ if __name__ == "__main__": ...@@ -25,6 +25,7 @@ if __name__ == "__main__":
parser.add_argument("--log-frequency", default=10, type=int) parser.add_argument("--log-frequency", default=10, type=int)
parser.add_argument("--search-for", choices=["macro", "micro"], default="macro") parser.add_argument("--search-for", choices=["macro", "micro"], default="macro")
parser.add_argument("--epochs", default=None, type=int, help="Number of epochs (default: macro 310, micro 150)") parser.add_argument("--epochs", default=None, type=int, help="Number of epochs (default: macro 310, micro 150)")
parser.add_argument("--visualization", default=False, action="store_true")
args = parser.parse_args() args = parser.parse_args()
dataset_train, dataset_valid = datasets.get_dataset("cifar10") dataset_train, dataset_valid = datasets.get_dataset("cifar10")
...@@ -55,4 +56,6 @@ if __name__ == "__main__": ...@@ -55,4 +56,6 @@ if __name__ == "__main__":
dataset_valid=dataset_valid, dataset_valid=dataset_valid,
log_frequency=args.log_frequency, log_frequency=args.log_frequency,
mutator=mutator) mutator=mutator)
if args.visualization:
trainer.enable_visualization()
trainer.train() trainer.train()
...@@ -68,5 +68,6 @@ if __name__ == "__main__": ...@@ -68,5 +68,6 @@ if __name__ == "__main__":
dataset_valid=dataset_valid, dataset_valid=dataset_valid,
batch_size=64, batch_size=64,
log_frequency=10) log_frequency=10)
trainer.enable_visualization()
trainer.train() trainer.train()
trainer.export("checkpoint.json") trainer.export("checkpoint.json")
authorName: default
experimentName: example_mnist
trialConcurrency: 1
maxExecDuration: 1h
maxTrialNum: 10
#choice: local, remote, pai
trainingServicePlatform: local
#choice: true, false
useAnnotation: true
tuner:
builtinTunerName: TPE
trial:
command: python3 mnist.py --batch_num 200
codeDir: .
gpuNum: 0
nasMode: classic_mode
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment