Commit 23530bb6 authored by ShufanHuang's avatar ShufanHuang Committed by xuehui
Browse files

update ga_squad example (#461)

* update ga_squad experiment example on pai

* Update config_pai.yml

* Update README.md

* Update config_pai.yml

* Update README.md

* Update README.md

* Update README.md
parent a6621cef
......@@ -20,7 +20,9 @@ Also we have another version which time cost is less and performance is better.
# How to run this example?
## Use downloading script to download data
## Run this example on local or remote
### Use downloading script to download data
Execute the following command to download needed files
using the downloading script:
......@@ -30,7 +32,7 @@ chmod +x ./download.sh
./download.sh
```
## Download manually
### Download manually
1. download "dev-v1.1.json" and "train-v1.1.json" in https://rajpurkar.github.io/SQuAD-explorer/
......@@ -46,8 +48,8 @@ wget http://nlp.stanford.edu/data/glove.840B.300d.zip
unzip glove.840B.300d.zip
```
## Update configuration
Modify `nni/examples/trials/ga_squad/config.yaml`, here is the default configuration:
### Update configuration
Modify `nni/examples/trials/ga_squad/config.yml`, here is the default configuration:
```
authorName: default
......@@ -75,12 +77,70 @@ In the "trial" part, if you want to use GPU to perform the architecture search,
`trialConcurrency` is the number of trials running concurrently, which is the number of GPUs you want to use, if you are setting `gpuNum` to 1.
## submit this job
### submit this job
```
nnictl create --config ~/nni/examples/trials/ga_squad/config.yml
```
## Run this example on OpenPAI
Due to the memory limitation of upload, we only upload the source code and complete the data download and training on OpenPAI. This experiment requires sufficient memory that `memoryMB >= 32G`, and the training may last for several hours.
### Update configuration
Modify `nni/examples/trials/ga_squad/config_pai.yaml`, here is the default configuration:
```
authorName: default
experimentName: example_ga_squad
trialConcurrency: 1
maxExecDuration: 1h
maxTrialNum: 10
#choice: local, remote, pai
trainingServicePlatform: pai
#choice: true, false
useAnnotation: false
#Your nni_manager ip
nniManagerIp: 10.10.10.10
tuner:
codeDir: ../../tuners/ga_customer_tuner
classFileName: customer_tuner.py
className: CustomerTuner
classArgs:
optimize_mode: maximize
trial:
command: chmod +x ./download.sh && ./download.sh && python3 trial.py
codeDir: .
gpuNum: 0
cpuNum: 1
memoryMB: 32869
#The docker image to run nni job on pai
image: msranni/nni:latest
#The hdfs directory to store data on pai, format 'hdfs://host:port/directory'
dataDir: hdfs://10.10.10.10:9000/username/nni
#The hdfs directory to store output data generated by nni, format 'hdfs://host:port/directory'
outputDir: hdfs://10.10.10.10:9000/username/nni
paiConfig:
#The username to login pai
userName: username
#The password to login pai
passWord: password
#The host of restful server of pai
host: 10.10.10.10
```
Please change the default value to your personal account and machine information. Including `nniManagerIp`, `dataDir`, `outputDir`, `userName`, `passWord` and `host`.
In the "trial" part, if you want to use GPU to perform the architecture search, change `gpuNum` from `0` to `1`. You need to increase the `maxTrialNum` and `maxExecDuration`, according to how long you want to wait for the search result.
`trialConcurrency` is the number of trials running concurrently, which is the number of GPUs you want to use, if you are setting `gpuNum` to 1.
### submit this job
```
nnictl create --config ~/nni/examples/trials/ga_squad/config_pai.yml
```
# Techinal details about the trial
## How does it works
......
......@@ -7,18 +7,20 @@ maxTrialNum: 10
trainingServicePlatform: pai
#choice: true, false
useAnnotation: false
#Your nni_manager ip
nniManagerIp: 10.10.10.10
tuner:
codeDir: ../tuners/ga_customer_tuner
codeDir: ../../tuners/ga_customer_tuner
classFileName: customer_tuner.py
className: CustomerTuner
classArgs:
optimize_mode: maximize
trial:
command: python3 trial.py
command: chmod +x ./download.sh && ./download.sh && python3 trial.py
codeDir: .
gpuNum: 0
cpuNum: 1
memoryMB: 8196
memoryMB: 32869
#The docker image to run nni job on pai
image: msranni/nni:latest
#The hdfs directory to store data on pai, format 'hdfs://host:port/directory'
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment