update ga_squad example (#461)

* update ga_squad experiment example on pai * Update config_pai.yml * Update README.md * Update config_pai.yml * Update README.md * Update README.md * Update README.md

update ga_squad example (#461)
* update ga_squad experiment example on pai * Update config_pai.yml * Update README.md * Update config_pai.yml * Update README.md * Update README.md * Update README.md
23530bb6 · ShufanHuang · xuehui · a6621cef · 23530bb6 · 23530bb6
Commit 23530bb6 authored Dec 07, 2018 by ShufanHuang Committed by xuehui Dec 07, 2018
Show whitespace changes
Inline Side-by-side

Showing with 71 additions and 9 deletions

examples/trials/ga_squad/README.md examples/trials/ga_squad/README.md +65 -5

examples/trials/ga_squad/config_pai.yml examples/trials/ga_squad/config_pai.yml +6 -4

No files found.
--- a/examples/trials/ga_squad/README.md
+++ b/examples/trials/ga_squad/README.md
@@ -20,7 +20,9 @@ Also we have another version which time cost is less and performance is better.

 # How to run this example?

-## Use downloading script to download data
+## Run this example on local or remote
+
+### Use downloading script to download data

 Execute the following command to download needed files
 using the downloading script:
@@ -30,7 +32,7 @@ chmod +x ./download.sh
 ./download.sh
 ```

-## Download manually
+### Download manually

 1. download "dev-v1.1.json" and "train-v1.1.json" in https://rajpurkar.github.io/SQuAD-explorer/

@@ -46,8 +48,8 @@ wget http://nlp.stanford.edu/data/glove.840B.300d.zip
 unzip glove.840B.300d.zip
 ```

-## Update configuration
-Modify `nni/examples/trials/ga_squad/config.yaml`, here is the default configuration:
+### Update configuration
+Modify `nni/examples/trials/ga_squad/config.yml`, here is the default configuration:

 ```
 authorName: default
@@ -75,12 +77,70 @@ In the "trial" part, if you want to use GPU to perform the architecture search,

 `trialConcurrency` is the number of trials running concurrently, which is the number of GPUs you want to use, if you are setting `gpuNum` to 1.

-## submit this job
+### submit this job

 ```
 nnictl create --config ~/nni/examples/trials/ga_squad/config.yml
 ```

+## Run this example on OpenPAI
+
+Due to the memory limitation of upload, we only upload the source code and complete the data download and training on OpenPAI. This experiment requires sufficient memory that `memoryMB >= 32G`, and the training may last for several hours.
+
+### Update configuration
+Modify `nni/examples/trials/ga_squad/config_pai.yaml`, here is the default configuration:
+
+```
+authorName: default
+experimentName: example_ga_squad
+trialConcurrency: 1
+maxExecDuration: 1h
+maxTrialNum: 10
+#choice: local, remote, pai
+trainingServicePlatform: pai
+#choice: true, false
+useAnnotation: false
+#Your nni_manager ip
+nniManagerIp: 10.10.10.10
+tuner:
+  codeDir: ../../tuners/ga_customer_tuner
+  classFileName: customer_tuner.py
+  className: CustomerTuner
+  classArgs:
+    optimize_mode: maximize
+trial:
+  command: chmod +x ./download.sh && ./download.sh && python3 trial.py
+  codeDir: .
+  gpuNum: 0
+  cpuNum: 1
+  memoryMB: 32869
+  #The docker image to run nni job on pai
+  image: msranni/nni:latest
+  #The hdfs directory to store data on pai, format 'hdfs://host:port/directory'
+  dataDir: hdfs://10.10.10.10:9000/username/nni
+  #The hdfs directory to store output data generated by nni, format 'hdfs://host:port/directory'
+  outputDir: hdfs://10.10.10.10:9000/username/nni
+paiConfig:
+  #The username to login pai
+  userName: username
+  #The password to login pai
+  passWord: password
+  #The host of restful server of pai
+  host: 10.10.10.10
+```
+
+Please change the default value to your personal account and machine information. Including `nniManagerIp`, `dataDir`, `outputDir`, `userName`, `passWord` and `host`.
+
+In the "trial" part, if you want to use GPU to perform the architecture search, change `gpuNum` from `0` to `1`. You need to increase the `maxTrialNum` and `maxExecDuration`, according to how long you want to wait for the search result.
+
+`trialConcurrency` is the number of trials running concurrently, which is the number of GPUs you want to use, if you are setting `gpuNum` to 1.
+
+### submit this job
+
+```
+nnictl create --config ~/nni/examples/trials/ga_squad/config_pai.yml
+```
+
 # Techinal details about the trial

 ## How does it works

--- a/examples/trials/ga_squad/config_pai.yml
+++ b/examples/trials/ga_squad/config_pai.yml
@@ -7,18 +7,20 @@ maxTrialNum: 10
 trainingServicePlatform: pai
 #choice: true, false
 useAnnotation: false
+#Your nni_manager ip
+nniManagerIp: 10.10.10.10
 tuner:
-  codeDir: ../tuners/ga_customer_tuner
+  codeDir: ../../tuners/ga_customer_tuner
  classFileName: customer_tuner.py
  className: CustomerTuner
  classArgs:
    optimize_mode: maximize
 trial:
-  command: python3 trial.py
+  command: chmod +x ./download.sh && ./download.sh && python3 trial.py
  codeDir: .
  gpuNum: 0
  cpuNum: 1
-  memoryMB: 8196
+  memoryMB: 32869
  #The docker image to run nni job on pai
  image: msranni/nni:latest
  #The hdfs directory to store data on pai, format 'hdfs://host:port/directory'