Unverified Commit 8a60d624 authored by SparkSnail's avatar SparkSnail Committed by GitHub
Browse files

Support paiStorageConfigName (#2536)

parent 52f71f54
...@@ -16,7 +16,7 @@ Step 3. Mount NFS storage to local machine. ...@@ -16,7 +16,7 @@ Step 3. Mount NFS storage to local machine.
![](../../img/pai_job_submission_page.jpg) ![](../../img/pai_job_submission_page.jpg)
Find the data management region in job submission page. Find the data management region in job submission page.
![](../../img/pai_data_management_page.jpg) ![](../../img/pai_data_management_page.jpg)
The `DEFAULT_STORAGE`field is the path to be mounted in PAI's container when a job is started. The `Preview container paths` is the NFS host and path that PAI provided, you need to mount the corresponding host and path to your local machine first, then NNI could use the PAI's NFS storage. The `Preview container paths` is the NFS host and path that PAI provided, you need to mount the corresponding host and path to your local machine first, then NNI could use the PAI's NFS storage.
For example, use the following command: For example, use the following command:
``` ```
sudo mount -t nfs4 gcr-openpai-infra02:/pai/data /local/mnt sudo mount -t nfs4 gcr-openpai-infra02:/pai/data /local/mnt
...@@ -25,13 +25,14 @@ Then the `/data` folder in container will be mounted to `/local/mnt` folder in y ...@@ -25,13 +25,14 @@ Then the `/data` folder in container will be mounted to `/local/mnt` folder in y
You could use the following configuration in your NNI's config file: You could use the following configuration in your NNI's config file:
``` ```
nniManagerNFSMountPath: /local/mnt nniManagerNFSMountPath: /local/mnt
containerNFSMountPath: /data
``` ```
Step 4. Get PAI's storage plugin name. Step 4. Get PAI's storage config name and nniManagerMountPath
Contact PAI's admin, and get the PAI's storage plugin name for NFS storage. The default storage name is `teamwise_storage`, the configuration in NNI's config file is in following value: The `Team share storage` field is storage configuration used to specify storage value in PAI. You can get `paiStorageConfigName` and `containerNFSMountPath` field in `Team share storage`, for example:
``` ```
paiStoragePlugin: teamwise_storage paiStorageConfigName: confignfs-data
containerNFSMountPath: /mnt/confignfs-data
``` ```
## Run an experiment ## Run an experiment
...@@ -66,7 +67,7 @@ trial: ...@@ -66,7 +67,7 @@ trial:
virtualCluster: default virtualCluster: default
nniManagerNFSMountPath: /home/user/mnt nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user containerNFSMountPath: /mnt/data/user
paiStoragePlugin: teamwise_storage paiStorageConfigName: confignfs-data
# Configuration to access OpenPAI Cluster # Configuration to access OpenPAI Cluster
paiConfig: paiConfig:
userName: your_pai_nni_user userName: your_pai_nni_user
...@@ -90,13 +91,13 @@ Compared with [LocalMode](LocalMode.md) and [RemoteMachineMode](RemoteMachineMod ...@@ -90,13 +91,13 @@ Compared with [LocalMode](LocalMode.md) and [RemoteMachineMode](RemoteMachineMod
* Required key. Set the mount path in your nniManager machine. * Required key. Set the mount path in your nniManager machine.
* containerNFSMountPath * containerNFSMountPath
* Required key. Set the mount path in your container used in PAI. * Required key. Set the mount path in your container used in PAI.
* paiStoragePlugin * paiStorageConfigName:
* Optional key. Set the storage plugin name used in PAI. If it is not set in trial configuration, it should be set in the config file specified in `paiConfigPath` field. * Optional key. Set the storage name used in PAI. If it is not set in trial configuration, it should be set in the config file specified in `paiConfigPath` field.
* command * command
* Optional key. Set the commands used in PAI container. * Optional key. Set the commands used in PAI container.
* paiConfigPath * paiConfigPath
* Optional key. Set the file path of pai job configuration, the file is in yaml format. * Optional key. Set the file path of pai job configuration, the file is in yaml format.
If users set `paiConfigPath` in NNI's configuration file, no need to specify the fields `command`, `paiStoragePlugin`, `virtualCluster`, `image`, `memoryMB`, `cpuNum`, `gpuNum` in `trial` configuration. These fields will use the values from the config file specified by `paiConfigPath`. If users set `paiConfigPath` in NNI's configuration file, no need to specify the fields `command`, `paiStorageConfigName`, `virtualCluster`, `image`, `memoryMB`, `cpuNum`, `gpuNum` in `trial` configuration. These fields will use the values from the config file specified by `paiConfigPath`.
``` ```
Note: Note:
1. The job name in PAI's configuration file will be replaced by a new job name, the new job name is created by NNI, the name format is nni_exp_${this.experimentId}_trial_${trialJobId}. 1. The job name in PAI's configuration file will be replaced by a new job name, the new job name is created by NNI, the name format is nni_exp_${this.experimentId}_trial_${trialJobId}.
...@@ -127,7 +128,7 @@ And you will be redirected to HDFS web portal to browse the output files of that ...@@ -127,7 +128,7 @@ And you will be redirected to HDFS web portal to browse the output files of that
You can see there're three fils in output folder: stderr, stdout, and trial.log You can see there're three fils in output folder: stderr, stdout, and trial.log
## data management ## data management
Befour using NNI to start your experiment, users should set the corresponding mount data path in your nniManager machine. PAI has their own storage(NFS, AzureBlob ...), and the storage will used in PAI will be mounted to the container when it start a job. Users should set the PAI storage type by `paiStoragePlugin` field to choose a storage in PAI. Then users should mount the storage to their nniManager machine, and set the `nniManagerNFSMountPath` field in configuration file, NNI will generate bash files and copy data in `codeDir` to the `nniManagerNFSMountPath` folder, then NNI will start a trial job. The data in `nniManagerNFSMountPath` will be sync to PAI storage, and will be mounted to PAI's container. The data path in container is set in `containerNFSMountPath`, NNI will enter this folder first, and then run scripts to start a trial job. Before using NNI to start your experiment, users should set the corresponding mount data path in your nniManager machine. PAI has their own storage(NFS, AzureBlob ...), and the storage will used in PAI will be mounted to the container when it start a job. Users should set the PAI storage type by `paiStorageConfigName` field to choose a storage in PAI. Then users should mount the storage to their nniManager machine, and set the `nniManagerNFSMountPath` field in configuration file, NNI will generate bash files and copy data in `codeDir` to the `nniManagerNFSMountPath` folder, then NNI will start a trial job. The data in `nniManagerNFSMountPath` will be sync to PAI storage, and will be mounted to PAI's container. The data path in container is set in `containerNFSMountPath`, NNI will enter this folder first, and then run scripts to start a trial job.
## version check ## version check
NNI support version check feature in since version 0.6. It is a policy to insure the version of NNIManager is consistent with trialKeeper, and avoid errors caused by version incompatibility. NNI support version check feature in since version 0.6. It is a policy to insure the version of NNIManager is consistent with trialKeeper, and avoid errors caused by version incompatibility.
......
docs/img/pai_data_management_page.jpg

221 KB | W: | H:

docs/img/pai_data_management_page.jpg

96.2 KB | W: | H:

docs/img/pai_data_management_page.jpg
docs/img/pai_data_management_page.jpg
docs/img/pai_data_management_page.jpg
docs/img/pai_data_management_page.jpg
  • 2-up
  • Swipe
  • Onion skin
...@@ -25,7 +25,7 @@ trial: ...@@ -25,7 +25,7 @@ trial:
image: msranni/nni:latest image: msranni/nni:latest
nniManagerNFSMountPath: /home/user/mnt nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user containerNFSMountPath: /mnt/data/user
paiStoragePlugin: team_wise paiStorageConfigName: confignfs-data
paiConfig: paiConfig:
#The username to login pai #The username to login pai
userName: username userName: username
......
...@@ -25,7 +25,7 @@ trial: ...@@ -25,7 +25,7 @@ trial:
image: msranni/nni:latest image: msranni/nni:latest
nniManagerNFSMountPath: /home/user/mnt nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user containerNFSMountPath: /mnt/data/user
paiStoragePlugin: team_wise paiStorageConfigName: confignfs-data
paiConfig: paiConfig:
#The username to login pai #The username to login pai
userName: username userName: username
......
...@@ -23,7 +23,7 @@ trial: ...@@ -23,7 +23,7 @@ trial:
image: msranni/nni:latest image: msranni/nni:latest
nniManagerNFSMountPath: /home/user/mnt nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user containerNFSMountPath: /mnt/data/user
paiStoragePlugin: team_wise paiStorageConfigName: confignfs-data
nniManagerIp: <nni_manager_ip> nniManagerIp: <nni_manager_ip>
paiConfig: paiConfig:
userName: <username> userName: <username>
......
...@@ -25,7 +25,7 @@ trial: ...@@ -25,7 +25,7 @@ trial:
image: msranni/nni:latest image: msranni/nni:latest
nniManagerNFSMountPath: /home/user/mnt nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user containerNFSMountPath: /mnt/data/user
paiStoragePlugin: team_wise paiStorageConfigName: confignfs-data
paiConfig: paiConfig:
#The username to login pai #The username to login pai
userName: username userName: username
......
...@@ -29,7 +29,7 @@ trial: ...@@ -29,7 +29,7 @@ trial:
image: msranni/nni:latest image: msranni/nni:latest
nniManagerNFSMountPath: /home/user/mnt nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user containerNFSMountPath: /mnt/data/user
paiStoragePlugin: team_wise paiStorageConfigName: confignfs-data
paiConfig: paiConfig:
#The username to login pai #The username to login pai
userName: username userName: username
......
...@@ -24,7 +24,7 @@ trial: ...@@ -24,7 +24,7 @@ trial:
image: msranni/nni:latest image: msranni/nni:latest
nniManagerNFSMountPath: /home/user/mnt nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user containerNFSMountPath: /mnt/data/user
paiStoragePlugin: team_wise paiStorageConfigName: confignfs-data
paiConfig: paiConfig:
#The username to login pai #The username to login pai
userName: username userName: username
......
...@@ -22,7 +22,7 @@ trial: ...@@ -22,7 +22,7 @@ trial:
image: msranni/nni:latest image: msranni/nni:latest
nniManagerNFSMountPath: /home/user/mnt nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user containerNFSMountPath: /mnt/data/user
paiStoragePlugin: team_wise paiStorageConfigName: confignfs-data
paiConfig: paiConfig:
#The username to login pai #The username to login pai
userName: username userName: username
......
...@@ -25,7 +25,7 @@ trial: ...@@ -25,7 +25,7 @@ trial:
image: msranni/nni:latest image: msranni/nni:latest
nniManagerNFSMountPath: /home/user/mnt nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user containerNFSMountPath: /mnt/data/user
paiStoragePlugin: team_wise paiStorageConfigName: confignfs-data
paiConfig: paiConfig:
#The username to login pai #The username to login pai
userName: username userName: username
......
...@@ -25,7 +25,7 @@ trial: ...@@ -25,7 +25,7 @@ trial:
image: msranni/nni:latest image: msranni/nni:latest
nniManagerNFSMountPath: /home/user/mnt nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user containerNFSMountPath: /mnt/data/user
paiStoragePlugin: team_wise paiStorageConfigName: confignfs-data
paiConfig: paiConfig:
#The username to login pai #The username to login pai
userName: username userName: username
......
...@@ -25,7 +25,7 @@ trial: ...@@ -25,7 +25,7 @@ trial:
image: msranni/nni:latest image: msranni/nni:latest
nniManagerNFSMountPath: /home/user/mnt nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user containerNFSMountPath: /mnt/data/user
paiStoragePlugin: team_wise paiStorageConfigName: confignfs-data
paiConfig: paiConfig:
#The username to login pai #The username to login pai
userName: username userName: username
......
...@@ -32,7 +32,7 @@ trial: ...@@ -32,7 +32,7 @@ trial:
image: msranni/nni:latest image: msranni/nni:latest
nniManagerNFSMountPath: /home/user/mnt nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user containerNFSMountPath: /mnt/data/user
paiStoragePlugin: team_wise paiStorageConfigName: confignfs-data
paiConfig: paiConfig:
#The username to login pai #The username to login pai
userName: username userName: username
......
...@@ -32,7 +32,7 @@ trial: ...@@ -32,7 +32,7 @@ trial:
image: msranni/nni:latest image: msranni/nni:latest
nniManagerNFSMountPath: /home/user/mnt nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user containerNFSMountPath: /mnt/data/user
paiStoragePlugin: team_wise paiStorageConfigName: confignfs-data
paiConfig: paiConfig:
#The username to login pai #The username to login pai
userName: username userName: username
......
...@@ -25,7 +25,7 @@ trial: ...@@ -25,7 +25,7 @@ trial:
image: msranni/nni:latest image: msranni/nni:latest
nniManagerNFSMountPath: /home/user/mnt nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user containerNFSMountPath: /mnt/data/user
paiStoragePlugin: team_wise paiStorageConfigName: confignfs-data
paiConfig: paiConfig:
#The username to login pai #The username to login pai
userName: username userName: username
......
...@@ -25,7 +25,7 @@ trial: ...@@ -25,7 +25,7 @@ trial:
image: msranni/nni:latest image: msranni/nni:latest
nniManagerNFSMountPath: /home/user/mnt nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user containerNFSMountPath: /mnt/data/user
paiStoragePlugin: team_wise paiStorageConfigName: confignfs-data
paiConfig: paiConfig:
#The username to login pai #The username to login pai
userName: username userName: username
......
...@@ -39,7 +39,7 @@ export namespace ValidationSchemas { ...@@ -39,7 +39,7 @@ export namespace ValidationSchemas {
nniManagerNFSMountPath: joi.string().min(1), nniManagerNFSMountPath: joi.string().min(1),
containerNFSMountPath: joi.string().min(1), containerNFSMountPath: joi.string().min(1),
paiConfigPath: joi.string(), paiConfigPath: joi.string(),
paiStoragePlugin: joi.string().min(1), paiStorageConfigName: joi.string().min(1),
nasMode: joi.string().valid('classic_mode', 'enas_mode', 'oneshot_mode', 'darts_mode'), nasMode: joi.string().valid('classic_mode', 'enas_mode', 'oneshot_mode', 'darts_mode'),
portList: joi.array().items(joi.object({ portList: joi.array().items(joi.object({
label: joi.string().required(), label: joi.string().required(),
......
...@@ -30,12 +30,12 @@ export class NNIPAIK8STrialConfig extends TrialConfig { ...@@ -30,12 +30,12 @@ export class NNIPAIK8STrialConfig extends TrialConfig {
public virtualCluster?: string; public virtualCluster?: string;
public readonly nniManagerNFSMountPath: string; public readonly nniManagerNFSMountPath: string;
public readonly containerNFSMountPath: string; public readonly containerNFSMountPath: string;
public readonly paiStoragePlugin: string; public readonly paiStorageConfigName: string;
public readonly paiConfigPath?: string; public readonly paiConfigPath?: string;
constructor(command: string, codeDir: string, gpuNum: number, cpuNum: number, memoryMB: number, constructor(command: string, codeDir: string, gpuNum: number, cpuNum: number, memoryMB: number,
image: string, nniManagerNFSMountPath: string, containerNFSMountPath: string, image: string, nniManagerNFSMountPath: string, containerNFSMountPath: string,
paiStoragePlugin: string, virtualCluster?: string, paiConfigPath?: string) { paiStorageConfigName: string, virtualCluster?: string, paiConfigPath?: string) {
super(command, codeDir, gpuNum); super(command, codeDir, gpuNum);
this.cpuNum = cpuNum; this.cpuNum = cpuNum;
this.memoryMB = memoryMB; this.memoryMB = memoryMB;
...@@ -43,7 +43,7 @@ export class NNIPAIK8STrialConfig extends TrialConfig { ...@@ -43,7 +43,7 @@ export class NNIPAIK8STrialConfig extends TrialConfig {
this.virtualCluster = virtualCluster; this.virtualCluster = virtualCluster;
this.nniManagerNFSMountPath = nniManagerNFSMountPath; this.nniManagerNFSMountPath = nniManagerNFSMountPath;
this.containerNFSMountPath = containerNFSMountPath; this.containerNFSMountPath = containerNFSMountPath;
this.paiStoragePlugin = paiStoragePlugin; this.paiStorageConfigName = paiStorageConfigName;
this.paiConfigPath = paiConfigPath; this.paiConfigPath = paiConfigPath;
} }
} }
...@@ -233,9 +233,9 @@ class PAIK8STrainingService extends PAITrainingService { ...@@ -233,9 +233,9 @@ class PAIK8STrainingService extends PAITrainingService {
} }
}, },
extras: { extras: {
'com.microsoft.pai.runtimeplugin': [ 'storages': [
{ {
plugin: this.paiTrialConfig.paiStoragePlugin name: this.paiTrialConfig.paiStorageConfigName
} }
], ],
submitFrom: 'submit-job-v2' submitFrom: 'submit-job-v2'
......
...@@ -92,7 +92,7 @@ pai: ...@@ -92,7 +92,7 @@ pai:
memoryMB: 8192 memoryMB: 8192
nniManagerNFSMountPath: nniManagerNFSMountPath:
containerNFSMountPath: containerNFSMountPath:
paiStoragePlugin: paiStorageConfigName:
remote: remote:
machineList: machineList:
- ip: - ip:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment