Unverified Commit 88ef6c04 authored by SparkSnail's avatar SparkSnail Committed by GitHub
Browse files

Merge pull request #197 from microsoft/master

merge master
parents 5f3c5ffd 555334de
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Python wrapper for nni restful APIs\n",
"\n",
"nni provides nnicli module as a python wrapper for its restful APIs, which can be used to retrieve nni experiment and trial job information in your python code. This notebook shows how to use nnicli module.\n",
"\n",
"Following are the functions available in nnicli module:\n",
"\n",
"#### start_nni(config_file)\n",
"Starts nni experiment with specified configuration file\n",
"\n",
"#### stop_nni()\n",
"Stop nni experiment.\n",
"\n",
"#### set_endpoint(endpoint)\n",
"Set nni endpoint for nnicli, the endpoint is showed while nni experiment is started successfully using nnictl command or start_nni function\n",
"\n",
"#### version()\n",
"Returns nni version\n",
"\n",
"#### get_experiment_profile()\n",
"Returns experiment profile.\n",
"\n",
"#### get_experiment_status()\n",
"Returns nni experiment status.\n",
"\n",
"#### get_job_metrics(trial_job_id)\n",
"Returns specified trial job metrics, including final results and intermediate results.\n",
"\n",
"#### get_job_statistics()\n",
"Returns trial job statistics information\n",
"\n",
"#### get_trial_job(trial_job_id)\n",
"Returns information of a specified trial job.\n",
"\n",
"#### list_trial_jobs()\n",
"Returns information of all trial jobs of current experiment."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Start nni experiment using specified configuration file\n",
"Let's use a configruation file in nni examples directory to start an experiment."
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"authorName: default\r\n",
"experimentName: example_mnist\r\n",
"trialConcurrency: 1\r\n",
"maxExecDuration: 1h\r\n",
"maxTrialNum: 10\r\n",
"#choice: local, remote, pai\r\n",
"trainingServicePlatform: local\r\n",
"searchSpacePath: search_space.json\r\n",
"#choice: true, false\r\n",
"useAnnotation: false\r\n",
"tuner:\r\n",
" #choice: TPE, Random, Anneal, Evolution, BatchTuner, MetisTuner, GPTuner\r\n",
" #SMAC (SMAC should be installed through nnictl)\r\n",
" builtinTunerName: TPE\r\n",
" classArgs:\r\n",
" #choice: maximize, minimize\r\n",
" optimize_mode: maximize\r\n",
"trial:\r\n",
" command: python3 mnist.py\r\n",
" codeDir: .\r\n",
" gpuNum: 0\r\n"
]
}
],
"source": [
"! cat ../trials/mnist/config.yml"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"INFO: expand searchSpacePath: search_space.json to /mnt/d/Repos/nni/examples/trials/mnist/search_space.json\n",
"INFO: expand codeDir: . to /mnt/d/Repos/nni/examples/trials/mnist/.\n",
"INFO: Starting restful server...\n",
"INFO: Successfully started Restful server!\n",
"INFO: Setting local config...\n",
"INFO: Successfully set local config!\n",
"INFO: Starting experiment...\n",
"INFO: Successfully started experiment!\n",
"-----------------------------------------------------------------------\n",
"The experiment id is PlUIfDTR\n",
"The Web UI urls are: http://172.18.17.1:8080 http://10.172.121.40:8080 http://10.0.75.1:8080 http://127.0.0.1:8080\n",
"-----------------------------------------------------------------------\n",
"\n",
"You can use these commands to get more information about the experiment\n",
"-----------------------------------------------------------------------\n",
"commands description\n",
"1. nnictl experiment show show the information of experiments\n",
"2. nnictl trial ls list all of trial jobs\n",
"3. nnictl top monitor the status of running experiments\n",
"4. nnictl log stderr show stderr log content\n",
"5. nnictl log stdout show stdout log content\n",
"6. nnictl stop stop an experiment\n",
"7. nnictl trial kill kill a trial job by id\n",
"8. nnictl --help get help information about nnictl\n",
"-----------------------------------------------------------------------\n",
"\n"
]
}
],
"source": [
"import nnicli as nc\n",
"nc.start_nni(config_file='../trials/mnist/config.yml')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Connect nnicli module to started nni experiment\n",
"Call set_endpoint to connect nnicli moduele to the rest server of started nni experiment. Local mode training serviced is used in this notebook, but nnicli module can connect to any started nni experiment. The endpoint can be found in the output of start_nni function."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"nc.set_endpoint('http://127.0.0.1:8080')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Retrieve nni experiment and trial job information"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'errors': [], 'status': 'RUNNING'}"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nc.get_experiment_status()"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'trialJobNumber': 4, 'trialJobStatus': 'SUCCEEDED'},\n",
" {'trialJobNumber': 1, 'trialJobStatus': 'RUNNING'}]"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nc.get_job_statistics()"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'execDuration': 1117,\n",
" 'id': 'PlUIfDTR',\n",
" 'logDir': '/home/chicm/nni/experiments/PlUIfDTR',\n",
" 'maxSequenceId': 3,\n",
" 'params': {'authorName': 'default',\n",
" 'clusterMetaData': [{'key': 'codeDir',\n",
" 'value': '/mnt/d/Repos/nni/examples/trials/mnist/.'},\n",
" {'key': 'command', 'value': 'python3 mnist.py'}],\n",
" 'experimentName': 'example_mnist',\n",
" 'maxExecDuration': 3600,\n",
" 'maxTrialNum': 10,\n",
" 'searchSpace': '{\"hidden_size\": {\"_value\": [124, 512, 1024], \"_type\": \"choice\"}, \"batch_size\": {\"_value\": [1, 4, 8, 16, 32], \"_type\": \"choice\"}, \"conv_size\": {\"_value\": [2, 3, 5, 7], \"_type\": \"choice\"}, \"dropout_rate\": {\"_value\": [0.5, 0.9], \"_type\": \"uniform\"}, \"learning_rate\": {\"_value\": [0.0001, 0.001, 0.01, 0.1], \"_type\": \"choice\"}}',\n",
" 'trainingServicePlatform': 'local',\n",
" 'trialConcurrency': 1,\n",
" 'tuner': {'builtinTunerName': 'TPE',\n",
" 'checkpointDir': '/home/chicm/nni/experiments/PlUIfDTR/checkpoint',\n",
" 'classArgs': {'optimize_mode': 'maximize'},\n",
" 'className': 'TPE'},\n",
" 'versionCheck': True},\n",
" 'revision': 116,\n",
" 'startTime': 1564484985839}"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nc.get_experiment_profile()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's define an utility function to format json string returned by nnicli module."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"def show_json(res):\n",
" print(json.dumps(res, indent=4))"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"params\": {\n",
" \"searchSpace\": \"{\\\"hidden_size\\\": {\\\"_value\\\": [124, 512, 1024], \\\"_type\\\": \\\"choice\\\"}, \\\"batch_size\\\": {\\\"_value\\\": [1, 4, 8, 16, 32], \\\"_type\\\": \\\"choice\\\"}, \\\"conv_size\\\": {\\\"_value\\\": [2, 3, 5, 7], \\\"_type\\\": \\\"choice\\\"}, \\\"dropout_rate\\\": {\\\"_value\\\": [0.5, 0.9], \\\"_type\\\": \\\"uniform\\\"}, \\\"learning_rate\\\": {\\\"_value\\\": [0.0001, 0.001, 0.01, 0.1], \\\"_type\\\": \\\"choice\\\"}}\",\n",
" \"clusterMetaData\": [\n",
" {\n",
" \"key\": \"codeDir\",\n",
" \"value\": \"/mnt/d/Repos/nni/examples/trials/mnist/.\"\n",
" },\n",
" {\n",
" \"key\": \"command\",\n",
" \"value\": \"python3 mnist.py\"\n",
" }\n",
" ],\n",
" \"tuner\": {\n",
" \"classArgs\": {\n",
" \"optimize_mode\": \"maximize\"\n",
" },\n",
" \"builtinTunerName\": \"TPE\",\n",
" \"checkpointDir\": \"/home/chicm/nni/experiments/PlUIfDTR/checkpoint\",\n",
" \"className\": \"TPE\"\n",
" },\n",
" \"maxTrialNum\": 10,\n",
" \"maxExecDuration\": 3600,\n",
" \"experimentName\": \"example_mnist\",\n",
" \"authorName\": \"default\",\n",
" \"trialConcurrency\": 1,\n",
" \"trainingServicePlatform\": \"local\",\n",
" \"versionCheck\": true\n",
" },\n",
" \"execDuration\": 1192,\n",
" \"revision\": 124,\n",
" \"logDir\": \"/home/chicm/nni/experiments/PlUIfDTR\",\n",
" \"maxSequenceId\": 3,\n",
" \"id\": \"PlUIfDTR\",\n",
" \"startTime\": 1564484985839\n",
"}\n"
]
}
],
"source": [
"show_json(nc.get_experiment_profile())"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[\n",
" {\n",
" \"startTime\": 1564484995992,\n",
" \"hyperParameters\": [\n",
" \"{\\\"parameter_source\\\":\\\"algorithm\\\",\\\"parameter_id\\\":0,\\\"parameter_index\\\":0,\\\"parameters\\\":{\\\"batch_size\\\":8,\\\"conv_size\\\":3,\\\"hidden_size\\\":1024,\\\"learning_rate\\\":0.0001,\\\"dropout_rate\\\":0.8055724367106529}}\"\n",
" ],\n",
" \"id\": \"BW0NR\",\n",
" \"endTime\": 1564485259753,\n",
" \"status\": \"SUCCEEDED\",\n",
" \"sequenceId\": 0,\n",
" \"finalMetricData\": [\n",
" {\n",
" \"parameterId\": \"0\",\n",
" \"type\": \"FINAL\",\n",
" \"trialJobId\": \"BW0NR\",\n",
" \"timestamp\": 1564485258774,\n",
" \"data\": \"0.9078999757766724\",\n",
" \"sequence\": 0\n",
" }\n",
" ],\n",
" \"logPath\": \"file://localhost:/home/chicm/nni/experiments/PlUIfDTR/trials/BW0NR\"\n",
" },\n",
" {\n",
" \"startTime\": 1564485271947,\n",
" \"hyperParameters\": [\n",
" \"{\\\"parameter_source\\\":\\\"algorithm\\\",\\\"parameter_id\\\":1,\\\"parameter_index\\\":0,\\\"parameters\\\":{\\\"batch_size\\\":4,\\\"conv_size\\\":5,\\\"hidden_size\\\":512,\\\"learning_rate\\\":0.01,\\\"dropout_rate\\\":0.5547528540531742}}\"\n",
" ],\n",
" \"id\": \"x0P5w\",\n",
" \"endTime\": 1564485642784,\n",
" \"status\": \"SUCCEEDED\",\n",
" \"sequenceId\": 1,\n",
" \"finalMetricData\": [\n",
" {\n",
" \"parameterId\": \"1\",\n",
" \"type\": \"FINAL\",\n",
" \"trialJobId\": \"x0P5w\",\n",
" \"timestamp\": 1564485642072,\n",
" \"data\": \"0.10100000351667404\",\n",
" \"sequence\": 0\n",
" }\n",
" ],\n",
" \"logPath\": \"file://localhost:/home/chicm/nni/experiments/PlUIfDTR/trials/x0P5w\"\n",
" },\n",
" {\n",
" \"startTime\": 1564485652151,\n",
" \"hyperParameters\": [\n",
" \"{\\\"parameter_source\\\":\\\"algorithm\\\",\\\"parameter_id\\\":2,\\\"parameter_index\\\":0,\\\"parameters\\\":{\\\"batch_size\\\":8,\\\"conv_size\\\":3,\\\"hidden_size\\\":512,\\\"learning_rate\\\":0.0001,\\\"dropout_rate\\\":0.5584485925416655}}\"\n",
" ],\n",
" \"id\": \"V9jSG\",\n",
" \"endTime\": 1564485917057,\n",
" \"status\": \"SUCCEEDED\",\n",
" \"sequenceId\": 2,\n",
" \"finalMetricData\": [\n",
" {\n",
" \"parameterId\": \"2\",\n",
" \"type\": \"FINAL\",\n",
" \"trialJobId\": \"V9jSG\",\n",
" \"timestamp\": 1564485916403,\n",
" \"data\": \"0.928600013256073\",\n",
" \"sequence\": 0\n",
" }\n",
" ],\n",
" \"logPath\": \"file://localhost:/home/chicm/nni/experiments/PlUIfDTR/trials/V9jSG\"\n",
" },\n",
" {\n",
" \"startTime\": 1564485927295,\n",
" \"hyperParameters\": [\n",
" \"{\\\"parameter_source\\\":\\\"algorithm\\\",\\\"parameter_id\\\":3,\\\"parameter_index\\\":0,\\\"parameters\\\":{\\\"batch_size\\\":8,\\\"conv_size\\\":7,\\\"hidden_size\\\":124,\\\"learning_rate\\\":0.001,\\\"dropout_rate\\\":0.6281630602835235}}\"\n",
" ],\n",
" \"id\": \"CDlRX\",\n",
" \"status\": \"RUNNING\",\n",
" \"sequenceId\": 3,\n",
" \"logPath\": \"file://localhost:/home/chicm/nni/experiments/PlUIfDTR/trials/CDlRX\"\n",
" }\n",
"]\n"
]
}
],
"source": [
"show_json(nc.list_trial_jobs())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Visualizing nni experiment result\n",
"\n",
"With the retrieved trial job information, we can do some analysis by visualizing the metric data, below is a simple example."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 1080x432 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
"sns.set(style=\"whitegrid\")\n",
"\n",
"jobs = nc.list_trial_jobs()\n",
"job_ids = [x['id'] for x in jobs]\n",
"final_metrics = [float(x['finalMetricData'][0]['data']) for x in jobs]\n",
"\n",
"data = {'job id': job_ids, 'final metrics': final_metrics}\n",
"sns.set(rc={'figure.figsize':(15, 6)})\n",
"\n",
"plt.title('Trial job final results')\n",
"ax = sns.barplot(x='job id', y='final metrics', data=data) \n",
"\n",
"for i,p in enumerate(ax.patches):\n",
" ax.annotate('{:.4f}'.format(p.get_height()), (p.get_x() + p.get_width() / 2., p.get_height()),\n",
" ha='center', va='center', fontsize=11, color='black', rotation=0, xytext=(0, 5),\n",
" textcoords='offset points') "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Stop nni experiment"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"INFO: Stoping experiment PlUIfDTR\n",
"INFO: Stop experiment success.\n"
]
}
],
"source": [
"nc.stop_nni()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
**Automatic Feature Engineering in nni**
===
Now we have an [example](https://github.com/SpongebBob/tabular_automl_NNI), which could automaticlly do feature engineering in nni.
These code come from our contributors. And thanks our lovely contributors!
And welcome more and more people to join us!
......@@ -7,7 +7,9 @@ def random_archi_generator(nas_ss, random_state):
'''
chosen_archi = {}
print("zql: nas search space: ", nas_ss)
for block_name, block in nas_ss.items():
for block_name, block_value in nas_ss.items():
assert block_value['_type'] == "mutable_layer", "Random NAS Tuner only receives NAS search space whose _type is 'mutable_layer'"
block = block_value['_value']
tmp_block = {}
for layer_name, layer in block.items():
tmp_layer = {}
......
......@@ -35,9 +35,10 @@ setup(
license = 'MIT',
url = 'https://github.com/Microsoft/nni',
packages = find_packages('src/sdk/pynni', exclude=['tests']) + find_packages('tools'),
packages = find_packages('src/sdk/pynni', exclude=['tests']) + find_packages('src/sdk/pycli') + find_packages('tools'),
package_dir = {
'nni': 'src/sdk/pynni/nni',
'nnicli': 'src/sdk/pycli/nnicli',
'nni_annotation': 'tools/nni_annotation',
'nni_cmd': 'tools/nni_cmd',
'nni_trial_tool':'tools/nni_trial_tool',
......
......@@ -51,10 +51,12 @@ export namespace ValidationSchemas {
command: joi.string().min(1),
virtualCluster: joi.string(),
shmMB: joi.number(),
authFile: joi.string(),
nasMode: joi.string().valid('classic_mode', 'enas_mode', 'oneshot_mode'),
worker: joi.object({
replicas: joi.number().min(1).required(),
image: joi.string().min(1),
privateRegistryAuthPath: joi.string().min(1),
outputDir: joi.string(),
cpuNum: joi.number().min(1),
memoryMB: joi.number().min(100),
......@@ -64,6 +66,7 @@ export namespace ValidationSchemas {
ps: joi.object({
replicas: joi.number().min(1).required(),
image: joi.string().min(1),
privateRegistryAuthPath: joi.string().min(1),
outputDir: joi.string(),
cpuNum: joi.number().min(1),
memoryMB: joi.number().min(100),
......@@ -73,6 +76,7 @@ export namespace ValidationSchemas {
master: joi.object({
replicas: joi.number().min(1).required(),
image: joi.string().min(1),
privateRegistryAuthPath: joi.string().min(1),
outputDir: joi.string(),
cpuNum: joi.number().min(1),
memoryMB: joi.number().min(100),
......@@ -83,6 +87,7 @@ export namespace ValidationSchemas {
name: joi.string().min(1),
taskNum: joi.number().min(1).required(),
image: joi.string().min(1),
privateRegistryAuthPath: joi.string().min(1),
outputDir: joi.string(),
cpuNum: joi.number().min(1),
memoryMB: joi.number().min(100),
......
......@@ -43,8 +43,8 @@ export class FrameworkControllerTrialConfigTemplate extends KubernetesTrialConfi
public readonly taskNum: number;
constructor(taskNum: number, command : string, gpuNum : number,
cpuNum: number, memoryMB: number, image: string,
frameworkAttemptCompletionPolicy: FrameworkAttemptCompletionPolicy) {
super(command, gpuNum, cpuNum, memoryMB, image);
frameworkAttemptCompletionPolicy: FrameworkAttemptCompletionPolicy, privateRegistryFilePath?: string | undefined) {
super(command, gpuNum, cpuNum, memoryMB, image, privateRegistryFilePath);
this.frameworkAttemptCompletionPolicy = frameworkAttemptCompletionPolicy;
this.name = name;
this.taskNum = taskNum;
......
......@@ -305,7 +305,7 @@ class FrameworkControllerTrainingService extends KubernetesTrainingService imple
}
// Generate frameworkcontroller job resource config object
const frameworkcontrollerJobConfig: any =
this.generateFrameworkControllerJobConfig(trialJobId, trialWorkingFolder, frameworkcontrollerJobName, podResources);
await this.generateFrameworkControllerJobConfig(trialJobId, trialWorkingFolder, frameworkcontrollerJobName, podResources);
return Promise.resolve(frameworkcontrollerJobConfig);
}
......@@ -329,8 +329,8 @@ class FrameworkControllerTrainingService extends KubernetesTrainingService imple
* @param frameworkcontrollerJobName job name
* @param podResources pod template
*/
private generateFrameworkControllerJobConfig(trialJobId: string, trialWorkingFolder: string,
frameworkcontrollerJobName : string, podResources : any) : any {
private async generateFrameworkControllerJobConfig(trialJobId: string, trialWorkingFolder: string,
frameworkcontrollerJobName : string, podResources : any) : Promise<any> {
if (this.fcClusterConfig === undefined) {
throw new Error('frameworkcontroller Cluster config is not initialized');
}
......@@ -345,12 +345,14 @@ class FrameworkControllerTrainingService extends KubernetesTrainingService imple
if (containerPort === undefined) {
throw new Error('Container port is not initialized');
}
const taskRole: any = this.generateTaskRoleConfig(
trialWorkingFolder,
this.fcTrialConfig.taskRoles[index].image,
`run_${this.fcTrialConfig.taskRoles[index].name}.sh`,
podResources[index],
containerPort
containerPort,
await this.createRegistrySecret(this.fcTrialConfig.taskRoles[index].privateRegistryAuthPath)
);
taskRoles.push({
name: this.fcTrialConfig.taskRoles[index].name,
......@@ -363,7 +365,7 @@ class FrameworkControllerTrainingService extends KubernetesTrainingService imple
});
}
return {
return Promise.resolve({
apiVersion: `frameworkcontroller.microsoft.com/v1`,
kind: 'Framework',
metadata: {
......@@ -379,11 +381,11 @@ class FrameworkControllerTrainingService extends KubernetesTrainingService imple
executionType: 'Start',
taskRoles: taskRoles
}
};
});
}
private generateTaskRoleConfig(trialWorkingFolder: string, replicaImage: string, runScriptFile: string,
podResources: any, containerPort: number): any {
private generateTaskRoleConfig(trialWorkingFolder: string, replicaImage: string, runScriptFile: string,
podResources: any, containerPort: number, privateRegistrySecretName: string | undefined): any {
if (this.fcClusterConfig === undefined) {
throw new Error('frameworkcontroller Cluster config is not initialized');
}
......@@ -451,13 +453,22 @@ class FrameworkControllerTrainingService extends KubernetesTrainingService imple
mountPath: '/mnt/frameworkbarrier'
}]
}];
const spec: any = {
containers: containers,
initContainers: initContainers,
restartPolicy: 'OnFailure',
volumes: volumeSpecMap.get('nniVolumes'),
hostNetwork: false
let spec: any = {
containers: containers,
initContainers: initContainers,
restartPolicy: 'OnFailure',
volumes: volumeSpecMap.get('nniVolumes'),
hostNetwork: false
};
if(privateRegistrySecretName) {
spec.imagePullSecrets = [
{
name: privateRegistrySecretName
}
]
}
if (this.fcClusterConfig.serviceAccountName !== undefined) {
spec.serviceAccountName = this.fcClusterConfig.serviceAccountName;
}
......
......@@ -135,8 +135,8 @@ export class KubeflowTrialConfig extends KubernetesTrialConfig {
export class KubeflowTrialConfigTemplate extends KubernetesTrialConfigTemplate {
public readonly replicas: number;
constructor(replicas: number, command : string, gpuNum : number,
cpuNum: number, memoryMB: number, image: string) {
super(command, gpuNum, cpuNum, memoryMB, image);
cpuNum: number, memoryMB: number, image: string, privateRegistryAuthPath?: string) {
super(command, gpuNum, cpuNum, memoryMB, image, privateRegistryAuthPath);
this.replicas = replicas;
}
}
......
......@@ -347,7 +347,7 @@ class KubeflowTrainingService extends KubernetesTrainingService implements Kuber
}
// Generate kubeflow job resource config object
const kubeflowJobConfig: any = this.generateKubeflowJobConfig(trialJobId, trialWorkingFolder, kubeflowJobName, workerPodResources,
const kubeflowJobConfig: any = await this.generateKubeflowJobConfig(trialJobId, trialWorkingFolder, kubeflowJobName, workerPodResources,
nonWorkerResources);
return Promise.resolve(kubeflowJobConfig);
......@@ -361,8 +361,8 @@ class KubeflowTrainingService extends KubernetesTrainingService implements Kuber
* @param workerPodResources worker pod template
* @param nonWorkerPodResources non-worker pod template, like ps or master
*/
private generateKubeflowJobConfig(trialJobId: string, trialWorkingFolder: string, kubeflowJobName : string, workerPodResources : any,
nonWorkerPodResources?: any) : any {
private async generateKubeflowJobConfig(trialJobId: string, trialWorkingFolder: string, kubeflowJobName : string, workerPodResources : any,
nonWorkerPodResources?: any) : Promise<any> {
if (this.kubeflowClusterConfig === undefined) {
throw new Error('Kubeflow Cluster config is not initialized');
}
......@@ -377,29 +377,32 @@ class KubeflowTrainingService extends KubernetesTrainingService implements Kuber
const replicaSpecsObj: any = {};
const replicaSpecsObjMap: Map<string, object> = new Map<string, object>();
if (this.kubeflowTrialConfig.operatorType === 'tf-operator') {
const tensorflowTrialConfig: KubeflowTrialConfigTensorflow = <KubeflowTrialConfigTensorflow>this.kubeflowTrialConfig;
let privateRegistrySecretName = await this.createRegistrySecret(tensorflowTrialConfig.worker.privateRegistryAuthPath);
replicaSpecsObj.Worker = this.generateReplicaConfig(trialWorkingFolder, tensorflowTrialConfig.worker.replicas,
tensorflowTrialConfig.worker.image, 'run_worker.sh', workerPodResources);
tensorflowTrialConfig.worker.image, 'run_worker.sh', workerPodResources, privateRegistrySecretName);
if (tensorflowTrialConfig.ps !== undefined) {
let privateRegistrySecretName: string | undefined = await this.createRegistrySecret(tensorflowTrialConfig.ps.privateRegistryAuthPath);
replicaSpecsObj.Ps = this.generateReplicaConfig(trialWorkingFolder, tensorflowTrialConfig.ps.replicas,
tensorflowTrialConfig.ps.image, 'run_ps.sh', nonWorkerPodResources);
tensorflowTrialConfig.ps.image, 'run_ps.sh', nonWorkerPodResources, privateRegistrySecretName);
}
replicaSpecsObjMap.set(this.kubernetesCRDClient.jobKind, {tfReplicaSpecs: replicaSpecsObj});
} else if (this.kubeflowTrialConfig.operatorType === 'pytorch-operator') {
const pytorchTrialConfig: KubeflowTrialConfigPytorch = <KubeflowTrialConfigPytorch>this.kubeflowTrialConfig;
if (pytorchTrialConfig.worker !== undefined) {
let privateRegistrySecretName: string | undefined = await this.createRegistrySecret(pytorchTrialConfig.worker.privateRegistryAuthPath);
replicaSpecsObj.Worker = this.generateReplicaConfig(trialWorkingFolder, pytorchTrialConfig.worker.replicas,
pytorchTrialConfig.worker.image, 'run_worker.sh', workerPodResources);
pytorchTrialConfig.worker.image, 'run_worker.sh', workerPodResources, privateRegistrySecretName);
}
let privateRegistrySecretName: string | undefined = await this.createRegistrySecret(pytorchTrialConfig.master.privateRegistryAuthPath);
replicaSpecsObj.Master = this.generateReplicaConfig(trialWorkingFolder, pytorchTrialConfig.master.replicas,
pytorchTrialConfig.master.image, 'run_master.sh', nonWorkerPodResources);
pytorchTrialConfig.master.image, 'run_master.sh', nonWorkerPodResources, privateRegistrySecretName);
replicaSpecsObjMap.set(this.kubernetesCRDClient.jobKind, {pytorchReplicaSpecs: replicaSpecsObj});
}
return {
return Promise.resolve({
apiVersion: `kubeflow.org/${this.kubernetesCRDClient.apiVersion}`,
kind: this.kubernetesCRDClient.jobKind,
metadata: {
......@@ -412,7 +415,7 @@ class KubeflowTrainingService extends KubernetesTrainingService implements Kuber
}
},
spec: replicaSpecsObjMap.get(this.kubernetesCRDClient.jobKind)
};
});
}
/**
......@@ -424,7 +427,7 @@ class KubeflowTrainingService extends KubernetesTrainingService implements Kuber
* @param podResources pod resource config section
*/
private generateReplicaConfig(trialWorkingFolder: string, replicaNumber: number, replicaImage: string, runScriptFile: string,
podResources: any): any {
podResources: any, privateRegistrySecretName: string | undefined): any {
if (this.kubeflowClusterConfig === undefined) {
throw new Error('Kubeflow Cluster config is not initialized');
}
......@@ -436,7 +439,7 @@ class KubeflowTrainingService extends KubernetesTrainingService implements Kuber
if (this.kubernetesCRDClient === undefined) {
throw new Error('Kubeflow operator client is not initialized');
}
// The config spec for volume field
const volumeSpecMap: Map<string, object> = new Map<string, object>();
if (this.kubeflowClusterConfig.storageType === 'azureStorage') {
volumeSpecMap.set('nniVolumes', [
......@@ -459,7 +462,34 @@ class KubeflowTrainingService extends KubernetesTrainingService implements Kuber
}
}]);
}
// The config spec for container field
const containersSpecMap: Map<string, object> = new Map<string, object>();
containersSpecMap.set('containers', [
{
// Kubeflow tensorflow operator requires that containers' name must be tensorflow
// TODO: change the name based on operator's type
name: this.kubernetesCRDClient.containerName,
image: replicaImage,
args: ['sh', `${path.join(trialWorkingFolder, runScriptFile)}`],
volumeMounts: [
{
name: 'nni-vol',
mountPath: this.CONTAINER_MOUNT_PATH
}],
resources: podResources
}
]);
let spec: any = {
containers: containersSpecMap.get('containers'),
restartPolicy: 'ExitCode',
volumes: volumeSpecMap.get('nniVolumes')
}
if (privateRegistrySecretName) {
spec.imagePullSecrets = [
{
name: privateRegistrySecretName
}]
}
return {
replicas: replicaNumber,
template: {
......@@ -467,26 +497,9 @@ class KubeflowTrainingService extends KubernetesTrainingService implements Kuber
// tslint:disable-next-line:no-null-keyword
creationTimestamp: null
},
spec: {
containers: [
{
// Kubeflow tensorflow operator requires that containers' name must be tensorflow
// TODO: change the name based on operator's type
name: this.kubernetesCRDClient.containerName,
image: replicaImage,
args: ['sh', `${path.join(trialWorkingFolder, runScriptFile)}`],
volumeMounts: [
{
name: 'nni-vol',
mountPath: this.CONTAINER_MOUNT_PATH
}],
resources: podResources
}],
restartPolicy: 'ExitCode',
volumes: volumeSpecMap.get('nniVolumes')
}
spec: spec
}
};
}
}
}
// tslint:enable: no-unsafe-any no-any
......
......@@ -179,6 +179,9 @@ export class KubernetesTrialConfigTemplate {
// Docker image
public readonly image: string;
// Private registry config file path to download docker iamge
public readonly privateRegistryAuthPath?: string;
// Trail command
public readonly command : string;
......@@ -186,12 +189,13 @@ export class KubernetesTrialConfigTemplate {
public readonly gpuNum : number;
constructor(command : string, gpuNum : number,
cpuNum: number, memoryMB: number, image: string) {
cpuNum: number, memoryMB: number, image: string, privateRegistryAuthPath?: string) {
this.command = command;
this.gpuNum = gpuNum;
this.cpuNum = cpuNum;
this.memoryMB = memoryMB;
this.image = image;
this.privateRegistryAuthPath = privateRegistryAuthPath;
}
}
......
......@@ -38,6 +38,8 @@ import { KubernetesClusterConfig } from './kubernetesConfig';
import { kubernetesScriptFormat, KubernetesTrialJobDetail } from './kubernetesData';
import { KubernetesJobRestServer } from './kubernetesJobRestServer';
var fs = require('fs');
/**
* Training Service implementation for Kubernetes
*/
......@@ -327,5 +329,34 @@ abstract class KubernetesTrainingService {
return Promise.resolve();
}
protected async createRegistrySecret(filePath: string | undefined): Promise<string | undefined> {
if(filePath === undefined || filePath === '') {
return undefined;
}
let body = fs.readFileSync(filePath).toString('base64');
let registrySecretName = String.Format('nni-secret-{0}', uniqueString(8)
.toLowerCase());
await this.genericK8sClient.createSecret(
{
apiVersion: 'v1',
kind: 'Secret',
metadata: {
name: registrySecretName,
namespace: 'default',
labels: {
app: this.NNI_KUBERNETES_TRIAL_LABEL,
expId: getExperimentId()
}
},
type: 'kubernetes.io/dockerconfigjson',
data: {
'.dockerconfigjson': body
}
}
);
return registrySecretName;
}
}
export { KubernetesTrainingService };
......@@ -71,6 +71,8 @@ export class PAIJobConfig {
public readonly image: string;
// Code directory on HDFS
public readonly codeDir: string;
//authentication file used for private Docker registry
public readonly authFile?: string;
// List of taskRole, one task role at least
public taskRoles: PAITaskRole[];
......@@ -87,12 +89,13 @@ export class PAIJobConfig {
* @param taskRoles List of taskRole, one task role at least
*/
constructor(jobName: string, image : string, codeDir : string,
taskRoles : PAITaskRole[], virtualCluster: string) {
taskRoles : PAITaskRole[], virtualCluster: string, authFile?: string) {
this.jobName = jobName;
this.image = image;
this.codeDir = codeDir;
this.taskRoles = taskRoles;
this.virtualCluster = virtualCluster;
this.authFile = authFile;
}
}
......@@ -129,14 +132,17 @@ export class NNIPAITrialConfig extends TrialConfig {
public virtualCluster?: string;
//Shared memory for one task in the task role
public shmMB?: number;
//authentication file used for private Docker registry
public authFile?: string;
constructor(command : string, codeDir : string, gpuNum : number, cpuNum: number, memoryMB: number,
image: string, virtualCluster?: string, shmMB?: number) {
image: string, virtualCluster?: string, shmMB?: number, authFile?: string) {
super(command, codeDir, gpuNum);
this.cpuNum = cpuNum;
this.memoryMB = memoryMB;
this.image = image;
this.virtualCluster = virtualCluster;
this.shmMB = shmMB;
this.authFile = authFile;
}
}
......@@ -442,7 +442,7 @@ class PAITrainingService implements TrainingService {
// Task command
nniPaiTrialCommand,
// Task shared memory
this.paiTrialConfig.shmMB
this.paiTrialConfig.shmMB,
)
];
......@@ -456,7 +456,9 @@ class PAITrainingService implements TrainingService {
// PAI Task roles
paiTaskRoles,
// Add Virutal Cluster
this.paiTrialConfig.virtualCluster === undefined ? 'default' : this.paiTrialConfig.virtualCluster.toString()
this.paiTrialConfig.virtualCluster === undefined ? 'default' : this.paiTrialConfig.virtualCluster.toString(),
//Task auth File
this.paiTrialConfig.authFile
);
// Step 2. Upload code files in codeDir onto HDFS
......
# Copyright (c) Microsoft Corporation. All rights reserved.
#
# MIT License
#
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
# associated documentation files (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge, publish, distribute,
# sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all copies or
# substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT
# NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT
# OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
# ==================================================================================================
from .nni_client import *
# Copyright (c) Microsoft Corporation. All rights reserved.
#
# MIT License
#
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
# associated documentation files (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge, publish, distribute,
# sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all copies or
# substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT
# NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT
# OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
# ==================================================================================================
""" A python wrapper for nni rest api
Example:
import nnicli as nc
nc.start_nni('../../../../examples/trials/mnist/config.yml')
nc.set_endpoint('http://localhost:8080')
print(nc.version())
print(nc.get_experiment_status())
print(nc.get_job_statistics())
print(nc.list_trial_jobs())
nc.stop_nni()
"""
import sys
import os
import subprocess
import requests
__all__ = [
'start_nni',
'stop_nni',
'set_endpoint',
'version',
'get_experiment_status',
'get_experiment_profile',
'get_trial_job',
'list_trial_jobs',
'get_job_statistics',
'get_job_metrics',
'export_data'
]
EXPERIMENT_PATH = 'experiment'
VERSION_PATH = 'version'
STATUS_PATH = 'check-status'
JOB_STATISTICS_PATH = 'job-statistics'
TRIAL_JOBS_PATH = 'trial-jobs'
METRICS_PATH = 'metric-data'
EXPORT_DATA_PATH = 'export-data'
API_ROOT_PATH = 'api/v1/nni'
_api_endpoint = None
def set_endpoint(endpoint):
"""set endpoint of nni rest server for nnicli, for example:
http://localhost:8080
"""
global _api_endpoint
_api_endpoint = endpoint
def _check_endpoint():
if _api_endpoint is None:
raise AssertionError("Please call set_endpoint to specify nni endpoint")
def _nni_rest_get(api_path, response_type='json'):
_check_endpoint()
uri = '{}/{}/{}'.format(_api_endpoint, API_ROOT_PATH, api_path)
res = requests.get(uri)
if _http_succeed(res.status_code):
if response_type == 'json':
return res.json()
elif response_type == 'text':
return res.text
else:
raise AssertionError('Incorrect response_type')
else:
return None
def _http_succeed(status_code):
return status_code // 100 == 2
def _create_process(cmd):
if sys.platform == 'win32':
process = subprocess.Popen(cmd, stdout=subprocess.PIPE, creationflags=subprocess.CREATE_NEW_PROCESS_GROUP)
else:
process = subprocess.Popen(cmd, stdout=subprocess.PIPE)
while process.poll() is None:
output = process.stdout.readline()
if output:
print(output.decode('utf-8').strip())
return process.returncode
def start_nni(config_file):
"""start nni experiment with specified configuration file"""
cmd = 'nnictl create --config {}'.format(config_file).split(' ')
if _create_process(cmd) != 0:
raise RuntimeError('Failed to start nni.')
def stop_nni():
"""stop nni experiment"""
cmd = 'nnictl stop'.split(' ')
if _create_process(cmd) != 0:
raise RuntimeError('Failed to stop nni.')
def version():
"""return version of nni"""
return _nni_rest_get(VERSION_PATH, 'text')
def get_experiment_status():
"""return experiment status as a dict"""
return _nni_rest_get(STATUS_PATH)
def get_experiment_profile():
"""return experiment profile as a dict"""
return _nni_rest_get(EXPERIMENT_PATH)
def get_trial_job(trial_job_id):
"""return trial job information as a dict"""
assert trial_job_id is not None
return _nni_rest_get(os.path.join(TRIAL_JOBS_PATH, trial_job_id))
def list_trial_jobs():
"""return information for all trial jobs as a list"""
return _nni_rest_get(TRIAL_JOBS_PATH)
def get_job_statistics():
"""return trial job statistics information as a dict"""
return _nni_rest_get(JOB_STATISTICS_PATH)
def get_job_metrics(trial_job_id=None):
"""return trial job metrics"""
api_path = METRICS_PATH if trial_job_id is None else os.path.join(METRICS_PATH, trial_job_id)
return _nni_rest_get(api_path)
def export_data():
"""return exported information for all trial jobs"""
return _nni_rest_get(EXPORT_DATA_PATH)
import setuptools
setuptools.setup(
name = 'nnicli',
version = '999.0.0-developing',
packages = setuptools.find_packages(),
python_requires = '>=3.5',
install_requires = [
'requests'
],
author = 'Microsoft NNI Team',
author_email = 'nni@microsoft.com',
description = 'nnicli for Neural Network Intelligence project',
license = 'MIT',
url = 'https://github.com/Microsoft/nni',
)
......@@ -190,13 +190,19 @@ class HyperoptTuner(Tuner):
HyperoptTuner is a tuner which using hyperopt algorithm.
"""
def __init__(self, algorithm_name, optimize_mode='minimize'):
def __init__(self, algorithm_name, optimize_mode='minimize',
parallel_optimize=False, constant_liar_type='min'):
"""
Parameters
----------
algorithm_name : str
algorithm_name includes "tpe", "random_search" and anneal".
optimize_mode : str
parallel_optimize : bool
More detail could reference: docs/en_US/Tuner/HyperoptTuner.md
constant_liar_type : str
constant_liar_type including "min", "max" and "mean"
More detail could reference: docs/en_US/Tuner/HyperoptTuner.md
"""
self.algorithm_name = algorithm_name
self.optimize_mode = OptimizeMode(optimize_mode)
......@@ -205,6 +211,13 @@ class HyperoptTuner(Tuner):
self.rval = None
self.supplement_data_num = 0
self.parallel = parallel_optimize
if self.parallel:
self.CL_rval = None
self.constant_liar_type = constant_liar_type
self.running_data = []
self.optimal_y = None
def _choose_tuner(self, algorithm_name):
"""
Parameters
......@@ -266,6 +279,10 @@ class HyperoptTuner(Tuner):
# but it can cause deplicate parameter rarely
total_params = self.get_suggestion(random_search=True)
self.total_data[parameter_id] = total_params
if self.parallel:
self.running_data.append(parameter_id)
params = split_index(total_params)
return params
......@@ -287,10 +304,39 @@ class HyperoptTuner(Tuner):
raise RuntimeError('Received parameter_id not in total_data.')
params = self.total_data[parameter_id]
# code for parallel
if self.parallel:
constant_liar = kwargs.get('constant_liar', False)
if constant_liar:
rval = self.CL_rval
else:
rval = self.rval
self.running_data.remove(parameter_id)
# update the reward of optimal_y
if self.optimal_y is None:
if self.constant_liar_type == 'mean':
self.optimal_y = [reward, 1]
else:
self.optimal_y = reward
else:
if self.constant_liar_type == 'mean':
_sum = self.optimal_y[0] + reward
_number = self.optimal_y[1] + 1
self.optimal_y = [_sum, _number]
elif self.constant_liar_type == 'min':
self.optimal_y = min(self.optimal_y, reward)
elif self.constant_liar_type == 'max':
self.optimal_y = max(self.optimal_y, reward)
logger.debug("Update optimal_y with reward, optimal_y = %s", self.optimal_y)
else:
rval = self.rval
if self.optimize_mode is OptimizeMode.Maximize:
reward = -reward
rval = self.rval
domain = rval.domain
trials = rval.trials
......@@ -375,13 +421,26 @@ class HyperoptTuner(Tuner):
total_params : dict
parameter suggestion
"""
if self.parallel and len(self.total_data)>20 and len(self.running_data) and self.optimal_y is not None:
self.CL_rval = copy.deepcopy(self.rval)
if self.constant_liar_type == 'mean':
_constant_liar_y = self.optimal_y[0] / self.optimal_y[1]
else:
_constant_liar_y = self.optimal_y
for _parameter_id in self.running_data:
self.receive_trial_result(parameter_id=_parameter_id, parameters=None, value=_constant_liar_y, constant_liar=True)
rval = self.CL_rval
rval = self.rval
random_state = np.random.randint(2**31 - 1)
else:
rval = self.rval
random_state = rval.rstate.randint(2**31 - 1)
trials = rval.trials
algorithm = rval.algo
new_ids = rval.trials.new_trial_ids(1)
rval.trials.refresh()
random_state = rval.rstate.randint(2**31 - 1)
if random_search:
new_trials = hp.rand.suggest(new_ids, rval.domain, trials,
random_state)
......
# Copyright (c) Microsoft Corporation
# All rights reserved.
#
# MIT License
#
# Permission is hereby granted, free of charge,
# to any person obtaining a copy of this software and associated
# documentation files (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge, publish,
# distribute, sublicense, and/or sell copies of the Software, and
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included
# in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
import sys
import time
import traceback
from utils import GREEN, RED, CLEAR, setup_experiment
def test_nni_cli():
import nnicli as nc
config_file = 'config_test/examples/mnist.test.yml'
try:
# Sleep here to make sure previous stopped exp has enough time to exit to avoid port conflict
time.sleep(6)
print(GREEN + 'Testing nnicli:' + config_file + CLEAR)
nc.start_nni(config_file)
time.sleep(3)
nc.set_endpoint('http://localhost:8080')
print(nc.version())
print(nc.get_job_statistics())
print(nc.get_experiment_status())
nc.list_trial_jobs()
print(GREEN + 'Test nnicli {}: TEST PASS'.format(config_file) + CLEAR)
except Exception as error:
print(RED + 'Test nnicli {}: TEST FAIL'.format(config_file) + CLEAR)
print('%r' % error)
traceback.print_exc()
raise error
finally:
nc.stop_nni()
if __name__ == '__main__':
installed = (sys.argv[-1] != '--preinstall')
setup_experiment(installed)
test_nni_cli()
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment