new feature : NAS-NLP-benchmark (#3140)

697c60ec · 98may · GitHub · 449b7b0c · 697c60ec · 697c60ec
Unverified Commit 697c60ec authored Dec 12, 2020 by 98may Committed by GitHub Dec 12, 2020
11 changed files
--- a/docs/archive_en_US/NAS/Benchmarks.md
+++ b/docs/archive_en_US/NAS/Benchmarks.md
 # NAS Benchmarks

+[TOC]
+
 ```eval_rst
 ..  toctree::
    :hidden:
@@ -9,7 +11,7 @@

 ## Introduction

-To imporve the reproducibility of NAS algorithms as well as reducing computing resource requirements, researchers proposed a series of NAS benchmarks such as [NAS-Bench-101](https://arxiv.org/abs/1902.09635), [NAS-Bench-201](https://arxiv.org/abs/2001.00326), [NDS](https://arxiv.org/abs/1905.13214), etc. NNI provides a query interface for users to acquire these benchmarks. Within just a few lines of code, researcher are able to evaluate their NAS algorithms easily and fairly by utilizing these benchmarks.
+To imporve the reproducibility of NAS algorithms as well as reducing computing resource requirements, researchers proposed a series of NAS benchmarks such as [NAS-Bench-101](https://arxiv.org/abs/1902.09635), [NAS-Bench-201](https://arxiv.org/abs/2001.00326), [NDS](https://arxiv.org/abs/1905.13214), [NLP](https://arxiv.org/abs/2006.07116), etc. NNI provides a query interface for users to acquire these benchmarks. Within just a few lines of code, researcher are able to evaluate their NAS algorithms easily and fairly by utilizing these benchmarks.

 ## Prerequisites

@@ -27,7 +29,7 @@ cd nni/examples/nas/benchmarks
 ```
 Replace `${NNI_VERSION}` with a released version name or branch name, e.g., `v1.9`.

-2. Install dependencies via `pip3 install -r xxx.requirements.txt`. `xxx` can be `nasbench101`, `nasbench201` or `nds`.
+2. Install dependencies via `pip3 install -r xxx.requirements.txt`. `xxx` can be `nasbench101`, `nasbench201`, `nds` or `nlp`.
 3. Generate the database via `./xxx.sh`. The directory that stores the benchmark file can be configured with `NASBENCHMARK_DIR` environment variable, which defaults to `~/.nni/nasbenchmark`. Note that the NAS-Bench-201 dataset will be downloaded from a google drive.

 Please make sure there is at least 10GB free disk space and note that the conversion process can take up to hours to complete.
@@ -109,7 +111,7 @@ _On Network Design Spaces for Visual Recognition_ released trial statistics of o

 Instead of storing results obtained with different configurations in separate files, we dump them into one single database to enable comparison in multiple dimensions. Specifically, we use `model_family` to distinguish model types, `model_spec` for all hyper-parameters needed to build this model, `cell_spec` for detailed information on operators and connections if it is a NAS cell, `generator` to denote the sampling policy through which this configuration is generated. Refer to API documentation for details.

-## Available Operators
+### Available Operators

 Here is a list of available operators used in NDS.

@@ -158,3 +160,22 @@ Here is a list of available operators used in NDS.

 .. autoclass:: nni.nas.benchmarks.nds.NdsIntermediateStats
 ```
+
+## NLP
+
+[Paper link](https://arxiv.org/abs/2006.07116) &nbsp; &nbsp; [Open-source](https://github.com/fmsnew/nas-bench-nlp-release)
+
+The paper "NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing"  have provided search space of recurrent neural networks on the text datasets and trained 14k architectures within it, and have conducted both intrinsic and extrinsic evaluation of the trained models using datasets for semantic relatedness and language understanding evaluation. There are 2 datasets - PTB and wikitext-2. In the end, the precomputed results(ptb_single_run + ptb_multi_run + wikitext-2) can be utilized. 
+
+### API Documentation
+
+```eval_rst
+.. autofunction:: nni.nas.benchmarks.nlp.query_nlp_trial_stats
+
+.. autoclass:: nni.nas.benchmarks.nlp.NlpTrialConfig
+
+.. autoclass:: nni.nas.benchmarks.nlp.NlpTrialStats
+
+.. autoclass:: nni.nas.benchmarks.nlp.NlpIntermediateStats
+```
+
--- a/docs/en_US/NAS/BenchmarksExample.ipynb
+++ b/docs/en_US/NAS/BenchmarksExample.ipynb
 {
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# Example Usages of NAS Benchmarks"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import pprint\n",
-    "import time\n",
-    "\n",
-    "from nni.nas.benchmarks.nasbench101 import query_nb101_trial_stats\n",
-    "from nni.nas.benchmarks.nasbench201 import query_nb201_trial_stats\n",
-    "from nni.nas.benchmarks.nds import query_nds_trial_stats\n",
-    "\n",
-    "ti = time.time()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## NAS-Bench-101"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Use the following architecture as an example:\n",
-    "\n",
-    "![nas-101](../../img/nas-bench-101-example.png)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+  "cells": [
    {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "{'config': {'arch': {'input1': [0],\n                     'input2': [1],\n                     'input3': [2],\n                     'input4': [0],\n                     'input5': [0, 3, 4],\n                     'input6': [2, 5],\n                     'op1': 'conv3x3-bn-relu',\n                     'op2': 'maxpool3x3',\n                     'op3': 'conv3x3-bn-relu',\n                     'op4': 'conv3x3-bn-relu',\n                     'op5': 'conv1x1-bn-relu'},\n            'hash': '00005c142e6f48ac74fdcf73e3439874',\n            'id': 4,\n            'num_epochs': 108,\n            'num_vertices': 7},\n 'id': 10,\n 'intermediates': [{'current_epoch': 54,\n                    'id': 19,\n                    'test_acc': 77.40384340286255,\n                    'train_acc': 82.82251358032227,\n                    'training_time': 883.4580078125,\n                    'valid_acc': 77.76442170143127},\n                   {'current_epoch': 108,\n                    'id': 20,\n                    'test_acc': 92.11738705635071,\n                    'train_acc': 100.0,\n                    'training_time': 1769.1279296875,\n                    'valid_acc': 92.41786599159241}],\n 'parameters': 8.55553,\n 'test_acc': 92.11738705635071,\n 'train_acc': 100.0,\n 'training_time': 106147.67578125,\n 'valid_acc': 92.41786599159241}\n{'config': {'arch': {'input1': [0],\n                     'input2': [1],\n                     'input3': [2],\n                     'input4': [0],\n                     'input5': [0, 3, 4],\n                     'input6': [2, 5],\n                     'op1': 'conv3x3-bn-relu',\n                     'op2': 'maxpool3x3',\n                     'op3': 'conv3x3-bn-relu',\n                     'op4': 'conv3x3-bn-relu',\n                     'op5': 'conv1x1-bn-relu'},\n            'hash': '00005c142e6f48ac74fdcf73e3439874',\n            'id': 4,\n            'num_epochs': 108,\n            'num_vertices': 7},\n 'id': 11,\n 'intermediates': [{'current_epoch': 54,\n                    'id': 21,\n                    'test_acc': 82.04126358032227,\n                    'train_acc': 87.96073794364929,\n                    'training_time': 883.6810302734375,\n                    'valid_acc': 82.91265964508057},\n                   {'current_epoch': 108,\n                    'id': 22,\n                    'test_acc': 91.90705418586731,\n                    'train_acc': 100.0,\n                    'training_time': 1768.2509765625,\n                    'valid_acc': 92.45793223381042}],\n 'parameters': 8.55553,\n 'test_acc': 91.90705418586731,\n 'train_acc': 100.0,\n 'training_time': 106095.05859375,\n 'valid_acc': 92.45793223381042}\n{'config': {'arch': {'input1': [0],\n                     'input2': [1],\n                     'input3': [2],\n                     'input4': [0],\n                     'input5': [0, 3, 4],\n                     'input6': [2, 5],\n                     'op1': 'conv3x3-bn-relu',\n                     'op2': 'maxpool3x3',\n                     'op3': 'conv3x3-bn-relu',\n                     'op4': 'conv3x3-bn-relu',\n                     'op5': 'conv1x1-bn-relu'},\n            'hash': '00005c142e6f48ac74fdcf73e3439874',\n            'id': 4,\n            'num_epochs': 108,\n            'num_vertices': 7},\n 'id': 12,\n 'intermediates': [{'current_epoch': 54,\n                    'id': 23,\n                    'test_acc': 80.58894276618958,\n                    'train_acc': 86.34815812110901,\n                    'training_time': 883.4569702148438,\n                    'valid_acc': 81.1598539352417},\n                   {'current_epoch': 108,\n                    'id': 24,\n                    'test_acc': 92.15745329856873,\n                    'train_acc': 100.0,\n                    'training_time': 1768.9759521484375,\n                    'valid_acc': 93.04887652397156}],\n 'parameters': 8.55553,\n 'test_acc': 92.15745329856873,\n 'train_acc': 100.0,\n 'training_time': 106138.55712890625,\n 'valid_acc': 93.04887652397156}\n"
-    }
-   ],
-   "source": [
-    "arch = {\n",
-    "    'op1': 'conv3x3-bn-relu',\n",
-    "    'op2': 'maxpool3x3',\n",
-    "    'op3': 'conv3x3-bn-relu',\n",
-    "    'op4': 'conv3x3-bn-relu',\n",
-    "    'op5': 'conv1x1-bn-relu',\n",
-    "    'input1': [0],\n",
-    "    'input2': [1],\n",
-    "    'input3': [2],\n",
-    "    'input4': [0],\n",
-    "    'input5': [0, 3, 4],\n",
-    "    'input6': [2, 5]\n",
-    "}\n",
-    "for t in query_nb101_trial_stats(arch, 108, include_intermediates=True):\n",
-    "    pprint.pprint(t)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "An architecture of NAS-Bench-101 could be trained more than once. Each element of the returned generator is a dict which contains one of the training results of this trial config (architecture + hyper-parameters) including train/valid/test accuracy, training time, number of epochs, etc. The results of NAS-Bench-201 and NDS follow similar formats."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## NAS-Bench-201"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Use the following architecture as an example:\n",
-    "\n",
-    "![nas-201](../../img/nas-bench-201-example.png)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "# Example Usages of NAS Benchmarks"
+      ]
+    },
    {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "{'config': {'arch': {'0_1': 'avg_pool_3x3',\n                     '0_2': 'conv_1x1',\n                     '0_3': 'conv_1x1',\n                     '1_2': 'skip_connect',\n                     '1_3': 'skip_connect',\n                     '2_3': 'skip_connect'},\n            'dataset': 'cifar100',\n            'id': 7,\n            'num_cells': 5,\n            'num_channels': 16,\n            'num_epochs': 200},\n 'flops': 15.65322,\n 'id': 3,\n 'latency': 0.013182918230692545,\n 'ori_test_acc': 53.11,\n 'ori_test_evaluation_time': 1.0195916947864352,\n 'ori_test_loss': 1.7307863704681397,\n 'parameters': 0.135156,\n 'seed': 999,\n 'test_acc': 53.07999995727539,\n 'test_evaluation_time': 0.5097958473932176,\n 'test_loss': 1.731276072692871,\n 'train_acc': 57.82,\n 'train_loss': 1.5116578379058838,\n 'training_time': 2888.4371995925903,\n 'valid_acc': 53.14000000610351,\n 'valid_evaluation_time': 0.5097958473932176,\n 'valid_loss': 1.7302966793060304}\n{'config': {'arch': {'0_1': 'avg_pool_3x3',\n                     '0_2': 'conv_1x1',\n                     '0_3': 'conv_1x1',\n                     '1_2': 'skip_connect',\n                     '1_3': 'skip_connect',\n                     '2_3': 'skip_connect'},\n            'dataset': 'cifar100',\n            'id': 7,\n            'num_cells': 5,\n            'num_channels': 16,\n            'num_epochs': 200},\n 'flops': 15.65322,\n 'id': 7,\n 'latency': 0.013182918230692545,\n 'ori_test_acc': 51.93,\n 'ori_test_evaluation_time': 1.0195916947864352,\n 'ori_test_loss': 1.7572312774658203,\n 'parameters': 0.135156,\n 'seed': 777,\n 'test_acc': 51.979999938964845,\n 'test_evaluation_time': 0.5097958473932176,\n 'test_loss': 1.7429540189743042,\n 'train_acc': 57.578,\n 'train_loss': 1.5114233912658692,\n 'training_time': 2888.4371995925903,\n 'valid_acc': 51.88,\n 'valid_evaluation_time': 0.5097958473932176,\n 'valid_loss': 1.7715086591720581}\n{'config': {'arch': {'0_1': 'avg_pool_3x3',\n                     '0_2': 'conv_1x1',\n                     '0_3': 'conv_1x1',\n                     '1_2': 'skip_connect',\n                     '1_3': 'skip_connect',\n                     '2_3': 'skip_connect'},\n            'dataset': 'cifar100',\n            'id': 7,\n            'num_cells': 5,\n            'num_channels': 16,\n            'num_epochs': 200},\n 'flops': 15.65322,\n 'id': 11,\n 'latency': 0.013182918230692545,\n 'ori_test_acc': 53.38,\n 'ori_test_evaluation_time': 1.0195916947864352,\n 'ori_test_loss': 1.7281623031616211,\n 'parameters': 0.135156,\n 'seed': 888,\n 'test_acc': 53.67999998779297,\n 'test_evaluation_time': 0.5097958473932176,\n 'test_loss': 1.7327697801589965,\n 'train_acc': 57.792,\n 'train_loss': 1.5091403088760376,\n 'training_time': 2888.4371995925903,\n 'valid_acc': 53.08000000610352,\n 'valid_evaluation_time': 0.5097958473932176,\n 'valid_loss': 1.7235548280715942}\n"
-    }
-   ],
-   "source": [
-    "arch = {\n",
-    "    '0_1': 'avg_pool_3x3',\n",
-    "    '0_2': 'conv_1x1',\n",
-    "    '1_2': 'skip_connect',\n",
-    "    '0_3': 'conv_1x1',\n",
-    "    '1_3': 'skip_connect',\n",
-    "    '2_3': 'skip_connect'\n",
-    "}\n",
-    "for t in query_nb201_trial_stats(arch, 200, 'cifar100'):\n",
-    "    pprint.pprint(t)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Intermediate results are also available."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "code",
+      "execution_count": 3,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "import pprint\n",
+        "import time\n",
+        "\n",
+        "from nni.nas.benchmarks.nasbench101 import query_nb101_trial_stats\n",
+        "from nni.nas.benchmarks.nasbench201 import query_nb201_trial_stats\n",
+        "from nni.nas.benchmarks.nds import query_nds_trial_stats\n",
+        "\n",
+        "ti = time.time()"
+      ]
+    },
    {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "{'id': 4, 'arch': {'0_1': 'avg_pool_3x3', '0_2': 'conv_1x1', '0_3': 'conv_1x1', '1_2': 'skip_connect', '1_3': 'skip_connect', '2_3': 'skip_connect'}, 'num_epochs': 12, 'num_channels': 16, 'num_cells': 5, 'dataset': 'imagenet16-120'}\nIntermediates: 12\n{'id': 8, 'arch': {'0_1': 'avg_pool_3x3', '0_2': 'conv_1x1', '0_3': 'conv_1x1', '1_2': 'skip_connect', '1_3': 'skip_connect', '2_3': 'skip_connect'}, 'num_epochs': 200, 'num_channels': 16, 'num_cells': 5, 'dataset': 'imagenet16-120'}\nIntermediates: 200\n{'id': 8, 'arch': {'0_1': 'avg_pool_3x3', '0_2': 'conv_1x1', '0_3': 'conv_1x1', '1_2': 'skip_connect', '1_3': 'skip_connect', '2_3': 'skip_connect'}, 'num_epochs': 200, 'num_channels': 16, 'num_cells': 5, 'dataset': 'imagenet16-120'}\nIntermediates: 200\n{'id': 8, 'arch': {'0_1': 'avg_pool_3x3', '0_2': 'conv_1x1', '0_3': 'conv_1x1', '1_2': 'skip_connect', '1_3': 'skip_connect', '2_3': 'skip_connect'}, 'num_epochs': 200, 'num_channels': 16, 'num_cells': 5, 'dataset': 'imagenet16-120'}\nIntermediates: 200\n"
-    }
-   ],
-   "source": [
-    "for t in query_nb201_trial_stats(arch, None, 'imagenet16-120', include_intermediates=True):\n",
-    "    print(t['config'])\n",
-    "    print('Intermediates:', len(t['intermediates']))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## NDS"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Use the following architecture as an example:<br>\n",
-    "![nds](../../img/nas-bench-nds-example.png)\n",
-    "\n",
-    "Here, `bot_muls`, `ds`, `num_gs`, `ss` and `ws` stand for \"bottleneck multipliers\", \"depths\", \"number of groups\", \"strides\" and \"widths\" respectively."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## NAS-Bench-101"
+      ]
+    },
    {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "{'best_test_acc': 90.48,\n 'best_train_acc': 96.356,\n 'best_train_loss': 0.116,\n 'config': {'base_lr': 0.1,\n            'cell_spec': {},\n            'dataset': 'cifar10',\n            'generator': 'random',\n            'id': 45505,\n            'model_family': 'residual_bottleneck',\n            'model_spec': {'bot_muls': [0.0, 0.25, 0.25, 0.25],\n                           'ds': [1, 16, 1, 4],\n                           'num_gs': [1, 2, 1, 2],\n                           'ss': [1, 1, 2, 2],\n                           'ws': [16, 64, 128, 16]},\n            'num_epochs': 100,\n            'proposer': 'resnext-a',\n            'weight_decay': 0.0005},\n 'final_test_acc': 90.39,\n 'final_train_acc': 96.298,\n 'final_train_loss': 0.116,\n 'flops': 69.890986,\n 'id': 45505,\n 'iter_time': 0.065,\n 'parameters': 0.083002,\n 'seed': 1}\n"
-    }
-   ],
-   "source": [
-    "model_spec = {\n",
-    "    'bot_muls': [0.0, 0.25, 0.25, 0.25],\n",
-    "    'ds': [1, 16, 1, 4],\n",
-    "    'num_gs': [1, 2, 1, 2],\n",
-    "    'ss': [1, 1, 2, 2],\n",
-    "    'ws': [16, 64, 128, 16]\n",
-    "}\n",
-    "# Use none as a wildcard\n",
-    "for t in query_nds_trial_stats('residual_bottleneck', None, None, model_spec, None, 'cifar10'):\n",
-    "    pprint.pprint(t)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "Use the following architecture as an example:\n",
+        "\n",
+        "![nas-101](../../img/nas-bench-101-example.png)"
+      ]
+    },
    {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "[{'current_epoch': 1,\n  'id': 4494501,\n  'test_acc': 41.76,\n  'train_acc': 30.421000000000006,\n  'train_loss': 1.793},\n {'current_epoch': 2,\n  'id': 4494502,\n  'test_acc': 54.66,\n  'train_acc': 47.24,\n  'train_loss': 1.415},\n {'current_epoch': 3,\n  'id': 4494503,\n  'test_acc': 59.97,\n  'train_acc': 56.983,\n  'train_loss': 1.179},\n {'current_epoch': 4,\n  'id': 4494504,\n  'test_acc': 62.91,\n  'train_acc': 61.955,\n  'train_loss': 1.048},\n {'current_epoch': 5,\n  'id': 4494505,\n  'test_acc': 66.16,\n  'train_acc': 64.493,\n  'train_loss': 0.983},\n {'current_epoch': 6,\n  'id': 4494506,\n  'test_acc': 66.5,\n  'train_acc': 66.274,\n  'train_loss': 0.937},\n {'current_epoch': 7,\n  'id': 4494507,\n  'test_acc': 67.55,\n  'train_acc': 67.426,\n  'train_loss': 0.907},\n {'current_epoch': 8,\n  'id': 4494508,\n  'test_acc': 69.45,\n  'train_acc': 68.45400000000001,\n  'train_loss': 0.878},\n {'current_epoch': 9,\n  'id': 4494509,\n  'test_acc': 70.14,\n  'train_acc': 69.295,\n  'train_loss': 0.857},\n {'current_epoch': 10,\n  'id': 4494510,\n  'test_acc': 69.47,\n  'train_acc': 70.304,\n  'train_loss': 0.832}]\n"
-    }
-   ],
-   "source": [
-    "model_spec = {\n",
-    "    'bot_muls': [0.0, 0.25, 0.25, 0.25],\n",
-    "    'ds': [1, 16, 1, 4],\n",
-    "    'num_gs': [1, 2, 1, 2],\n",
-    "    'ss': [1, 1, 2, 2],\n",
-    "    'ws': [16, 64, 128, 16]\n",
-    "}\n",
-    "for t in query_nds_trial_stats('residual_bottleneck', None, None, model_spec, None, 'cifar10', include_intermediates=True):\n",
-    "    pprint.pprint(t['intermediates'][:10])"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "code",
+      "execution_count": 2,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "arch = {\n",
+        "    'op1': 'conv3x3-bn-relu',\n",
+        "    'op2': 'maxpool3x3',\n",
+        "    'op3': 'conv3x3-bn-relu',\n",
+        "    'op4': 'conv3x3-bn-relu',\n",
+        "    'op5': 'conv1x1-bn-relu',\n",
+        "    'input1': [0],\n",
+        "    'input2': [1],\n",
+        "    'input3': [2],\n",
+        "    'input4': [0],\n",
+        "    'input5': [0, 3, 4],\n",
+        "    'input6': [2, 5]\n",
+        "}\n",
+        "for t in query_nb101_trial_stats(arch, 108, include_intermediates=True):\n",
+        "    pprint.pprint(t)"
+      ]
+    },
    {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "{'best_test_acc': 93.58,\n 'best_train_acc': 99.772,\n 'best_train_loss': 0.011,\n 'config': {'base_lr': 0.1,\n            'cell_spec': {},\n            'dataset': 'cifar10',\n            'generator': 'random',\n            'id': 108998,\n            'model_family': 'residual_basic',\n            'model_spec': {'ds': [1, 12, 12, 12],\n                           'ss': [1, 1, 2, 2],\n                           'ws': [16, 24, 24, 40]},\n            'num_epochs': 100,\n            'proposer': 'resnet',\n            'weight_decay': 0.0005},\n 'final_test_acc': 93.49,\n 'final_train_acc': 99.772,\n 'final_train_loss': 0.011,\n 'flops': 184.519578,\n 'id': 108998,\n 'iter_time': 0.059,\n 'parameters': 0.594138,\n 'seed': 1}\n"
-    }
-   ],
-   "source": [
-    "model_spec = {'ds': [1, 12, 12, 12], 'ss': [1, 1, 2, 2], 'ws': [16, 24, 24, 40]}\n",
-    "for t in query_nds_trial_stats('residual_basic', 'resnet', 'random', model_spec, {}, 'cifar10'):\n",
-    "    pprint.pprint(t)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "An architecture of NAS-Bench-101 could be trained more than once. Each element of the returned generator is a dict which contains one of the training results of this trial config (architecture + hyper-parameters) including train/valid/test accuracy, training time, number of epochs, etc. The results of NAS-Bench-201 and NDS follow similar formats."
+      ]
+    },
    {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "{'best_test_acc': 84.5,\n 'best_train_acc': 89.66499999999999,\n 'best_train_loss': 0.302,\n 'config': {'base_lr': 0.1,\n            'cell_spec': {},\n            'dataset': 'cifar10',\n            'generator': 'random',\n            'id': 139492,\n            'model_family': 'vanilla',\n            'model_spec': {'ds': [1, 12, 12, 12],\n                           'ss': [1, 1, 2, 2],\n                           'ws': [16, 24, 32, 40]},\n            'num_epochs': 100,\n            'proposer': 'vanilla',\n            'weight_decay': 0.0005},\n 'final_test_acc': 84.35,\n 'final_train_acc': 89.633,\n 'final_train_loss': 0.303,\n 'flops': 208.36393,\n 'id': 154692,\n 'iter_time': 0.058,\n 'parameters': 0.68977,\n 'seed': 1}\n"
-    }
-   ],
-   "source": [
-    "# get the first one\n",
-    "pprint.pprint(next(query_nds_trial_stats('vanilla', None, None, None, None, None)))"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## NAS-Bench-201"
+      ]
+    },
    {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "{'best_test_acc': 93.37,\n 'best_train_acc': 99.91,\n 'best_train_loss': 0.006,\n 'config': {'base_lr': 0.1,\n            'cell_spec': {'normal_0_input_x': 0,\n                          'normal_0_input_y': 1,\n                          'normal_0_op_x': 'avg_pool_3x3',\n                          'normal_0_op_y': 'conv_7x1_1x7',\n                          'normal_1_input_x': 2,\n                          'normal_1_input_y': 0,\n                          'normal_1_op_x': 'sep_conv_3x3',\n                          'normal_1_op_y': 'sep_conv_5x5',\n                          'normal_2_input_x': 2,\n                          'normal_2_input_y': 2,\n                          'normal_2_op_x': 'dil_sep_conv_3x3',\n                          'normal_2_op_y': 'dil_sep_conv_3x3',\n                          'normal_3_input_x': 4,\n                          'normal_3_input_y': 4,\n                          'normal_3_op_x': 'skip_connect',\n                          'normal_3_op_y': 'dil_sep_conv_3x3',\n                          'normal_4_input_x': 2,\n                          'normal_4_input_y': 4,\n                          'normal_4_op_x': 'conv_7x1_1x7',\n                          'normal_4_op_y': 'sep_conv_3x3',\n                          'normal_concat': [3, 5, 6],\n                          'reduce_0_input_x': 0,\n                          'reduce_0_input_y': 1,\n                          'reduce_0_op_x': 'avg_pool_3x3',\n                          'reduce_0_op_y': 'dil_sep_conv_3x3',\n                          'reduce_1_input_x': 0,\n                          'reduce_1_input_y': 0,\n                          'reduce_1_op_x': 'sep_conv_3x3',\n                          'reduce_1_op_y': 'sep_conv_3x3',\n                          'reduce_2_input_x': 2,\n                          'reduce_2_input_y': 0,\n                          'reduce_2_op_x': 'skip_connect',\n                          'reduce_2_op_y': 'sep_conv_7x7',\n                          'reduce_3_input_x': 4,\n                          'reduce_3_input_y': 4,\n                          'reduce_3_op_x': 'conv_7x1_1x7',\n                          'reduce_3_op_y': 'skip_connect',\n                          'reduce_4_input_x': 0,\n                          'reduce_4_input_y': 5,\n                          'reduce_4_op_x': 'conv_7x1_1x7',\n                          'reduce_4_op_y': 'conv_7x1_1x7',\n                          'reduce_concat': [3, 6]},\n            'dataset': 'cifar10',\n            'generator': 'random',\n            'id': 1,\n            'model_family': 'nas_cell',\n            'model_spec': {'aux': False,\n                           'depth': 12,\n                           'drop_prob': 0.0,\n                           'num_nodes_normal': 5,\n                           'num_nodes_reduce': 5,\n                           'width': 32},\n            'num_epochs': 100,\n            'proposer': 'amoeba',\n            'weight_decay': 0.0005},\n 'final_test_acc': 93.27,\n 'final_train_acc': 99.91,\n 'final_train_loss': 0.006,\n 'flops': 664.400586,\n 'id': 1,\n 'iter_time': 0.281,\n 'parameters': 4.190314,\n 'seed': 1}\n"
-    }
-   ],
-   "source": [
-    "# count number\n",
-    "model_spec = {'num_nodes_normal': 5, 'num_nodes_reduce': 5, 'depth': 12, 'width': 32, 'aux': False, 'drop_prob': 0.0}\n",
-    "cell_spec = {\n",
-    "    'normal_0_op_x': 'avg_pool_3x3',\n",
-    "    'normal_0_input_x': 0,\n",
-    "    'normal_0_op_y': 'conv_7x1_1x7',\n",
-    "    'normal_0_input_y': 1,\n",
-    "    'normal_1_op_x': 'sep_conv_3x3',\n",
-    "    'normal_1_input_x': 2,\n",
-    "    'normal_1_op_y': 'sep_conv_5x5',\n",
-    "    'normal_1_input_y': 0,\n",
-    "    'normal_2_op_x': 'dil_sep_conv_3x3',\n",
-    "    'normal_2_input_x': 2,\n",
-    "    'normal_2_op_y': 'dil_sep_conv_3x3',\n",
-    "    'normal_2_input_y': 2,\n",
-    "    'normal_3_op_x': 'skip_connect',\n",
-    "    'normal_3_input_x': 4,\n",
-    "    'normal_3_op_y': 'dil_sep_conv_3x3',\n",
-    "    'normal_3_input_y': 4,\n",
-    "    'normal_4_op_x': 'conv_7x1_1x7',\n",
-    "    'normal_4_input_x': 2,\n",
-    "    'normal_4_op_y': 'sep_conv_3x3',\n",
-    "    'normal_4_input_y': 4,\n",
-    "    'normal_concat': [3, 5, 6],\n",
-    "    'reduce_0_op_x': 'avg_pool_3x3',\n",
-    "    'reduce_0_input_x': 0,\n",
-    "    'reduce_0_op_y': 'dil_sep_conv_3x3',\n",
-    "    'reduce_0_input_y': 1,\n",
-    "    'reduce_1_op_x': 'sep_conv_3x3',\n",
-    "    'reduce_1_input_x': 0,\n",
-    "    'reduce_1_op_y': 'sep_conv_3x3',\n",
-    "    'reduce_1_input_y': 0,\n",
-    "    'reduce_2_op_x': 'skip_connect',\n",
-    "    'reduce_2_input_x': 2,\n",
-    "    'reduce_2_op_y': 'sep_conv_7x7',\n",
-    "    'reduce_2_input_y': 0,\n",
-    "    'reduce_3_op_x': 'conv_7x1_1x7',\n",
-    "    'reduce_3_input_x': 4,\n",
-    "    'reduce_3_op_y': 'skip_connect',\n",
-    "    'reduce_3_input_y': 4,\n",
-    "    'reduce_4_op_x': 'conv_7x1_1x7',\n",
-    "    'reduce_4_input_x': 0,\n",
-    "    'reduce_4_op_y': 'conv_7x1_1x7',\n",
-    "    'reduce_4_input_y': 5,\n",
-    "    'reduce_concat': [3, 6]\n",
-    "}\n",
-    "\n",
-    "for t in query_nds_trial_stats('nas_cell', None, None, model_spec, cell_spec, 'cifar10'):\n",
-    "    assert t['config']['model_spec'] == model_spec\n",
-    "    assert t['config']['cell_spec'] == cell_spec\n",
-    "    pprint.pprint(t)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "Use the following architecture as an example:\n",
+        "\n",
+        "![nas-201](../../img/nas-bench-201-example.png)"
+      ]
+    },
    {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "NDS (amoeba) count: 5107\n"
-    }
-   ],
-   "source": [
-    "# count number\n",
-    "print('NDS (amoeba) count:', len(list(query_nds_trial_stats(None, 'amoeba', None, None, None, None, None))))"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+      "cell_type": "code",
+      "execution_count": 3,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "arch = {\n",
+        "    '0_1': 'avg_pool_3x3',\n",
+        "    '0_2': 'conv_1x1',\n",
+        "    '1_2': 'skip_connect',\n",
+        "    '0_3': 'conv_1x1',\n",
+        "    '1_3': 'skip_connect',\n",
+        "    '2_3': 'skip_connect'\n",
+        "}\n",
+        "for t in query_nb201_trial_stats(arch, 200, 'cifar100'):\n",
+        "    pprint.pprint(t)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "Intermediate results are also available."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 4,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "for t in query_nb201_trial_stats(arch, None, 'imagenet16-120', include_intermediates=True):\n",
+        "    print(t['config'])\n",
+        "    print('Intermediates:', len(t['intermediates']))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## NDS"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "Use the following architecture as an example:<br>\n",
+        "![nds](../../img/nas-bench-nds-example.png)\n",
+        "\n",
+        "Here, `bot_muls`, `ds`, `num_gs`, `ss` and `ws` stand for \"bottleneck multipliers\", \"depths\", \"number of groups\", \"strides\" and \"widths\" respectively."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 5,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "model_spec = {\n",
+        "    'bot_muls': [0.0, 0.25, 0.25, 0.25],\n",
+        "    'ds': [1, 16, 1, 4],\n",
+        "    'num_gs': [1, 2, 1, 2],\n",
+        "    'ss': [1, 1, 2, 2],\n",
+        "    'ws': [16, 64, 128, 16]\n",
+        "}\n",
+        "# Use none as a wildcard\n",
+        "for t in query_nds_trial_stats('residual_bottleneck', None, None, model_spec, None, 'cifar10'):\n",
+        "    pprint.pprint(t)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 6,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "model_spec = {\n",
+        "    'bot_muls': [0.0, 0.25, 0.25, 0.25],\n",
+        "    'ds': [1, 16, 1, 4],\n",
+        "    'num_gs': [1, 2, 1, 2],\n",
+        "    'ss': [1, 1, 2, 2],\n",
+        "    'ws': [16, 64, 128, 16]\n",
+        "}\n",
+        "for t in query_nds_trial_stats('residual_bottleneck', None, None, model_spec, None, 'cifar10', include_intermediates=True):\n",
+        "    pprint.pprint(t['intermediates'][:10])"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 7,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "model_spec = {'ds': [1, 12, 12, 12], 'ss': [1, 1, 2, 2], 'ws': [16, 24, 24, 40]}\n",
+        "for t in query_nds_trial_stats('residual_basic', 'resnet', 'random', model_spec, {}, 'cifar10'):\n",
+        "    pprint.pprint(t)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 8,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "# get the first one\n",
+        "pprint.pprint(next(query_nds_trial_stats('vanilla', None, None, None, None, None)))"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 9,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "# count number\n",
+        "model_spec = {'num_nodes_normal': 5, 'num_nodes_reduce': 5, 'depth': 12, 'width': 32, 'aux': False, 'drop_prob': 0.0}\n",
+        "cell_spec = {\n",
+        "    'normal_0_op_x': 'avg_pool_3x3',\n",
+        "    'normal_0_input_x': 0,\n",
+        "    'normal_0_op_y': 'conv_7x1_1x7',\n",
+        "    'normal_0_input_y': 1,\n",
+        "    'normal_1_op_x': 'sep_conv_3x3',\n",
+        "    'normal_1_input_x': 2,\n",
+        "    'normal_1_op_y': 'sep_conv_5x5',\n",
+        "    'normal_1_input_y': 0,\n",
+        "    'normal_2_op_x': 'dil_sep_conv_3x3',\n",
+        "    'normal_2_input_x': 2,\n",
+        "    'normal_2_op_y': 'dil_sep_conv_3x3',\n",
+        "    'normal_2_input_y': 2,\n",
+        "    'normal_3_op_x': 'skip_connect',\n",
+        "    'normal_3_input_x': 4,\n",
+        "    'normal_3_op_y': 'dil_sep_conv_3x3',\n",
+        "    'normal_3_input_y': 4,\n",
+        "    'normal_4_op_x': 'conv_7x1_1x7',\n",
+        "    'normal_4_input_x': 2,\n",
+        "    'normal_4_op_y': 'sep_conv_3x3',\n",
+        "    'normal_4_input_y': 4,\n",
+        "    'normal_concat': [3, 5, 6],\n",
+        "    'reduce_0_op_x': 'avg_pool_3x3',\n",
+        "    'reduce_0_input_x': 0,\n",
+        "    'reduce_0_op_y': 'dil_sep_conv_3x3',\n",
+        "    'reduce_0_input_y': 1,\n",
+        "    'reduce_1_op_x': 'sep_conv_3x3',\n",
+        "    'reduce_1_input_x': 0,\n",
+        "    'reduce_1_op_y': 'sep_conv_3x3',\n",
+        "    'reduce_1_input_y': 0,\n",
+        "    'reduce_2_op_x': 'skip_connect',\n",
+        "    'reduce_2_input_x': 2,\n",
+        "    'reduce_2_op_y': 'sep_conv_7x7',\n",
+        "    'reduce_2_input_y': 0,\n",
+        "    'reduce_3_op_x': 'conv_7x1_1x7',\n",
+        "    'reduce_3_input_x': 4,\n",
+        "    'reduce_3_op_y': 'skip_connect',\n",
+        "    'reduce_3_input_y': 4,\n",
+        "    'reduce_4_op_x': 'conv_7x1_1x7',\n",
+        "    'reduce_4_input_x': 0,\n",
+        "    'reduce_4_op_y': 'conv_7x1_1x7',\n",
+        "    'reduce_4_input_y': 5,\n",
+        "    'reduce_concat': [3, 6]\n",
+        "}\n",
+        "\n",
+        "for t in query_nds_trial_stats('nas_cell', None, None, model_spec, cell_spec, 'cifar10'):\n",
+        "    assert t['config']['model_spec'] == model_spec\n",
+        "    assert t['config']['cell_spec'] == cell_spec\n",
+        "    pprint.pprint(t)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 10,
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "# count number\n",
+        "print('NDS (amoeba) count:', len(list(query_nds_trial_stats(None, 'amoeba', None, None, None, None, None))))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## NLP"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "pycharm": {
+          "metadata": false
+        }
+      },
+      "source": [
+        "Use the following two architectures as examples. \n",
+        "The arch in the paper is called \"receipe\" with nested variable, and now it is nunested in the benchmarks for NNI.\n",
+        "An arch has multiple Node, Node_input_n and Node_op, you can refer to doc for more details.\n",
+        "\n",
+        "arch1 : <img src=\"../../img/nas-bench-nlp-example1.jpeg\" width=400 height=300 /> \n",
+        "\n",
+        "\n",
+        "arch2 : <img src=\"../../img/nas-bench-nlp-example2.jpeg\" width=400 height=300 /> \n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "metadata": {},
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "{'config': {'arch': {'h_new_0_input_0': 'node_3',\n                     'h_new_0_input_1': 'node_2',\n                     'h_new_0_input_2': 'node_1',\n                     'h_new_0_op': 'blend',\n                     'node_0_input_0': 'x',\n                     'node_0_input_1': 'h_prev_0',\n                     'node_0_op': 'linear',\n                     'node_1_input_0': 'node_0',\n                     'node_1_op': 'activation_tanh',\n                     'node_2_input_0': 'h_prev_0',\n                     'node_2_input_1': 'node_1',\n                     'node_2_input_2': 'x',\n                     'node_2_op': 'linear',\n                     'node_3_input_0': 'node_2',\n                     'node_3_op': 'activation_leaky_relu'},\n            'dataset': 'ptb',\n            'id': 20003},\n 'id': 16291,\n 'test_loss': 4.680262297102549,\n 'train_loss': 4.132040537087838,\n 'training_time': 177.05208373069763,\n 'val_loss': 4.707944253177966}\n"
+          ]
+        }
+      ],
+      "source": [
+        "import pprint\n",
+        "from nni.nas.benchmarks.nlp import query_nlp_trial_stats\n",
+        "\n",
+        "arch1 = {'h_new_0_input_0': 'node_3', 'h_new_0_input_1': 'node_2', 'h_new_0_input_2': 'node_1', 'h_new_0_op': 'blend', 'node_0_input_0': 'x', 'node_0_input_1': 'h_prev_0', 'node_0_op': 'linear','node_1_input_0': 'node_0', 'node_1_op': 'activation_tanh', 'node_2_input_0': 'h_prev_0', 'node_2_input_1': 'node_1', 'node_2_input_2': 'x', 'node_2_op': 'linear', 'node_3_input_0': 'node_2', 'node_3_op': 'activation_leaky_relu'}\n",
+        "for i in query_nlp_trial_stats(arch=arch1, dataset=\"ptb\"):\n",
+        "    pprint.pprint(i)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 6,
+      "metadata": {},
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "[{'current_epoch': 46,\n  'id': 1796,\n  'test_loss': 6.233430054978619,\n  'train_loss': 6.4866799231542664,\n  'training_time': 146.5680329799652,\n  'val_loss': 6.326836978687959},\n {'current_epoch': 47,\n  'id': 1797,\n  'test_loss': 6.2402057403023825,\n  'train_loss': 6.485401405247535,\n  'training_time': 146.05511450767517,\n  'val_loss': 6.3239741605870865},\n {'current_epoch': 48,\n  'id': 1798,\n  'test_loss': 6.351145308363877,\n  'train_loss': 6.611281181173992,\n  'training_time': 145.8849437236786,\n  'val_loss': 6.436160816865809},\n {'current_epoch': 49,\n  'id': 1799,\n  'test_loss': 6.227155079159031,\n  'train_loss': 6.473414458249545,\n  'training_time': 145.51414465904236,\n  'val_loss': 6.313294354607077}]\n"
+          ]
+        }
+      ],
+      "source": [
+        "arch2 = {\"h_new_0_input_0\":\"node_0\",\"h_new_0_input_1\":\"node_1\",\"h_new_0_op\":\"elementwise_sum\",\"node_0_input_0\":\"x\",\"node_0_input_1\":\"h_prev_0\",\"node_0_op\":\"linear\",\"node_1_input_0\":\"node_0\",\"node_1_op\":\"activation_tanh\"}\n",
+        "for i in query_nlp_trial_stats(arch=arch2, dataset='wikitext-2', include_intermediates=True):\n",
+        "    pprint.pprint(i['intermediates'][45:49])"
+      ]
+    },
    {
-     "output_type": "stream",
-     "name": "stdout",
-     "text": "Elapsed time:  2.2023813724517822 seconds\n"
+      "cell_type": "code",
+      "execution_count": 4,
+      "metadata": {
+        "pycharm": {},
+        "tags": []
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Elapsed time:  5.60982608795166 seconds\n"
+          ]
+        }
+      ],
+      "source": [
+        "print('Elapsed time: ', time.time() - ti, 'seconds')"
+      ]
    }
-   ],
-   "source": [
-    "print('Elapsed time: ', time.time() - ti, 'seconds')"
-   ]
-  }
- ],
- "metadata": {
-  "language_info": {
-   "name": "python",
-   "codemirror_mode": {
-    "name": "ipython",
+  ],
+  "metadata": {
+    "file_extension": ".py",
+    "kernelspec": {
+      "display_name": "Python 3",
+      "language": "python",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "name": "python",
+      "version": "3.8.5-final"
+    },
+    "mimetype": "text/x-python",
+    "name": "python",
+    "npconvert_exporter": "python",
+    "orig_nbformat": 2,
+    "pygments_lexer": "ipython3",
    "version": 3
-   },
-   "version": "3.6.10-final"
  },
-  "orig_nbformat": 2,
-  "file_extension": ".py",
-  "mimetype": "text/x-python",
-  "name": "python",
-  "npconvert_exporter": "python",
-  "pygments_lexer": "ipython3",
-  "version": 3,
-  "kernelspec": {
-   "name": "python361064bitnnilatestcondabff8d66a619a4d26af34fe0fe687c7b0",
-   "display_name": "Python 3.6.10 64-bit ('nnilatest': conda)"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
+  "nbformat": 4,
+  "nbformat_minor": 2
 }
\ No newline at end of file
--- a/docs/img/nas-bench-nlp-example1.jpeg
+++ b/docs/img/nas-bench-nlp-example1.jpeg
--- a/docs/img/nas-bench-nlp-example2.jpeg
+++ b/docs/img/nas-bench-nlp-example2.jpeg
--- a/examples/nas/benchmarks/.gitignore
+++ b/examples/nas/benchmarks/.gitignore
@@ -2,3 +2,4 @@ nasbench_full.tfrecord
 a.pth
 data.zip
 nds_data
+nlp_data
\ No newline at end of file
--- a/examples/nas/benchmarks/nlp.requirements.txt
+++ b/examples/nas/benchmarks/nlp.requirements.txt
+peewee
--- a/examples/nas/benchmarks/nlp.sh
+++ b/examples/nas/benchmarks/nlp.sh
+#!/bin/bash
+set -e
+
+if [ -z "${NASBENCHMARK_DIR}" ]; then
+    NASBENCHMARK_DIR=~/.nni/nasbenchmark
+fi
+
+
+mkdir -p nlp_data
+cd nlp_data
+echo "Downloading NLP[1/3] wikitext2_data.zip..."
+if [ -f "wikitext2_data.zip" ]; then
+    echo "wikitext2_data.zip found. Skip download."
+else
+    wget -O wikitext2_data.zip https://github.com/fmsnew/nas-bench-nlp-release/blob/master/train_logs_wikitext-2/logs.zip?raw=true
+fi
+echo "Downloading NLP[2/3] ptb_single_run_data.zip..."
+if [ -f "ptb_single_run_data.zip" ]; then
+    echo "ptb_single_run_data.zip found. Skip download."
+else
+    wget -O ptb_single_run_data.zip https://github.com/fmsnew/nas-bench-nlp-release/blob/master/train_logs_single_run/logs.zip?raw=true
+fi
+echo "Downloading NLP[3/3] ptb_multi_runs_data.zip..."
+if [ -f "ptb_multi_runs_data.zip" ]; then
+    echo "ptb_multi_runs_data.zip found. Skip download."
+else
+    wget -O ptb_multi_runs_data.zip https://github.com/fmsnew/nas-bench-nlp-release/blob/master/train_logs_multi_runs/logs.zip?raw=true
+fi
+echo "### there exits duplicate log_files in ptb_single_run_data.zip and ptb_multi_run_data.zip, you can ignore all or replace all ###"
+unzip -q wikitext2_data.zip
+unzip -q ptb_single_run_data.zip
+unzip -q ptb_multi_runs_data.zip
+cd ..
+
+echo "Generating database..."
+rm -f ${NASBENCHMARK_DIR}/nlp.db ${NASBENCHMARK_DIR}/nlp.db-journal
+mkdir -p ${NASBENCHMARK_DIR}
+python3 -m nni.nas.benchmarks.nlp.db_gen nlp_data
+rm -rf nlp_data
--- a/nni/nas/benchmarks/nlp/__init__.py
+++ b/nni/nas/benchmarks/nlp/__init__.py
+from .model import NlpTrialStats, NlpIntermediateStats, NlpTrialConfig
+from .query import query_nlp_trial_stats
+
+
--- a/nni/nas/benchmarks/nlp/db_gen.py
+++ b/nni/nas/benchmarks/nlp/db_gen.py
+import json
+import os
+import argparse
+import tqdm
+
+from .model import db, NlpTrialConfig, NlpTrialStats, NlpIntermediateStats
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('input_dir', help='Path to extracted NLP data dir.')
+    args = parser.parse_args()
+    with db, tqdm.tqdm(total=len(os.listdir(args.input_dir)), desc="creating tables") as pbar:
+        db.create_tables([NlpTrialConfig, NlpTrialStats, NlpIntermediateStats])
+        json_files = os.listdir(args.input_dir)
+        for json_file in json_files:
+            pbar.update(1)
+            if json_file.endswith('.json'):
+                log_path = os.path.join(args.input_dir, json_file)
+                cur = json.load(open(log_path, 'r'))
+                arch = json.loads(cur['recepie'])
+                unested_arch = {}
+                for k in arch.keys():
+                    # print(k)
+                    unested_arch['{}_op'.format(k)] = arch[k]['op']
+                    for i in range(len(arch[k]['input'])):
+                        unested_arch['{}_input_{}'.format(k, i)] = arch[k]['input'][i]
+                config = NlpTrialConfig.create(arch=unested_arch, dataset=cur['data'][5:])
+                if cur['status'] == 'OK':
+                    trial_stats = NlpTrialStats.create(config=config, train_loss=cur['train_losses'][-1], val_loss=cur['val_losses'][-1],
+                                                       test_loss=cur['test_losses'][-1], training_time=cur['wall_times'][-1])
+                    epochs = 50
+                    intermediate_stats = []
+                    for epoch in range(epochs):
+                        epoch_res = {
+                            'train_loss' : cur['train_losses'][epoch],
+                            'val_loss' : cur['val_losses'][epoch],
+                            'test_loss' : cur['test_losses'][epoch],
+                            'training_time' : cur['wall_times'][epoch]
+                        }
+                        epoch_res.update(current_epoch=epoch + 1, trial=trial_stats)
+                        intermediate_stats.append(epoch_res)
+                    NlpIntermediateStats.insert_many(intermediate_stats).execute(db)
+
+
+if __name__ == '__main__':
+    main()
--- a/nni/nas/benchmarks/nlp/model.py
+++ b/nni/nas/benchmarks/nlp/model.py
+import os
+
+from peewee import CharField, FloatField, ForeignKeyField, IntegerField, Model
+from playhouse.sqlite_ext import JSONField, SqliteExtDatabase
+
+from nni.nas.benchmarks.utils import json_dumps
+from nni.nas.benchmarks.constants import DATABASE_DIR
+
+db = SqliteExtDatabase(os.path.join(DATABASE_DIR, 'nlp.db'), autoconnect=True)
+
+class NlpTrialConfig(Model):
+    """
+    Trial config for NLP. epoch_num is fixed at 50.
+
+    Attributes
+    ----------
+    arch: dict
+        aka recepie in NAS-NLP-Benchmark repo (https://github.com/fmsnew/nas-bench-nlp-release).
+        an arch has multiple Node, Node_input_n and Node_op.
+        ``Node`` can be ``node_n`` or ``h_new_n`` or ``f/i/o/j(_act)`` etc. (n is an int number and need not to be consecutive)
+        ``Node_input_n`` can be ``Node`` or ``x`` etc.
+        ``Node_op`` can be ``linear`` or ``activation_sigm`` or ``activation_tanh`` or ``elementwise_prod``
+        or ``elementwise_sum`` or ``activation_leaky_relu`` ...
+        e.g., {"h_new_0_input_0":"node_3","h_new_0_input_1":"x","h_new_0_op":"linear","node_2_input_0":"x",
+        "node_2_input_1":"h_prev_0","node_2_op":"linear","node_3_input_0":"node_2","node_3_op":"activation_leaky_relu"}
+    dataset: str
+        Dataset used. Could be ``ptb`` or ``wikitext-2``.
+    """
+    arch = JSONField(json_dumps=json_dumps, index=True)
+    dataset = CharField(max_length=15, index=True, choices=[
+        'ptb',
+        'wikitext-2'
+    ])
+
+    class Meta:
+        database = db
+
+class NlpTrialStats(Model):
+    """
+    Computation statistics for NAS-NLP-Benchmark.
+    Each corresponds to one trial result after 50 epoch.
+
+    Attributes
+    ----------
+    config : NlpTrialConfig
+        Corresponding config for trial.
+    train_loss : float or None
+        Final loss on training data. Could be NaN (None).
+    val_loss : float or None
+        Final loss on validation data. Could be NaN (None).
+    test_loss : float or None
+        Final loss on test data. Could be NaN (None).
+    training_time : float
+        Time elapsed in seconds. aka wall_time in in NAS-NLP-Benchmark repo.
+    """
+    config = ForeignKeyField(NlpTrialConfig, backref='trial_stats', index=True)
+    train_loss = FloatField(null=True)
+    val_loss = FloatField(null=True)
+    test_loss = FloatField(null=True)
+    training_time = FloatField(null=True)
+
+    class Meta:
+        database = db
+
+class NlpIntermediateStats(Model):
+    """
+    Computation statistics for NAS-NLP-Benchmark.
+    Each corresponds to one trial result for 1-50 epoch.
+
+    Attributes
+    ----------
+    config : NlpTrialConfig
+        Corresponding config for trial.
+    train_loss : float or None
+        Final loss on training data. Could be NaN (None).
+    val_loss : float or None
+        Final loss on validation data. Could be NaN (None).
+    test_loss : float or None
+        Final loss on test data. Could be NaN (None).
+    training_time : float
+        Time elapsed in seconds. aka wall_time in in NAS-NLP-Benchmark repo.
+    """
+    trial = ForeignKeyField(NlpTrialStats, backref='intermediates', index=True)
+    current_epoch = IntegerField(index=True)
+    train_loss = FloatField(null=True)
+    val_loss = FloatField(null=True)
+    test_loss = FloatField(null=True)
+    training_time = FloatField(null=True)
+
+    class Meta:
+        database = db
+    
\ No newline at end of file
--- a/nni/nas/benchmarks/nlp/query.py
+++ b/nni/nas/benchmarks/nlp/query.py
+import functools
+
+from peewee import fn
+from playhouse.shortcuts import model_to_dict
+from .model import NlpTrialStats, NlpTrialConfig
+
+def query_nlp_trial_stats(arch, dataset, reduction=None, include_intermediates=False):
+    """
+    Query trial stats of NLP benchmark given conditions, including config(arch + dataset) and training results after 50 epoch.
+
+    Parameters
+    ----------
+    arch : dict or None
+        If a dict, it is in the format that is described in
+        :class:`nni.nas.benchmark.nlp.NlpTrialConfig`. Only trial stats matched will be returned.
+        If none, all architectures in the database will be matched.
+    dataset : str or None
+        If specified, can be one of the dataset available in :class:`nni.nas.benchmark.nlp.NlpTrialConfig`.
+        Otherwise a wildcard.
+    reduction : str or None
+        If 'none' or None, all trial stats will be returned directly.
+        If 'mean', fields in trial stats will be averaged given the same trial config.
+        Please note that some trial configs have multiple runs which make "reduction" meaningful, while some may not.
+    include_intermediates : boolean
+        If true, intermediate results will be returned.
+
+    Returns
+    -------
+    generator of dict
+        A generator of :class:`nni.nas.benchmark.nlp.NlpTrialStats` objects,
+        where each of them has been converted into a dict.
+    """
+    fields = []
+    if reduction == 'none':
+        reduction = None
+    if reduction == 'mean':
+        for field_name in NlpTrialStats._meta.sorted_field_names:
+            if field_name not in ['id', 'config']:
+                fields.append(fn.AVG(getattr(NlpTrialStats, field_name)).alias(field_name))
+    elif reduction is None:
+        fields.append(NlpTrialStats)
+    else:
+        raise ValueError('Unsupported reduction: \'%s\'' % reduction)
+    query = NlpTrialStats.select(*fields, NlpTrialConfig).join(NlpTrialConfig)
+
+    conditions = []
+    if arch is not None:
+        conditions.append(NlpTrialConfig.arch == arch)
+    if dataset is not None:
+        conditions.append(NlpTrialConfig.dataset == dataset)
+
+    for trial in query.where(functools.reduce(lambda a, b: a & b, conditions)):
+        if include_intermediates:
+            data = model_to_dict(trial)
+            # exclude 'trial' from intermediates as it is already available in data
+            data['intermediates'] = [
+                {k: v for k, v in model_to_dict(t).items() if k != 'trial'} for t in trial.intermediates
+            ]
+            yield data
+        else:
+            yield model_to_dict(trial)
\ No newline at end of file