TextNAS.rst 3.06 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
TextNAS
=======

Introduction
------------

This is the implementation of the TextNAS algorithm proposed in the paper `TextNAS: A Neural Architecture Search Space tailored for Text Representation <https://arxiv.org/pdf/1912.10729.pdf>`__. TextNAS is a neural architecture search algorithm tailored for text representation, more specifically, TextNAS is based on a novel search space consists of operators widely adopted to solve various NLP tasks, and TextNAS also supports multi-path ensemble within a single network to balance the width and depth of the architecture. 

The search space of TextNAS contains: 

.. code-block:: bash

   * 1-D convolutional operator with filter size 1, 3, 5, 7 
   * recurrent operator (bi-directional GRU) 
   * self-attention operator
   * pooling operator (max/average)


Following the ENAS algorithm, TextNAS also utilizes parameter sharing to accelerate the search speed and adopts a reinforcement-learning controller for the architecture sampling and generation. Please refer to the paper for more details of TextNAS.

Preparation
-----------

Prepare the word vectors and SST dataset, and organize them in data directory as shown below:

.. code-block:: bash

   textnas
   ├── data
   │   ├── sst
   │   │   └── trees
   │   │       ├── dev.txt
   │   │       ├── test.txt
   │   │       └── train.txt
   │   └── glove.840B.300d.txt
   ├── dataloader.py
   ├── model.py
   ├── ops.py
   ├── README.md
   ├── search.py
   └── utils.py

The following link might be helpful for finding and downloading the corresponding dataset:


* `GloVe: Global Vectors for Word Representation <https://nlp.stanford.edu/projects/glove/>`__

  * `glove.840B.300d.txt <http://nlp.stanford.edu/data/glove.840B.300d.zip>`__

* `Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank <https://nlp.stanford.edu/sentiment/>`__

  * `trainDevTestTrees_PTB.zip <https://nlp.stanford.edu/sentiment/trainDevTestTrees_PTB.zip>`__

Examples
--------

Search Space
^^^^^^^^^^^^

:githublink:`Example code <examples/nas/textnas>`

.. code-block:: bash

   # In case NNI code is not cloned. If the code is cloned already, ignore this line and enter code folder.
   git clone https://github.com/Microsoft/nni.git

   # search the best architecture
   cd examples/nas/textnas

   # view more options for search
   python3 search.py -h

After each search epoch, 10 sampled architectures will be tested directly. Their performances are expected to be 40% - 42% after 10 epochs.

By default, 20 sampled architectures will be exported into ``checkpoints`` directory for next step.

retrain
^^^^^^^

.. code-block:: bash

   # In case NNI code is not cloned. If the code is cloned already, ignore this line and enter code folder.
   git clone https://github.com/Microsoft/nni.git

   # search the best architecture
   cd examples/nas/textnas

   # default to retrain on sst-2
   sh run_retrain.sh

Reference
---------

TextNAS directly uses EnasTrainer, please refer to `ENAS <./ENAS.rst>`__ for the trainer APIs.