Commit eeab424e authored by Sinan Tan's avatar Sinan Tan Committed by xuehui
Browse files

Add more detailed introduction about the evolution algorithm.

parent 08a75265
...@@ -85,6 +85,116 @@ nnictl create --config ~/nni/examples/trials/ga_squad/config.yaml ...@@ -85,6 +85,116 @@ nnictl create --config ~/nni/examples/trials/ga_squad/config.yaml
# Techinal details about the trial # Techinal details about the trial
## How does it works
The evolution-algorithm based architecture for question answering has two different parts just like any other examples: the trial and the tuner.
### The trial
The trial has a lot of different files, functions and classes. Here we will only give most of those files a brief introduction:
* `attention.py` contains an implementaion for attention mechanism in Tensorflow.
* `data.py` contains functions for data preprocessing.
* `evaluate.py` contains the evaluation script.
* `graph.py` contains the definition of the computation graph.
* `rnn.py` contains an implementaion for GRU in Tensorflow.
* `train_model.py` is a wrapper for the whole question answering model.
Among those files, `trial.py` and `graph_to_tf.py` is special.
`graph_to_tf.py` has a function named as `graph_to_network`, here is its skelton code:
```
def graph_to_network(input1,
input2,
input1_lengths,
input2_lengths,
graph,
dropout_rate,
is_training,
num_heads=1,
rnn_units=256):
topology = graph.is_topology()
layers = dict()
layers_sequence_lengths = dict()
num_units = input1.get_shape().as_list()[-1]
layers[0] = input1*tf.sqrt(tf.cast(num_units, tf.float32)) + \
positional_encoding(input1, scale=False, zero_pad=False)
layers[1] = input2*tf.sqrt(tf.cast(num_units, tf.float32))
layers[0] = dropout(layers[0], dropout_rate, is_training)
layers[1] = dropout(layers[1], dropout_rate, is_training)
layers_sequence_lengths[0] = input1_lengths
layers_sequence_lengths[1] = input2_lengths
for _, topo_i in enumerate(topology):
if topo_i == '|':
continue
if graph.layers[topo_i].graph_type == LayerType.input.value:
# ......
elif graph.layers[topo_i].graph_type == LayerType.attention.value:
# ......
# More layers to handle
```
As we can see, this function is actually a compiler, that converts the internal model DAG configuration (which will be introduced in the `Model configuration format` section) `graph`, to a Tensorflow computation graph.
```
topology = graph.is_topology()
```
performs topological sorting on the internal graph representation, and the code inside the loop:
```
for _, topo_i in enumerate(topology):
```
performs actually conversion that maps each layer to a part in Tensorflow computation graph.
### The tuner
The tuner is much more simple than the trial. They actually share the same `graph.py`. Besides, the tuner has a `customer_tuner.py`, the most important class in which is `CustomerTuner`:
```
class CustomerTuner(Tuner):
# ......
def generate_parameters(self, parameter_id):
"""Returns a set of trial graph config, as a serializable object.
parameter_id : int
"""
if len(self.population) <= 0:
logger.debug("the len of poplution lower than zero.")
raise Exception('The population is empty')
pos = -1
for i in range(len(self.population)):
if self.population[i].result == None:
pos = i
break
if pos != -1:
indiv = copy.deepcopy(self.population[pos])
self.population.pop(pos)
temp = json.loads(graph_dumps(indiv.config))
else:
random.shuffle(self.population)
if self.population[0].result > self.population[1].result:
self.population[0] = self.population[1]
indiv = copy.deepcopy(self.population[0])
self.population.pop(1)
indiv.mutation()
graph = indiv.config
temp = json.loads(graph_dumps(graph))
# ......
```
As we can see, the overloaded method `generate_parameters` implements a pretty naive mutation algorithm. The code lines:
```
if self.population[0].result > self.population[1].result:
self.population[0] = self.population[1]
indiv = copy.deepcopy(self.population[0])
```
controls the mutation process. It will always take two random individuals in the population, only keeping and mutating the one with better result.
## Model configuration format ## Model configuration format
Here is an example of the model configuration, which is passed from the tuner to the trial in the architecture search procedure. Here is an example of the model configuration, which is passed from the tuner to the trial in the architecture search procedure.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment