# Gated Graph Neural Network (GGNN)
- Paper link: https://arxiv.org/pdf/1511.05493.pdf
## Dependencies
- PyTorch 1.0+
- DGL 0.3.1+
## GGNN implemented in dgl
In dgl, GGNN is implemented as module `GatedGraphConv`, it can be imported as follows:
```python
from dgl.nn.pytorch import GatedGraphConv
```
## Solving bAbI tasks
In this example, we use GGNN to solve some of the [bAbI](https://github.com/facebook/bAbI-tasks)
tasks solved in the paper.
#### Overview of bAbI tasks
bAbI is a set of question answering tasks that require a system to do multi-step reasoning.
Datasets of bAbI tasks are generated by templates, which can be natural language or symbolic
form. In this example, we follow the paper to generate the datasets using symbolic form.
There are 20 tasks in bAbI, in this example, we follow the paper to do task 4, 15, 16, 18 and 19.
#### Task 4: Two argument relations: subject vs. object
An example of task 4 is as follows
```
1 C e A
2 A e B
3 eval A w C
```
A, B, C are nodes; e, w are edges, there are totally four kinds of edges: `n, s, w, e`, which can
be viewed as north, south, west, east.
The first two lines are conditions, and the third line are the question and answer.
So the explanation of the example is:
```
1 Go east from C, we can reach A
2 Go east from A, we can reach B
3 Question: where can we reach if we go west from A? Answer: C
```
If we represent the conditions using a graph, we can view this task as a `Node Selection` task.
For different edges in questions, we view them as different question types, we train
separate models for each question type. The module for solving node selection tasks is
implemented in `ggnn_ns.py`.
For four question types `n, s, w, e`, we assign a question id for them ranging from 0 to 3.
For each question id, run the following commands for training and testing:
```bash
python train_ns.py --task_id=4 --question_id=0 --train_num=50 --epochs=10
python train_ns.py --task_id=4 --question_id=1 --train_num=50 --epochs=10
python train_ns.py --task_id=4 --question_id=2 --train_num=50 --epochs=10
python train_ns.py --task_id=4 --question_id=3 --train_num=50 --epochs=10
```
The training file name `train_ns` means training node selection. `train_num` means the number of
training examples used.
#### Task 15: Basic deduction
Task 15 is similar to task 4, it's also a Node Selection task. An example is shown below:
```
1 I has_fear C
2 H is C
3 G is I
4 A is B
5 E has_fear C
6 C has_fear I
7 B has_fear C
8 F is E
9 eval H has_fear I
```
There are two types of edges in this task: `is, has_fear`. There is only one question type in
this task: `has_fear`, we assign question id `1` for it.
Run the following command for training and testing:
```bash
python train_ns.py --task_id=15 --question_id=1 --train_num=50 --epochs=15 --lr=1e-2
```
#### Task 16: Basic induction
Task 16 is similar to task 15. An example of task 16 is shown below
```
1 J has_color F
2 K has_color I
3 A has_color I
4 G is D
5 J is C
6 H has_color I
7 H is D
8 A is D
9 K is D
10 eval G has_color I
```
There are two types of edges in this task: `is, has_color`. There is only one question type in
this task: `has_color`, we assign question id `1` for it.
Run the following command for training and testing:
```bash
python train_ns.py --task_id=16 --question_id=1 --train_num=50 --epochs=20 --lr=1e-2
```
#### Task 18: Reasoning about size
Task 18 is a `Graph Classification` task, an example is shown below:
```
1 G > B
2 G > D
3 E > F
4 E > A
5 B > A
6 E > B
7 eval G < A false
```
Line 1 to line 6 give some conditions for comparision of the size of entities, line 7 is the
question, asking whether `G < A` is `true` or `false`. So the input is a graph, the output is a
binary classification result. We view it as a `Graph Classification` task.
Following the paper, we use GGNN to encode the graph, followed by a `GlobalAttentionPooling`
layer to pool the graph into a hidden vector, which is used to classify the graph.
The module for solving graph classification tasks is implemented in `ggnn_gc.py`.
There are two types of edges in this task: `>, <`, and so are the question types. We assign
question ids `0, 1` to them.
Run the following commands for training and testing:
```bash
python train_gc.py --task_id=18 --question_id=0 --train_num=50 --batch_size=10 --lr=1e-3 --epochs=20
python train_gc.py --task_id=18 --question_id=1 --train_num=50 --batch_size=10 --lr=1e-3 --epochs=20
```
#### Task 19: Path finding
An example of task 19 is as follows:
```
1 D n A
2 D s E
3 G w D
4 E s B
5 eval path G A w,n
```
Similar to task 4, there are four types of edges: `n, s, w, e`, which can
be viewed as north, south, west, east. The conditions are the same as task 4, the question in
line 5 means `Question: find a path from G to A. Answer: first go west, then go north`. The
output is a sequence of edges. So there is no question type in this task.
The paper uses *Gated Graph Sequence Neural Networks (GGS-NNs)* to solve this kind of problems.
In this example, we implemented GGS-NNs in `ggsnn.py`, run the following command for training
and testing:
```bash
python train_path_finding.py --train_num=250 --epochs=200
```
#### Results
Following the paper, we use 10 different test sets for evaluation. The result is the mean and
standard deviation of the evaluation performance across the 10 datasets. Numbers in the parentheses
are the number of training data used.
| Task ID | Reported
Accuracy | DGL
Accuracy |
|:---------:|-----------------------------|------------------------------|
| 4 | 100.0 ± 0.00 (50) | 100.0 ± 0.00 (50)|
| 15 | 100.0 ± 0.00 (50) | 100.0 ± 0.00 (50)|
| 16 | 100.0 ± 0.00 (50) | 100.0 ± 0.00 (50)|
| 18 | 100.0 ± 0.00 (50) | 100.0 ± 0.00 (50)|
| 19 | 99.0 ± 1.1 (250) | 97.8 ± 0.02 (50) |