README.md 4.36 KB
Newer Older
Da Zheng's avatar
Da Zheng committed
1
2
3
4
5
6
Graph Convolutional Networks (GCN)
============

Paper link: [https://arxiv.org/abs/1609.02907](https://arxiv.org/abs/1609.02907)
Author's code repo: [https://github.com/tkipf/gcn](https://github.com/tkipf/gcn)

7
Dependencies
8
------------
9
- MXNet nightly build
10
11
12
- requests

``bash
13
pip install mxnet --pre
14
15
16
pip install requests
``

Da Zheng's avatar
Da Zheng committed
17
18
Codes
-----
19
20
21
22
23
The folder contains three implementations of GCN:
- `gcn.py` uses DGL's predefined graph convolution module.
- `gcn_mp.py` uses user-defined message and reduce functions.
- `gcn_spmv.py` improves from `gcn_mp.py` by using DGL's builtin functions
   so SPMV optimization could be applied.
Da Zheng's avatar
Da Zheng committed
24

Ziyue Huang's avatar
Ziyue Huang committed
25
26
27
The provided implementation in `gcn_concat.py` is a bit different from the
original paper for better performance, credit to @yifeim and @ZiyueHuang.

28
29
Results
-------
30
31
Run with following (available dataset: "cora", "citeseer", "pubmed")
```bash
32
DGLBACKEND=mxnet python3 train.py --dataset cora --gpu 0
33
34
35
36
37
38
```

* cora: ~0.810 (paper: 0.815)
* citeseer: ~0.702 (paper: 0.703)
* pubmed: ~0.780 (paper: 0.790)

39
40
Results (`gcn_concat.py vs. gcn.py`)
------------------------------------
41
42
43
`gcn_concat.py` uses concatenation of hidden units to account for multi-hop
  skip-connections, while `gcn_spmv.py` uses simple additions (the original paper
omitted this detail). We feel concatenation is superior
Da Zheng's avatar
Da Zheng committed
44
45
because all neighboring information is presented without additional modeling
assumptions.
46
47
48
49
50
51
52
These results are based on single-run training to minimize the cross-entropy
loss. We can see clear skip connection can help train a GCN with many layers.

The experiments show that adding depth may or may not improve accuracy.
While adding depth is a clear way to mimic power iterations of matrix factorizations,
training multiple epochs to obtain stationary points could equivalently solve matrix
factorization. Given the small datasets, we can't draw such conclusions from these experiments.
53
54

```
55
56
57
58
59
60
61
62
63
64
65
# Final accuracy 57.70% MLP without GCN
DGLBACKEND=mxnet python3 examples/mxnet/gcn/gcn_concat.py --dataset "citeseer" --n-epochs 200 --n-layers 0

# Final accuracy 68.20% with 2-layer GCN
DGLBACKEND=mxnet python3 examples/mxnet/gcn/gcn_spmv.py --dataset "citeseer" --n-epochs 200 --n-layers 1

# Final accuracy 18.40% with 10-layer GCN
DGLBACKEND=mxnet python3 examples/mxnet/gcn/gcn_spmv.py --dataset "citeseer" --n-epochs 200 --n-layers 9

# Final accuracy 65.70% with 10-layer GCN with skip connection
DGLBACKEND=mxnet python3 examples/mxnet/gcn/gcn_concat.py --dataset "citeseer" --n-epochs 200 --n-layers 2 --normalization 'sym' --self-loop
66

67
68
# Final accuracy 64.70% with 10-layer GCN with skip connection
DGLBACKEND=mxnet python3 examples/mxnet/gcn/gcn_concat.py --dataset "citeseer" --n-epochs 200 --n-layers 10 --normalization 'sym' --self-loop
69

70
71
72
```

```
73
74
# Final accuracy 53.20% MLP without GCN
DGLBACKEND=mxnet python3 examples/mxnet/gcn/gcn_concat.py --dataset "cora" --n-epochs 200 --n-layers 0
75

76
77
78
79
80
81
82
83
84
85
86
# Final accuracy 81.40% with 2-layer GCN
DGLBACKEND=mxnet python3 examples/mxnet/gcn/gcn_spmv.py --dataset "cora" --n-epochs 200 --n-layers 1

# Final accuracy 27.60% with 10-layer GCN
DGLBACKEND=mxnet python3 examples/mxnet/gcn/gcn_spmv.py --dataset "cora" --n-epochs 200 --n-layers 9

# Final accuracy 72.60% with 2-layer GCN with skip connection
DGLBACKEND=mxnet python3 examples/mxnet/gcn/gcn_concat.py --dataset "cora" --n-epochs 200 --n-layers 2 --normalization 'sym' --self-loop

# Final accuracy 78.90% with 10-layer GCN with skip connection
DGLBACKEND=mxnet python3 examples/mxnet/gcn/gcn_concat.py --dataset "cora" --n-epochs 200 --n-layers 10 --normalization 'sym' --self-loop
87

88
89
90
```

```
91
92
93
94
# Final accuracy 70.30% MLP without GCN
DGLBACKEND=mxnet python3 examples/mxnet/gcn/gcn_concat.py --dataset "pubmed" --n-epochs 200 --n-layers 0

# Final accuracy 77.40% with 2-layer GCN
95
DGLBACKEND=mxnet python3 examples/mxnet/gcn/gcn_spmv.py --dataset "pubmed" --n-epochs 200 --n-layers 1
96
97

# Final accuracy 36.20% with 10-layer GCN
98
DGLBACKEND=mxnet python3 examples/mxnet/gcn/gcn_spmv.py --dataset "pubmed" --n-epochs 200 --n-layers 9
99

100
101
# Final accuracy 78.30% with 2-layer GCN with skip connection
DGLBACKEND=mxnet python3 examples/mxnet/gcn/gcn_concat.py --dataset "pubmed" --n-epochs 200 --n-layers 2 --normalization 'sym' --self-loop
102

103
104
# Final accuracy 76.30% with 10-layer GCN with skip connection
DGLBACKEND=mxnet python3 examples/mxnet/gcn/gcn_concat.py --dataset "pubmed" --n-epochs 200 --n-layers 10 --normalization 'sym' --self-loop
105
```