# NGNN + GraphSage/GCN
## Introduction
This is an example of implementing [NGNN](https://arxiv.org/abs/2111.11638) for link prediction in DGL.
We use a model-agnostic methodology, namely Network In Graph Neural Network (NGNN), which allows arbitrary GNN models to increase their model capacity.
The script in this folder experiments full-batch GCN/GraphSage (with/without NGNN) on the datasets: ogbl-ddi, ogbl-collab and ogbl-ppa.
## Installation requirements
```
ogb>=1.3.3
torch>=1.11.0
dgl>=0.8
```
## Experiments
We do not fix random seeds at all, and take over 10 runs for all models. All models are trained on a single V100 GPU with 16GB memory.
### ogbl-ddi
#### performance
|
test set |
validation set |
#parameters |
|
Hits@20 |
Hits@50 |
Hits@100 |
Hits@20 |
Hits@50 |
Hits@100 |
|
| GCN+NGNN(paper) |
48.22% ± 7.00% |
82.56% ± 4.03% |
89.48% ± 1.68% |
65.95% ± 1.16% |
70.24% ± 0.50% |
72.54% ± 0.62% |
1,487,361 |
| GCN+NGNN(ours; 50runs) |
54.83% ± 15.81% |
93.15% ± 2.59% |
97.05% ± 0.56% |
71.21% ± 0.38% |
73.55% ± 0.25% |
76.24% ± 1.33% |
| GraphSage+NGNN(paper) |
60.75% ± 4.94% |
84.58% ± 1.89% |
92.58% ± 0.88% |
68.05% ± 0.68% |
71.14% ± 0.33% |
72.77% ± 0.09% |
1,618,433 |
| GraphSage+NGNN(ours; 50runs) |
57.70% ± 15.23% |
96.18% ± 0.94% |
98.58% ± 0.17% |
73.23% ± 0.40% |
87.20% ± 5.29% |
98.71% ± 0.22% |
A 3-layer MLP is used as LinkPredictor here, while a 2-layer one is used by the NGNN paper. This is the main reason for the better performance.
#### Reproduction of performance
- GCN + NGNN
```{.bash}
python main.py --dataset ogbl-ddi --device 0 --ngnn_type input --epochs 800 --dropout 0.5 --num_layers 2 --lr 0.0025 --batch_size 16384 --runs 50
```
- GraphSage + NGNN
```{.bash}
python main.py --dataset ogbl-ddi --device 1 --ngnn_type input --use_sage --epochs 600 --dropout 0.25 --num_layers 2 --lr 0.0012 --batch_size 32768 --runs 50
```
### ogbl-collab
#### Performance
|
test set |
validation set |
#parameters |
|
Hits@10 |
Hits@50 |
Hits@100 |
Hits@10 |
Hits@50 |
Hits@100 |
|
| GCN+NGNN(paper) |
36.69% ± 0.82% |
51.83% ± 0.50% |
57.41% ± 0.22% |
44.97% ± 0.97% |
60.84% ± 0.63% |
66.09% ± 0.30% |
428,033 |
| GCN+NGNN(ours) |
39.29% ± 1.21% |
53.48% ± 0.40% |
58.34% ± 0.45% |
48.28% ± 1.39% |
62.73% ± 0.40% |
67.13% ± 0.39% |
| GraphSage+NGNN(paper) |
36.83% ± 2.56% |
52.62% ± 1.04% |
57.96% ± 0.56% |
45.62% ± 2.56% |
61.34% ± 1.05% |
66.26% ± 0.44% |
591,873 |
| GraphSage+NGNN(ours) |
40.30% ± 1.03% |
53.59% ± 0.56% |
58.75% ± 0.57% |
49.85% ± 1.07% |
62.81% ± 0.46% |
67.33% ± 0.38% |
#### Reproduction of performance
- GCN + NGNN
```{.bash}
python main.py --dataset ogbl-collab --device 2 --ngnn_type hidden --epochs 600 --dropout 0.2 --num_layers 3 --lr 0.001 --batch_size 32768 --runs 10
```
- GraphSage + NGNN
```{.bash}
python main.py --dataset ogbl-collab --device 3 --ngnn_type input --use_sage --epochs 800 --dropout 0.2 --num_layers 3 --lr 0.0005 --batch_size 32768 --runs 10
```
### ogbl-ppa
#### Performance
|
test set |
validation set |
#parameters |
|
Hits@10 |
Hits@50 |
Hits@100 |
Hits@10 |
Hits@50 |
Hits@100 |
|
| GCN+NGNN(paper) |
5.64% ± 0.93% |
18.44% ± 1.88% |
26.78% ± 0.9% |
8.14% ± 0.71% |
19.69% ± 0.94% |
27.86% ± 0.81% |
673,281 |
| GCN+NGNN(ours) |
13.07% ± 3.24% |
28.55% ± 1.62% |
36.83% ± 0.99% |
16.36% ± 1.89% |
30.56% ± 0.72% |
38.34% ± 0.82% |
410,113 |
| GraphSage+NGNN(paper) |
3.52% ± 1.24% |
15.55% ± 1.92% |
24.45% ± 2.34% |
5.59% ± 0.93% |
17.21% ± 0.69% |
25.42% ± 0.50% |
819,201 |
| GraphSage+NGNN(ours) |
11.73% ± 2.42% |
29.88% ± 1.84% |
40.05% ± 1.38% |
14.73% ± 2.36% |
31.59% ± 1.72% |
40.58% ± 1.23% |
556,033 |
The main difference between this implementation and NGNN paper is the position of NGNN (all -> input).
#### Reproduction of performance
- GCN + NGNN
```{.bash}
python main.py --dataset ogbl-ppa --device 4 --ngnn_type input --epochs 80 --dropout 0.2 --num_layers 3 --lr 0.001 --batch_size 49152 --runs 10
```
- GraphSage + NGNN
```{.bash}
python main.py --dataset ogbl-ppa --device 5 --ngnn_type input --use_sage --epochs 80 --dropout 0.2 --num_layers 3 --lr 0.001 --batch_size 49152 --runs 10
```
## References
```{.tex}
@article{DBLP:journals/corr/abs-2111-11638,
author = {Xiang Song and
Runjie Ma and
Jiahang Li and
Muhan Zhang and
David Paul Wipf},
title = {Network In Graph Neural Network},
journal = {CoRR},
volume = {abs/2111.11638},
year = {2021},
url = {https://arxiv.org/abs/2111.11638},
eprinttype = {arXiv},
eprint = {2111.11638},
timestamp = {Fri, 26 Nov 2021 13:48:43 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-2111-11638.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```