README.md 1.91 KB
Newer Older
1
2
# Node classification on heterogeneous graph with RGCN

3
4
This example aims to demonstrate how to run node classification task on heterogeneous graph with **GraphBolt**. Models are not tuned to achieve the best accuracy yet.

5
6
7
8
9
10
11
12
## Run on `ogbn-mag` dataset

### Command
```
python3 hetero_rgcn.py
```

### Statistics of train/validation/test
13
14
15
16
17
18
Below results are run on AWS EC2 r6idn.metal, 1024GB RAM, 128 vCPUs(Ice Lake 8375C), 0 GPUs.

| Dataset Size | Peak CPU RAM Usage | Time Per Epoch(Training) | Time Per Epoch(Inference: train/val/test set)      |
| ------------ | ------------- | ------------------------ | ---------------------------    |
| ~1.1GB       | ~5GB          | ~3min                    | ~1min40s + ~0min9s + ~0min7s    |

19
20
21
22
23
24
25
```
Final performance: 
All runs:
Highest Train: 49.29 ± 0.85
Highest Valid: 34.69 ± 0.49
  Final Train: 48.14 ± 1.09
   Final Test: 33.65 ± 0.63
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
```

## Run on `ogb-lsc-mag240m` dataset

### Command
```
python3 hetero_rgcn.py --dataset ogb-lsc-mag240m --runs 2
```

### Statistics of train/validation/test
Below results are run on AWS EC2 r6idn.metal, 1024GB RAM, 128 vCPUs(Ice Lake 8375C), 0 GPUs.

| Dataset Size | Peak CPU RAM Usage | Time Per Epoch(Training) | Time Per Epoch(Inference: train/val/test set) |
| ------------ | ------------- | ------------------------ | ------------------------- |
| ~404GB       | ~110GB        | ~2min45s                 | ~28min25s + ~4min21s + ~2min54s   |


As labels are hidden for test set, test accuray is always **0.00**. Test submission is saved as `y_pred_mag240m_test-dev.npz` under current directory.

As we can see from above table, the time per epoch is quite close to the one in `ogbn-mag`. This is due to no embedding layer is applied for `ogb-lsc-mag240m`. All required node features are generated in advance.
```
Final performance: 
All runs:
Highest Train: 54.75 ± 0.29
Highest Valid: 52.08 ± 0.09
  Final Train: 54.75 ± 0.29
   Final Test: 0.00 ± 0.00
53
```