Unverified Commit 3cf42fd1 authored by Tianqi Zhang (张天启)'s avatar Tianqi Zhang (张天启) Committed by GitHub
Browse files

[Bug Fix][Example] hgp-sl: More info for running (#2634)



* [Bug Fix][Example] hgp-sl: More info for running

* change list to table

* change dataset summary in gxn and sagpool

* remove redundent
Co-authored-by: default avatarTong He <hetong007@gmail.com>
parent 3ca52b1e
......@@ -10,47 +10,13 @@ The DGL's built-in LegacyTUDataset. This is a serial of graph kernel datasets fo
NOTE: Follow the setting of the author's implementation, for 'DD' and 'PROTEINS', we use one-hot node label as input node features. For ENZYMES', 'IMDB-BINARY', 'IMDB-MULTI' and 'COLLAB', we use the concatenation of one-hot node label (if available) and one-hot node degree as input node features.
DD
- NumGraphs: 1178
- AvgNodesPerGraph: 284.32
- AvgEdgesPerGraph: 715.66
- NumFeats: 89
- NumClasses: 2
PROTEINS
- NumGraphs: 1113
- AvgNodesPerGraph: 39.06
- AvgEdgesPerGraph: 72.82
- NumFeats: 1
- NumClasses: 2
ENZYMES
- NumGraphs: 600
- AvgNodesPerGraph: 32.63
- AvgEdgesPerGraph: 62.14
- NumFeats: 18
- NumClasses: 6
IMDB-BINARY
- NumGraphs: 1000
- AvgNodesPerGraph: 19.77
- AvgEdgesPerGraph: 96.53
- NumFeats: -
- NumClasses: 2
IMDB-MULTI
- NumGraphs: 1500
- AvgNodesPerGraph: 13.00
- AvgEdgesPerGraph: 65.94
- NumFeats: -
- NumClasses: 3
COLLAB
- NumGraphs: 5000
- AvgNodesPerGraph: 74.49
- AvgEdgesPerGraph: 2457.78
- NumFeats: -
- NumClasses: 3
| | DD | PROTEINS | ENZYMES | IMDB-BINARY | IMDB-MULTI | COLLAB |
| ---------------- | ------ | -------- | ------- | ------------ | ---------- | -------- |
| NumGraphs | 1178 | 1113 | 600 | 1000 | 1500 | 5000 |
| AvgNodesPerGraph | 284.32 | 39.06 | 32.63 | 19.77 | 13.00 | 74.49 |
| AvgEdgesPerGraph | 715.66 | 72.82 | 62.14 | 96.53 | 65.94 | 2457.78 |
| NumFeats | 89 | 1 | 18 | - | - | - |
| NumClasses | 2 | 2 | 6 | 2 | 3 | 2 |
How to run example files
......
......@@ -15,62 +15,39 @@ The DGL's built-in [LegacyTUDataset](https://docs.dgl.ai/api/python/dgl.data.htm
NOTE: Since there is no data attributes in some of these datasets, we use node_id (in one-hot vector whose length is the max number of nodes across all graphs) as the node feature. Also note that the node_id in some datasets is not unique (e.g. a graph may has two nodes with the same id).
DD
- NumGraphs: 1178
- AvgNodesPerGraph: 284.32
- AvgEdgesPerGraph: 715.66
- NumFeats: 89
- NumClasses: 2
PROTEINS
- NumGraphs: 1113
- AvgNodesPerGraph: 39.06
- AvgEdgesPerGraph: 72.82
- NumFeats: 1
- NumClasses: 2
NCI1
- NumGraphs: 4110
- AvgNodesPerGraph: 29.87
- AvgEdgesPerGraph: 32.30
- NumFeats: 37
- NumClasses: 2
NCI109
- NumGraphs: 4127
- AvgNodesPerGraph: 29.68
- AvgEdgesPerGraph: 32.13
- NumFeats: 38
- NumClasses: 2
Mutagenicity
- NumGraphs: 4337
- AvgNodesPerGraph: 30.32
- AvgEdgesPerGraph: 30.77
- NumFeats: 14
- NumClasses: 2
ENZYMES
- NumGraphs: 600
- AvgNodesPerGraph: 32.63
- AvgEdgesPerGraph: 62.14
- NumFeats: 18
- NumClasses: 6
| | DD | PROTEINS | NCI1 | NCI109 | Mutagenicity | ENZYMES |
| ---------------- | ------ | -------- | ----- | ------ | ------------ | ------- |
| NumGraphs | 1178 | 1113 | 4110 | 4127 | 4337 | 600 |
| AvgNodesPerGraph | 284.32 | 39.06 | 29.87 | 29.68 | 30.32 | 32.63 |
| AvgEdgesPerGraph | 715.66 | 72.82 | 32.30 | 32.13 | 30.77 | 62.14 |
| NumFeats | 89 | 1 | 37 | 38 | 14 | 18 |
| NumClasses | 2 | 2 | 2 | 2 | 2 | 6 |
How to run example files
--------------------------------
In the HGP-SL-DGL folder, run
```bash
python main.py --dataset ${your_dataset_name_here}
python main.py --dataset ${your_dataset_name_here} [hyper-parameters]
```
If want to use a GPU, run
```bash
python main.py --device ${your_device_id_here} --dataset ${your_dataset_name_here}
python main.py --device ${your_device_id_here} --dataset ${your_dataset_name_here} [hyper-parameters]
```
For example, to perform experiments on DD dataset on GPU, run:
```bash
python main.py --device 0 --dataset DD --lr 0.0001 --batch_size 64 --pool_ratio 0.3 --dropout 0.5 --conv_layers 2
```
NOTE: Be careful when modifying `batch_size` and `pool_ratio` for large dataset like DD. Too large batch size or pooling ratio may cause out-of-memory and other severe errors.
You can find the detailed hyper-parameter settings below (in the Performance section).
Performance
-------------------------
......
......@@ -10,40 +10,13 @@ The DGL's built-in LegacyTUDataset. This is a serial of graph kernel datasets fo
NOTE: Since there is no data attributes in some of these datasets, we use node_id (in one-hot vector whose length is the max number of nodes across all graphs) as the node feature. Also note that the node_id in some datasets is not unique (e.g. a graph may has two nodes with the same id).
DD
- NumGraphs: 1178
- AvgNodesPerGraph: 284.32
- AvgEdgesPerGraph: 715.66
- NumFeats: 89
- NumClasses: 2
PROTEINS
- NumGraphs: 1113
- AvgNodesPerGraph: 39.06
- AvgEdgesPerGraph: 72.82
- NumFeats: 1
- NumClasses: 2
NCI1
- NumGraphs: 4110
- AvgNodesPerGraph: 29.87
- AvgEdgesPerGraph: 32.30
- NumFeats: 37
- NumClasses: 2
NCI109
- NumGraphs: 4127
- AvgNodesPerGraph: 29.68
- AvgEdgesPerGraph: 32.13
- NumFeats: 38
- NumClasses: 2
Mutagenicity
- NumGraphs: 4337
- AvgNodesPerGraph: 30.32
- AvgEdgesPerGraph: 30.77
- NumFeats: 14
- NumClasses: 2
| | DD | PROTEINS | NCI1 | NCI109 | Mutagenicity |
| ---------------- | ------ | -------- | ----- | ------ | ------------ |
| NumGraphs | 1178 | 1113 | 4110 | 4127 | 4337 |
| AvgNodesPerGraph | 284.32 | 39.06 | 29.87 | 29.68 | 30.32 |
| AvgEdgesPerGraph | 715.66 | 72.82 | 32.30 | 32.13 | 30.77 |
| NumFeats | 89 | 1 | 37 | 38 | 14 |
| NumClasses | 2 | 2 | 2 | 2 | 2 |
How to run example files
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment