"examples/git@developer.sourcefind.cn:OpenDAS/vision.git" did not exist on "876117b5d3c72aa7b2472ab3051bab6380f379d2"
Unverified Commit b2b531e0 authored by Hengrui Zhang's avatar Hengrui Zhang Committed by GitHub
Browse files

[Example] add implementation of grace (#2828)



* [Example] add implementation of grace

* [Doc] add grace in the index file

* fix

* fix typos
Co-authored-by: default avatarMufei Li <mufeili1996@gmail.com>
parent e7046f1e
......@@ -11,6 +11,7 @@ The folder contains example implementations of selected research papers related
| [Network Embedding with Completely-imbalanced Labels](#rect) | :heavy_check_mark: | | | | |
| [Boost then Convolve: Gradient Boosting Meets Graph Neural Networks](#bgnn) | :heavy_check_mark: | | | | |
| [Contrastive Multi-View Representation Learning on Graphs](#mvgrl) | :heavy_check_mark: | | :heavy_check_mark: | | |
| [Deep Graph Contrastive Representation Learning](#grace) | :heavy_check_mark: | | | | |
| [Graph Random Neural Network for Semi-Supervised Learning on Graphs](#grand) | :heavy_check_mark: | | | | |
| [Heterogeneous Graph Transformer](#hgt) | :heavy_check_mark: | :heavy_check_mark: | | | |
| [Graph Convolutional Networks for Graphs with Multi-Dimensionally Weighted Edges](#mwe) | :heavy_check_mark: | | | | :heavy_check_mark: |
......@@ -109,6 +110,9 @@ The folder contains example implementations of selected research papers related
- <a name="mvgrl"></a> Hassani and Khasahmadi. Contrastive Multi-View Representation Learning on Graphs. [Paper link](https://arxiv.org/abs/2006.05582).
- Example code: [PyTorch](../examples/pytorch/mvgrl)
- Tags: graph diffusion, self-supervised learning on graphs.
- <a name="grace"></a> Zhu et al. Deep Graph Contrastive Representation Learning. [Paper link](https://arxiv.org/abs/2006.04131).
- Example code: [PyTorch](../examples/pytorch/grace)
- Tags: contrastive learning for node classification.
- <a name="grand"></a> Feng et al. Graph Random Neural Network for Semi-Supervised Learning on Graphs. [Paper link](https://arxiv.org/abs/2005.11079).
- Example code: [PyTorch](../examples/pytorch/grand)
- Tags: semi-supervised node classification, simplifying graph convolution, data augmentation
......@@ -136,19 +140,12 @@ The folder contains example implementations of selected research papers related
- <a name="dimenet"></a> Klicpera et al. Directional Message Passing for Molecular Graphs. [Paper link](https://arxiv.org/abs/2003.03123).
- Example code: [PyTorch](../examples/pytorch/dimenet)
- Tags: molecules, molecular property prediction, quantum chemistry
- <a name="dagnn"></a> Rossi et al. Temporal Graph Networks For Deep Learning on Dynamic Graphs. [Paper link](https://arxiv.org/abs/2006.10637).
- Example code: [PyTorch](../examples/pytorch/tgn)
- Tags: over-smoothing, node classification
- <a name="dagnn"></a> Rossi et al. Temporal Graph Networks For Deep Learning on Dynamic Graphs. [Paper link](https://arxiv.org/abs/2006.10637).
- Example code: [PyTorch](../examples/pytorch/tgn)
- <a name="tgn"></a> Rossi et al. Temporal Graph Networks For Deep Learning on Dynamic Graphs. [Paper link](https://arxiv.org/abs/2006.10637).
- Example code: [Pytorch](../examples/pytorch/tgn)
- Tags: over-smoothing, node classification
- <a name="compgcn"></a> Vashishth, Shikhar, et al. Composition-based Multi-Relational Graph Convolutional Networks. [Paper link](https://arxiv.org/abs/1911.03082).
- Example code: [PyTorch](../examples/pytorch/compGCN)
- Tags: multi-relational graphs, graph neural network
- <a name="deepergcn"></a> Li et al. DeeperGCN: All You Need to Train Deeper GCNs. [Paper link](https://arxiv.org/abs/2006.07739).
- Example code: [PyTorch](../examples/pytorch/deepergcn)
- Tags: over-smoothing, deeper gnn, OGB
......
# DGL Implementation of GRACE
This DGL example implements the model proposed in the paper [Deep Graph Contrastive Representation Learning](https://arxiv.org/abs/2006.04131).
Author's code: https://github.com/CRIPAC-DIG/GRACE
## Example Implementor
This example was implemented by [Hengrui Zhang](https://github.com/hengruizhang98) when he was an applied scientist intern at AWS Shanghai AI Lab.
## Dependencies
- Python 3.7
- PyTorch 1.7.1
- dgl 0.6.0
## Datasets
##### Unsupervised Node Classification Datasets:
'Cora', 'Citeseer' and 'Pubmed'
| Dataset | # Nodes | # Edges | # Classes |
| -------- | ------- | ------- | --------- |
| Cora | 2,708 | 10,556 | 7 |
| Citeseer | 3,327 | 9,228 | 6 |
| Pubmed | 19,717 | 88,651 | 3 |
## Arguments
```
--dataname str The graph dataset name. Default is 'cora'.
--gpu int GPU index. Default is 0.
--split int Dataset spliting method. Default is 'random'.
```
## How to run examples
In the paper(as well as authors' repo), the training set and testing set are split randomly with 1:9 ratio. In order to fairly compare it with other models with the split they used, termed public split, in this repo we also provide experiment results using public split. To run the examples,
```python
# Cora with random split
python main.py --dataname cora
# Cora with public split
python main.py --dataname cora --split public
```
replace 'cora' with 'citeseer' or 'pubmed' if you would like to run this example for other datasets.
## Performance
We use the same hyperparameter settings as provided by the author, you can check config.yaml for detailed hyper-parameters for each dataset.
Random split (Train/Test = 1:9)
| Dataset | Cora | Citeseer | Pubmed |
| :---------------: | :--: | :------: | :----: |
| Accuracy Reported | 83.3 | 72.1 | 86.7 |
| Author's Code | 83.1 | 71.0 | 86.3 |
| DGL | 83.4 | 71.4 | 86.1 |
Public split
| Dataset | Cora | Citeseer | Pubmed |
| :-----------: | :--: | :------: | :----: |
| Author's Code | 79.9 | 68.6 | 81.3 |
| DGL | 80.1 | 68.9 | 81.2 |
# Data augmentation on graphs via edge dropping and feature masking
import torch as th
import numpy as np
import dgl
def aug(graph, x, feat_drop_rate, edge_mask_rate):
ng = drop_edge(graph, edge_mask_rate)
feat = drop_feat(x, feat_drop_rate)
ng = ng.add_self_loop()
return ng, feat
def drop_edge(graph, drop_prob):
E = graph.num_edges()
mask_rates = th.FloatTensor(np.ones(E) * drop_prob)
masks = th.bernoulli(1 - mask_rates)
edge_idx = masks.nonzero().squeeze(1)
sg = dgl.edge_subgraph(graph, edge_idx, preserve_nodes=True)
return sg
def drop_feat(x, drop_prob):
D = x.shape[1]
mask_rates = th.FloatTensor(np.ones(D) * drop_prob)
masks = th.bernoulli(1 - mask_rates)
x = x.clone()
x[:, masks] = 0
return x
\ No newline at end of file
cora:
learning_rate: 0.0005
num_hidden: 128
num_proj_hidden: 128
activation: 'relu'
num_layers: 2
drop_edge_rate_1: 0.2
drop_edge_rate_2: 0.4
drop_feature_rate_1: 0.3
drop_feature_rate_2: 0.4
tau: 0.4
num_epochs: 400
weight_decay: 0.00001
citeseer:
learning_rate: 0.001
num_hidden: 256
num_proj_hidden: 256
activation: 'prelu'
num_layers: 2
drop_edge_rate_1: 0.2
drop_edge_rate_2: 0.0
drop_feature_rate_1: 0.3
drop_feature_rate_2: 0.2
tau: 0.9
num_epochs: 200
weight_decay: 0.00001
pubmed:
learning_rate: 0.001
num_hidden: 256
num_proj_hidden: 256
activation: 'relu'
num_layers: 2
drop_edge_rate_1: 0.4
drop_edge_rate_2: 0.1
drop_feature_rate_1: 0.0
drop_feature_rate_2: 0.2
tau: 0.7
num_epochs: 1500
weight_decay: 0.00001
from dgl.data import CoraGraphDataset, CiteseerGraphDataset, PubmedGraphDataset
def load(name):
if name == 'cora':
dataset = CoraGraphDataset()
elif name == 'citeseer':
dataset = CiteseerGraphDataset()
elif name == 'pubmed':
dataset = PubmedGraphDataset()
graph = dataset[0]
train_mask = graph.ndata.pop('train_mask')
test_mask = graph.ndata.pop('test_mask')
feat = graph.ndata.pop('feat')
labels = graph.ndata.pop('label')
return graph, feat, labels, train_mask, test_mask
\ No newline at end of file
'''
Code adapted from https://github.com/CRIPAC-DIG/GRACE
Linear evaluation on learned node embeddings
'''
import numpy as np
import functools
from sklearn.metrics import f1_score
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.multiclass import OneVsRestClassifier
from sklearn.preprocessing import normalize, OneHotEncoder
def repeat(n_times):
def decorator(f):
@functools.wraps(f)
def wrapper(*args, **kwargs):
results = [f(*args, **kwargs) for _ in range(n_times)]
statistics = {}
for key in results[0].keys():
values = [r[key] for r in results]
statistics[key] = {
'mean': np.mean(values),
'std': np.std(values)}
print_statistics(statistics, f.__name__)
return statistics
return wrapper
return decorator
def prob_to_one_hot(y_pred):
ret = np.zeros(y_pred.shape, np.bool)
indices = np.argmax(y_pred, axis=1)
for i in range(y_pred.shape[0]):
ret[i][indices[i]] = True
return ret
def print_statistics(statistics, function_name):
print(f'(E) | {function_name}:', end=' ')
for i, key in enumerate(statistics.keys()):
mean = statistics[key]['mean']
std = statistics[key]['std']
print(f'{key}={mean:.4f}+-{std:.4f}', end='')
if i != len(statistics.keys()) - 1:
print(',', end=' ')
else:
print()
@repeat(3)
def label_classification(embeddings, y, train_mask, test_mask, split='random', ratio=0.1):
X = embeddings.detach().cpu().numpy()
Y = y.detach().cpu().numpy()
Y = Y.reshape(-1, 1)
onehot_encoder = OneHotEncoder(categories='auto').fit(Y)
Y = onehot_encoder.transform(Y).toarray().astype(np.bool)
X = normalize(X, norm='l2')
if split == 'random':
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=1 - ratio)
elif split == 'public':
X_train = X[train_mask]
X_test = X[test_mask]
y_train = Y[train_mask]
y_test = Y[test_mask]
logreg = LogisticRegression(solver='liblinear')
c = 2.0 ** np.arange(-10, 10)
clf = GridSearchCV(estimator=OneVsRestClassifier(logreg),
param_grid=dict(estimator__C=c), n_jobs=8, cv=5,
verbose=0)
clf.fit(X_train, y_train)
y_pred = clf.predict_proba(X_test)
y_pred = prob_to_one_hot(y_pred)
micro = f1_score(y_test, y_pred, average="micro")
macro = f1_score(y_test, y_pred, average="macro")
return {
'F1Mi': micro,
'F1Ma': macro
}
import argparse
from model import Grace
from aug import aug
from dataset import load
import torch as th
import torch.nn as nn
import yaml
from yaml import SafeLoader
from eval import label_classification
import warnings
warnings.filterwarnings('ignore')
parser = argparse.ArgumentParser()
parser.add_argument('--dataname', type=str, default='cora', choices = ['cora', 'citeseer', 'pubmed'])
parser.add_argument('--gpu', type=int, default=0)
parser.add_argument('--split', type=str, default='random', choices = ['random', 'public'])
args = parser.parse_args()
if args.gpu != -1 and th.cuda.is_available():
args.device = 'cuda:{}'.format(args.gpu)
else:
args.device = 'cpu'
if __name__ == '__main__':
# Step 1: Load hyperparameters =================================================================== #
config = 'config.yaml'
config = yaml.load(open(config), Loader=SafeLoader)[args.dataname]
lr = config['learning_rate']
hid_dim = config['num_hidden']
out_dim = config['num_proj_hidden']
num_layers = config['num_layers']
act_fn = ({'relu': nn.ReLU(), 'prelu': nn.PReLU()})[config['activation']]
drop_edge_rate_1 = config['drop_edge_rate_1']
drop_edge_rate_2 = config['drop_edge_rate_2']
drop_feature_rate_1 = config['drop_feature_rate_1']
drop_feature_rate_2 = config['drop_feature_rate_2']
temp = config['tau']
epochs = config['num_epochs']
wd = config['weight_decay']
# Step 2: Prepare data =================================================================== #
graph, feat, labels, train_mask, test_mask = load(args.dataname)
in_dim = feat.shape[1]
# Step 3: Create model =================================================================== #
model = Grace(in_dim, hid_dim, out_dim, num_layers, act_fn, temp)
model = model.to(args.device)
optimizer = th.optim.Adam(model.parameters(), lr=lr, weight_decay=wd)
# Step 4: Training ======================================================================= #
for epoch in range(epochs):
model.train()
optimizer.zero_grad()
graph1, feat1 = aug(graph, feat, drop_feature_rate_1, drop_edge_rate_1)
graph2, feat2 = aug(graph, feat, drop_feature_rate_2, drop_edge_rate_2)
graph1 = graph1.to(args.device)
graph2 = graph2.to(args.device)
feat1 = feat1.to(args.device)
feat2 = feat2.to(args.device)
loss = model(graph1, graph2, feat1, feat2)
loss.backward()
optimizer.step()
print(f'Epoch={epoch:03d}, loss={loss.item():.4f}')
# Step 5: Linear evaluation ============================================================== #
print("=== Final Evaluation ===")
graph = graph.add_self_loop()
graph = graph.to(args.device)
feat = feat.to(args.device)
embeds = model.get_embedding(graph, feat)
'''Evaluation Embeddings '''
label_classification(embeds, labels, train_mask, test_mask, split=args.split, ratio=0.1)
\ No newline at end of file
import torch as th
import torch.nn as nn
import torch.nn.functional as F
from dgl.nn import GraphConv
# Multi-layer Graph Convolutional Networks
class GCN(nn.Module):
def __init__(self, in_dim, out_dim, act_fn, num_layers = 2):
super(GCN, self).__init__()
assert num_layers >= 2
self.num_layers = num_layers
self.convs = nn.ModuleList()
self.convs.append(GraphConv(in_dim, out_dim * 2))
for _ in range(self.num_layers - 2):
self.convs.append(GraphConv(out_dim * 2, out_dim * 2))
self.convs.append(GraphConv(out_dim * 2, out_dim))
self.act_fn = act_fn
def forward(self, graph, feat):
for i in range(self.num_layers):
feat = self.act_fn(self.convs[i](graph, feat))
return feat
# Multi-layer(2-layer) Perceptron
class MLP(nn.Module):
def __init__(self, in_dim, out_dim):
super(MLP, self).__init__()
self.fc1 = nn.Linear(in_dim, out_dim)
self.fc2 = nn.Linear(out_dim, in_dim)
def forward(self, x):
z = F.elu(self.fc1(x))
return self.fc2(z)
class Grace(nn.Module):
r"""
GRACE model
Parameters
-----------
in_dim: int
Input feature size.
hid_dim: int
Hidden feature size.
out_dim: int
Output feature size.
num_layers: int
Number of the GNN encoder layers.
act_fn: nn.Module
Activation function.
temp: float
Temperature constant.
"""
def __init__(self, in_dim, hid_dim, out_dim, num_layers, act_fn, temp):
super(Grace, self).__init__()
self.encoder = GCN(in_dim, hid_dim, act_fn, num_layers)
self.temp = temp
self.proj = MLP(hid_dim, out_dim)
def sim(self, z1, z2):
# normalize embeddings across feature dimension
z1 = F.normalize(z1)
z2 = F.normalize(z2)
s = th.mm(z1, z2.t())
return s
def get_loss(self, z1, z2):
# calculate SimCLR loss
f = lambda x: th.exp(x / self.temp)
refl_sim = f(self.sim(z1, z1)) # intra-view pairs
between_sim = f(self.sim(z1, z2)) # inter-view pairs
# between_sim.diag(): positive pairs
x1 = refl_sim.sum(1) + between_sim.sum(1) - refl_sim.diag()
loss = -th.log(between_sim.diag() / x1)
return loss
def get_embedding(self, graph, feat):
# get embeddings from the model for evaluation
h = self.encoder(graph, feat)
return h.detach()
def forward(self, graph1, graph2, feat1, feat2):
# encoding
h1 = self.encoder(graph1, feat1)
h2 = self.encoder(graph2, feat2)
# projection
z1 = self.proj(h1)
z2 = self.proj(h2)
# get loss
l1 = self.get_loss(z1, z2)
l2 = self.get_loss(z2, z1)
ret = (l1 + l2) * 0.5
return ret.mean()
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment