"tests/git@developer.sourcefind.cn:OpenDAS/dgl.git" did not exist on "de54891fc89c958a4c1b1b37bf3b3d2155c6e9ab"
Unverified Commit 85231dc9 authored by xnouhz's avatar xnouhz Committed by GitHub
Browse files

[Example] Label Propagation and Correct&Smooth (#2852)



* [example] label propagation and correct&smooth

* update

* update

* update

* update

* [docs] add speed for c&s

* update

* [fix] remove gat&consistent with the author's code

* [feat] multi-type adj norm supported

* update
Co-authored-by: default avatarMufei Li <mufeili1996@gmail.com>
parent 5de4edd1
......@@ -95,12 +95,17 @@ The folder contains example implementations of selected research papers related
| [DeeperGCN: All You Need to Train Deeper GCNs](#deepergcn) | | | :heavy_check_mark: | | :heavy_check_mark: |
| [Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forcasting](#dcrnn) | | | :heavy_check_mark: | | |
| [GaAN: Gated Attention Networks for Learning on large and Spatiotemporal Graphs](#gaan) | | | :heavy_check_mark: | | |
| [Combining Label Propagation and Simple Models Out-performs Graph Neural Networks](#correct_and_smooth) | :heavy_check_mark: | | | | :heavy_check_mark: |
| [Learning from Labeled and Unlabeled Data with Label Propagation](#label_propagation) | :heavy_check_mark: | | | | |
## 2021
- <a name="bgnn"></a> Ivanov et al. Boost then Convolve: Gradient Boosting Meets Graph Neural Networks. [Paper link](https://openreview.net/forum?id=ebS5NUfoMKL).
- Example code: [PyTorch](../examples/pytorch/bgnn)
- Tags: semi-supervised node classification, tabular data, GBDT
- <a name="correct_and_smooth"></a> Huang et al. Combining Label Propagation and Simple Models Out-performs Graph Neural Networks. [Paper link](https://arxiv.org/abs/2010.13993).
- Example code: [PyTorch](../examples/pytorch/correct_and_smooth)
- Tags: efficiency, node classification, label propagation
## 2020
......@@ -142,7 +147,7 @@ The folder contains example implementations of selected research papers related
- Tags: molecules, molecular property prediction, quantum chemistry
- <a name="tgn"></a> Rossi et al. Temporal Graph Networks For Deep Learning on Dynamic Graphs. [Paper link](https://arxiv.org/abs/2006.10637).
- Example code: [Pytorch](../examples/pytorch/tgn)
- Tags: over-smoothing, node classification
- Tags: temporal, node classification
- <a name="compgcn"></a> Vashishth, Shikhar, et al. Composition-based Multi-Relational Graph Convolutional Networks. [Paper link](https://arxiv.org/abs/1911.03082).
- Example code: [PyTorch](../examples/pytorch/compGCN)
- Tags: multi-relational graphs, graph neural network
......@@ -152,7 +157,6 @@ The folder contains example implementations of selected research papers related
## 2019
- <a name="infograph"></a> Sun et al. InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization. [Paper link](https://arxiv.org/abs/1908.01000).
- Example code: [PyTorch](../examples/pytorch/infograph)
- Tags: semi-supervised graph regression, unsupervised graph classification
......@@ -229,7 +233,6 @@ The folder contains example implementations of selected research papers related
- Example code: [PyTorch](../examples/pytorch/gnn_explainer)
- Tags: Graph Neural Network, Explainability
## 2018
- <a name="dgmg"></a> Li et al. Learning Deep Generative Models of Graphs. [Paper link](https://arxiv.org/abs/1803.03324).
......@@ -419,6 +422,12 @@ The folder contains example implementations of selected research papers related
- Example code: [PyTorch](../examples/pytorch/graph_matching)
- Tags: graph edit distance, graph matching
## 2002
- <a name="label_propagation"></a> Zhu & Ghahramani. Learning from Labeled and Unlabeled Data with Label Propagation. [Paper link](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.3864&rep=rep1&type=pdf).
- Example code: [PyTorch](../examples/pytorch/label_propagation)
- Tags: node classification, label propagation
## 1998
- <a name="pagerank"></a> Page et al. The PageRank Citation Ranking: Bringing Order to the Web. [Paper link](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.5427).
......
# DGL Implementation of CorrectAndSmooth
This DGL example implements the GNN model proposed in the paper [Combining Label Propagation and Simple Models Out-performs Graph Neural Networks](https://arxiv.org/abs/2010.13993). For the original implementation, see [here](https://github.com/CUAI/CorrectAndSmooth).
Contributor: [xnuohz](https://github.com/xnuohz)
### Requirements
The codebase is implemented in Python 3.7. For version requirement of packages, see below.
```
dgl 0.6.0.post1
torch 1.7.0
ogb 1.3.0
```
### The graph datasets used in this example
Open Graph Benchmark(OGB). Dataset summary:
| Dataset | #Nodes | #Edges | #Node Feats | Metric |
| :-----------: | :-------: | :--------: | :---------: | :------: |
| ogbn-arxiv | 169,343 | 1,166,243 | 128 | Accuracy |
| ogbn-products | 2,449,029 | 61,859,140 | 100 | Accuracy |
### Usage
Training a **Base predictor** and using **Correct&Smooth** which follows the original hyperparameters on different datasets.
##### ogbn-arxiv
* **MLP + C&S**
```bash
python main.py --dropout 0.5
python main.py --pretrain --correction-adj DA --smoothing-adj AD
```
* **Linear + C&S**
```bash
python main.py --model linear --dropout 0.5 --epochs 1000
python main.py --model linear --pretrain --correction-alpha 0.8 --smoothing-alpha 0.6 --correction-adj AD
```
##### ogbn-products
* **Linear + C&S**
```bash
python main.py --dataset ogbn-products --model linear --dropout 0.5 --epochs 1000 --lr 0.1
python main.py --dataset ogbn-products --model linear --pretrain --correction-alpha 0.6 --smoothing-alpha 0.9
```
### Performance
#### ogbn-arxiv
| | MLP | MLP + C&S | Linear | Linear + C&S |
| :-------------: | :---: | :-------: | :----: | :----------: |
| Results(Author) | 55.58 | 68.72 | 51.06 | 70.24 |
| Results(DGL) | 56.12 | 68.63 | 52.49 | 71.69 |
#### ogbn-products
| | Linear | Linear + C&S |
| :-------------: | :----: | :----------: |
| Results(Author) | 47.67 | 82.34 |
| Results(DGL) | 47.71 | 79.57 |
### Speed
| ogb-arxiv | Time | GPU Memory | Params |
| :------------------: | :-----------: | :--------: | :-----: |
| Author, Linear + C&S | 6.3 * 10 ^ -3 | 1,248M | 5,160 |
| DGL, Linear + C&S | 5.6 * 10 ^ -3 | 1,252M | 5,160 |
import argparse
import copy
import os
import torch
import torch.nn.functional as F
import torch.optim as optim
import dgl
from ogb.nodeproppred import DglNodePropPredDataset, Evaluator
from model import MLP, MLPLinear, CorrectAndSmooth
def evaluate(y_pred, y_true, idx, evaluator):
return evaluator.eval({
'y_true': y_true[idx],
'y_pred': y_pred[idx]
})['acc']
def main():
# check cuda
device = f'cuda:{args.gpu}' if torch.cuda.is_available() and args.gpu >= 0 else 'cpu'
# load data
dataset = DglNodePropPredDataset(name=args.dataset)
evaluator = Evaluator(name=args.dataset)
split_idx = dataset.get_idx_split()
g, labels = dataset[0] # graph: DGLGraph object, label: torch tensor of shape (num_nodes, num_tasks)
if args.dataset == 'ogbn-arxiv':
g = dgl.to_bidirected(g, copy_ndata=True)
feat = g.ndata['feat']
feat = (feat - feat.mean(0)) / feat.std(0)
g.ndata['feat'] = feat
g = g.to(device)
feats = g.ndata['feat']
labels = labels.to(device)
# load masks for train / validation / test
train_idx = split_idx["train"].to(device)
valid_idx = split_idx["valid"].to(device)
test_idx = split_idx["test"].to(device)
n_features = feats.size()[-1]
n_classes = dataset.num_classes
# load model
if args.model == 'mlp':
model = MLP(n_features, args.hid_dim, n_classes, args.num_layers, args.dropout)
elif args.model == 'linear':
model = MLPLinear(n_features, n_classes)
else:
raise NotImplementedError(f'Model {args.model} is not supported.')
model = model.to(device)
print(f'Model parameters: {sum(p.numel() for p in model.parameters())}')
if args.pretrain:
print('---------- Before ----------')
model.load_state_dict(torch.load(f'base/{args.dataset}-{args.model}.pt'))
model.eval()
y_soft = model(feats).exp()
y_pred = y_soft.argmax(dim=-1, keepdim=True)
valid_acc = evaluate(y_pred, labels, valid_idx, evaluator)
test_acc = evaluate(y_pred, labels, test_idx, evaluator)
print(f'Valid acc: {valid_acc:.4f} | Test acc: {test_acc:.4f}')
print('---------- Correct & Smoothing ----------')
cs = CorrectAndSmooth(num_correction_layers=args.num_correction_layers,
correction_alpha=args.correction_alpha,
correction_adj=args.correction_adj,
num_smoothing_layers=args.num_smoothing_layers,
smoothing_alpha=args.smoothing_alpha,
smoothing_adj=args.smoothing_adj,
scale=args.scale)
mask_idx = torch.cat([train_idx, valid_idx])
y_soft = cs.correct(g, y_soft, labels[mask_idx], mask_idx)
y_soft = cs.smooth(g, y_soft, labels[mask_idx], mask_idx)
y_pred = y_soft.argmax(dim=-1, keepdim=True)
valid_acc = evaluate(y_pred, labels, valid_idx, evaluator)
test_acc = evaluate(y_pred, labels, test_idx, evaluator)
print(f'Valid acc: {valid_acc:.4f} | Test acc: {test_acc:.4f}')
else:
opt = optim.Adam(model.parameters(), lr=args.lr)
best_acc = 0
best_model = copy.deepcopy(model)
# training
print('---------- Training ----------')
for i in range(args.epochs):
model.train()
opt.zero_grad()
logits = model(feats)
train_loss = F.nll_loss(logits[train_idx], labels.squeeze(1)[train_idx])
train_loss.backward()
opt.step()
model.eval()
with torch.no_grad():
logits = model(feats)
y_pred = logits.argmax(dim=-1, keepdim=True)
train_acc = evaluate(y_pred, labels, train_idx, evaluator)
valid_acc = evaluate(y_pred, labels, valid_idx, evaluator)
print(f'Epoch {i} | Train loss: {train_loss.item():.4f} | Train acc: {train_acc:.4f} | Valid acc {valid_acc:.4f}')
if valid_acc > best_acc:
best_acc = valid_acc
best_model = copy.deepcopy(model)
# testing & saving model
print('---------- Testing ----------')
best_model.eval()
logits = best_model(feats)
y_pred = logits.argmax(dim=-1, keepdim=True)
test_acc = evaluate(y_pred, labels, test_idx, evaluator)
print(f'Test acc: {test_acc:.4f}')
if not os.path.exists('base'):
os.makedirs('base')
torch.save(best_model.state_dict(), f'base/{args.dataset}-{args.model}.pt')
if __name__ == '__main__':
"""
Correct & Smoothing Hyperparameters
"""
parser = argparse.ArgumentParser(description='Base predictor(C&S)')
# Dataset
parser.add_argument('--gpu', type=int, default=0, help='-1 for cpu')
parser.add_argument('--dataset', type=str, default='ogbn-arxiv', choices=['ogbn-arxiv', 'ogbn-products'])
# Base predictor
parser.add_argument('--model', type=str, default='mlp', choices=['mlp', 'linear'])
parser.add_argument('--num-layers', type=int, default=3)
parser.add_argument('--hid-dim', type=int, default=256)
parser.add_argument('--dropout', type=float, default=0.4)
parser.add_argument('--lr', type=float, default=0.01)
parser.add_argument('--epochs', type=int, default=300)
# extra options for gat
parser.add_argument('--n-heads', type=int, default=3)
parser.add_argument('--attn_drop', type=float, default=0.05)
# C & S
parser.add_argument('--pretrain', action='store_true', help='Whether to perform C & S')
parser.add_argument('--num-correction-layers', type=int, default=50)
parser.add_argument('--correction-alpha', type=float, default=0.979)
parser.add_argument('--correction-adj', type=str, default='DAD')
parser.add_argument('--num-smoothing-layers', type=int, default=50)
parser.add_argument('--smoothing-alpha', type=float, default=0.756)
parser.add_argument('--smoothing-adj', type=str, default='DAD')
parser.add_argument('--scale', type=float, default=20.)
args = parser.parse_args()
print(args)
main()
import torch
import torch.nn as nn
import torch.nn.functional as F
import dgl.function as fn
class MLPLinear(nn.Module):
def __init__(self, in_dim, out_dim):
super(MLPLinear, self).__init__()
self.linear = nn.Linear(in_dim, out_dim)
self.reset_parameters()
def reset_parameters(self):
self.linear.reset_parameters()
def forward(self, x):
return F.log_softmax(self.linear(x), dim=-1)
class MLP(nn.Module):
def __init__(self, in_dim, hid_dim, out_dim, num_layers, dropout=0.):
super(MLP, self).__init__()
assert num_layers >= 2
self.linears = nn.ModuleList()
self.bns = nn.ModuleList()
self.linears.append(nn.Linear(in_dim, hid_dim))
self.bns.append(nn.BatchNorm1d(hid_dim))
for _ in range(num_layers - 2):
self.linears.append(nn.Linear(hid_dim, hid_dim))
self.bns.append(nn.BatchNorm1d(hid_dim))
self.linears.append(nn.Linear(hid_dim, out_dim))
self.dropout = dropout
self.reset_parameters()
def reset_parameters(self):
for layer in self.linears:
layer.reset_parameters()
for layer in self.bns:
layer.reset_parameters()
def forward(self, x):
for linear, bn in zip(self.linears[:-1], self.bns):
x = linear(x)
x = F.relu(x, inplace=True)
x = bn(x)
x = F.dropout(x, p=self.dropout, training=self.training)
x = self.linears[-1](x)
return F.log_softmax(x, dim=-1)
class LabelPropagation(nn.Module):
r"""
Description
-----------
Introduced in `Learning from Labeled and Unlabeled Data with Label Propagation <https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.3864&rep=rep1&type=pdf>`_
.. math::
\mathbf{Y}^{\prime} = \alpha \cdot \mathbf{D}^{-1/2} \mathbf{A}
\mathbf{D}^{-1/2} \mathbf{Y} + (1 - \alpha) \mathbf{Y},
where unlabeled data is inferred by labeled data via propagation.
Parameters
----------
num_layers: int
The number of propagations.
alpha: float
The :math:`\alpha` coefficient.
adj: str
'DAD': D^-0.5 * A * D^-0.5
'DA': D^-1 * A
'AD': A * D^-1
"""
def __init__(self, num_layers, alpha, adj='DAD'):
super(LabelPropagation, self).__init__()
self.num_layers = num_layers
self.alpha = alpha
self.adj = adj
@torch.no_grad()
def forward(self, g, labels, mask=None, post_step=lambda y: y.clamp_(0., 1.)):
with g.local_scope():
if labels.dtype == torch.long:
labels = F.one_hot(labels.view(-1)).to(torch.float32)
y = labels
if mask is not None:
y = torch.zeros_like(labels)
y[mask] = labels[mask]
last = (1 - self.alpha) * y
degs = g.in_degrees().float().clamp(min=1)
norm = torch.pow(degs, -0.5 if self.adj == 'DAD' else -1).to(labels.device).unsqueeze(1)
for _ in range(self.num_layers):
# Assume the graphs to be undirected
if self.adj in ['DAD', 'AD']:
y = norm * y
g.ndata['h'] = y
g.update_all(fn.copy_u('h', 'm'), fn.sum('m', 'h'))
y = self.alpha * g.ndata.pop('h')
if self.adj in ['DAD', 'DA']:
y = y * norm
y = post_step(last + y)
return y
class CorrectAndSmooth(nn.Module):
r"""
Description
-----------
Introduced in `Combining Label Propagation and Simple Models Out-performs Graph Neural Networks <https://arxiv.org/abs/2010.13993>`_
Parameters
----------
num_correction_layers: int
The number of correct propagations.
correction_alpha: float
The coefficient of correction.
correction_adj: str
'DAD': D^-0.5 * A * D^-0.5
'DA': D^-1 * A
'AD': A * D^-1
num_smoothing_layers: int
The number of smooth propagations.
smoothing_alpha: float
The coefficient of smoothing.
smoothing_adj: str
'DAD': D^-0.5 * A * D^-0.5
'DA': D^-1 * A
'AD': A * D^-1
autoscale: bool, optional
If set to True, will automatically determine the scaling factor :math:`\sigma`. Default is True.
scale: float, optional
The scaling factor :math:`\sigma`, in case :obj:`autoscale = False`. Default is 1.
"""
def __init__(self,
num_correction_layers,
correction_alpha,
correction_adj,
num_smoothing_layers,
smoothing_alpha,
smoothing_adj,
autoscale=True,
scale=1.):
super(CorrectAndSmooth, self).__init__()
self.autoscale = autoscale
self.scale = scale
self.prop1 = LabelPropagation(num_correction_layers,
correction_alpha,
correction_adj)
self.prop2 = LabelPropagation(num_smoothing_layers,
smoothing_alpha,
correction_adj)
def correct(self, g, y_soft, y_true, mask):
with g.local_scope():
assert abs(float(y_soft.sum()) / y_soft.size(0) - 1.0) < 1e-2
numel = int(mask.sum()) if mask.dtype == torch.bool else mask.size(0)
assert y_true.size(0) == numel
if y_true.dtype == torch.long:
y_true = F.one_hot(y_true.view(-1), y_soft.size(-1)).to(y_soft.dtype)
error = torch.zeros_like(y_soft)
error[mask] = y_true - y_soft[mask]
if self.autoscale:
smoothed_error = self.prop1(g, error, post_step=lambda x: x.clamp_(-1., 1.))
sigma = error[mask].abs().sum() / numel
scale = sigma / smoothed_error.abs().sum(dim=1, keepdim=True)
scale[scale.isinf() | (scale > 1000)] = 1.0
result = y_soft + scale * smoothed_error
result[result.isnan()] = y_soft[result.isnan()]
return result
else:
def fix_input(x):
x[mask] = error[mask]
return x
smoothed_error = self.prop1(g, error, post_step=fix_input)
result = y_soft + self.scale * smoothed_error
result[result.isnan()] = y_soft[result.isnan()]
return result
def smooth(self, g, y_soft, y_true, mask):
with g.local_scope():
numel = int(mask.sum()) if mask.dtype == torch.bool else mask.size(0)
assert y_true.size(0) == numel
if y_true.dtype == torch.long:
y_true = F.one_hot(y_true.view(-1), y_soft.size(-1)).to(y_soft.dtype)
y_soft[mask] = y_true
return self.prop2(g, y_soft)
# DGL Implementation of Label Propagation
This DGL example implements the method proposed in the paper [Learning from Labeled and Unlabeled Data with Label Propagation](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.3864&rep=rep1&type=pdf).
Contributor: [xnuohz](https://github.com/xnuohz)
### Requirements
The codebase is implemented in Python 3.7. For version requirement of packages, see below.
```
dgl 0.6.0.post1
torch 1.7.0
```
### The graph datasets used in this example
The DGL's built-in Cora, Pubmed and Citeseer datasets. Dataset summary:
| Dataset | #Nodes | #Edges | #Feats | #Classes | #Train Nodes | #Val Nodes | #Test Nodes |
| :------: | :----: | :----: | :----: | :------: | :----------: | :--------: | :---------: |
| Citeseer | 3,327 | 9,228 | 3,703 | 6 | 120 | 500 | 1000 |
| Cora | 2,708 | 10,556 | 1,433 | 7 | 140 | 500 | 1000 |
| Pubmed | 19,717 | 88,651 | 500 | 3 | 60 | 500 | 1000 |
### Usage
```bash
# Cora
python main.py
# Citeseer
python main.py --dataset Citeseer --num-layers 100 --alpha 0.99
# Pubmed
python main.py --dataset Pubmed --num-layers 60 --alpha 1
```
### Performance
| Dataset | Cora | Citeseer | Pubmed |
| :----------: | :---: | :------: | :----: |
| Results(DGL) | 69.20 | 51.30 | 71.40 |
import argparse
import torch
import dgl
from dgl.data import CoraGraphDataset, CiteseerGraphDataset, PubmedGraphDataset
from model import LabelPropagation
def main():
# check cuda
device = f'cuda:{args.gpu}' if torch.cuda.is_available() and args.gpu >= 0 else 'cpu'
# load data
if args.dataset == 'Cora':
dataset = CoraGraphDataset()
elif args.dataset == 'Citeseer':
dataset = CiteseerGraphDataset()
elif args.dataset == 'Pubmed':
dataset = PubmedGraphDataset()
else:
raise ValueError('Dataset {} is invalid.'.format(args.dataset))
g = dataset[0]
g = dgl.add_self_loop(g)
labels = g.ndata.pop('label').to(device).long()
# load masks for train / test, valid is not used.
train_mask = g.ndata.pop('train_mask')
test_mask = g.ndata.pop('test_mask')
train_idx = torch.nonzero(train_mask, as_tuple=False).squeeze().to(device)
test_idx = torch.nonzero(test_mask, as_tuple=False).squeeze().to(device)
g = g.to(device)
# label propagation
lp = LabelPropagation(args.num_layers, args.alpha)
logits = lp(g, labels, mask=train_idx)
test_acc = torch.sum(logits[test_idx].argmax(dim=1) == labels[test_idx]).item() / len(test_idx)
print("Test Acc {:.4f}".format(test_acc))
if __name__ == '__main__':
"""
Label Propagation Hyperparameters
"""
parser = argparse.ArgumentParser(description='LP')
parser.add_argument('--gpu', type=int, default=0)
parser.add_argument('--dataset', type=str, default='Cora')
parser.add_argument('--num-layers', type=int, default=10)
parser.add_argument('--alpha', type=float, default=0.5)
args = parser.parse_args()
print(args)
main()
import torch
import torch.nn as nn
import torch.nn.functional as F
import dgl.function as fn
class LabelPropagation(nn.Module):
r"""
Description
-----------
Introduced in `Learning from Labeled and Unlabeled Data with Label Propagation <https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.3864&rep=rep1&type=pdf>`_
.. math::
\mathbf{Y}^{\prime} = \alpha \cdot \mathbf{D}^{-1/2} \mathbf{A}
\mathbf{D}^{-1/2} \mathbf{Y} + (1 - \alpha) \mathbf{Y},
where unlabeled data is inferred by labeled data via propagation.
Parameters
----------
num_layers: int
The number of propagations.
alpha: float
The :math:`\alpha` coefficient.
"""
def __init__(self, num_layers, alpha):
super(LabelPropagation, self).__init__()
self.num_layers = num_layers
self.alpha = alpha
@torch.no_grad()
def forward(self, g, labels, mask=None, post_step=lambda y: y.clamp_(0., 1.)):
with g.local_scope():
if labels.dtype == torch.long:
labels = F.one_hot(labels.view(-1)).to(torch.float32)
y = labels
if mask is not None:
y = torch.zeros_like(labels)
y[mask] = labels[mask]
last = (1 - self.alpha) * y
degs = g.in_degrees().float().clamp(min=1)
norm = torch.pow(degs, -0.5).to(labels.device).unsqueeze(1)
for _ in range(self.num_layers):
# Assume the graphs to be undirected
g.ndata['h'] = y * norm
g.update_all(fn.copy_u('h', 'm'), fn.sum('m', 'h'))
y = last + self.alpha * g.ndata.pop('h') * norm
y = post_step(y)
last = (1 - self.alpha) * y
return y
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment