Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
dgl
Commits
c63a926d
Unverified
Commit
c63a926d
authored
Nov 27, 2023
by
Rhett Ying
Committed by
GitHub
Nov 27, 2023
Browse files
[doc] remove deprecated tutoriasl for minibatch training (#6625)
parent
74684bbe
Changes
8
Show whitespace changes
Inline
Side-by-side
Showing
8 changed files
with
4 additions
and
699 deletions
+4
-699
docs/source/conf.py
docs/source/conf.py
+0
-2
docs/source/index.rst
docs/source/index.rst
+1
-2
docs/source/notebooks/stochastic_training/index.rst
docs/source/notebooks/stochastic_training/index.rst
+3
-3
tutorials/large/.gitignore
tutorials/large/.gitignore
+0
-2
tutorials/large/L0_neighbor_sampling_overview.py
tutorials/large/L0_neighbor_sampling_overview.py
+0
-119
tutorials/large/L1_large_node_classification.py
tutorials/large/L1_large_node_classification.py
+0
-312
tutorials/large/L2_large_link_prediction.py
tutorials/large/L2_large_link_prediction.py
+0
-257
tutorials/large/README.txt
tutorials/large/README.txt
+0
-2
No files found.
docs/source/conf.py
View file @
c63a926d
...
@@ -211,7 +211,6 @@ from sphinx_gallery.sorting import FileNameSortKey
...
@@ -211,7 +211,6 @@ from sphinx_gallery.sorting import FileNameSortKey
examples_dirs
=
[
examples_dirs
=
[
"../../tutorials/blitz"
,
"../../tutorials/blitz"
,
"../../tutorials/large"
,
"../../tutorials/dist"
,
"../../tutorials/dist"
,
"../../tutorials/models"
,
"../../tutorials/models"
,
"../../tutorials/multi"
,
"../../tutorials/multi"
,
...
@@ -219,7 +218,6 @@ examples_dirs = [
...
@@ -219,7 +218,6 @@ examples_dirs = [
]
# path to find sources
]
# path to find sources
gallery_dirs
=
[
gallery_dirs
=
[
"tutorials/blitz/"
,
"tutorials/blitz/"
,
"tutorials/large/"
,
"tutorials/dist/"
,
"tutorials/dist/"
,
"tutorials/models/"
,
"tutorials/models/"
,
"tutorials/multi/"
,
"tutorials/multi/"
,
...
...
docs/source/index.rst
View file @
c63a926d
...
@@ -28,7 +28,6 @@ Welcome to Deep Graph Library Tutorials and Documentation
...
@@ -28,7 +28,6 @@ Welcome to Deep Graph Library Tutorials and Documentation
guide_ko/index
guide_ko/index
notebooks/sparse/index
notebooks/sparse/index
notebooks/stochastic_training/index
notebooks/stochastic_training/index
tutorials/large/index
tutorials/cpu/index
tutorials/cpu/index
tutorials/multi/index
tutorials/multi/index
tutorials/dist/index
tutorials/dist/index
...
@@ -100,7 +99,7 @@ For acquainted users who wish to learn more advanced usage,
...
@@ -100,7 +99,7 @@ For acquainted users who wish to learn more advanced usage,
* `Learn DGL by examples <https://github.com/dmlc/dgl/tree/master/examples>`_.
* `Learn DGL by examples <https://github.com/dmlc/dgl/tree/master/examples>`_.
* Read the :doc:`User Guide<guide/index>` (:doc:`中文版链接<guide_cn/index>`), which explains the concepts
* Read the :doc:`User Guide<guide/index>` (:doc:`中文版链接<guide_cn/index>`), which explains the concepts
and usage of DGL in much more details.
and usage of DGL in much more details.
* Go through the tutorials for :doc:`Stochastic Training of GNNs <
tutorials/large
/index>`,
* Go through the tutorials for :doc:`Stochastic Training of GNNs <
notebooks/stochastic_training
/index>`,
which covers the basic steps for training GNNs on large graphs in mini-batches.
which covers the basic steps for training GNNs on large graphs in mini-batches.
* :doc:`Study classical papers <tutorials/models/index>` on graph machine learning alongside DGL.
* :doc:`Study classical papers <tutorials/models/index>` on graph machine learning alongside DGL.
* Search for the usage of a specific API in the :doc:`API reference manual <api/python/index>`,
* Search for the usage of a specific API in the :doc:`API reference manual <api/python/index>`,
...
...
docs/source/notebooks/stochastic_training/index.rst
View file @
c63a926d
GNN
Stochastic Training
Stochastic Training
of GNNs
=========================
=========================
==
This tutorial introduces how to train GNNs with stochastic training.
This tutorial introduces how to train GNNs with stochastic training.
...
@@ -7,6 +7,6 @@ This tutorial introduces how to train GNNs with stochastic training.
...
@@ -7,6 +7,6 @@ This tutorial introduces how to train GNNs with stochastic training.
:maxdepth: 1
:maxdepth: 1
:titlesonly:
:titlesonly:
neighbor_sampling_overview.nblink
node_classification.nblink
node_classification.nblink
link_prediction.nblink
link_prediction.nblink
neighbor_sampling_overview.nblink
tutorials/large/.gitignore
deleted
100644 → 0
View file @
74684bbe
dataset
model.pt
tutorials/large/L0_neighbor_sampling_overview.py
deleted
100644 → 0
View file @
74684bbe
"""
Introduction of Neighbor Sampling
=================================
In :doc:`previous tutorials <../blitz/1_introduction>` you have learned how to
train GNNs by computing the representations of all nodes on a graph.
However, sometimes your graph is too large to fit the computation of all
nodes in a single GPU.
By the end of this tutorial, you will be able to
- Understand the pipeline of stochastic GNN training.
- Understand what is neighbor sampling and why it yields a bipartite
graph for each GNN layer.
"""
######################################################################
# Message Passing Review
# ----------------------
#
# Recall that in `Gilmer et al. <https://arxiv.org/abs/1704.01212>`__
# (also in :doc:`message passing tutorial <../blitz/3_message_passing>`), the
# message passing formulation is as follows:
#
# .. math::
#
#
# m_{u\to v}^{(l)} = M^{(l)}\left(h_v^{(l-1)}, h_u^{(l-1)}, e_{u\to v}^{(l-1)}\right)
#
# .. math::
#
#
# m_{v}^{(l)} = \sum_{u\in\mathcal{N}(v)}m_{u\to v}^{(l)}
#
# .. math::
#
#
# h_v^{(l)} = U^{(l)}\left(h_v^{(l-1)}, m_v^{(l)}\right)
#
# where DGL calls :math:`M^{(l)}` the *message function*, :math:`\sum` the
# *reduce function* and :math:`U^{(l)}` the *update function*. Note that
# :math:`\sum` here can represent any function and is not necessarily a
# summation.
#
# Essentially, the :math:`l`-th layer representation of a single node
# depends on the :math:`(l-1)`-th layer representation of the same node,
# as well as the :math:`(l-1)`-th layer representation of the neighboring
# nodes. Those :math:`(l-1)`-th layer representations then depend on the
# :math:`(l-2)`-th layer representation of those nodes, as well as their
# neighbors.
#
# The following animation shows how a 2-layer GNN is supposed to compute
# the output of node 5:
#
# |image1|
#
# You can see that to compute node 5 from the second layer, you will need
# its direct neighbors’ first layer representations (colored in yellow),
# which in turn needs their direct neighbors’ (i.e. node 5’s second-hop
# neighbors’) representations (colored in green).
#
# .. |image1| image:: https://data.dgl.ai/tutorial/img/sampling.gif
#
######################################################################
# Neighbor Sampling Overview
# --------------------------
#
# You can also see from the previous example that computing representation
# for a small number of nodes often requires input features of a
# significantly larger number of nodes. Taking all neighbors for message
# aggregation is often too costly since the nodes needed for input
# features would easily cover a large portion of the graph, especially for
# real-world graphs which are often
# `scale-free <https://en.wikipedia.org/wiki/Scale-free_network>`__.
#
# Neighbor sampling addresses this issue by selecting a subset of the
# neighbors to perform aggregation. For instance, to compute
# :math:`\boldsymbol{h}_5^{(2)}`, you can choose two of the neighbors
# instead of all of them to aggregate, as in the following animation:
#
# |image2|
#
# You can see that this method uses much fewer nodes needed in message
# passing for a single minibatch.
#
# .. |image2| image:: https://data.dgl.ai/tutorial/img/bipartite.gif
#
######################################################################
# You can also notice in the animation above that the computation
# dependencies in the animation above can be described as a series of
# bipartite graphs.
# The output nodes (called *destination nodes*) are on one side and all the
# nodes necessary for inputs (called *source nodes*) are on the other side.
# The arrows indicate how the sampled neighbors propagates messages to the nodes.
# DGL calls such graphs *message flow graphs* (MFG).
#
# Note that some GNN modules, such as `SAGEConv`, need to use the destination
# nodes' features on the previous layer to compute the outputs. Without
# loss of generality, DGL always includes the destination nodes themselves
# in the source nodes.
#
######################################################################
# What’s next?
# ------------
#
# :doc:`Stochastic GNN Training for Node Classification in
# DGL <L1_large_node_classification>`
#
# Thumbnail credits: Understanding graph embedding methods and their applications, Mengjia Xu
# sphinx_gallery_thumbnail_path = '_static/large_L0_neighbor_sampling_overview.png'
tutorials/large/L1_large_node_classification.py
deleted
100644 → 0
View file @
74684bbe
"""
Node Classification
===========================================================
This tutorial shows how to train a multi-layer GraphSAGE for node
classification on ``ogbn-arxiv`` provided by `Open Graph
Benchmark (OGB) <https://ogb.stanford.edu/>`__. The dataset contains around
170 thousand nodes and 1 million edges.
By the end of this tutorial, you will be able to
- Train a GNN model for node classification on a single GPU with DGL's
neighbor sampling components.
This tutorial assumes that you have read the :doc:`Introduction of Neighbor
Sampling for GNN Training <L0_neighbor_sampling_overview>`.
"""
######################################################################
# Loading Dataset
# ---------------
#
# `ogbn-arxiv` is already prepared as ``BuiltinDataset`` in GraphBolt.
#
import
os
os
.
environ
[
"DGLBACKEND"
]
=
"pytorch"
import
dgl
import
dgl.graphbolt
as
gb
import
numpy
as
np
import
torch
dataset
=
gb
.
BuiltinDataset
(
"ogbn-arxiv"
).
load
()
device
=
"cpu"
# change to 'cuda' for GPU
######################################################################
# Dataset consists of graph, feature and tasks. You can get the
# training-validation-test set from the tasks. Seed nodes and corresponding
# labels are already stored in each training-validation-test set. Other
# metadata such as number of classes are also stored in the tasks. In this
# dataset, there is only one task: `node classification`.
#
graph
=
dataset
.
graph
feature
=
dataset
.
feature
train_set
=
dataset
.
tasks
[
0
].
train_set
valid_set
=
dataset
.
tasks
[
0
].
validation_set
test_set
=
dataset
.
tasks
[
0
].
test_set
task_name
=
dataset
.
tasks
[
0
].
metadata
[
"name"
]
num_classes
=
dataset
.
tasks
[
0
].
metadata
[
"num_classes"
]
print
(
f
"Task:
{
task_name
}
. Number of classes:
{
num_classes
}
"
)
######################################################################
# How DGL Handles Computation Dependency
# --------------------------------------
#
# In the :doc:`previous tutorial <L0_neighbor_sampling_overview>`, you
# have seen that the computation dependency for message passing of a
# single node can be described as a series of *message flow graphs* (MFG).
#
# |image1|
#
# .. |image1| image:: https://data.dgl.ai/tutorial/img/bipartite.gif
#
######################################################################
# Defining Neighbor Sampler and Data Loader in DGL
# ------------------------------------------------
#
# DGL provides tools to iterate over the dataset in minibatches
# while generating the computation dependencies to compute their outputs
# with the MFGs above. For node classification, you can use
# ``dgl.graphbolt.MultiProcessDataLoader`` for iterating over the dataset.
# It accepts a data pipe that generates minibatches of nodes and their
# labels, sample neighbors for each node, and generate the computation
# dependencies in the form of MFGs. Feature fetching, block creation and
# copying to target device are also supported. All these operations are
# split into separate stages in the data pipe, so that you can customize
# the data pipeline by inserting your own operations.
#
# .. note::
#
# To write your own neighbor sampler, please refer to :ref:`this user
# guide section <guide-minibatch-customizing-neighborhood-sampler>`.
#
#
# Let’s say that each node will gather messages from 4 neighbors on each
# layer. The code defining the data loader and neighbor sampler will look
# like the following.
#
datapipe
=
gb
.
ItemSampler
(
train_set
,
batch_size
=
1024
,
shuffle
=
True
)
datapipe
=
datapipe
.
sample_neighbor
(
graph
,
[
4
,
4
])
datapipe
=
datapipe
.
fetch_feature
(
feature
,
node_feature_keys
=
[
"feat"
])
datapipe
=
datapipe
.
to_dgl
()
datapipe
=
datapipe
.
copy_to
(
device
)
train_dataloader
=
gb
.
MultiProcessDataLoader
(
datapipe
,
num_workers
=
0
)
######################################################################
# .. note::
#
# In this example, neighborhood sampling runs on CPU, If you are
# interested in running it on GPU, please refer to
# :ref:`guide-minibatch-gpu-sampling`.
#
######################################################################
# You can iterate over the data loader and a ``DGLMiniBatch`` object
# is yielded.
#
data
=
next
(
iter
(
train_dataloader
))
print
(
data
)
######################################################################
# You can get the input node IDs from MFGs.
#
mfgs
=
data
.
blocks
input_nodes
=
mfgs
[
0
].
srcdata
[
dgl
.
NID
]
print
(
f
"Input nodes:
{
input_nodes
}
."
)
######################################################################
# Defining Model
# --------------
#
# Let’s consider training a 2-layer GraphSAGE with neighbor sampling. The
# model can be written as follows:
#
import
torch.nn
as
nn
import
torch.nn.functional
as
F
from
dgl.nn
import
SAGEConv
class
Model
(
nn
.
Module
):
def
__init__
(
self
,
in_feats
,
h_feats
,
num_classes
):
super
(
Model
,
self
).
__init__
()
self
.
conv1
=
SAGEConv
(
in_feats
,
h_feats
,
aggregator_type
=
"mean"
)
self
.
conv2
=
SAGEConv
(
h_feats
,
num_classes
,
aggregator_type
=
"mean"
)
self
.
h_feats
=
h_feats
def
forward
(
self
,
mfgs
,
x
):
# Lines that are changed are marked with an arrow: "<---"
h_dst
=
x
[:
mfgs
[
0
].
num_dst_nodes
()]
# <---
h
=
self
.
conv1
(
mfgs
[
0
],
(
x
,
h_dst
))
# <---
h
=
F
.
relu
(
h
)
h_dst
=
h
[:
mfgs
[
1
].
num_dst_nodes
()]
# <---
h
=
self
.
conv2
(
mfgs
[
1
],
(
h
,
h_dst
))
# <---
return
h
in_size
=
feature
.
size
(
"node"
,
None
,
"feat"
)[
0
]
model
=
Model
(
in_size
,
64
,
num_classes
).
to
(
device
)
######################################################################
# If you compare against the code in the
# :doc:`introduction <../blitz/1_introduction>`, you will notice several
# differences:
#
# - **DGL GNN layers on MFGs**. Instead of computing on the
# full graph:
#
# .. code:: python
#
# h = self.conv1(g, x)
#
# you only compute on the sampled MFG:
#
# .. code:: python
#
# h = self.conv1(mfgs[0], (x, h_dst))
#
# All DGL’s GNN modules support message passing on MFGs,
# where you supply a pair of features, one for source nodes and another
# for destination nodes.
#
# - **Feature slicing for self-dependency**. There are statements that
# perform slicing to obtain the previous-layer representation of the
# nodes:
#
# .. code:: python
#
# h_dst = x[:mfgs[0].num_dst_nodes()]
#
# ``num_dst_nodes`` method works with MFGs, where it will
# return the number of destination nodes.
#
# Since the first few source nodes of the yielded MFG are
# always the same as the destination nodes, these statements obtain the
# representations of the destination nodes on the previous layer. They are
# then combined with neighbor aggregation in ``dgl.nn.SAGEConv`` layer.
#
# .. note::
#
# See the :doc:`custom message passing
# tutorial <L4_message_passing>` for more details on how to
# manipulate MFGs produced in this way, such as the usage
# of ``num_dst_nodes``.
#
######################################################################
# Defining Training Loop
# ----------------------
#
# The following initializes the model and defines the optimizer.
#
opt
=
torch
.
optim
.
Adam
(
model
.
parameters
())
######################################################################
# When computing the validation score for model selection, usually you can
# also do neighbor sampling. To do that, you need to define another data
# loader.
#
datapipe
=
gb
.
ItemSampler
(
valid_set
,
batch_size
=
1024
,
shuffle
=
False
)
datapipe
=
datapipe
.
sample_neighbor
(
graph
,
[
4
,
4
])
datapipe
=
datapipe
.
fetch_feature
(
feature
,
node_feature_keys
=
[
"feat"
])
datapipe
=
datapipe
.
to_dgl
()
datapipe
=
datapipe
.
copy_to
(
device
)
valid_dataloader
=
gb
.
MultiProcessDataLoader
(
datapipe
,
num_workers
=
0
)
import
sklearn.metrics
######################################################################
# The following is a training loop that performs validation every epoch.
# It also saves the model with the best validation accuracy into a file.
#
import
tqdm
best_accuracy
=
0
best_model_path
=
"model.pt"
for
epoch
in
range
(
10
):
model
.
train
()
with
tqdm
.
tqdm
(
train_dataloader
)
as
tq
:
for
step
,
data
in
enumerate
(
tq
):
x
=
data
.
node_features
[
"feat"
]
labels
=
data
.
labels
predictions
=
model
(
data
.
blocks
,
x
)
loss
=
F
.
cross_entropy
(
predictions
,
labels
)
opt
.
zero_grad
()
loss
.
backward
()
opt
.
step
()
accuracy
=
sklearn
.
metrics
.
accuracy_score
(
labels
.
cpu
().
numpy
(),
predictions
.
argmax
(
1
).
detach
().
cpu
().
numpy
(),
)
tq
.
set_postfix
(
{
"loss"
:
"%.03f"
%
loss
.
item
(),
"acc"
:
"%.03f"
%
accuracy
},
refresh
=
False
,
)
model
.
eval
()
predictions
=
[]
labels
=
[]
with
tqdm
.
tqdm
(
valid_dataloader
)
as
tq
,
torch
.
no_grad
():
for
data
in
tq
:
x
=
data
.
node_features
[
"feat"
]
labels
.
append
(
data
.
labels
.
cpu
().
numpy
())
predictions
.
append
(
model
(
data
.
blocks
,
x
).
argmax
(
1
).
cpu
().
numpy
())
predictions
=
np
.
concatenate
(
predictions
)
labels
=
np
.
concatenate
(
labels
)
accuracy
=
sklearn
.
metrics
.
accuracy_score
(
labels
,
predictions
)
print
(
"Epoch {} Validation Accuracy {}"
.
format
(
epoch
,
accuracy
))
if
best_accuracy
<
accuracy
:
best_accuracy
=
accuracy
torch
.
save
(
model
.
state_dict
(),
best_model_path
)
# Note that this tutorial do not train the whole model to the end.
break
######################################################################
# Conclusion
# ----------
#
# In this tutorial, you have learned how to train a multi-layer GraphSAGE
# with neighbor sampling.
#
# What’s next?
# ------------
#
# - :doc:`Stochastic training of GNN for link
# prediction <L2_large_link_prediction>`.
# - :doc:`Adapting your custom GNN module for stochastic
# training <L4_message_passing>`.
# - During inference you may wish to disable neighbor sampling. If so,
# please refer to the :ref:`user guide on exact offline
# inference <guide-minibatch-inference>`.
#
tutorials/large/L2_large_link_prediction.py
deleted
100644 → 0
View file @
74684bbe
"""
Link Prediction
==============================================
This tutorial will show how to train a multi-layer GraphSAGE for link
prediction on `CoraGraphDataset <https://data.dgl.ai/dataset/cora_v2.zip>`__.
The dataset contains 2708 nodes and 10556 edges.
By the end of this tutorial, you will be able to
- Train a GNN model for link prediction on target device with DGL's
neighbor sampling components.
This tutorial assumes that you have read the :doc:`Introduction of Neighbor
Sampling for GNN Training <L0_neighbor_sampling_overview>` and :doc:`Neighbor
Sampling for Node Classification <L1_large_node_classification>`.
"""
######################################################################
# Link Prediction Overview
# ------------------------
#
# Unlike node classification predicts labels for nodes based on their
# local neighborhoods, link prediction assesses the likelihood of an edge
# existing between two nodes, necessitating different sampling strategies
# that account for pairs of nodes and their joint neighborhoods.
#
######################################################################
# Loading Dataset
# ---------------
#
# `cora` is already prepared as ``BuiltinDataset`` in GraphBolt.
#
import
os
os
.
environ
[
"DGLBACKEND"
]
=
"pytorch"
import
dgl.graphbolt
as
gb
import
numpy
as
np
import
torch
import
tqdm
dataset
=
gb
.
BuiltinDataset
(
"cora"
).
load
()
device
=
torch
.
device
(
"cpu"
)
# change to 'cuda' for GPU
######################################################################
# Dataset consists of graph, feature and tasks. You can get the
# training-validation-test set from the tasks. Seed nodes and corresponding
# labels are already stored in each training-validation-test set. This
# dataset contains 2 tasks, one for node classification and the other for
# link prediction. We will use the link prediction task.
#
graph
=
dataset
.
graph
feature
=
dataset
.
feature
train_set
=
dataset
.
tasks
[
1
].
train_set
test_set
=
dataset
.
tasks
[
1
].
test_set
task_name
=
dataset
.
tasks
[
1
].
metadata
[
"name"
]
print
(
f
"Task:
{
task_name
}
."
)
######################################################################
# Defining Neighbor Sampler and Data Loader in DGL
# ------------------------------------------------
#
# Different from the :doc:`link prediction tutorial for full
# graph <../blitz/4_link_predict>`, a common practice to train GNN on large graphs is
# to iterate over the edges
# in minibatches, since computing the probability of all edges is usually
# impossible. For each minibatch of edges, you compute the output
# representation of their incident nodes using neighbor sampling and GNN,
# in a similar fashion introduced in the :doc:`large-scale node classification
# tutorial <L1_large_node_classification>`.
#
# To perform link prediction, you need to specify a negative sampler. DGL
# provides builtin negative samplers such as
# ``dgl.graphbolt.UniformNegativeSampler``. Here this tutorial uniformly
# draws 5 negative examples per positive example.
#
# Except for the negative sampler, the rest of the code is identical to
# the :doc:`node classification tutorial <L1_large_node_classification>`.
#
datapipe
=
gb
.
ItemSampler
(
train_set
,
batch_size
=
256
,
shuffle
=
True
)
datapipe
=
datapipe
.
sample_uniform_negative
(
graph
,
5
)
datapipe
=
datapipe
.
sample_neighbor
(
graph
,
[
5
,
5
,
5
])
datapipe
=
datapipe
.
fetch_feature
(
feature
,
node_feature_keys
=
[
"feat"
])
datapipe
=
datapipe
.
to_dgl
()
datapipe
=
datapipe
.
copy_to
(
device
)
train_dataloader
=
gb
.
MultiProcessDataLoader
(
datapipe
,
num_workers
=
0
)
######################################################################
# You can peek one minibatch from ``train_dataloader`` and see what it
# will give you.
#
data
=
next
(
iter
(
train_dataloader
))
print
(
f
"DGLMiniBatch:
{
data
}
"
)
######################################################################
# Defining Model for Node Representation
# --------------------------------------
#
import
dgl.nn
as
dglnn
import
torch.nn
as
nn
import
torch.nn.functional
as
F
class
SAGE
(
nn
.
Module
):
def
__init__
(
self
,
in_size
,
hidden_size
):
super
().
__init__
()
self
.
layers
=
nn
.
ModuleList
()
self
.
layers
.
append
(
dglnn
.
SAGEConv
(
in_size
,
hidden_size
,
"mean"
))
self
.
layers
.
append
(
dglnn
.
SAGEConv
(
hidden_size
,
hidden_size
,
"mean"
))
self
.
hidden_size
=
hidden_size
self
.
predictor
=
nn
.
Sequential
(
nn
.
Linear
(
hidden_size
,
hidden_size
),
nn
.
ReLU
(),
nn
.
Linear
(
hidden_size
,
1
),
)
def
forward
(
self
,
blocks
,
x
):
hidden_x
=
x
for
layer_idx
,
(
layer
,
block
)
in
enumerate
(
zip
(
self
.
layers
,
blocks
)):
hidden_x
=
layer
(
block
,
hidden_x
)
is_last_layer
=
layer_idx
==
len
(
self
.
layers
)
-
1
if
not
is_last_layer
:
hidden_x
=
F
.
relu
(
hidden_x
)
return
hidden_x
######################################################################
# Defining Training Loop
# ----------------------
#
# The following initializes the model and defines the optimizer.
#
in_size
=
feature
.
size
(
"node"
,
None
,
"feat"
)[
0
]
model
=
SAGE
(
in_size
,
128
).
to
(
device
)
optimizer
=
torch
.
optim
.
Adam
(
model
.
parameters
(),
lr
=
0.001
)
#####################################################################
# Convert the minibatch to a training pair and a label tensor.
#
def
to_binary_link_dgl_computing_pack
(
data
:
gb
.
DGLMiniBatch
):
"""Convert the minibatch to a training pair and a label tensor."""
pos_src
,
pos_dst
=
data
.
positive_node_pairs
neg_src
,
neg_dst
=
data
.
negative_node_pairs
node_pairs
=
(
torch
.
cat
((
pos_src
,
neg_src
),
dim
=
0
),
torch
.
cat
((
pos_dst
,
neg_dst
),
dim
=
0
),
)
pos_label
=
torch
.
ones_like
(
pos_src
)
neg_label
=
torch
.
zeros_like
(
neg_src
)
labels
=
torch
.
cat
([
pos_label
,
neg_label
],
dim
=
0
)
return
(
node_pairs
,
labels
.
float
())
######################################################################
# The following is the training loop for link prediction and
# evaluation.
#
for
epoch
in
range
(
10
):
model
.
train
()
total_loss
=
0
for
step
,
data
in
tqdm
.
tqdm
(
enumerate
(
train_dataloader
)):
# Unpack MiniBatch.
compacted_pairs
,
labels
=
to_binary_link_dgl_computing_pack
(
data
)
node_feature
=
data
.
node_features
[
"feat"
]
# Convert sampled subgraphs to DGL blocks.
blocks
=
data
.
blocks
# Get the embeddings of the input nodes.
y
=
model
(
blocks
,
node_feature
)
logits
=
model
.
predictor
(
y
[
compacted_pairs
[
0
]]
*
y
[
compacted_pairs
[
1
]]
).
squeeze
()
# Compute loss.
loss
=
F
.
binary_cross_entropy_with_logits
(
logits
,
labels
)
optimizer
.
zero_grad
()
loss
.
backward
()
optimizer
.
step
()
total_loss
+=
loss
.
item
()
print
(
f
"Epoch
{
epoch
:
03
d
}
| Loss
{
total_loss
/
(
step
+
1
):.
3
f
}
"
)
######################################################################
# Evaluating Performance with Link Prediction
# -------------------------------------------
#
model
.
eval
()
datapipe
=
gb
.
ItemSampler
(
test_set
,
batch_size
=
256
,
shuffle
=
False
)
# Since we need to use all neghborhoods for evaluation, we set the fanout
# to -1.
datapipe
=
datapipe
.
sample_neighbor
(
graph
,
[
-
1
,
-
1
])
datapipe
=
datapipe
.
fetch_feature
(
feature
,
node_feature_keys
=
[
"feat"
])
datapipe
=
datapipe
.
to_dgl
()
datapipe
=
datapipe
.
copy_to
(
device
)
eval_dataloader
=
gb
.
MultiProcessDataLoader
(
datapipe
,
num_workers
=
0
)
logits
=
[]
labels
=
[]
for
step
,
data
in
enumerate
(
eval_dataloader
):
# Unpack MiniBatch.
compacted_pairs
,
label
=
to_binary_link_dgl_computing_pack
(
data
)
# The features of sampled nodes.
x
=
data
.
node_features
[
"feat"
]
# Forward.
y
=
model
(
data
.
blocks
,
x
)
logit
=
(
model
.
predictor
(
y
[
compacted_pairs
[
0
]]
*
y
[
compacted_pairs
[
1
]])
.
squeeze
()
.
detach
()
)
logits
.
append
(
logit
)
labels
.
append
(
label
)
logits
=
torch
.
cat
(
logits
,
dim
=
0
)
labels
=
torch
.
cat
(
labels
,
dim
=
0
)
# Compute the AUROC score.
from
sklearn.metrics
import
roc_auc_score
auc
=
roc_auc_score
(
labels
,
logits
)
print
(
"Link Prediction AUC:"
,
auc
)
######################################################################
# Conclusion
# ----------
#
# In this tutorial, you have learned how to train a multi-layer GraphSAGE
# for link prediction with neighbor sampling.
#
tutorials/large/README.txt
deleted
100644 → 0
View file @
74684bbe
[Deprecated] Stochastic Training of GNNs
========================================
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment