Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
dgl
Commits
1d2a1cdc
Unverified
Commit
1d2a1cdc
authored
Nov 30, 2023
by
Rhett Ying
Committed by
GitHub
Nov 30, 2023
Browse files
[doc] update edge classification chapter (#6642)
parent
e3752754
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
124 additions
and
112 deletions
+124
-112
docs/source/guide/minibatch-edge.rst
docs/source/guide/minibatch-edge.rst
+124
-112
No files found.
docs/source/guide/minibatch-edge.rst
View file @
1d2a1cdc
...
@@ -16,36 +16,45 @@ You can use the
...
@@ -16,36 +16,45 @@ You can use the
..
code
::
python
..
code
::
python
sampler
=
dgl
.
dataloading
.
MultiLayerFullNeighborSampler
(
2
)
datapipe
=
datapipe
.
sample_neighbor
(
g
,
[
10
,
10
])
#
Or
equivalently
datapipe
=
dgl
.
graphbolt
.
NeighborSampler
(
datapipe
,
g
,
[
10
,
10
])
To
use
the
neighborhood
sampler
provided
by
DGL
for
edge
classification
,
The
code
for
defining
a
data
loader
is
also
the
same
as
that
of
node
one
need
to
instead
combine
it
with
classification
.
The
only
difference
is
that
it
iterates
over
the
:
func
:`~
dgl
.
dataloading
.
as_edge_prediction_sampler
`,
which
iterates
edges
(
namely
,
node
pairs
)
in
the
training
set
instead
of
the
nodes
.
over
a
set
of
edges
in
minibatches
,
yielding
the
subgraph
induced
by
the
edge
minibatch
and
*
message
flow
graphs
*
(
MFGs
)
to
be
consumed
by
the
module
below
.
For
example
,
the
following
code
creates
a
PyTorch
DataLoader
that
iterates
over
the
training
edge
ID
array
``
train_eids
``
in
batches
,
putting
the
list
of
generated
MFGs
onto
GPU
.
..
code
::
python
..
code
::
python
sampler
=
dgl
.
dataloading
.
as_edge_prediction_sampler
(
sampler
)
import
dgl
.
graphbolt
as
gb
dataloader
=
dgl
.
dataloading
.
DataLoader
(
g
,
train_eid_dict
,
sampler
,
device
=
torch
.
device
(
'cuda'
if
torch
.
cuda
.
is_available
()
else
'cpu'
)
batch_size
=
1024
,
g
=
gb
.
SamplingGraph
()
shuffle
=
True
,
node_paris
=
torch
.
arange
(
0
,
1000
).
reshape
(-
1
,
2
)
drop_last
=
False
,
labels
=
torch
.
randint
(
0
,
2
,
(
5
,))
num_workers
=
4
)
train_set
=
gb
.
ItemSet
((
node_pairs
,
labels
),
names
=(
"node_pairs"
,
"labels"
))
datapipe
=
gb
.
ItemSampler
(
train_set
,
batch_size
=
128
,
shuffle
=
True
)
datapipe
=
datapipe
.
sample_neighbor
(
g
,
[
10
,
10
])
#
2
layers
.
#
Or
equivalently
:
#
datapipe
=
gb
.
NeighborSampler
(
datapipe
,
g
,
[
10
,
10
])
datapipe
=
datapipe
.
fetch_feature
(
feature
,
node_feature_keys
=[
"feat"
])
datapipe
=
datapipe
.
to_dgl
()
datapipe
=
datapipe
.
copy_to
(
device
)
dataloader
=
gb
.
MultiProcessDataLoader
(
datapipe
,
num_workers
=
0
)
Iterating
over
the
DataLoader
will
yield
:
class
:`~
dgl
.
graphbolt
.
DGLMiniBatch
`
which
contains
a
list
of
specially
created
graphs
representing
the
computation
dependencies
on
each
layer
.
They
are
called
*
message
flow
graphs
*
(
MFGs
)
in
DGL
.
..
code
::
python
mini_batch
=
next
(
iter
(
dataloader
))
print
(
mini_batch
.
blocks
)
..
note
::
..
note
::
See
the
:
doc
:`
Stochastic
Training
Tutorial
See
the
:
doc
:`
Stochastic
Training
Tutorial
<
tutorials
/
large
/
L0_neighbor_sampling_overview
>`
for
the
concept
of
<../
notebooks
/
stochastic_training
/
neighbor_sampling_overview
.
nblink
>`
__
message
flow
graph
.
for
the
concept
of
message
flow
graph
.
For
a
complete
list
of
supported
builtin
samplers
,
please
refer
to
the
:
ref
:`
neighborhood
sampler
API
reference
<
api
-
dataloading
-
neighbor
-
sampling
>`.
If
you
wish
to
develop
your
own
neighborhood
sampler
or
you
want
a
more
If
you
wish
to
develop
your
own
neighborhood
sampler
or
you
want
a
more
detailed
explanation
of
the
concept
of
MFGs
,
please
refer
to
detailed
explanation
of
the
concept
of
MFGs
,
please
refer
to
...
@@ -63,26 +72,29 @@ an edge exists between the two nodes, and potentially use it for
...
@@ -63,26 +72,29 @@ an edge exists between the two nodes, and potentially use it for
advantage
.
advantage
.
Therefore
in
edge
classification
you
sometimes
would
like
to
exclude
the
Therefore
in
edge
classification
you
sometimes
would
like
to
exclude
the
edges
sampled
in
the
minibatch
from
the
original
graph
for
neighborhood
seed
edges
as
well
as
their
reverse
edges
from
the
sampled
minibatch
.
sampling
,
as
well
as
the
reverse
edges
of
the
sampled
edges
on
an
You
can
use
:
func
:`~
dgl
.
graphbolt
.
exclude_seed_edges
`
alongside
with
undirected
graph
.
You
can
specify
``
exclude
=
'reverse_id'
``
in
calling
:
class
:`~
dgl
.
graphbolt
.
MiniBatchTransformer
`
to
achieve
this
.
:
func
:`~
dgl
.
dataloading
.
as_edge_prediction_sampler
`,
with
the
mapping
of
the
edge
IDs
to
their
reverse
edges
IDs
.
Usually
doing
so
will
lead
to
much
slower
sampling
process
due
to
locating
the
reverse
edges
involving
in
the
minibatch
and
removing
them
.
..
code
::
python
..
code
::
python
n_edges
=
g
.
num_edges
()
import
dgl
.
graphbolt
as
gb
sampler
=
dgl
.
dataloading
.
as_edge_prediction_sampler
(
from
functools
import
partial
sampler
,
exclude
=
'reverse_id'
,
reverse_eids
=
torch
.
cat
([
torch
.
arange
(
n_edges
//
2
,
n_edges
),
torch
.
arange
(
0
,
n_edges
//
2
)]))
device
=
torch
.
device
(
'cuda'
if
torch
.
cuda
.
is_available
()
else
'cpu'
)
dataloader
=
dgl
.
dataloading
.
DataLoader
(
g
=
gb
.
SamplingGraph
()
g
,
train_eid_dict
,
sampler
,
node_paris
=
torch
.
arange
(
0
,
1000
).
reshape
(-
1
,
2
)
batch_size
=
1024
,
labels
=
torch
.
randint
(
0
,
2
,
(
5
,))
shuffle
=
True
,
train_set
=
gb
.
ItemSet
((
node_pairs
,
labels
),
names
=(
"node_pairs"
,
"labels"
))
drop_last
=
False
,
datapipe
=
gb
.
ItemSampler
(
train_set
,
batch_size
=
128
,
shuffle
=
True
)
num_workers
=
4
)
datapipe
=
datapipe
.
sample_neighbor
(
g
,
[
10
,
10
])
#
2
layers
.
exclude_seed_edges
=
partial
(
gb
.
exclude_seed_edges
,
include_reverse_edges
=
True
)
datapipe
=
datapipe
.
transform
(
exclude_seed_edges
)
datapipe
=
datapipe
.
fetch_feature
(
feature
,
node_feature_keys
=[
"feat"
])
datapipe
=
datapipe
.
to_dgl
()
datapipe
=
datapipe
.
copy_to
(
device
)
dataloader
=
gb
.
MultiProcessDataLoader
(
datapipe
,
num_workers
=
0
)
Adapt
your
model
for
minibatch
training
Adapt
your
model
for
minibatch
training
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...
@@ -113,14 +125,12 @@ input features.
...
@@ -113,14 +125,12 @@ input features.
return
x
return
x
The
input
to
the
latter
part
is
usually
the
output
from
the
The
input
to
the
latter
part
is
usually
the
output
from
the
former
part
,
as
well
as
the
subgraph
of
the
original
graph
induced
by
the
former
part
,
as
well
as
the
subgraph
(
node
pairs
)
of
the
original
graph
induced
edges
in
the
minibatch
.
The
subgraph
is
yielded
from
the
same
data
by
the
edges
in
the
minibatch
.
The
subgraph
is
yielded
from
the
same
data
loader
.
One
can
call
:
meth
:`
dgl
.
DGLGraph
.
apply_edges
`
to
compute
the
loader
.
scores
on
the
edges
with
the
edge
subgraph
.
The
following
code
shows
an
example
of
predicting
scores
on
the
edges
by
The
following
code
shows
an
example
of
predicting
scores
on
the
edges
by
concatenating
the
incident
node
features
and
projecting
it
with
a
dense
concatenating
the
incident
node
features
and
projecting
it
with
a
dense
layer
.
layer
.
..
code
::
python
..
code
::
python
...
@@ -129,19 +139,15 @@ layer.
...
@@ -129,19 +139,15 @@ layer.
super
().
__init__
()
super
().
__init__
()
self
.
W
=
nn
.
Linear
(
2
*
in_features
,
num_classes
)
self
.
W
=
nn
.
Linear
(
2
*
in_features
,
num_classes
)
def
apply_edges
(
self
,
edges
):
def
forward
(
self
,
node_pairs
,
x
):
data
=
torch
.
cat
([
edges
.
src
[
'x'
],
edges
.
dst
[
'x'
]],
1
)
src_x
=
x
[
node_pairs
[
0
]]
return
{
'score'
:
self
.
W
(
data
)}
dst_x
=
x
[
node_pairs
[
1
]]
data
=
torch
.
cat
([
src_x
,
dst_x
],
1
)
def
forward
(
self
,
edge_subgraph
,
x
):
return
self
.
W
(
data
)
with
edge_subgraph
.
local_scope
():
edge_subgraph
.
ndata
[
'x'
]
=
x
edge_subgraph
.
apply_edges
(
self
.
apply_edges
)
return
edge_subgraph
.
edata
[
'score'
]
The
entire
model
will
take
the
list
of
MFGs
and
the
edge
subgraph
The
entire
model
will
take
the
list
of
MFGs
and
the
edges
generated
by
the
data
generated
by
the
data
loader
,
as
well
as
the
input
node
features
as
loader
,
as
well
as
the
input
node
features
as
follows
:
follows
:
..
code
::
python
..
code
::
python
...
@@ -151,10 +157,10 @@ follows:
...
@@ -151,10 +157,10 @@ follows:
self
.
gcn
=
StochasticTwoLayerGCN
(
self
.
gcn
=
StochasticTwoLayerGCN
(
in_features
,
hidden_features
,
out_features
)
in_features
,
hidden_features
,
out_features
)
self
.
predictor
=
ScorePredictor
(
num_classes
,
out_features
)
self
.
predictor
=
ScorePredictor
(
num_classes
,
out_features
)
def
forward
(
self
,
edge_subgraph
,
blocks
,
x
):
def
forward
(
self
,
blocks
,
x
,
node_pairs
):
x
=
self
.
gcn
(
blocks
,
x
)
x
=
self
.
gcn
(
blocks
,
x
)
return
self
.
predictor
(
edge_subgraph
,
x
)
return
self
.
predictor
(
node_pairs
,
x
)
DGL
ensures
that
that
the
nodes
in
the
edge
subgraph
are
the
same
as
the
DGL
ensures
that
that
the
nodes
in
the
edge
subgraph
are
the
same
as
the
output
nodes
of
the
last
MFG
in
the
generated
list
of
MFGs
.
output
nodes
of
the
last
MFG
in
the
generated
list
of
MFGs
.
...
@@ -169,21 +175,21 @@ their incident node representations.
...
@@ -169,21 +175,21 @@ their incident node representations.
..
code
::
python
..
code
::
python
import
torch
.
nn
.
functional
as
F
model
=
Model
(
in_features
,
hidden_features
,
out_features
,
num_classes
)
model
=
Model
(
in_features
,
hidden_features
,
out_features
,
num_classes
)
model
=
model
.
cuda
(
)
model
=
model
.
to
(
device
)
opt
=
torch
.
optim
.
Adam
(
model
.
parameters
())
opt
=
torch
.
optim
.
Adam
(
model
.
parameters
())
for
input_nodes
,
edge_subgraph
,
blocks
in
dataloader
:
for
data
in
dataloader
:
blocks
=
[
b
.
to
(
torch
.
device
(
'cuda'
))
for
b
in
blocks
]
blocks
=
data
.
blocks
edge_subgraph
=
edge_subgraph
.
to
(
torch
.
device
(
'cuda'
))
x
=
data
.
edge_features
(
"feat"
)
input_features
=
blocks
[
0
].
srcdata
[
'features'
]
y_hat
=
model
(
data
.
blocks
,
x
,
data
.
positive_node_pairs
)
edge_labels
=
edge_subgraph
.
edata
[
'labels'
]
loss
=
F
.
cross_entropy
(
data
.
labels
,
y_hat
)
edge_predictions
=
model
(
edge_subgraph
,
blocks
,
input_features
)
loss
=
compute_loss
(
edge_labels
,
edge_predictions
)
opt
.
zero_grad
()
opt
.
zero_grad
()
loss
.
backward
()
loss
.
backward
()
opt
.
step
()
opt
.
step
()
For
heterogeneous
graphs
For
heterogeneous
graphs
~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~
...
@@ -212,7 +218,7 @@ classification/regression.
...
@@ -212,7 +218,7 @@ classification/regression.
For
score
prediction
,
the
only
implementation
difference
between
the
For
score
prediction
,
the
only
implementation
difference
between
the
homogeneous
graph
and
the
heterogeneous
graph
is
that
we
are
looping
homogeneous
graph
and
the
heterogeneous
graph
is
that
we
are
looping
over
the
edge
types
for
:
meth
:`~
dgl
.
DGLGraph
.
apply_edges
`
.
over
the
edge
types
.
..
code
::
python
..
code
::
python
...
@@ -221,16 +227,13 @@ over the edge types for :meth:`~dgl.DGLGraph.apply_edges`.
...
@@ -221,16 +227,13 @@ over the edge types for :meth:`~dgl.DGLGraph.apply_edges`.
super
().
__init__
()
super
().
__init__
()
self
.
W
=
nn
.
Linear
(
2
*
in_features
,
num_classes
)
self
.
W
=
nn
.
Linear
(
2
*
in_features
,
num_classes
)
def
apply_edges
(
self
,
edges
):
def
forward
(
self
,
node_pairs
,
x
):
data
=
torch
.
cat
([
edges
.
src
[
'x'
],
edges
.
dst
[
'x'
]],
1
)
scores
=
{}
return
{
'score'
:
self
.
W
(
data
)}
for
etype
in
node_pairs
.
keys
():
src
,
dst
=
node_pairs
[
etype
]
def
forward
(
self
,
edge_subgraph
,
x
):
data
=
torch
.
cat
([
x
[
etype
][
src
],
x
[
etype
][
dst
]],
1
)
with
edge_subgraph
.
local_scope
():
scores
[
etype
]
=
self
.
W
(
data
)
edge_subgraph
.
ndata
[
'x'
]
=
x
return
scores
for
etype
in
edge_subgraph
.
canonical_etypes
:
edge_subgraph
.
apply_edges
(
self
.
apply_edges
,
etype
=
etype
)
return
edge_subgraph
.
edata
[
'score'
]
class
Model
(
nn
.
Module
):
class
Model
(
nn
.
Module
):
def
__init__
(
self
,
in_features
,
hidden_features
,
out_features
,
num_classes
,
def
__init__
(
self
,
in_features
,
hidden_features
,
out_features
,
num_classes
,
...
@@ -240,34 +243,46 @@ over the edge types for :meth:`~dgl.DGLGraph.apply_edges`.
...
@@ -240,34 +243,46 @@ over the edge types for :meth:`~dgl.DGLGraph.apply_edges`.
in_features
,
hidden_features
,
out_features
,
etypes
)
in_features
,
hidden_features
,
out_features
,
etypes
)
self
.
pred
=
ScorePredictor
(
num_classes
,
out_features
)
self
.
pred
=
ScorePredictor
(
num_classes
,
out_features
)
def
forward
(
self
,
edge_subgraph
,
blocks
,
x
):
def
forward
(
self
,
node_pairs
,
blocks
,
x
):
x
=
self
.
rgcn
(
blocks
,
x
)
x
=
self
.
rgcn
(
blocks
,
x
)
return
self
.
pred
(
edge_subgraph
,
x
)
return
self
.
pred
(
node_pairs
,
x
)
Data
loader
definition
is
also
very
similar
to
that
of
node
Data
loader
definition
is
almost
identical
to
that
of
homogeneous
graph
.
The
classification
.
The
only
difference
is
that
you
need
only
difference
is
that
the
train_set
is
now
an
instance
of
:
func
:`~
dgl
.
dataloading
.
as_edge_prediction_sampler
`,
:
class
:`~
dgl
.
graphbolt
.
ItemSetDict
`
instead
of
:
class
:`~
dgl
.
graphbolt
.
ItemSet
`.
and
you
will
be
supplying
a
dictionary
of
edge
types
and
edge
ID
tensors
instead
of
a
dictionary
of
node
types
and
node
ID
tensors
.
..
code
::
python
..
code
::
python
sampler
=
dgl
.
dataloading
.
MultiLayerFullNeighborSampler
(
2
)
import
dgl
.
graphbolt
as
gb
sampler
=
dgl
.
dataloading
.
as_edge_prediction_sampler
(
sampler
)
dataloader
=
dgl
.
dataloading
.
DataLoader
(
device
=
torch
.
device
(
'cuda'
if
torch
.
cuda
.
is_available
()
else
'cpu'
)
g
,
train_eid_dict
,
sampler
,
g
=
gb
.
SamplingGraph
()
batch_size
=
1024
,
node_pairs
=
torch
.
arange
(
0
,
1000
).
reshape
(-
1
,
2
)
shuffle
=
True
,
labels
=
torch
.
randint
(
0
,
3
,
(
1000
,))
drop_last
=
False
,
node_pairs_labels
=
{
num_workers
=
4
)
"user:like:item"
:
gb
.
ItemSet
(
(
node_pairs
,
labels
),
names
=(
"node_pairs"
,
"labels"
)
),
"user:follow:user"
:
gb
.
ItemSet
(
(
node_pairs
,
labels
),
names
=(
"node_pairs"
,
"labels"
)
),
}
train_set
=
gb
.
ItemSetDict
(
node_pairs_labels
)
datapipe
=
gb
.
ItemSampler
(
train_set
,
batch_size
=
128
,
shuffle
=
True
)
datapipe
=
datapipe
.
sample_neighbor
(
g
,
[
10
,
10
])
#
2
layers
.
datapipe
=
datapipe
.
fetch_feature
(
feature
,
node_feature_keys
={
"item"
:
[
"feat"
],
"user"
:
[
"feat"
]}
)
datapipe
=
datapipe
.
to_dgl
()
datapipe
=
datapipe
.
copy_to
(
device
)
dataloader
=
gb
.
MultiProcessDataLoader
(
datapipe
,
num_workers
=
0
)
Things
become
a
little
different
if
you
wish
to
exclude
the
reverse
Things
become
a
little
different
if
you
wish
to
exclude
the
reverse
edges
on
heterogeneous
graphs
.
On
heterogeneous
graphs
,
reverse
edges
edges
on
heterogeneous
graphs
.
On
heterogeneous
graphs
,
reverse
edges
usually
have
a
different
edge
type
from
the
edges
themselves
,
in
order
usually
have
a
different
edge
type
from
the
edges
themselves
,
in
order
to
differentiate
the
“
forward
”
and
“
backward
”
relationships
(
e
.
g
.
to
differentiate
the
“
forward
”
and
“
backward
”
relationships
(
e
.
g
.
``
follow
``
and
``
followed
by
``
are
reverse
relations
of
each
other
,
``
follow
``
and
``
followed
_
by
``
are
reverse
relations
of
each
other
,
``
purchas
e
``
and
``
purchas
ed
by
``
are
reverse
relations
of
each
other
,
``
lik
e
``
and
``
lik
ed
_
by
``
are
reverse
relations
of
each
other
,
etc
.).
etc
.).
If
each
edge
in
a
type
has
a
reverse
edge
with
the
same
ID
in
another
If
each
edge
in
a
type
has
a
reverse
edge
with
the
same
ID
in
another
...
@@ -277,16 +292,17 @@ reverse edges then goes as follows.
...
@@ -277,16 +292,17 @@ reverse edges then goes as follows.
..
code
::
python
..
code
::
python
sampler
=
dgl
.
dataloading
.
as_edge_prediction_sampler
(
sampler
,
exclude
=
'reverse_types'
,
exclude_seed_edges
=
partial
(
reverse_etypes
={
'follow'
:
'followed by'
,
'followed by'
:
'follow'
,
gb
.
exclude_seed_edges
,
'purchase'
:
'purchased by'
,
'purchased by'
:
'purchase'
})
include_reverse_edges
=
True
,
dataloader
=
dgl
.
dataloading
.
DataLoader
(
reverse_etypes_mapping
={
g
,
train_eid_dict
,
sampler
,
"user:like:item"
:
"item:liked_by:user"
,
batch_size
=
1024
,
"user:follow:user"
:
"user:followed_by:user"
,
shuffle
=
True
,
},
drop_last
=
False
,
)
num_workers
=
4
)
datapipe
=
datapipe
.
transform
(
exclude_seed_edges
)
The
training
loop
is
again
almost
the
same
as
that
on
homogeneous
graph
,
The
training
loop
is
again
almost
the
same
as
that
on
homogeneous
graph
,
except
for
the
implementation
of
``
compute_loss
``
that
will
take
in
two
except
for
the
implementation
of
``
compute_loss
``
that
will
take
in
two
...
@@ -309,7 +325,3 @@ dictionaries of node types and predictions here.
...
@@ -309,7 +325,3 @@ dictionaries of node types and predictions here.
loss
.
backward
()
loss
.
backward
()
opt
.
step
()
opt
.
step
()
`
GCMC
<
https
://
github
.
com
/
dmlc
/
dgl
/
tree
/
master
/
examples
/
pytorch
/
gcmc
>`
__
is
an
example
of
edge
classification
on
a
bipartite
graph
.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment