Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
dgl
Commits
7e0107c3
Unverified
Commit
7e0107c3
authored
Sep 23, 2020
by
Mufei Li
Committed by
GitHub
Sep 23, 2020
Browse files
Update (#2225)
Co-authored-by:
Ubuntu
<
ubuntu@ip-172-31-52-53.us-west-2.compute.internal
>
parent
7a6d6668
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
59 additions
and
54 deletions
+59
-54
docs/source/guide/training-graph.rst
docs/source/guide/training-graph.rst
+59
-54
No files found.
docs/source/guide/training-graph.rst
View file @
7e0107c3
...
@@ -3,10 +3,10 @@
...
@@ -3,10 +3,10 @@
5.4
Graph
Classification
5.4
Graph
Classification
----------------------------------
----------------------------------
Instead
of
a
big
single
graph
,
sometimes
w
e
might
have
the
data
in
the
Instead
of
a
big
single
graph
,
sometimes
on
e
might
have
the
data
in
the
form
of
multiple
graphs
,
for
example
a
list
of
different
types
of
form
of
multiple
graphs
,
for
example
a
list
of
different
types
of
communities
of
people
.
By
characterizing
the
friendship
s
among
people
in
communities
of
people
.
By
characterizing
the
friendship
among
people
in
the
same
community
by
a
graph
,
we
get
a
list
of
graphs
to
classify
.
In
the
same
community
by
a
graph
,
one
can
get
a
list
of
graphs
to
classify
.
In
this
scenario
,
a
graph
classification
model
could
help
identify
the
type
this
scenario
,
a
graph
classification
model
could
help
identify
the
type
of
the
community
,
i
.
e
.
to
classify
each
graph
based
on
the
structure
and
of
the
community
,
i
.
e
.
to
classify
each
graph
based
on
the
structure
and
overall
information
.
overall
information
.
...
@@ -16,11 +16,11 @@ Overview
...
@@ -16,11 +16,11 @@ Overview
The
major
difference
between
graph
classification
and
node
The
major
difference
between
graph
classification
and
node
classification
or
link
prediction
is
that
the
prediction
result
classification
or
link
prediction
is
that
the
prediction
result
characterize
the
property
of
the
entire
input
graph
.
We
perform
the
characterize
s
the
property
of
the
entire
input
graph
.
One
can
perform
the
message
passing
over
nodes
/
edges
just
like
the
previous
tasks
,
but
also
message
passing
over
nodes
/
edges
just
like
the
previous
tasks
,
but
also
try
to
retrieve
a
graph
-
level
representation
.
needs
to
retrieve
a
graph
-
level
representation
.
The
graph
classification
proceeds
as
follows
:
The
graph
classification
pipeline
proceeds
as
follows
:
..
figure
::
https
://
data
.
dgl
.
ai
/
tutorial
/
batch
/
graph_classifier
.
png
..
figure
::
https
://
data
.
dgl
.
ai
/
tutorial
/
batch
/
graph_classifier
.
png
:
alt
:
Graph
Classification
Process
:
alt
:
Graph
Classification
Process
...
@@ -29,23 +29,23 @@ The graph classification proceeds as follows:
...
@@ -29,23 +29,23 @@ The graph classification proceeds as follows:
From
left
to
right
,
the
common
practice
is
:
From
left
to
right
,
the
common
practice
is
:
-
Prepare
graphs
in
to
a
batch
of
graphs
-
Prepare
a
batch
of
graphs
-
M
essage
passing
on
the
batched
graphs
to
update
node
/
edge
features
-
Perform
m
essage
passing
on
the
batched
graphs
to
update
node
/
edge
features
-
Aggregate
node
/
edge
features
into
a
graph
-
level
representation
-
Aggregate
node
/
edge
features
into
graph
-
level
representation
s
-
Classif
ication
head
for
the
task
-
Classif
y
graphs
based
on
graph
-
level
representations
Batch
of
Graphs
Batch
of
Graphs
^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^
Usually
a
graph
classification
task
trains
on
a
lot
of
graphs
,
and
it
Usually
a
graph
classification
task
trains
on
a
lot
of
graphs
,
and
it
will
be
very
inefficient
if
we
use
only
one
graph
at
a
time
when
will
be
very
inefficient
to
use
only
one
graph
at
a
time
when
training
the
model
.
Borrowing
the
idea
of
mini
-
batch
training
from
training
the
model
.
Borrowing
the
idea
of
mini
-
batch
training
from
common
deep
learning
practice
,
w
e
can
build
a
batch
of
multiple
graphs
common
deep
learning
practice
,
on
e
can
build
a
batch
of
multiple
graphs
and
send
them
together
for
one
training
iteration
.
and
send
them
together
for
one
training
iteration
.
In
DGL
,
w
e
can
build
a
single
batched
graph
o
f
a
list
of
graphs
.
This
In
DGL
,
on
e
can
build
a
single
batched
graph
f
rom
a
list
of
graphs
.
This
batched
graph
can
be
simply
used
as
a
single
large
graph
,
with
separa
ted
batched
graph
can
be
simply
used
as
a
single
large
graph
,
with
connec
ted
components
representing
the
corresponding
original
small
graphs
.
components
corresponding
to
the
original
small
graphs
.
..
figure
::
https
://
data
.
dgl
.
ai
/
tutorial
/
batch
/
batch
.
png
..
figure
::
https
://
data
.
dgl
.
ai
/
tutorial
/
batch
/
batch
.
png
:
alt
:
Batched
Graph
:
alt
:
Batched
Graph
...
@@ -56,45 +56,46 @@ Graph Readout
...
@@ -56,45 +56,46 @@ Graph Readout
^^^^^^^^^^^^^
^^^^^^^^^^^^^
Every
graph
in
the
data
may
have
its
unique
structure
,
as
well
as
its
Every
graph
in
the
data
may
have
its
unique
structure
,
as
well
as
its
node
and
edge
features
.
In
order
to
make
a
single
prediction
,
w
e
usually
node
and
edge
features
.
In
order
to
make
a
single
prediction
,
on
e
usually
aggregate
and
summarize
over
the
possibly
abundant
information
.
This
aggregate
s
and
summarize
s
over
the
possibly
abundant
information
.
This
type
of
operation
is
named
*
R
eadout
*.
Common
aggreg
ations
include
type
of
operation
is
named
*
r
eadout
*.
Common
readout
oper
ations
include
summation
,
average
,
maximum
or
minimum
over
all
node
or
edge
features
.
summation
,
average
,
maximum
or
minimum
over
all
node
or
edge
features
.
Given
a
graph
:
math
:`
g
`,
we
can
define
the
average
readout
aggregation
Given
a
graph
:
math
:`
g
`,
one
can
define
the
average
node
feature
readout
as
as
..
math
::
h_g
=
\
frac
{
1
}{|\
mathcal
{
V
}|}\
sum_
{
v
\
in
\
mathcal
{
V
}}
h_v
..
math
::
h_g
=
\
frac
{
1
}{|\
mathcal
{
V
}|}\
sum_
{
v
\
in
\
mathcal
{
V
}}
h_v
In
DGL
the
corresponding
function
call
is
:
func
:`
dgl
.
readout_nodes
`.
where
:
math
:`
h_g
`
is
the
representation
of
:
math
:`
g
`,
:
math
:`\
mathcal
{
V
}`
is
the
set
of
nodes
in
:
math
:`
g
`,
:
math
:`
h_v
`
is
the
feature
of
node
:
math
:`
v
`.
Once
:
math
:`
h_g
`
is
available
,
we
can
pass
it
through
an
MLP
layer
for
DGL
provides
built
-
in
support
for
common
readout
operations
.
For
example
,
:
func
:`
dgl
.
readout_nodes
`
implements
the
above
readout
operation
.
Once
:
math
:`
h_g
`
is
available
,
one
can
pass
it
through
an
MLP
layer
for
classification
output
.
classification
output
.
Writing
n
eural
n
etwork
m
odel
Writing
N
eural
N
etwork
M
odel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The
input
to
the
model
is
the
batched
graph
with
node
and
edge
features
.
The
input
to
the
model
is
the
batched
graph
with
node
and
edge
features
.
One
thing
to
note
is
the
node
and
edge
features
in
the
batched
graph
have
no
batch
dimension
.
A
little
special
care
should
be
put
in
the
model
:
Computation
on
a
b
atched
g
raph
Computation
on
a
B
atched
G
raph
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Next
,
we
discuss
the
computational
properties
of
a
batched
graph
.
First
,
different
graphs
in
a
batch
are
entirely
separated
,
i
.
e
.
no
edges
between
any
two
graphs
.
With
this
nice
property
,
all
message
passing
First
,
different
graphs
in
a
batch
are
entirely
separated
,
i
.
e
.
no
edge
connecting
two
graphs
.
With
this
nice
property
,
all
message
passing
functions
still
have
the
same
results
.
functions
still
have
the
same
results
.
Second
,
the
readout
function
on
a
batched
graph
will
be
conducted
over
Second
,
the
readout
function
on
a
batched
graph
will
be
conducted
over
each
graph
separately
.
Assum
e
the
batch
size
is
:
math
:`
B
`
and
the
each
graph
separately
.
Assum
ing
the
batch
size
is
:
math
:`
B
`
and
the
feature
to
be
aggregated
has
dimension
:
math
:`
D
`,
the
shape
of
the
feature
to
be
aggregated
has
dimension
:
math
:`
D
`,
the
shape
of
the
readout
result
will
be
:
math
:`(
B
,
D
)`.
readout
result
will
be
:
math
:`(
B
,
D
)`.
..
code
::
python
..
code
::
python
import
dgl
import
torch
g1
=
dgl
.
graph
(([
0
,
1
],
[
1
,
0
]))
g1
=
dgl
.
graph
(([
0
,
1
],
[
1
,
0
]))
g1
.
ndata
[
'h'
]
=
torch
.
tensor
([
1.
,
2.
])
g1
.
ndata
[
'h'
]
=
torch
.
tensor
([
1.
,
2.
])
g2
=
dgl
.
graph
(([
0
,
1
],
[
1
,
2
]))
g2
=
dgl
.
graph
(([
0
,
1
],
[
1
,
2
]))
...
@@ -107,23 +108,24 @@ readout result will be :math:`(B, D)`.
...
@@ -107,23 +108,24 @@ readout result will be :math:`(B, D)`.
dgl
.
readout_nodes
(
bg
,
'h'
)
dgl
.
readout_nodes
(
bg
,
'h'
)
#
tensor
([
3.
,
6.
])
#
[
1
+
2
,
1
+
2
+
3
]
#
tensor
([
3.
,
6.
])
#
[
1
+
2
,
1
+
2
+
3
]
Finally
,
each
node
/
edge
feature
tensor
on
a
batched
graph
is
in
the
Finally
,
each
node
/
edge
feature
in
a
batched
graph
is
obtained
by
format
of
concatenating
the
corresponding
feature
tensor
from
all
concatenating
the
corresponding
features
from
all
graphs
in
order
.
graphs
.
..
code
::
python
..
code
::
python
bg
.
ndata
[
'h'
]
bg
.
ndata
[
'h'
]
#
tensor
([
1.
,
2.
,
1.
,
2.
,
3.
])
#
tensor
([
1.
,
2.
,
1.
,
2.
,
3.
])
Model
d
efinition
Model
D
efinition
^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^
Being
aware
of
the
above
computation
rules
,
we
can
define
a
very
simple
Being
aware
of
the
above
computation
rules
,
one
can
define
a
model
as
follows
.
model
.
..
code
::
python
..
code
::
python
import
dgl
.
nn
.
pytorch
as
dglnn
import
torch
.
nn
as
nn
class
Classifier
(
nn
.
Module
):
class
Classifier
(
nn
.
Module
):
def
__init__
(
self
,
in_dim
,
hidden_dim
,
n_classes
):
def
__init__
(
self
,
in_dim
,
hidden_dim
,
n_classes
):
super
(
Classifier
,
self
).
__init__
()
super
(
Classifier
,
self
).
__init__
()
...
@@ -131,7 +133,7 @@ model.
...
@@ -131,7 +133,7 @@ model.
self
.
conv2
=
dglnn
.
GraphConv
(
hidden_dim
,
hidden_dim
)
self
.
conv2
=
dglnn
.
GraphConv
(
hidden_dim
,
hidden_dim
)
self
.
classify
=
nn
.
Linear
(
hidden_dim
,
n_classes
)
self
.
classify
=
nn
.
Linear
(
hidden_dim
,
n_classes
)
def
forward
(
self
,
g
,
feat
):
def
forward
(
self
,
g
,
h
):
#
Apply
graph
convolution
and
activation
.
#
Apply
graph
convolution
and
activation
.
h
=
F
.
relu
(
self
.
conv1
(
g
,
h
))
h
=
F
.
relu
(
self
.
conv1
(
g
,
h
))
h
=
F
.
relu
(
self
.
conv2
(
g
,
h
))
h
=
F
.
relu
(
self
.
conv2
(
g
,
h
))
...
@@ -141,19 +143,19 @@ model.
...
@@ -141,19 +143,19 @@ model.
hg
=
dgl
.
mean_nodes
(
g
,
'h'
)
hg
=
dgl
.
mean_nodes
(
g
,
'h'
)
return
self
.
classify
(
hg
)
return
self
.
classify
(
hg
)
Training
l
oop
Training
L
oop
~~~~~~~~~~~~~
~~~~~~~~~~~~~
Data
Loading
Data
Loading
^^^^^^^^^^^^
^^^^^^^^^^^^
Once
the
model
’
s
defined
,
w
e
can
start
training
.
Since
graph
Once
the
model
i
s
defined
,
on
e
can
start
training
.
Since
graph
classification
deals
with
lots
of
relative
small
graphs
instead
of
a
big
classification
deals
with
lots
of
relative
ly
small
graphs
instead
of
a
big
single
one
,
we
usually
can
train
efficiently
on
stochastic
mini
-
batches
single
one
,
one
can
train
efficiently
on
stochastic
mini
-
batches
of
graphs
,
without
the
need
to
design
sophisticated
graph
sampling
of
graphs
,
without
the
need
to
design
sophisticated
graph
sampling
algorithms
.
algorithms
.
Assuming
that
w
e
have
a
graph
classification
dataset
as
introduced
in
Assuming
that
on
e
have
a
graph
classification
dataset
as
introduced
in
:
ref
:`
guide
-
data
-
pipeline
`.
:
ref
:`
guide
-
data
-
pipeline
`.
..
code
::
python
..
code
::
python
...
@@ -162,7 +164,7 @@ Assuming that we have a graph classification dataset as introduced in
...
@@ -162,7 +164,7 @@ Assuming that we have a graph classification dataset as introduced in
dataset
=
dgl
.
data
.
GINDataset
(
'MUTAG'
,
False
)
dataset
=
dgl
.
data
.
GINDataset
(
'MUTAG'
,
False
)
Each
item
in
the
graph
classification
dataset
is
a
pair
of
a
graph
and
Each
item
in
the
graph
classification
dataset
is
a
pair
of
a
graph
and
its
label
.
W
e
can
speed
up
the
data
loading
process
by
taking
advantage
its
label
.
On
e
can
speed
up
the
data
loading
process
by
taking
advantage
of
the
DataLoader
,
by
customizing
the
collate
function
to
batch
the
of
the
DataLoader
,
by
customizing
the
collate
function
to
batch
the
graphs
:
graphs
:
...
@@ -175,7 +177,7 @@ graphs:
...
@@ -175,7 +177,7 @@ graphs:
return
batched_graph
,
batched_labels
return
batched_graph
,
batched_labels
Then
one
can
create
a
DataLoader
that
iterates
over
the
dataset
of
Then
one
can
create
a
DataLoader
that
iterates
over
the
dataset
of
graphs
in
minibatches
.
graphs
in
mini
-
batches
.
..
code
::
python
..
code
::
python
...
@@ -195,20 +197,23 @@ updating the model.
...
@@ -195,20 +197,23 @@ updating the model.
..
code
::
python
..
code
::
python
model
=
Classifier
(
10
,
20
,
5
)
import
torch
.
nn
.
functional
as
F
#
Only
an
example
,
7
is
the
input
feature
size
model
=
Classifier
(
7
,
20
,
5
)
opt
=
torch
.
optim
.
Adam
(
model
.
parameters
())
opt
=
torch
.
optim
.
Adam
(
model
.
parameters
())
for
epoch
in
range
(
20
):
for
epoch
in
range
(
20
):
for
batched_graph
,
labels
in
dataloader
:
for
batched_graph
,
labels
in
dataloader
:
feats
=
batched_graph
.
ndata
[
'
feats'
]
feats
=
batched_graph
.
ndata
[
'
attr'
].
float
()
logits
=
model
(
batched_graph
,
feats
)
logits
=
model
(
batched_graph
,
feats
)
loss
=
F
.
cross_entropy
(
logits
,
labels
)
loss
=
F
.
cross_entropy
(
logits
,
labels
)
opt
.
zero_grad
()
opt
.
zero_grad
()
loss
.
backward
()
loss
.
backward
()
opt
.
step
()
opt
.
step
()
DGL
implements
For
an
end
-
to
-
end
example
of
graph
classification
,
see
`
GIN
<
https
://
github
.
com
/
dmlc
/
dgl
/
tree
/
master
/
examples
/
pytorch
/
gin
>`
__
`
DGL
's GIN example
<https://github.com/dmlc/dgl/tree/master/examples/pytorch/gin>`__
.
as
an
example
of
graph
classification
.
The
training
loop
is
inside
the
The training loop is inside the
function ``train`` in
function ``train`` in
`main.py <https://github.com/dmlc/dgl/blob/master/examples/pytorch/gin/main.py>`__.
`main.py <https://github.com/dmlc/dgl/blob/master/examples/pytorch/gin/main.py>`__.
The model implementation is inside
The model implementation is inside
...
@@ -221,8 +226,8 @@ Heterogeneous graph
...
@@ -221,8 +226,8 @@ Heterogeneous graph
~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~
Graph classification with heterogeneous graphs is a little different
Graph classification with heterogeneous graphs is a little different
from
that
with
homogeneous
graphs
.
Except
that
you
need
heterogeneou
s
from that with homogeneous graphs.
In addition to graph convolution module
s
graph
convolution
modules
,
yoyu
also
need
to
aggregate
over
the
nodes
of
compatible with heterogeneous graphs, one
also need
s
to aggregate over the nodes of
different types in the readout function.
different types in the readout function.
The following shows an example of summing up the average of node
The following shows an example of summing up the average of node
...
@@ -242,7 +247,7 @@ representations for each node type.
...
@@ -242,7 +247,7 @@ representations for each node type.
for rel in rel_names}, aggregate='
sum
')
for rel in rel_names}, aggregate='
sum
')
def forward(self, graph, inputs):
def forward(self, graph, inputs):
#
inputs
are
features
of
nodes
# inputs
is
features of nodes
h = self.conv1(graph, inputs)
h = self.conv1(graph, inputs)
h = {k: F.relu(v) for k, v in h.items()}
h = {k: F.relu(v) for k, v in h.items()}
h = self.conv2(graph, h)
h = self.conv2(graph, h)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment