Unverified Commit 99751d49 authored by Quan (Andy) Gan's avatar Quan (Andy) Gan Committed by GitHub
Browse files

[Doc] Rename block to message flow graph (#2702)

* rename block to mfg

* revert

* rename
parent 491d908b
......@@ -39,6 +39,19 @@ the ``sample_blocks`` methods.
.. autoclass:: MultiLayerFullNeighborSampler
:show-inheritance:
.. _api-dataloading-collators:
Collators
---------
.. currentmodule:: dgl.dataloading
Collators are platform-agnostic classes that generates the mini-batches
given the graphs and indices to sample from.
.. autoclass:: NodeCollator
.. autoclass:: EdgeCollator
.. autoclass:: GraphCollator
.. _api-dataloading-negative-sampling:
Negative Samplers for Link Prediction
......
......@@ -148,40 +148,47 @@ Since the number of nodes
for input and output is different, we need to perform message passing on
a small, bipartite-structured graph instead. We call such a
bipartite-structured graph that only contains the necessary input nodes
and output nodes a *block*. The following figure shows the block of the
second GNN layer for node 8.
(referred as *source* nodes) and output nodes (referred as *destination* nodes)
of a *message flow graph* (MFG).
The following figure shows the MFG of the second GNN layer for node 8.
.. figure:: https://data.dgl.ai/asset/image/guide_6_4_4.png
:alt: Imgur
.. note::
See the :doc:`Stochastic Training Tutorial
<tutorials/large/L0_neighbor_sampling_overview>` for the concept of
message flow graph.
Note that the output nodes also appear in the input nodes. The reason is
that representations of output nodes from the previous layer are needed
Note that the destination nodes also appear in the source nodes. The reason is
that representations of destination nodes from the previous layer are needed
for feature combination after message passing (i.e. :math:`\phi^{(2)}`).
DGL provides :func:`dgl.to_block` to convert any frontier
to a block where the first argument specifies the frontier and the
second argument specifies the output nodes. For instance, the frontier
above can be converted to a block with output node 8 with the code as
to a MFG where the first argument specifies the frontier and the
second argument specifies the destination nodes. For instance, the frontier
above can be converted to a MFG with destination node 8 with the code as
follows.
.. code:: python
output_nodes = torch.LongTensor([8])
block = dgl.to_block(frontier, output_nodes)
dst_nodes = torch.LongTensor([8])
block = dgl.to_block(frontier, dst_nodes)
To find the number of input nodes and output nodes of a given node type,
To find the number of source nodes and destination nodes of a given node type,
one can use :meth:`dgl.DGLHeteroGraph.number_of_src_nodes` and
:meth:`dgl.DGLHeteroGraph.number_of_dst_nodes` methods.
.. code:: python
num_input_nodes, num_output_nodes = block.number_of_src_nodes(), block.number_of_dst_nodes()
print(num_input_nodes, num_output_nodes)
num_src_nodes, num_dst_nodes = block.number_of_src_nodes(), block.number_of_dst_nodes()
print(num_src_nodes, num_dst_nodes)
The block’s input node features can be accessed via member
The MFG’s source node features can be accessed via member
:attr:`dgl.DGLHeteroGraph.srcdata` and :attr:`dgl.DGLHeteroGraph.srcnodes`, and
its output node features can be accessed via member
its destination node features can be accessed via member
:attr:`dgl.DGLHeteroGraph.dstdata` and :attr:`dgl.DGLHeteroGraph.dstnodes`. The
syntax of ``srcdata``/``dstdata`` and ``srcnodes``/``dstnodes`` are
identical to :attr:`dgl.DGLHeteroGraph.ndata` and
......@@ -189,46 +196,36 @@ identical to :attr:`dgl.DGLHeteroGraph.ndata` and
.. code:: python
block.srcdata['h'] = torch.randn(num_input_nodes, 5)
block.dstdata['h'] = torch.randn(num_output_nodes, 5)
block.srcdata['h'] = torch.randn(num_src_nodes, 5)
block.dstdata['h'] = torch.randn(num_dst_nodes, 5)
If a block is converted from a frontier, which is in turn converted from
a graph, one can directly read the feature of the block’s input and
output nodes via
If a MFG is converted from a frontier, which is in turn converted from
a graph, one can directly read the feature of the MFG’s source and
destination nodes via
.. code:: python
print(block.srcdata['x'])
print(block.dstdata['y'])
.. raw:: html
<div class="alert alert-info">
::
The original node IDs of the input nodes and output nodes in the block
can be found as the feature ``dgl.NID``, and the mapping from the
block’s edge IDs to the input frontier’s edge IDs can be found as the
feature ``dgl.EID``.
.. raw:: html
</div>
.. note::
**Output Nodes**
The original node IDs of the source nodes and destination nodes in the MFG
can be found as the feature ``dgl.NID``, and the mapping from the
MFG’s edge IDs to the input frontier’s edge IDs can be found as the
feature ``dgl.EID``.
DGL ensures that the output nodes of a block will always appear in the
input nodes. The output nodes will always index firstly in the input
DGL ensures that the destination nodes of a MFG will always appear in the
source nodes. The destination nodes will always index firstly in the source
nodes.
.. code:: python
input_nodes = block.srcdata[dgl.NID]
output_nodes = block.dstdata[dgl.NID]
assert torch.equal(input_nodes[:len(output_nodes)], output_nodes)
src_nodes = block.srcdata[dgl.NID]
dst_nodes = block.dstdata[dgl.NID]
assert torch.equal(src_nodes[:len(dst_nodes)], dst_nodes)
As a result, the output nodes must cover all nodes that are the
As a result, the destination nodes must cover all nodes that are the
destination of an edge in the frontier.
For example, consider the following frontier
......@@ -240,15 +237,15 @@ For example, consider the following frontier
where the red and green nodes (i.e. node 4, 5, 7, 8, and 11) are all
nodes that is a destination of an edge. Then the following code will
raise an error because the output nodes did not cover all those nodes.
raise an error because the destination nodes did not cover all those nodes.
.. code:: python
dgl.to_block(frontier2, torch.LongTensor([4, 5])) # ERROR
However, the output nodes can have more nodes than above. In this case,
However, the destination nodes can have more nodes than above. In this case,
we will have isolated nodes that do not have any edge connecting to it.
The isolated nodes will be included in both input nodes and output
The isolated nodes will be included in both source nodes and destination
nodes.
.. code:: python
......@@ -261,7 +258,7 @@ nodes.
Heterogeneous Graphs
^^^^^^^^^^^^^^^^^^^^
Blocks also work on heterogeneous graphs. Let’s say that we have the
MFGs also work on heterogeneous graphs. Let’s say that we have the
following frontier:
.. code:: python
......@@ -272,20 +269,20 @@ following frontier:
('game', 'played-by', 'user'): ([2], [6])
}, num_nodes_dict={'user': 10, 'game': 10})
One can also create a block with output nodes User #3, #6, and #8, as
One can also create a MFG with destination nodes User #3, #6, and #8, as
well as Game #2 and #6.
.. code:: python
hetero_block = dgl.to_block(hetero_frontier, {'user': [3, 6, 8], 'block': [2, 6]})
hetero_block = dgl.to_block(hetero_frontier, {'user': [3, 6, 8], 'game': [2, 6]})
One can also get the input nodes and output nodes by type:
One can also get the source nodes and destination nodes by type:
.. code:: python
# input users and games
# source users and games
print(hetero_block.srcnodes['user'].data[dgl.NID], hetero_block.srcnodes['game'].data[dgl.NID])
# output users and games
# destination users and games
print(hetero_block.dstnodes['user'].data[dgl.NID], hetero_block.dstnodes['game'].data[dgl.NID])
......@@ -307,10 +304,10 @@ see what :class:`~dgl.dataloading.dataloader.BlockSampler`, the parent class of
:class:`~dgl.dataloading.neighbor.MultiLayerFullNeighborSampler`, is.
:class:`~dgl.dataloading.dataloader.BlockSampler` is responsible for
generating the list of blocks starting from the last layer, with method
generating the list of MFGs starting from the last layer, with method
:meth:`~dgl.dataloading.dataloader.BlockSampler.sample_blocks`. The default implementation of
``sample_blocks`` is to iterate backwards, generating the frontiers and
converting them to blocks.
converting them to MFGs.
Therefore, for neighborhood sampling, **you only need to implement
the**\ :meth:`~dgl.dataloading.dataloader.BlockSampler.sample_frontier`\ **method**. Given which
......@@ -386,7 +383,7 @@ nodes with a probability, one can simply define the sampler as follows:
return self.n_layers
After implementing your sampler, you can create a data loader that takes
in your sampler and it will keep generating lists of blocks while
in your sampler and it will keep generating lists of MFGs while
iterating over the seed nodes as usual.
.. code:: python
......
......@@ -22,11 +22,11 @@ To use the neighborhood sampler provided by DGL for edge classification,
one need to instead combine it with
:class:`~dgl.dataloading.pytorch.EdgeDataLoader`, which iterates
over a set of edges in minibatches, yielding the subgraph induced by the
edge minibatch and ``blocks`` to be consumed by the module below.
edge minibatch and *message flow graphs* (MFGs) to be consumed by the module below.
For example, the following code creates a PyTorch DataLoader that
iterates over the training edge ID array ``train_eids`` in batches,
putting the list of generated blocks onto GPU.
putting the list of generated MFGs onto GPU.
.. code:: python
......@@ -37,12 +37,18 @@ putting the list of generated blocks onto GPU.
drop_last=False,
num_workers=4)
For a complete list of supported builtin samplers, please refer to the
:ref:`neighborhood sampler API reference <api-dataloading-neighbor-sampling>`.
.. note::
If you wish to develop your own neighborhood sampler or you want a more
detailed explanation of the concept of blocks, please refer to
:ref:`guide-minibatch-customizing-neighborhood-sampler`.
See the :doc:`Stochastic Training Tutorial
<tutorials/large/L0_neighbor_sampling_overview>` for the concept of
message flow graph.
For a complete list of supported builtin samplers, please refer to the
:ref:`neighborhood sampler API reference <api-dataloading-neighbor-sampling>`.
If you wish to develop your own neighborhood sampler or you want a more
detailed explanation of the concept of MFGs, please refer to
:ref:`guide-minibatch-customizing-neighborhood-sampler`.
Removing edges in the minibatch from the original graph for neighbor sampling
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
......@@ -92,7 +98,7 @@ The edge classification model usually consists of two parts:
The former part is exactly the same as
:ref:`that from node classification <guide-minibatch-node-classification-model>`
and we can simply reuse it. The input is still the list of
blocks generated from a data loader provided by DGL, as well as the
MFGs generated from a data loader provided by DGL, as well as the
input features.
.. code:: python
......@@ -135,7 +141,7 @@ layer.
edge_subgraph.apply_edges(self.apply_edges)
return edge_subgraph.edata['score']
The entire model will take the list of blocks and the edge subgraph
The entire model will take the list of MFGs and the edge subgraph
generated by the data loader, as well as the input node features as
follows:
......@@ -153,14 +159,14 @@ follows:
return self.predictor(edge_subgraph, x)
DGL ensures that that the nodes in the edge subgraph are the same as the
output nodes of the last block in the generated list of blocks.
output nodes of the last MFG in the generated list of MFGs.
Training Loop
~~~~~~~~~~~~~
The training loop is very similar to node classification. You can
iterate over the dataloader and get a subgraph induced by the edges in
the minibatch, as well as the list of blocks necessary for computing
the minibatch, as well as the list of MFGs necessary for computing
their incident node representations.
.. code:: python
......
......@@ -109,9 +109,11 @@ When a negative sampler is provided, DGL’s data loader will generate
three items per minibatch:
- A positive graph containing all the edges sampled in the minibatch.
- A negative graph containing all the non-existent edges generated by
the negative sampler.
- A list of blocks generated by the neighborhood sampler.
- A list of *message flow graphs* (MFGs) generated by the neighborhood sampler.
So one can define the link prediction model as follows that takes in the
three items as well as the input features.
......
......@@ -5,10 +5,16 @@
:ref:`(中文版) <guide_cn-minibatch-custom-gnn-module>`
.. note::
:doc:`This tutorial <tutorials/large/L4_message_passing>` has similar
content to this section for the homogeneous graph case.
If you were familiar with how to write a custom GNN module for updating
the entire graph for homogeneous or heterogeneous graphs (see
:ref:`guide-nn`), the code for computing on
blocks is similar, with the exception that the nodes are divided into
MFGs is similar, with the exception that the nodes are divided into
input nodes and output nodes.
For example, consider the following custom graph convolution module
......@@ -30,7 +36,7 @@ like.
return self.W(torch.cat([g.ndata['h'], g.ndata['h_neigh']], 1))
If you have a custom message passing NN module for the full graph, and
you would like to make it work for blocks, you only need to rewrite the
you would like to make it work for MFGs, you only need to rewrite the
forward function as follows. Note that the corresponding statements from
the full-graph implementation are commented; you can compare the
original statements with the new statements.
......@@ -62,7 +68,7 @@ original statements with the new statements.
[block.dstdata['h'], block.dstdata['h_neigh']], 1))
In general, you need to do the following to make your NN module work for
blocks.
MFGs.
- Obtain the features for output nodes from the input features by
slicing the first few rows. The number of rows can be obtained by
......@@ -149,22 +155,22 @@ serve for input or output.
return {ntype: g.dstnodes[ntype].data['h_dst']
for ntype in g.ntypes}
Writing modules that work on homogeneous graphs, bipartite graphs, and blocks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Writing modules that work on homogeneous graphs, bipartite graphs, and MFGs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All message passing modules in DGL work on homogeneous graphs,
unidirectional bipartite graphs (that have two node types and one edge
type), and a block with one edge type. Essentially, the input graph and
type), and a MFG with one edge type. Essentially, the input graph and
feature of a builtin DGL neural network module must satisfy either of
the following cases.
- If the input feature is a pair of tensors, then the input graph must
be unidirectional bipartite.
- If the input feature is a single tensor and the input graph is a
block, DGL will automatically set the feature on the output nodes as
MFG, DGL will automatically set the feature on the output nodes as
the first few rows of the input node features.
- If the input feature must be a single tensor and the input graph is
not a block, then the input graph must be homogeneous.
not a MFG, then the input graph must be homogeneous.
For example, the following is simplified from the PyTorch implementation
of :class:`dgl.nn.pytorch.SAGEConv` (also available in MXNet and Tensorflow)
......@@ -194,6 +200,6 @@ of :class:`dgl.nn.pytorch.SAGEConv` (also available in MXNet and Tensorflow)
self.W(torch.cat([g.dstdata['h'], g.dstdata['h_neigh']], 1)))
:ref:`guide-nn` also provides a walkthrough on :class:`dgl.nn.pytorch.SAGEConv`,
which works on unidirectional bipartite graphs, homogeneous graphs, and blocks.
which works on unidirectional bipartite graphs, homogeneous graphs, and MFGs.
......@@ -31,7 +31,7 @@ over a set of nodes in minibatches.
For example, the following code creates a PyTorch DataLoader that
iterates over the training node ID array ``train_nids`` in batches,
putting the list of generated blocks onto GPU.
putting the list of generated MFGs onto GPU.
.. code:: python
......@@ -51,7 +51,7 @@ putting the list of generated blocks onto GPU.
Iterating over the DataLoader will yield a list of specially created
graphs representing the computation dependencies on each layer. They are
called *blocks* in DGL.
called *message flow graphs* (MFGs) in DGL.
.. code:: python
......@@ -65,12 +65,19 @@ be computed as output, which node representations are needed as input,
and how does representation from the input nodes propagate to the output
nodes.
For a complete list of supported builtin samplers, please refer to the
:ref:`neighborhood sampler API reference <api-dataloading-neighbor-sampling>`.
.. note::
See the :doc:`Stochastic Training Tutorial
<tutorials/large/L0_neighbor_sampling_overview>` for the concept of
message flow graph.
For a complete list of supported builtin samplers, please refer to the
:ref:`neighborhood sampler API reference <api-dataloading-neighbor-sampling>`.
If you wish to develop your own neighborhood sampler or you want a more
detailed explanation of the concept of MFGs, please refer to
:ref:`guide-minibatch-customizing-neighborhood-sampler`.
If you wish to develop your own neighborhood sampler or you want a more
detailed explanation of the concept of blocks, please refer to
:ref:`guide-minibatch-customizing-neighborhood-sampler`.
.. _guide-minibatch-node-classification-model:
......@@ -114,7 +121,7 @@ The DGL ``GraphConv`` modules above accepts an element in ``blocks``
generated by the data loader as an argument.
:ref:`The API reference of each NN module <apinn>` will tell you
whether it supports accepting a block as an argument.
whether it supports accepting a MFG as an argument.
If you wish to use your own message passing module, please refer to
:ref:`guide-minibatch-custom-gnn-module`.
......@@ -124,7 +131,7 @@ Training Loop
The training loop simply consists of iterating over the dataset with the
customized batching iterator. During each iteration that yields a list
of blocks, we:
of MFGs, we:
1. Load the node features corresponding to the input nodes onto GPU. The
node features can be stored in either memory or external storage.
......@@ -133,10 +140,10 @@ of blocks, we:
If the features are stored in ``g.ndata``, then the features can be loaded
by accessing the features in ``blocks[0].srcdata``, the features of
input nodes of the first block, which is identical to all the
source nodes of the first MFG, which is identical to all the
necessary nodes needed for computing the final representations.
2. Feed the list of blocks and the input node features to the multilayer
2. Feed the list of MFGs and the input node features to the multilayer
GNN and get the outputs.
3. Load the node labels corresponding to the output nodes onto GPU.
......@@ -147,7 +154,7 @@ of blocks, we:
If the features are stored in ``g.ndata``, then the labels
can be loaded by accessing the features in ``blocks[-1].srcdata``,
the features of output nodes of the last block, which is identical to
the features of destination nodes of the last MFG, which is identical to
the nodes we wish to compute the final representation.
4. Compute the loss and backpropagate.
......
......@@ -166,7 +166,7 @@ def batch(graphs, ndata=ALL, edata=ALL, *,
raise DGLError('Invalid argument edata: must be a string list but got {}.'.format(
type(edata)))
if any(g.is_block for g in graphs):
raise DGLError("Batching a block is not supported.")
raise DGLError("Batching a MFG is not supported.")
relations = list(sorted(graphs[0].canonical_etypes))
relation_ids = [graphs[0].get_etype_id(r) for r in relations]
......
......@@ -358,14 +358,14 @@ def heterograph(data_dict,
return retg.to(device)
def create_block(data_dict, num_src_nodes=None, num_dst_nodes=None, idtype=None, device=None):
"""Create a :class:`DGLBlock` object.
"""Create a message flow graph (MFG) as a :class:`DGLBlock` object.
Parameters
----------
data_dict : graph data
The dictionary data for constructing a block. The keys are in the form of
string triplets (src_type, edge_type, dst_type), specifying the input node type,
edge type, and output node type. The values are graph data in the form of
The dictionary data for constructing a MFG. The keys are in the form of
string triplets (src_type, edge_type, dst_type), specifying the source node type,
edge type, and destination node type. The values are graph data in the form of
:math:`(U, V)`, where :math:`(U[i], V[i])` forms the edge with ID :math:`i`.
The allowed graph data formats are:
......@@ -376,35 +376,35 @@ def create_block(data_dict, num_src_nodes=None, num_dst_nodes=None, idtype=None,
- ``(iterable[int], iterable[int])``: Similar to the tuple of node-tensors
format, but stores node IDs in two sequences (e.g. list, tuple, numpy.ndarray).
If you would like to create a block with a single input node type, a single output
If you would like to create a MFG with a single source node type, a single destination
node type, and a single edge type, then you can pass in the graph data directly
without wrapping it as a dictionary.
num_src_nodes : dict[str, int] or int, optional
The number of nodes for each input node type, which is a dictionary mapping a node type
:math:`T` to the number of :math:`T`-typed input nodes.
The number of nodes for each source node type, which is a dictionary mapping a node type
:math:`T` to the number of :math:`T`-typed source nodes.
If not given for a node type :math:`T`, DGL finds the largest ID appearing in *every*
graph data whose input node type is :math:`T`, and sets the number of nodes to
graph data whose source node type is :math:`T`, and sets the number of nodes to
be that ID plus one. If given and the value is no greater than the largest ID for some
input node type, DGL will raise an error. By default, DGL infers the number of nodes for
all input node types.
source node type, DGL will raise an error. By default, DGL infers the number of nodes for
all source node types.
If you would like to create a block with a single input node type, a single output
If you would like to create a MFG with a single source node type, a single destination
node type, and a single edge type, then you can pass in an integer to directly
represent the number of input nodes.
represent the number of source nodes.
num_dst_nodes : dict[str, int] or int, optional
The number of nodes for each output node type, which is a dictionary mapping a node type
:math:`T` to the number of :math:`T`-typed output nodes.
The number of nodes for each destination node type, which is a dictionary mapping a node
type :math:`T` to the number of :math:`T`-typed destination nodes.
If not given for a node type :math:`T`, DGL finds the largest ID appearing in *every*
graph data whose output node type is :math:`T`, and sets the number of nodes to
graph data whose destination node type is :math:`T`, and sets the number of nodes to
be that ID plus one. If given and the value is no greater than the largest ID for some
output node type, DGL will raise an error. By default, DGL infers the number of nodes for
all output node types.
destination node type, DGL will raise an error. By default, DGL infers the number of nodes
for all destination node types.
If you would like to create a block with a single output node type, a single output
node type, and a single edge type, then you can pass in an integer to directly
represent the number of output nodes.
If you would like to create a MFG with a single destination node type, a single
destination node type, and a single edge type, then you can pass in an integer to directly
represent the number of destination nodes.
idtype : int32 or int64, optional
The data type for storing the structure-related graph information such as node and
edge IDs. It should be a framework-specific data type object (e.g., ``torch.int32``).
......@@ -419,7 +419,7 @@ def create_block(data_dict, num_src_nodes=None, num_dst_nodes=None, idtype=None,
Returns
-------
DGLBlock
The created block.
The created MFG.
Notes
-----
......@@ -501,12 +501,12 @@ def create_block(data_dict, num_src_nodes=None, num_dst_nodes=None, idtype=None,
num_dst_nodes[dty] = max(num_dst_nodes[dty], vrange)
else: # sanity check
if num_src_nodes[sty] < urange:
raise DGLError('The given number of nodes of input node type {} must be larger'
raise DGLError('The given number of nodes of source node type {} must be larger'
' than the max ID in the data, but got {} and {}.'.format(
sty, num_src_nodes[sty], urange - 1))
if num_dst_nodes[dty] < vrange:
raise DGLError('The given number of nodes of output node type {} must be larger'
' than the max ID in the data, but got {} and {}.'.format(
raise DGLError('The given number of nodes of destination node type {} must be'
' larger than the max ID in the data, but got {} and {}.'.format(
dty, num_dst_nodes[dty], vrange - 1))
# Create the graph
......@@ -546,17 +546,17 @@ def create_block(data_dict, num_src_nodes=None, num_dst_nodes=None, idtype=None,
return retg.to(device)
def block_to_graph(block):
"""Convert a :class:`DGLBlock` object to a :class:`DGLGraph`.
"""Convert a message flow graph (MFG) as a :class:`DGLBlock` object to a :class:`DGLGraph`.
DGL will rename all the input node types by suffixing with ``_src``, and
all the output node types by suffixing with ``_dst``.
DGL will rename all the source node types by suffixing with ``_src``, and
all the destination node types by suffixing with ``_dst``.
Features on the returned graph will be preserved.
Parameters
----------
block : DGLBlock
The block.
The MFG.
Returns
-------
......
......@@ -15,7 +15,7 @@ from ..distributed.dist_graph import DistGraph
# pylint: disable=unused-argument
def assign_block_eids(block, frontier):
"""Assigns edge IDs from the original graph to the block.
"""Assigns edge IDs from the original graph to the message flow graph (MFG).
See also
--------
......@@ -117,8 +117,8 @@ class BlockSampler(object):
"""Abstract class specifying the neighborhood sampling strategy for DGL data loaders.
The main method for BlockSampler is :meth:`sample_blocks`,
which generates a list of blocks for a multi-layer GNN given a set of seed nodes to
have their outputs computed.
which generates a list of message flow graphs (MFGs) for a multi-layer GNN given a set of
seed nodes to have their outputs computed.
The default implementation of :meth:`sample_blocks` is
to repeat :attr:`num_layers` times the following procedure from the last layer to the first
......@@ -133,13 +133,13 @@ class BlockSampler(object):
reverse edges. This is controlled by the argument :attr:`exclude_eids` in
:meth:`sample_blocks` method.
* Convert the frontier into a block.
* Convert the frontier into a MFG.
* Optionally assign the IDs of the edges in the original graph selected in the first step
to the block, controlled by the argument ``return_eids`` in
to the MFG, controlled by the argument ``return_eids`` in
:meth:`sample_blocks` method.
* Prepend the block to the block list to be returned.
* Prepend the MFG to the MFG list to be returned.
All subclasses should override :meth:`sample_frontier`
method while specifying the number of layers to sample in :attr:`num_layers` argument.
......@@ -149,19 +149,21 @@ class BlockSampler(object):
num_layers : int
The number of layers to sample.
return_eids : bool, default False
Whether to return the edge IDs involved in message passing in the block.
Whether to return the edge IDs involved in message passing in the MFG.
If True, the edge IDs will be stored as an edge feature named ``dgl.EID``.
Notes
-----
For the concept of frontiers and blocks, please refer to User Guide Section 6 [TODO].
For the concept of frontiers and MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
"""
def __init__(self, num_layers, return_eids):
self.num_layers = num_layers
self.return_eids = return_eids
def sample_frontier(self, block_id, g, seed_nodes):
"""Generate the frontier given the output nodes.
"""Generate the frontier given the destination nodes.
The subclasses should override this function.
......@@ -172,7 +174,7 @@ class BlockSampler(object):
g : DGLGraph
The original graph.
seed_nodes : Tensor or dict[ntype, Tensor]
The output nodes by node type.
The destination nodes by node type.
If the graph only has one node type, one can just specify a single tensor
of node IDs.
......@@ -184,19 +186,21 @@ class BlockSampler(object):
Notes
-----
For the concept of frontiers and blocks, please refer to User Guide Section 6 [TODO].
For the concept of frontiers and MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
"""
raise NotImplementedError
def sample_blocks(self, g, seed_nodes, exclude_eids=None):
"""Generate the a list of blocks given the output nodes.
"""Generate the a list of MFGs given the destination nodes.
Parameters
----------
g : DGLGraph
The original graph.
seed_nodes : Tensor or dict[ntype, Tensor]
The output nodes by node type.
The destination nodes by node type.
If the graph only has one node type, one can just specify a single tensor
of node IDs.
......@@ -206,11 +210,13 @@ class BlockSampler(object):
Returns
-------
list[DGLGraph]
The blocks generated for computing the multi-layer GNN output.
The MFGs generated for computing the multi-layer GNN output.
Notes
-----
For the concept of frontiers and blocks, please refer to User Guide Section 6 [TODO].
For the concept of frontiers and MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
"""
blocks = []
exclude_eids = (
......@@ -259,11 +265,13 @@ class Collator(ABC):
Provides a :attr:`dataset` object containing the collection of all nodes or edges,
as well as a :attr:`collate` method that combines a set of items from
:attr:`dataset` and obtains the blocks.
:attr:`dataset` and obtains the message flow graphs (MFGs).
Notes
-----
For the concept of blocks, please refer to User Guide Section 6 [TODO].
For the concept of MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
"""
@abstractproperty
def dataset(self):
......@@ -272,7 +280,7 @@ class Collator(ABC):
@abstractmethod
def collate(self, items):
"""Combines the items from the dataset object and obtains the list of blocks.
"""Combines the items from the dataset object and obtains the list of MFGs.
Parameters
----------
......@@ -281,7 +289,9 @@ class Collator(ABC):
Notes
-----
For the concept of blocks, please refer to User Guide Section 6 [TODO].
For the concept of MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
"""
raise NotImplementedError
......@@ -330,6 +340,12 @@ class NodeCollator(Collator):
... batch_size=1024, shuffle=True, drop_last=False, num_workers=4)
>>> for input_nodes, output_nodes, blocks in dataloader:
... train_on(input_nodes, output_nodes, blocks)
Notes
-----
For the concept of MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
"""
def __init__(self, g, nids, block_sampler):
self.g = g
......@@ -351,7 +367,7 @@ class NodeCollator(Collator):
return self._dataset
def collate(self, items):
"""Find the list of blocks necessary for computing the representation of given
"""Find the list of MFGs necessary for computing the representation of given
nodes for a node classification/regression task.
Parameters
......@@ -372,8 +388,8 @@ class NodeCollator(Collator):
If the original graph has multiple node types, return a dictionary of
node type names and node ID tensors. Otherwise, return a single tensor.
blocks : list[DGLGraph]
The list of blocks necessary for computing the representation.
MFGs : list[DGLGraph]
The list of MFGs necessary for computing the representation.
"""
if isinstance(items[0], tuple):
# returns a list of pairs: group them by node types into a dict
......@@ -404,7 +420,7 @@ class EdgeCollator(Collator):
* If a negative sampler is given, another graph that contains the "negative edges",
connecting the source and destination nodes yielded from the given negative sampler.
* A list of blocks necessary for computing the representation of the incident nodes
* A list of MFGs necessary for computing the representation of the incident nodes
of the edges in the minibatch.
Parameters
......@@ -552,6 +568,12 @@ class EdgeCollator(Collator):
... batch_size=1024, shuffle=True, drop_last=False, num_workers=4)
>>> for input_nodes, pos_pair_graph, neg_pair_graph, blocks in dataloader:
... train_on(input_nodes, pair_graph, neg_pair_graph, blocks)
Notes
-----
For the concept of MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
"""
def __init__(self, g, eids, block_sampler, g_sampling=None, exclude=None,
reverse_eids=None, reverse_etypes=None, negative_sampler=None):
......@@ -690,7 +712,7 @@ class EdgeCollator(Collator):
Note that the metagraph of this graph will be identical to that of the original
graph.
blocks : list[DGLGraph]
The list of blocks necessary for computing the representation of the edges.
The list of MFGs necessary for computing the representation of the edges.
"""
if self.negative_sampler is None:
return self._collate(items)
......
......@@ -25,7 +25,7 @@ class MultiLayerNeighborSampler(BlockSampler):
replace : bool, default True
Whether to sample with replacement
return_eids : bool, default False
Whether to return the edge IDs involved in message passing in the block.
Whether to return the edge IDs involved in message passing in the MFG.
If True, the edge IDs will be stored as an edge feature named ``dgl.EID``.
Examples
......@@ -50,6 +50,12 @@ class MultiLayerNeighborSampler(BlockSampler):
... {('user', 'follows', 'user'): 5,
... ('user', 'plays', 'game'): 4,
... ('game', 'played-by', 'user'): 3}] * 3)
Notes
-----
For the concept of MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
"""
def __init__(self, fanouts, replace=False, return_eids=False):
super().__init__(len(fanouts), return_eids)
......@@ -84,7 +90,7 @@ class MultiLayerFullNeighborSampler(MultiLayerNeighborSampler):
n_layers : int
The number of GNN layers to sample.
return_eids : bool, default False
Whether to return the edge IDs involved in message passing in the block.
Whether to return the edge IDs involved in message passing in the MFG.
If True, the edge IDs will be stored as an edge feature named ``dgl.EID``.
Examples
......@@ -100,6 +106,12 @@ class MultiLayerFullNeighborSampler(MultiLayerNeighborSampler):
... batch_size=1024, shuffle=True, drop_last=False, num_workers=4)
>>> for blocks in dataloader:
... train_on(blocks)
Notes
-----
For the concept of MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
"""
def __init__(self, n_layers, return_eids=False):
super().__init__([None] * n_layers, return_eids=return_eids)
......@@ -16,8 +16,8 @@ def _remove_kwargs_dist(kwargs):
# The following code is a fix to the PyTorch-specific issue in
# https://github.com/dmlc/dgl/issues/2137
#
# Basically the sampled blocks/subgraphs contain the features extracted from the
# parent graph. In DGL, the blocks/subgraphs will hold a reference to the parent
# Basically the sampled MFGs/subgraphs contain the features extracted from the
# parent graph. In DGL, the MFGs/subgraphs will hold a reference to the parent
# graph feature tensor and an index tensor, so that the features could be extracted upon
# request. However, in the context of multiprocessed sampling, we do not need to
# transmit the parent graph feature tensor from the subprocess to the main process,
......@@ -26,13 +26,13 @@ def _remove_kwargs_dist(kwargs):
# it with the following trick:
#
# In the collator running in the sampler processes:
# For each frame in the block, we check each column and the column with the same name
# For each frame in the MFG, we check each column and the column with the same name
# in the corresponding parent frame. If the storage of the former column is the
# same object as the latter column, we are sure that the former column is a
# subcolumn of the latter, and set the storage of the former column as None.
#
# In the iterator of the main process:
# For each frame in the block, we check each column and the column with the same name
# For each frame in the MFG, we check each column and the column with the same name
# in the corresponding parent frame. If the storage of the former column is None,
# we replace it with the storage of the latter column.
......@@ -118,7 +118,7 @@ def _restore_blocks_storage(blocks, g):
class _NodeCollator(NodeCollator):
def collate(self, items):
# input_nodes, output_nodes, [items], blocks
# input_nodes, output_nodes, blocks
result = super().collate(items)
_pop_blocks_storage(result[-1], self.g)
return result
......@@ -173,7 +173,7 @@ class _EdgeDataLoaderIter:
result_ = next(self.iter_)
if self.edge_dataloader.collator.negative_sampler is not None:
# input_nodes, pair_graph, neg_pair_graph, blocks
# input_nodes, pair_graph, neg_pair_graph, blocks if None.
# Otherwise, input_nodes, pair_graph, blocks
_restore_subgraph_storage(result_[2], self.edge_dataloader.collator.g)
_restore_subgraph_storage(result_[1], self.edge_dataloader.collator.g)
......@@ -184,7 +184,7 @@ class _EdgeDataLoaderIter:
class NodeDataLoader:
"""PyTorch dataloader for batch-iterating over a set of nodes, generating the list
of blocks as computation dependency of the said minibatch.
of message flow graphs (MFGs) as computation dependency of the said minibatch.
Parameters
----------
......@@ -195,7 +195,7 @@ class NodeDataLoader:
block_sampler : dgl.dataloading.BlockSampler
The neighborhood sampler.
device : device context, optional
The device of the generated blocks in each iteration, which should be a
The device of the generated MFGs in each iteration, which should be a
PyTorch device object (e.g., ``torch.device``).
kwargs : dict
Arguments being passed to :py:class:`torch.utils.data.DataLoader`.
......@@ -212,6 +212,12 @@ class NodeDataLoader:
... batch_size=1024, shuffle=True, drop_last=False, num_workers=4)
>>> for input_nodes, output_nodes, blocks in dataloader:
... train_on(input_nodes, output_nodes, blocks)
Notes
-----
Please refer to
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`
and :ref:`User Guide Section 6 <guide-minibatch>` for usage.
"""
collator_arglist = inspect.getfullargspec(NodeCollator).args
......@@ -261,8 +267,8 @@ class NodeDataLoader:
class EdgeDataLoader:
"""PyTorch dataloader for batch-iterating over a set of edges, generating the list
of blocks as computation dependency of the said minibatch for edge classification,
edge regression, and link prediction.
of message flow graphs (MFGs) as computation dependency of the said minibatch for
edge classification, edge regression, and link prediction.
For each iteration, the object will yield
......@@ -275,7 +281,7 @@ class EdgeDataLoader:
* If a negative sampler is given, another graph that contains the "negative edges",
connecting the source and destination nodes yielded from the given negative sampler.
* A list of blocks necessary for computing the representation of the incident nodes
* A list of MFGs necessary for computing the representation of the incident nodes
of the edges in the minibatch.
For more details, please refer to :ref:`guide-minibatch-edge-classification-sampler`
......@@ -290,7 +296,7 @@ class EdgeDataLoader:
block_sampler : dgl.dataloading.BlockSampler
The neighborhood sampler.
device : device context, optional
The device of the generated blocks and graphs in each iteration, which should be a
The device of the generated MFGs and graphs in each iteration, which should be a
PyTorch device object (e.g., ``torch.device``).
g_sampling : DGLGraph, optional
The graph where neighborhood sampling is performed.
......@@ -406,11 +412,17 @@ class EdgeDataLoader:
... negative_sampler=neg_sampler,
... batch_size=1024, shuffle=True, drop_last=False, num_workers=4)
>>> for input_nodes, pos_pair_graph, neg_pair_graph, blocks in dataloader:
... train_on(input_nodse, pair_graph, neg_pair_graph, blocks)
... train_on(input_nodes, pair_graph, neg_pair_graph, blocks)
See also
--------
:class:`~dgl.dataloading.dataloader.EdgeCollator`
dgl.dataloading.dataloader.EdgeCollator
Notes
-----
Please refer to
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`
and :ref:`User Guide Section 6 <guide-minibatch>` for usage.
For end-to-end usages, please refer to the following tutorial/examples:
......
......@@ -1668,35 +1668,35 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True):
"""Convert a graph into a bipartite-structured *block* for message passing.
A block is a graph consisting of two sets of nodes: the
*input* nodes and *output* nodes. The input and output nodes can have multiple
node types. All the edges connect from input nodes to output nodes.
*source* nodes and *destination* nodes. The source and destination nodes can have multiple
node types. All the edges connect from source nodes to destination nodes.
Specifically, the input nodes and output nodes will have the same node types as the
Specifically, the source nodes and destination nodes will have the same node types as the
ones in the original graph. DGL maps each edge ``(u, v)`` with edge type
``(utype, etype, vtype)`` in the original graph to the edge with type
``etype`` connecting from node ID ``u`` of type ``utype`` in the input side to node
ID ``v`` of type ``vtype`` in the output side.
``etype`` connecting from node ID ``u`` of type ``utype`` in the source side to node
ID ``v`` of type ``vtype`` in the destination side.
For blocks returned by :func:`to_block`, the output nodes of the block will only
contain the nodes that have at least one inbound edge of any type. The input nodes
of the block will only contain the nodes that appear in the output nodes, as well
as the nodes that have at least one outbound edge connecting to one of the output nodes.
For blocks returned by :func:`to_block`, the destination nodes of the block will only
contain the nodes that have at least one inbound edge of any type. The source nodes
of the block will only contain the nodes that appear in the destination nodes, as well
as the nodes that have at least one outbound edge connecting to one of the destination nodes.
If the :attr:`dst_nodes` argument is not None, it specifies the output nodes instead.
The destination nodes are specified by the :attr:`dst_nodes` argument if it is not None.
Parameters
----------
graph : DGLGraph
The graph.
dst_nodes : Tensor or dict[str, Tensor], optional
The list of output nodes.
The list of destination nodes.
If a tensor is given, the graph must have only one node type.
If given, it must be a superset of all the nodes that have at least one inbound
edge. An error will be raised otherwise.
include_dst_in_src : bool
If False, do not include output nodes in input nodes.
If False, do not include destination nodes in source nodes.
(Default: True)
......@@ -1734,13 +1734,13 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True):
>>> g = dgl.graph(([1, 2], [2, 3]))
>>> block = dgl.to_block(g, torch.LongTensor([3, 2]))
The output nodes would be exactly the same as the ones given: [3, 2].
The destination nodes would be exactly the same as the ones given: [3, 2].
>>> induced_dst = block.dstdata[dgl.NID]
>>> induced_dst
tensor([3, 2])
The first few input nodes would also be exactly the same as
The first few source nodes would also be exactly the same as
the ones given. The rest of the nodes are the ones necessary for message passing
into nodes 3, 2. This means that the node 1 would be included.
......@@ -1749,7 +1749,7 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True):
tensor([3, 2, 1])
You can notice that the first two nodes are identical to the given nodes as well as
the output nodes.
the destination nodes.
The induced edges can also be obtained by the following:
......@@ -1764,20 +1764,20 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True):
>>> induced_src[src], induced_dst[dst]
(tensor([2, 1]), tensor([3, 2]))
The output nodes specified must be a superset of the nodes that have edges connecting
to them. For example, the following will raise an error since the output nodes
The destination nodes specified must be a superset of the nodes that have edges connecting
to them. For example, the following will raise an error since the destination nodes
does not contain node 3, which has an edge connecting to it.
>>> g = dgl.graph(([1, 2], [2, 3]))
>>> dgl.to_block(g, torch.LongTensor([2])) # error
Converting a heterogeneous graph to a block is similar, except that when specifying
the output nodes, you have to give a dict:
the destination nodes, you have to give a dict:
>>> g = dgl.heterograph({('A', '_E', 'B'): ([1, 2], [2, 3])})
If you don't specify any node of type A on the output side, the node type ``A``
in the block would have zero nodes on the output side.
If you don't specify any node of type A on the destination side, the node type ``A``
in the block would have zero nodes on the destination side.
>>> block = dgl.to_block(g, {'B': torch.LongTensor([3, 2])})
>>> block.number_of_dst_nodes('A')
......@@ -1787,12 +1787,12 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True):
>>> block.dstnodes['B'].data[dgl.NID]
tensor([3, 2])
The input side would contain all the nodes on the output side:
The source side would contain all the nodes on the destination side:
>>> block.srcnodes['B'].data[dgl.NID]
tensor([3, 2])
As well as all the nodes that have connections to the nodes on the output side:
As well as all the nodes that have connections to the nodes on the destination side:
>>> block.srcnodes['A'].data[dgl.NID]
tensor([2, 1])
......
......@@ -93,15 +93,16 @@ By the end of this tutorial, you will be able to
######################################################################
# You can also notice in the animation above that the computation
# dependencies in the animation above can be described as a series of
# *bipartite graphs*.
# The output nodes are on one side and all the nodes necessary for inputs
# are on the other side. The arrows indicate how the sampled neighbors
# propagates messages to the nodes.
# bipartite graphs.
# The output nodes (called *destination nodes*) are on one side and all the
# nodes necessary for inputs (called *source nodes*) are on the other side.
# The arrows indicate how the sampled neighbors propagates messages to the nodes.
# DGL calls such graphs *message flow graphs* (MFG).
#
# Note that some GNN modules, such as `SAGEConv`, need to use the output
# Note that some GNN modules, such as `SAGEConv`, need to use the destination
# nodes' features on the previous layer to compute the outputs. Without
# loss of generality, DGL always includes the output nodes themselves
# in the input nodes.
# loss of generality, DGL always includes the destination nodes themselves
# in the source nodes.
#
......
......@@ -70,7 +70,7 @@ test_nids = idx_split['test']
#
# In the :doc:`previous tutorial <L0_neighbor_sampling_overview>`, you
# have seen that the computation dependency for message passing of a
# single node can be described as a series of bipartite graphs.
# single node can be described as a series of *message flow graphs* (MFG).
#
# |image1|
#
......@@ -84,10 +84,10 @@ test_nids = idx_split['test']
#
# DGL provides tools to iterate over the dataset in minibatches
# while generating the computation dependencies to compute their outputs
# with the bipartite graphs above. For node classification, you can use
# with the MFGs above. For node classification, you can use
# ``dgl.dataloading.NodeDataLoader`` for iterating over the dataset.
# It accepts a sampler object to control how to generate the computation
# dependencies in the form of bipartite graphs. DGL provides
# dependencies in the form of MFGs. DGL provides
# implementations of common sampling algorithms such as
# ``dgl.dataloading.MultiLayerNeighborSampler`` which randomly picks
# a fixed number of neighbors for each node.
......@@ -113,7 +113,7 @@ train_dataloader = dgl.dataloading.NodeDataLoader(
graph, # The graph
train_nids, # The node IDs to iterate over in minibatches
sampler, # The neighbor sampler
device=device, # Put the sampled bipartite graphs on CPU or GPU
device=device, # Put the sampled MFGs on CPU or GPU
# The following arguments are inherited from PyTorch DataLoader.
batch_size=1024, # Batch size
shuffle=True, # Whether to shuffle the nodes for every epoch
......@@ -126,7 +126,7 @@ train_dataloader = dgl.dataloading.NodeDataLoader(
# You can iterate over the data loader and see what it yields.
#
input_nodes, output_nodes, bipartites = example_minibatch = next(iter(train_dataloader))
input_nodes, output_nodes, mfgs = example_minibatch = next(iter(train_dataloader))
print(example_minibatch)
print("To compute {} nodes' outputs, we need {} nodes' input features".format(len(output_nodes), len(input_nodes)))
......@@ -138,24 +138,24 @@ print("To compute {} nodes' outputs, we need {} nodes' input features".format(le
# are needed on the first GNN layer for this minibatch.
# - An ID tensor for the output nodes, i.e. nodes whose representations
# are to be computed.
# - A list of bipartite graphs storing the computation dependencies
# - A list of MFGs storing the computation dependencies
# for each GNN layer.
#
######################################################################
# You can get the input and output node IDs of the bipartite graphs
# and verify that the first few input nodes are always the same as the output
# You can get the source and destination node IDs of the MFGs
# and verify that the first few source nodes are always the same as the destination
# nodes. As we described in the :doc:`overview <L0_neighbor_sampling_overview>`,
# output nodes' own features from the previous layer may also be necessary in
# destination nodes' own features from the previous layer may also be necessary in
# the computation of the new features.
#
bipartite_0_src = bipartites[0].srcdata[dgl.NID]
bipartite_0_dst = bipartites[0].dstdata[dgl.NID]
print(bipartite_0_src)
print(bipartite_0_dst)
print(torch.equal(bipartite_0_src[:bipartites[0].num_dst_nodes()], bipartite_0_dst))
mfg_0_src = mfgs[0].srcdata[dgl.NID]
mfg_0_dst = mfgs[0].dstdata[dgl.NID]
print(mfg_0_src)
print(mfg_0_dst)
print(torch.equal(mfg_0_src[:mfgs[0].num_dst_nodes()], mfg_0_dst))
######################################################################
......@@ -177,14 +177,14 @@ class Model(nn.Module):
self.conv2 = SAGEConv(h_feats, num_classes, aggregator_type='mean')
self.h_feats = h_feats
def forward(self, bipartites, x):
def forward(self, mfgs, x):
# Lines that are changed are marked with an arrow: "<---"
h_dst = x[:bipartites[0].num_dst_nodes()] # <---
h = self.conv1(bipartites[0], (x, h_dst)) # <---
h_dst = x[:mfgs[0].num_dst_nodes()] # <---
h = self.conv1(mfgs[0], (x, h_dst)) # <---
h = F.relu(h)
h_dst = h[:bipartites[1].num_dst_nodes()] # <---
h = self.conv2(bipartites[1], (h, h_dst)) # <---
h_dst = h[:mfgs[1].num_dst_nodes()] # <---
h = self.conv2(mfgs[1], (h, h_dst)) # <---
return h
model = Model(num_features, 128, num_classes).to(device)
......@@ -195,44 +195,44 @@ model = Model(num_features, 128, num_classes).to(device)
# :doc:`introduction <../blitz/1_introduction>`, you will notice several
# differences:
#
# - **DGL GNN layers on bipartite graphs**. Instead of computing on the
# - **DGL GNN layers on MFGs**. Instead of computing on the
# full graph:
#
# .. code:: python
#
# h = self.conv1(g, x)
#
# you only compute on the sampled bipartite graph:
# you only compute on the sampled MFG:
#
# .. code:: python
#
# h = self.conv1(bipartites[0], (x, h_dst))
# h = self.conv1(mfgs[0], (x, h_dst))
#
# All DGL’s GNN modules support message passing on bipartite graphs,
# where you supply a pair of features, one for input nodes and another
# for output nodes.
# All DGL’s GNN modules support message passing on MFGs,
# where you supply a pair of features, one for source nodes and another
# for destination nodes.
#
# - **Feature slicing for self-dependency**. There are statements that
# perform slicing to obtain the previous-layer representation of the
# output nodes:
# nodes:
#
# .. code:: python
#
# h_dst = x[:bipartites[0].num_dst_nodes()]
# h_dst = x[:mfgs[0].num_dst_nodes()]
#
# ``num_dst_nodes`` method works with bipartite graphs, where it will
# return the number of output nodes.
# ``num_dst_nodes`` method works with MFGs, where it will
# return the number of destination nodes.
#
# Since the first few input nodes of the yielded bipartite graph are
# always the same as the output nodes, these statements obtain the
# representations of the output nodes on the previous layer. They are
# Since the first few source nodes of the yielded MFG are
# always the same as the destination nodes, these statements obtain the
# representations of the destination nodes on the previous layer. They are
# then combined with neighbor aggregation in ``dgl.nn.SAGEConv`` layer.
#
# .. note::
#
# See the :doc:`custom message passing
# tutorial <L4_message_passing>` for more details on how to
# manipulate bipartite graphs produced in this way, such as the usage
# manipulate MFGs produced in this way, such as the usage
# of ``num_dst_nodes``.
#
......@@ -277,12 +277,12 @@ for epoch in range(10):
model.train()
with tqdm.tqdm(train_dataloader) as tq:
for step, (input_nodes, output_nodes, bipartites) in enumerate(tq):
for step, (input_nodes, output_nodes, mfgs) in enumerate(tq):
# feature copy from CPU to GPU takes place here
inputs = bipartites[0].srcdata['feat']
labels = bipartites[-1].dstdata['label']
inputs = mfgs[0].srcdata['feat']
labels = mfgs[-1].dstdata['label']
predictions = model(bipartites, inputs)
predictions = model(mfgs, inputs)
loss = F.cross_entropy(predictions, labels)
opt.zero_grad()
......@@ -298,10 +298,10 @@ for epoch in range(10):
predictions = []
labels = []
with tqdm.tqdm(valid_dataloader) as tq, torch.no_grad():
for input_nodes, output_nodes, bipartites in tq:
inputs = bipartites[0].srcdata['feat']
labels.append(bipartites[-1].dstdata['label'].cpu().numpy())
predictions.append(model(bipartites, inputs).argmax(1).cpu().numpy())
for input_nodes, output_nodes, mfgs in tq:
inputs = mfgs[0].srcdata['feat']
labels.append(mfgs[-1].dstdata['label'].cpu().numpy())
predictions.append(model(mfgs, inputs).argmax(1).cpu().numpy())
predictions = np.concatenate(predictions)
labels = np.concatenate(labels)
accuracy = sklearn.metrics.accuracy_score(labels, predictions)
......
......@@ -117,7 +117,7 @@ train_dataloader = dgl.dataloading.EdgeDataLoader(
torch.arange(graph.number_of_edges()), # The edges to iterate over
sampler, # The neighbor sampler
negative_sampler=negative_sampler, # The negative sampler
device=device, # Put the bipartite graphs on CPU or GPU
device=device, # Put the MFGs on CPU or GPU
# The following arguments are inherited from PyTorch DataLoader.
batch_size=1024, # Batch size
shuffle=True, # Whether to shuffle the nodes for every epoch
......@@ -131,11 +131,11 @@ train_dataloader = dgl.dataloading.EdgeDataLoader(
# will give you.
#
input_nodes, pos_graph, neg_graph, bipartites = next(iter(train_dataloader))
input_nodes, pos_graph, neg_graph, mfgs = next(iter(train_dataloader))
print('Number of input nodes:', len(input_nodes))
print('Positive graph # nodes:', pos_graph.number_of_nodes(), '# edges:', pos_graph.number_of_edges())
print('Negative graph # nodes:', neg_graph.number_of_nodes(), '# edges:', neg_graph.number_of_edges())
print(bipartites)
print(mfgs)
######################################################################
......@@ -152,9 +152,9 @@ print(bipartites)
# necessary for computing the pair-wise scores of positive and negative examples
# in the current minibatch.
#
# The last element is a list of bipartite graphs storing the computation
# dependencies for each GNN layer.
# The bipartite graphs are used to compute the GNN outputs of the nodes
# The last element is a list of :doc:`MFGs <L0_neighbor_sampling_overview>`
# storing the computation dependencies for each GNN layer.
# The MFGs are used to compute the GNN outputs of the nodes
# involved in positive/negative graph.
#
......@@ -180,12 +180,12 @@ class Model(nn.Module):
self.conv2 = SAGEConv(h_feats, h_feats, aggregator_type='mean')
self.h_feats = h_feats
def forward(self, bipartites, x):
h_dst = x[:bipartites[0].num_dst_nodes()]
h = self.conv1(bipartites[0], (x, h_dst))
def forward(self, mfgs, x):
h_dst = x[:mfgs[0].num_dst_nodes()]
h = self.conv1(mfgs[0], (x, h_dst))
h = F.relu(h)
h_dst = h[:bipartites[1].num_dst_nodes()]
h = self.conv2(bipartites[1], (h, h_dst))
h_dst = h[:mfgs[1].num_dst_nodes()]
h = self.conv2(mfgs[1], (h, h_dst))
return h
model = Model(num_features, 128).to(device)
......@@ -256,10 +256,10 @@ def inference(model, graph, node_features):
device=device)
result = []
for input_nodes, output_nodes, bipartites in train_dataloader:
for input_nodes, output_nodes, mfgs in train_dataloader:
# feature copy from CPU to GPU takes place here
inputs = bipartites[0].srcdata['feat']
result.append(model(bipartites, inputs))
inputs = mfgs[0].srcdata['feat']
result.append(model(mfgs, inputs))
return torch.cat(result)
......@@ -324,11 +324,11 @@ best_accuracy = 0
best_model_path = 'model.pt'
for epoch in range(1):
with tqdm.tqdm(train_dataloader) as tq:
for step, (input_nodes, pos_graph, neg_graph, bipartites) in enumerate(tq):
for step, (input_nodes, pos_graph, neg_graph, mfgs) in enumerate(tq):
# feature copy from CPU to GPU takes place here
inputs = bipartites[0].srcdata['feat']
inputs = mfgs[0].srcdata['feat']
outputs = model(bipartites, inputs)
outputs = model(mfgs, inputs)
pos_score = predictor(pos_graph, outputs)
neg_score = predictor(neg_graph, outputs)
......
......@@ -38,33 +38,34 @@ train_dataloader = dgl.dataloading.NodeDataLoader(
num_workers=0
)
input_nodes, output_nodes, bipartites = next(iter(train_dataloader))
input_nodes, output_nodes, mfgs = next(iter(train_dataloader))
######################################################################
# DGL Bipartite Graph Introduction
# --------------------------------
#
# In the previous tutorials, you have seen the concept *bipartite graph*,
# where nodes are divided into two parts.
# In the previous tutorials, you have seen the concept *message flow graph*
# (MFG), where nodes are divided into two parts. It is a kind of (directional)
# bipartite graph.
# This section introduces how you can manipulate (directional) bipartite
# graphs.
#
# You can access the input node features and output node features via
# You can access the source node features and destination node features via
# ``srcdata`` and ``dstdata`` attributes:
#
bipartite = bipartites[0]
print(bipartite.srcdata)
print(bipartite.dstdata)
mfg = mfgs[0]
print(mfg.srcdata)
print(mfg.dstdata)
######################################################################
# It also has ``num_src_nodes`` and ``num_dst_nodes`` functions to query
# how many input nodes and output nodes exist in the bipartite graph:
# how many source nodes and destination nodes exist in the bipartite graph:
#
print(bipartite.num_src_nodes(), bipartite.num_dst_nodes())
print(mfg.num_src_nodes(), mfg.num_dst_nodes())
######################################################################
......@@ -72,18 +73,18 @@ print(bipartite.num_src_nodes(), bipartite.num_dst_nodes())
# will do with ``ndata`` on the graphs you have seen earlier:
#
bipartite.srcdata['x'] = torch.zeros(bipartite.num_src_nodes(), bipartite.num_dst_nodes())
dst_feat = bipartite.dstdata['feat']
mfg.srcdata['x'] = torch.zeros(mfg.num_src_nodes(), mfg.num_dst_nodes())
dst_feat = mfg.dstdata['feat']
######################################################################
# Also, since the bipartite graphs are constructed by DGL, you can
# retrieve the input node IDs (i.e. those that are required to compute the
# output) and output node IDs (i.e. those whose representations the
# retrieve the source node IDs (i.e. those that are required to compute the
# output) and destination node IDs (i.e. those whose representations the
# current GNN layer should compute) as follows.
#
bipartite.srcdata[dgl.NID], bipartite.dstdata[dgl.NID]
mfg.srcdata[dgl.NID], mfg.dstdata[dgl.NID]
######################################################################
......@@ -93,30 +94,30 @@ bipartite.srcdata[dgl.NID], bipartite.dstdata[dgl.NID]
######################################################################
# Recall that the bipartite graphs yielded by the ``NodeDataLoader`` and
# ``EdgeDataLoader`` have the property that the first few input nodes are
# always identical to the output nodes:
# Recall that the MFGs yielded by the ``NodeDataLoader`` and
# ``EdgeDataLoader`` have the property that the first few source nodes are
# always identical to the destination nodes:
#
# |image1|
#
# .. |image1| image:: https://data.dgl.ai/tutorial/img/bipartite.gif
#
print(torch.equal(bipartite.srcdata[dgl.NID][:bipartite.num_dst_nodes()], bipartite.dstdata[dgl.NID]))
print(torch.equal(mfg.srcdata[dgl.NID][:mfg.num_dst_nodes()], mfg.dstdata[dgl.NID]))
######################################################################
# Suppose you have obtained the input node representations
# Suppose you have obtained the source node representations
# :math:`h_u^{(l-1)}`:
#
bipartite.srcdata['h'] = torch.randn(bipartite.num_src_nodes(), 10)
mfg.srcdata['h'] = torch.randn(mfg.num_src_nodes(), 10)
######################################################################
# Recall that DGL provides the `update_all` interface for expressing how
# to compute messages and how to aggregate them on the nodes that receive
# them. This concept naturally applies to bipartite graphs -- message
# them. This concept naturally applies to bipartite graphs like MFGs -- message
# computation happens on the edges between source and destination nodes of
# the edges, and message aggregation happens on the destination nodes.
#
......@@ -129,8 +130,8 @@ bipartite.srcdata['h'] = torch.randn(bipartite.num_src_nodes(), 10)
import dgl.function as fn
bipartite.update_all(message_func=fn.copy_u('h', 'm'), reduce_func=fn.mean('m', 'h'))
m_v = bipartite.dstdata['h']
mfg.update_all(message_func=fn.copy_u('h', 'm'), reduce_func=fn.mean('m', 'h'))
m_v = mfg.dstdata['h']
m_v
......@@ -165,9 +166,9 @@ class SAGEConv(nn.Module):
Parameters
----------
g : Graph
The input bipartite graph.
The input MFG.
h : (Tensor, Tensor)
The feature of input nodes and output nodes as a pair of Tensors.
The feature of source nodes and destination nodes as a pair of Tensors.
"""
with g.local_scope():
h_src, h_dst = h
......@@ -185,12 +186,12 @@ class Model(nn.Module):
self.conv1 = SAGEConv(in_feats, h_feats)
self.conv2 = SAGEConv(h_feats, num_classes)
def forward(self, bipartites, x):
h_dst = x[:bipartites[0].num_dst_nodes()]
h = self.conv1(bipartites[0], (x, h_dst))
def forward(self, mfgs, x):
h_dst = x[:mfgs[0].num_dst_nodes()]
h = self.conv1(mfgs[0], (x, h_dst))
h = F.relu(h)
h_dst = h[:bipartites[1].num_dst_nodes()]
h = self.conv2(bipartites[1], (h, h_dst))
h_dst = h[:mfgs[1].num_dst_nodes()]
h = self.conv2(mfgs[1], (h, h_dst))
return h
sampler = dgl.dataloading.MultiLayerNeighborSampler([4, 4])
......@@ -205,15 +206,15 @@ train_dataloader = dgl.dataloading.NodeDataLoader(
model = Model(graph.ndata['feat'].shape[1], 128, dataset.num_classes).to(device)
with tqdm.tqdm(train_dataloader) as tq:
for step, (input_nodes, output_nodes, bipartites) in enumerate(tq):
inputs = bipartites[0].srcdata['feat']
labels = bipartites[-1].dstdata['label']
predictions = model(bipartites, inputs)
for step, (input_nodes, output_nodes, mfgs) in enumerate(tq):
inputs = mfgs[0].srcdata['feat']
labels = mfgs[-1].dstdata['label']
predictions = model(mfgs, inputs)
######################################################################
# Both ``update_all`` and the functions in ``nn.functional`` namespace
# support bipartite graphs, so you can migrate the code working for small
# support MFGs, so you can migrate the code working for small
# graphs to large graph training with minimal changes introduced above.
#
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment