Unverified Commit 99751d49 authored by Quan (Andy) Gan's avatar Quan (Andy) Gan Committed by GitHub
Browse files

[Doc] Rename block to message flow graph (#2702)

* rename block to mfg

* revert

* rename
parent 491d908b
...@@ -39,6 +39,19 @@ the ``sample_blocks`` methods. ...@@ -39,6 +39,19 @@ the ``sample_blocks`` methods.
.. autoclass:: MultiLayerFullNeighborSampler .. autoclass:: MultiLayerFullNeighborSampler
:show-inheritance: :show-inheritance:
.. _api-dataloading-collators:
Collators
---------
.. currentmodule:: dgl.dataloading
Collators are platform-agnostic classes that generates the mini-batches
given the graphs and indices to sample from.
.. autoclass:: NodeCollator
.. autoclass:: EdgeCollator
.. autoclass:: GraphCollator
.. _api-dataloading-negative-sampling: .. _api-dataloading-negative-sampling:
Negative Samplers for Link Prediction Negative Samplers for Link Prediction
......
...@@ -148,40 +148,47 @@ Since the number of nodes ...@@ -148,40 +148,47 @@ Since the number of nodes
for input and output is different, we need to perform message passing on for input and output is different, we need to perform message passing on
a small, bipartite-structured graph instead. We call such a a small, bipartite-structured graph instead. We call such a
bipartite-structured graph that only contains the necessary input nodes bipartite-structured graph that only contains the necessary input nodes
and output nodes a *block*. The following figure shows the block of the (referred as *source* nodes) and output nodes (referred as *destination* nodes)
second GNN layer for node 8. of a *message flow graph* (MFG).
The following figure shows the MFG of the second GNN layer for node 8.
.. figure:: https://data.dgl.ai/asset/image/guide_6_4_4.png .. figure:: https://data.dgl.ai/asset/image/guide_6_4_4.png
:alt: Imgur :alt: Imgur
.. note::
See the :doc:`Stochastic Training Tutorial
<tutorials/large/L0_neighbor_sampling_overview>` for the concept of
message flow graph.
Note that the output nodes also appear in the input nodes. The reason is Note that the destination nodes also appear in the source nodes. The reason is
that representations of output nodes from the previous layer are needed that representations of destination nodes from the previous layer are needed
for feature combination after message passing (i.e. :math:`\phi^{(2)}`). for feature combination after message passing (i.e. :math:`\phi^{(2)}`).
DGL provides :func:`dgl.to_block` to convert any frontier DGL provides :func:`dgl.to_block` to convert any frontier
to a block where the first argument specifies the frontier and the to a MFG where the first argument specifies the frontier and the
second argument specifies the output nodes. For instance, the frontier second argument specifies the destination nodes. For instance, the frontier
above can be converted to a block with output node 8 with the code as above can be converted to a MFG with destination node 8 with the code as
follows. follows.
.. code:: python .. code:: python
output_nodes = torch.LongTensor([8]) dst_nodes = torch.LongTensor([8])
block = dgl.to_block(frontier, output_nodes) block = dgl.to_block(frontier, dst_nodes)
To find the number of input nodes and output nodes of a given node type, To find the number of source nodes and destination nodes of a given node type,
one can use :meth:`dgl.DGLHeteroGraph.number_of_src_nodes` and one can use :meth:`dgl.DGLHeteroGraph.number_of_src_nodes` and
:meth:`dgl.DGLHeteroGraph.number_of_dst_nodes` methods. :meth:`dgl.DGLHeteroGraph.number_of_dst_nodes` methods.
.. code:: python .. code:: python
num_input_nodes, num_output_nodes = block.number_of_src_nodes(), block.number_of_dst_nodes() num_src_nodes, num_dst_nodes = block.number_of_src_nodes(), block.number_of_dst_nodes()
print(num_input_nodes, num_output_nodes) print(num_src_nodes, num_dst_nodes)
The block’s input node features can be accessed via member The MFG’s source node features can be accessed via member
:attr:`dgl.DGLHeteroGraph.srcdata` and :attr:`dgl.DGLHeteroGraph.srcnodes`, and :attr:`dgl.DGLHeteroGraph.srcdata` and :attr:`dgl.DGLHeteroGraph.srcnodes`, and
its output node features can be accessed via member its destination node features can be accessed via member
:attr:`dgl.DGLHeteroGraph.dstdata` and :attr:`dgl.DGLHeteroGraph.dstnodes`. The :attr:`dgl.DGLHeteroGraph.dstdata` and :attr:`dgl.DGLHeteroGraph.dstnodes`. The
syntax of ``srcdata``/``dstdata`` and ``srcnodes``/``dstnodes`` are syntax of ``srcdata``/``dstdata`` and ``srcnodes``/``dstnodes`` are
identical to :attr:`dgl.DGLHeteroGraph.ndata` and identical to :attr:`dgl.DGLHeteroGraph.ndata` and
...@@ -189,46 +196,36 @@ identical to :attr:`dgl.DGLHeteroGraph.ndata` and ...@@ -189,46 +196,36 @@ identical to :attr:`dgl.DGLHeteroGraph.ndata` and
.. code:: python .. code:: python
block.srcdata['h'] = torch.randn(num_input_nodes, 5) block.srcdata['h'] = torch.randn(num_src_nodes, 5)
block.dstdata['h'] = torch.randn(num_output_nodes, 5) block.dstdata['h'] = torch.randn(num_dst_nodes, 5)
If a block is converted from a frontier, which is in turn converted from If a MFG is converted from a frontier, which is in turn converted from
a graph, one can directly read the feature of the block’s input and a graph, one can directly read the feature of the MFG’s source and
output nodes via destination nodes via
.. code:: python .. code:: python
print(block.srcdata['x']) print(block.srcdata['x'])
print(block.dstdata['y']) print(block.dstdata['y'])
.. raw:: html .. note::
<div class="alert alert-info">
::
The original node IDs of the input nodes and output nodes in the block
can be found as the feature ``dgl.NID``, and the mapping from the
block’s edge IDs to the input frontier’s edge IDs can be found as the
feature ``dgl.EID``.
.. raw:: html
</div>
**Output Nodes** The original node IDs of the source nodes and destination nodes in the MFG
can be found as the feature ``dgl.NID``, and the mapping from the
MFG’s edge IDs to the input frontier’s edge IDs can be found as the
feature ``dgl.EID``.
DGL ensures that the output nodes of a block will always appear in the DGL ensures that the destination nodes of a MFG will always appear in the
input nodes. The output nodes will always index firstly in the input source nodes. The destination nodes will always index firstly in the source
nodes. nodes.
.. code:: python .. code:: python
input_nodes = block.srcdata[dgl.NID] src_nodes = block.srcdata[dgl.NID]
output_nodes = block.dstdata[dgl.NID] dst_nodes = block.dstdata[dgl.NID]
assert torch.equal(input_nodes[:len(output_nodes)], output_nodes) assert torch.equal(src_nodes[:len(dst_nodes)], dst_nodes)
As a result, the output nodes must cover all nodes that are the As a result, the destination nodes must cover all nodes that are the
destination of an edge in the frontier. destination of an edge in the frontier.
For example, consider the following frontier For example, consider the following frontier
...@@ -240,15 +237,15 @@ For example, consider the following frontier ...@@ -240,15 +237,15 @@ For example, consider the following frontier
where the red and green nodes (i.e. node 4, 5, 7, 8, and 11) are all where the red and green nodes (i.e. node 4, 5, 7, 8, and 11) are all
nodes that is a destination of an edge. Then the following code will nodes that is a destination of an edge. Then the following code will
raise an error because the output nodes did not cover all those nodes. raise an error because the destination nodes did not cover all those nodes.
.. code:: python .. code:: python
dgl.to_block(frontier2, torch.LongTensor([4, 5])) # ERROR dgl.to_block(frontier2, torch.LongTensor([4, 5])) # ERROR
However, the output nodes can have more nodes than above. In this case, However, the destination nodes can have more nodes than above. In this case,
we will have isolated nodes that do not have any edge connecting to it. we will have isolated nodes that do not have any edge connecting to it.
The isolated nodes will be included in both input nodes and output The isolated nodes will be included in both source nodes and destination
nodes. nodes.
.. code:: python .. code:: python
...@@ -261,7 +258,7 @@ nodes. ...@@ -261,7 +258,7 @@ nodes.
Heterogeneous Graphs Heterogeneous Graphs
^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^
Blocks also work on heterogeneous graphs. Let’s say that we have the MFGs also work on heterogeneous graphs. Let’s say that we have the
following frontier: following frontier:
.. code:: python .. code:: python
...@@ -272,20 +269,20 @@ following frontier: ...@@ -272,20 +269,20 @@ following frontier:
('game', 'played-by', 'user'): ([2], [6]) ('game', 'played-by', 'user'): ([2], [6])
}, num_nodes_dict={'user': 10, 'game': 10}) }, num_nodes_dict={'user': 10, 'game': 10})
One can also create a block with output nodes User #3, #6, and #8, as One can also create a MFG with destination nodes User #3, #6, and #8, as
well as Game #2 and #6. well as Game #2 and #6.
.. code:: python .. code:: python
hetero_block = dgl.to_block(hetero_frontier, {'user': [3, 6, 8], 'block': [2, 6]}) hetero_block = dgl.to_block(hetero_frontier, {'user': [3, 6, 8], 'game': [2, 6]})
One can also get the input nodes and output nodes by type: One can also get the source nodes and destination nodes by type:
.. code:: python .. code:: python
# input users and games # source users and games
print(hetero_block.srcnodes['user'].data[dgl.NID], hetero_block.srcnodes['game'].data[dgl.NID]) print(hetero_block.srcnodes['user'].data[dgl.NID], hetero_block.srcnodes['game'].data[dgl.NID])
# output users and games # destination users and games
print(hetero_block.dstnodes['user'].data[dgl.NID], hetero_block.dstnodes['game'].data[dgl.NID]) print(hetero_block.dstnodes['user'].data[dgl.NID], hetero_block.dstnodes['game'].data[dgl.NID])
...@@ -307,10 +304,10 @@ see what :class:`~dgl.dataloading.dataloader.BlockSampler`, the parent class of ...@@ -307,10 +304,10 @@ see what :class:`~dgl.dataloading.dataloader.BlockSampler`, the parent class of
:class:`~dgl.dataloading.neighbor.MultiLayerFullNeighborSampler`, is. :class:`~dgl.dataloading.neighbor.MultiLayerFullNeighborSampler`, is.
:class:`~dgl.dataloading.dataloader.BlockSampler` is responsible for :class:`~dgl.dataloading.dataloader.BlockSampler` is responsible for
generating the list of blocks starting from the last layer, with method generating the list of MFGs starting from the last layer, with method
:meth:`~dgl.dataloading.dataloader.BlockSampler.sample_blocks`. The default implementation of :meth:`~dgl.dataloading.dataloader.BlockSampler.sample_blocks`. The default implementation of
``sample_blocks`` is to iterate backwards, generating the frontiers and ``sample_blocks`` is to iterate backwards, generating the frontiers and
converting them to blocks. converting them to MFGs.
Therefore, for neighborhood sampling, **you only need to implement Therefore, for neighborhood sampling, **you only need to implement
the**\ :meth:`~dgl.dataloading.dataloader.BlockSampler.sample_frontier`\ **method**. Given which the**\ :meth:`~dgl.dataloading.dataloader.BlockSampler.sample_frontier`\ **method**. Given which
...@@ -386,7 +383,7 @@ nodes with a probability, one can simply define the sampler as follows: ...@@ -386,7 +383,7 @@ nodes with a probability, one can simply define the sampler as follows:
return self.n_layers return self.n_layers
After implementing your sampler, you can create a data loader that takes After implementing your sampler, you can create a data loader that takes
in your sampler and it will keep generating lists of blocks while in your sampler and it will keep generating lists of MFGs while
iterating over the seed nodes as usual. iterating over the seed nodes as usual.
.. code:: python .. code:: python
......
...@@ -22,11 +22,11 @@ To use the neighborhood sampler provided by DGL for edge classification, ...@@ -22,11 +22,11 @@ To use the neighborhood sampler provided by DGL for edge classification,
one need to instead combine it with one need to instead combine it with
:class:`~dgl.dataloading.pytorch.EdgeDataLoader`, which iterates :class:`~dgl.dataloading.pytorch.EdgeDataLoader`, which iterates
over a set of edges in minibatches, yielding the subgraph induced by the over a set of edges in minibatches, yielding the subgraph induced by the
edge minibatch and ``blocks`` to be consumed by the module below. edge minibatch and *message flow graphs* (MFGs) to be consumed by the module below.
For example, the following code creates a PyTorch DataLoader that For example, the following code creates a PyTorch DataLoader that
iterates over the training edge ID array ``train_eids`` in batches, iterates over the training edge ID array ``train_eids`` in batches,
putting the list of generated blocks onto GPU. putting the list of generated MFGs onto GPU.
.. code:: python .. code:: python
...@@ -37,12 +37,18 @@ putting the list of generated blocks onto GPU. ...@@ -37,12 +37,18 @@ putting the list of generated blocks onto GPU.
drop_last=False, drop_last=False,
num_workers=4) num_workers=4)
For a complete list of supported builtin samplers, please refer to the .. note::
:ref:`neighborhood sampler API reference <api-dataloading-neighbor-sampling>`.
If you wish to develop your own neighborhood sampler or you want a more See the :doc:`Stochastic Training Tutorial
detailed explanation of the concept of blocks, please refer to <tutorials/large/L0_neighbor_sampling_overview>` for the concept of
:ref:`guide-minibatch-customizing-neighborhood-sampler`. message flow graph.
For a complete list of supported builtin samplers, please refer to the
:ref:`neighborhood sampler API reference <api-dataloading-neighbor-sampling>`.
If you wish to develop your own neighborhood sampler or you want a more
detailed explanation of the concept of MFGs, please refer to
:ref:`guide-minibatch-customizing-neighborhood-sampler`.
Removing edges in the minibatch from the original graph for neighbor sampling Removing edges in the minibatch from the original graph for neighbor sampling
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...@@ -92,7 +98,7 @@ The edge classification model usually consists of two parts: ...@@ -92,7 +98,7 @@ The edge classification model usually consists of two parts:
The former part is exactly the same as The former part is exactly the same as
:ref:`that from node classification <guide-minibatch-node-classification-model>` :ref:`that from node classification <guide-minibatch-node-classification-model>`
and we can simply reuse it. The input is still the list of and we can simply reuse it. The input is still the list of
blocks generated from a data loader provided by DGL, as well as the MFGs generated from a data loader provided by DGL, as well as the
input features. input features.
.. code:: python .. code:: python
...@@ -135,7 +141,7 @@ layer. ...@@ -135,7 +141,7 @@ layer.
edge_subgraph.apply_edges(self.apply_edges) edge_subgraph.apply_edges(self.apply_edges)
return edge_subgraph.edata['score'] return edge_subgraph.edata['score']
The entire model will take the list of blocks and the edge subgraph The entire model will take the list of MFGs and the edge subgraph
generated by the data loader, as well as the input node features as generated by the data loader, as well as the input node features as
follows: follows:
...@@ -153,14 +159,14 @@ follows: ...@@ -153,14 +159,14 @@ follows:
return self.predictor(edge_subgraph, x) return self.predictor(edge_subgraph, x)
DGL ensures that that the nodes in the edge subgraph are the same as the DGL ensures that that the nodes in the edge subgraph are the same as the
output nodes of the last block in the generated list of blocks. output nodes of the last MFG in the generated list of MFGs.
Training Loop Training Loop
~~~~~~~~~~~~~ ~~~~~~~~~~~~~
The training loop is very similar to node classification. You can The training loop is very similar to node classification. You can
iterate over the dataloader and get a subgraph induced by the edges in iterate over the dataloader and get a subgraph induced by the edges in
the minibatch, as well as the list of blocks necessary for computing the minibatch, as well as the list of MFGs necessary for computing
their incident node representations. their incident node representations.
.. code:: python .. code:: python
......
...@@ -109,9 +109,11 @@ When a negative sampler is provided, DGL’s data loader will generate ...@@ -109,9 +109,11 @@ When a negative sampler is provided, DGL’s data loader will generate
three items per minibatch: three items per minibatch:
- A positive graph containing all the edges sampled in the minibatch. - A positive graph containing all the edges sampled in the minibatch.
- A negative graph containing all the non-existent edges generated by - A negative graph containing all the non-existent edges generated by
the negative sampler. the negative sampler.
- A list of blocks generated by the neighborhood sampler.
- A list of *message flow graphs* (MFGs) generated by the neighborhood sampler.
So one can define the link prediction model as follows that takes in the So one can define the link prediction model as follows that takes in the
three items as well as the input features. three items as well as the input features.
......
...@@ -5,10 +5,16 @@ ...@@ -5,10 +5,16 @@
:ref:`(中文版) <guide_cn-minibatch-custom-gnn-module>` :ref:`(中文版) <guide_cn-minibatch-custom-gnn-module>`
.. note::
:doc:`This tutorial <tutorials/large/L4_message_passing>` has similar
content to this section for the homogeneous graph case.
If you were familiar with how to write a custom GNN module for updating If you were familiar with how to write a custom GNN module for updating
the entire graph for homogeneous or heterogeneous graphs (see the entire graph for homogeneous or heterogeneous graphs (see
:ref:`guide-nn`), the code for computing on :ref:`guide-nn`), the code for computing on
blocks is similar, with the exception that the nodes are divided into MFGs is similar, with the exception that the nodes are divided into
input nodes and output nodes. input nodes and output nodes.
For example, consider the following custom graph convolution module For example, consider the following custom graph convolution module
...@@ -30,7 +36,7 @@ like. ...@@ -30,7 +36,7 @@ like.
return self.W(torch.cat([g.ndata['h'], g.ndata['h_neigh']], 1)) return self.W(torch.cat([g.ndata['h'], g.ndata['h_neigh']], 1))
If you have a custom message passing NN module for the full graph, and If you have a custom message passing NN module for the full graph, and
you would like to make it work for blocks, you only need to rewrite the you would like to make it work for MFGs, you only need to rewrite the
forward function as follows. Note that the corresponding statements from forward function as follows. Note that the corresponding statements from
the full-graph implementation are commented; you can compare the the full-graph implementation are commented; you can compare the
original statements with the new statements. original statements with the new statements.
...@@ -62,7 +68,7 @@ original statements with the new statements. ...@@ -62,7 +68,7 @@ original statements with the new statements.
[block.dstdata['h'], block.dstdata['h_neigh']], 1)) [block.dstdata['h'], block.dstdata['h_neigh']], 1))
In general, you need to do the following to make your NN module work for In general, you need to do the following to make your NN module work for
blocks. MFGs.
- Obtain the features for output nodes from the input features by - Obtain the features for output nodes from the input features by
slicing the first few rows. The number of rows can be obtained by slicing the first few rows. The number of rows can be obtained by
...@@ -149,22 +155,22 @@ serve for input or output. ...@@ -149,22 +155,22 @@ serve for input or output.
return {ntype: g.dstnodes[ntype].data['h_dst'] return {ntype: g.dstnodes[ntype].data['h_dst']
for ntype in g.ntypes} for ntype in g.ntypes}
Writing modules that work on homogeneous graphs, bipartite graphs, and blocks Writing modules that work on homogeneous graphs, bipartite graphs, and MFGs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All message passing modules in DGL work on homogeneous graphs, All message passing modules in DGL work on homogeneous graphs,
unidirectional bipartite graphs (that have two node types and one edge unidirectional bipartite graphs (that have two node types and one edge
type), and a block with one edge type. Essentially, the input graph and type), and a MFG with one edge type. Essentially, the input graph and
feature of a builtin DGL neural network module must satisfy either of feature of a builtin DGL neural network module must satisfy either of
the following cases. the following cases.
- If the input feature is a pair of tensors, then the input graph must - If the input feature is a pair of tensors, then the input graph must
be unidirectional bipartite. be unidirectional bipartite.
- If the input feature is a single tensor and the input graph is a - If the input feature is a single tensor and the input graph is a
block, DGL will automatically set the feature on the output nodes as MFG, DGL will automatically set the feature on the output nodes as
the first few rows of the input node features. the first few rows of the input node features.
- If the input feature must be a single tensor and the input graph is - If the input feature must be a single tensor and the input graph is
not a block, then the input graph must be homogeneous. not a MFG, then the input graph must be homogeneous.
For example, the following is simplified from the PyTorch implementation For example, the following is simplified from the PyTorch implementation
of :class:`dgl.nn.pytorch.SAGEConv` (also available in MXNet and Tensorflow) of :class:`dgl.nn.pytorch.SAGEConv` (also available in MXNet and Tensorflow)
...@@ -194,6 +200,6 @@ of :class:`dgl.nn.pytorch.SAGEConv` (also available in MXNet and Tensorflow) ...@@ -194,6 +200,6 @@ of :class:`dgl.nn.pytorch.SAGEConv` (also available in MXNet and Tensorflow)
self.W(torch.cat([g.dstdata['h'], g.dstdata['h_neigh']], 1))) self.W(torch.cat([g.dstdata['h'], g.dstdata['h_neigh']], 1)))
:ref:`guide-nn` also provides a walkthrough on :class:`dgl.nn.pytorch.SAGEConv`, :ref:`guide-nn` also provides a walkthrough on :class:`dgl.nn.pytorch.SAGEConv`,
which works on unidirectional bipartite graphs, homogeneous graphs, and blocks. which works on unidirectional bipartite graphs, homogeneous graphs, and MFGs.
...@@ -31,7 +31,7 @@ over a set of nodes in minibatches. ...@@ -31,7 +31,7 @@ over a set of nodes in minibatches.
For example, the following code creates a PyTorch DataLoader that For example, the following code creates a PyTorch DataLoader that
iterates over the training node ID array ``train_nids`` in batches, iterates over the training node ID array ``train_nids`` in batches,
putting the list of generated blocks onto GPU. putting the list of generated MFGs onto GPU.
.. code:: python .. code:: python
...@@ -51,7 +51,7 @@ putting the list of generated blocks onto GPU. ...@@ -51,7 +51,7 @@ putting the list of generated blocks onto GPU.
Iterating over the DataLoader will yield a list of specially created Iterating over the DataLoader will yield a list of specially created
graphs representing the computation dependencies on each layer. They are graphs representing the computation dependencies on each layer. They are
called *blocks* in DGL. called *message flow graphs* (MFGs) in DGL.
.. code:: python .. code:: python
...@@ -65,12 +65,19 @@ be computed as output, which node representations are needed as input, ...@@ -65,12 +65,19 @@ be computed as output, which node representations are needed as input,
and how does representation from the input nodes propagate to the output and how does representation from the input nodes propagate to the output
nodes. nodes.
For a complete list of supported builtin samplers, please refer to the .. note::
:ref:`neighborhood sampler API reference <api-dataloading-neighbor-sampling>`.
See the :doc:`Stochastic Training Tutorial
<tutorials/large/L0_neighbor_sampling_overview>` for the concept of
message flow graph.
For a complete list of supported builtin samplers, please refer to the
:ref:`neighborhood sampler API reference <api-dataloading-neighbor-sampling>`.
If you wish to develop your own neighborhood sampler or you want a more
detailed explanation of the concept of MFGs, please refer to
:ref:`guide-minibatch-customizing-neighborhood-sampler`.
If you wish to develop your own neighborhood sampler or you want a more
detailed explanation of the concept of blocks, please refer to
:ref:`guide-minibatch-customizing-neighborhood-sampler`.
.. _guide-minibatch-node-classification-model: .. _guide-minibatch-node-classification-model:
...@@ -114,7 +121,7 @@ The DGL ``GraphConv`` modules above accepts an element in ``blocks`` ...@@ -114,7 +121,7 @@ The DGL ``GraphConv`` modules above accepts an element in ``blocks``
generated by the data loader as an argument. generated by the data loader as an argument.
:ref:`The API reference of each NN module <apinn>` will tell you :ref:`The API reference of each NN module <apinn>` will tell you
whether it supports accepting a block as an argument. whether it supports accepting a MFG as an argument.
If you wish to use your own message passing module, please refer to If you wish to use your own message passing module, please refer to
:ref:`guide-minibatch-custom-gnn-module`. :ref:`guide-minibatch-custom-gnn-module`.
...@@ -124,7 +131,7 @@ Training Loop ...@@ -124,7 +131,7 @@ Training Loop
The training loop simply consists of iterating over the dataset with the The training loop simply consists of iterating over the dataset with the
customized batching iterator. During each iteration that yields a list customized batching iterator. During each iteration that yields a list
of blocks, we: of MFGs, we:
1. Load the node features corresponding to the input nodes onto GPU. The 1. Load the node features corresponding to the input nodes onto GPU. The
node features can be stored in either memory or external storage. node features can be stored in either memory or external storage.
...@@ -133,10 +140,10 @@ of blocks, we: ...@@ -133,10 +140,10 @@ of blocks, we:
If the features are stored in ``g.ndata``, then the features can be loaded If the features are stored in ``g.ndata``, then the features can be loaded
by accessing the features in ``blocks[0].srcdata``, the features of by accessing the features in ``blocks[0].srcdata``, the features of
input nodes of the first block, which is identical to all the source nodes of the first MFG, which is identical to all the
necessary nodes needed for computing the final representations. necessary nodes needed for computing the final representations.
2. Feed the list of blocks and the input node features to the multilayer 2. Feed the list of MFGs and the input node features to the multilayer
GNN and get the outputs. GNN and get the outputs.
3. Load the node labels corresponding to the output nodes onto GPU. 3. Load the node labels corresponding to the output nodes onto GPU.
...@@ -147,7 +154,7 @@ of blocks, we: ...@@ -147,7 +154,7 @@ of blocks, we:
If the features are stored in ``g.ndata``, then the labels If the features are stored in ``g.ndata``, then the labels
can be loaded by accessing the features in ``blocks[-1].srcdata``, can be loaded by accessing the features in ``blocks[-1].srcdata``,
the features of output nodes of the last block, which is identical to the features of destination nodes of the last MFG, which is identical to
the nodes we wish to compute the final representation. the nodes we wish to compute the final representation.
4. Compute the loss and backpropagate. 4. Compute the loss and backpropagate.
......
...@@ -166,7 +166,7 @@ def batch(graphs, ndata=ALL, edata=ALL, *, ...@@ -166,7 +166,7 @@ def batch(graphs, ndata=ALL, edata=ALL, *,
raise DGLError('Invalid argument edata: must be a string list but got {}.'.format( raise DGLError('Invalid argument edata: must be a string list but got {}.'.format(
type(edata))) type(edata)))
if any(g.is_block for g in graphs): if any(g.is_block for g in graphs):
raise DGLError("Batching a block is not supported.") raise DGLError("Batching a MFG is not supported.")
relations = list(sorted(graphs[0].canonical_etypes)) relations = list(sorted(graphs[0].canonical_etypes))
relation_ids = [graphs[0].get_etype_id(r) for r in relations] relation_ids = [graphs[0].get_etype_id(r) for r in relations]
......
...@@ -358,14 +358,14 @@ def heterograph(data_dict, ...@@ -358,14 +358,14 @@ def heterograph(data_dict,
return retg.to(device) return retg.to(device)
def create_block(data_dict, num_src_nodes=None, num_dst_nodes=None, idtype=None, device=None): def create_block(data_dict, num_src_nodes=None, num_dst_nodes=None, idtype=None, device=None):
"""Create a :class:`DGLBlock` object. """Create a message flow graph (MFG) as a :class:`DGLBlock` object.
Parameters Parameters
---------- ----------
data_dict : graph data data_dict : graph data
The dictionary data for constructing a block. The keys are in the form of The dictionary data for constructing a MFG. The keys are in the form of
string triplets (src_type, edge_type, dst_type), specifying the input node type, string triplets (src_type, edge_type, dst_type), specifying the source node type,
edge type, and output node type. The values are graph data in the form of edge type, and destination node type. The values are graph data in the form of
:math:`(U, V)`, where :math:`(U[i], V[i])` forms the edge with ID :math:`i`. :math:`(U, V)`, where :math:`(U[i], V[i])` forms the edge with ID :math:`i`.
The allowed graph data formats are: The allowed graph data formats are:
...@@ -376,35 +376,35 @@ def create_block(data_dict, num_src_nodes=None, num_dst_nodes=None, idtype=None, ...@@ -376,35 +376,35 @@ def create_block(data_dict, num_src_nodes=None, num_dst_nodes=None, idtype=None,
- ``(iterable[int], iterable[int])``: Similar to the tuple of node-tensors - ``(iterable[int], iterable[int])``: Similar to the tuple of node-tensors
format, but stores node IDs in two sequences (e.g. list, tuple, numpy.ndarray). format, but stores node IDs in two sequences (e.g. list, tuple, numpy.ndarray).
If you would like to create a block with a single input node type, a single output If you would like to create a MFG with a single source node type, a single destination
node type, and a single edge type, then you can pass in the graph data directly node type, and a single edge type, then you can pass in the graph data directly
without wrapping it as a dictionary. without wrapping it as a dictionary.
num_src_nodes : dict[str, int] or int, optional num_src_nodes : dict[str, int] or int, optional
The number of nodes for each input node type, which is a dictionary mapping a node type The number of nodes for each source node type, which is a dictionary mapping a node type
:math:`T` to the number of :math:`T`-typed input nodes. :math:`T` to the number of :math:`T`-typed source nodes.
If not given for a node type :math:`T`, DGL finds the largest ID appearing in *every* If not given for a node type :math:`T`, DGL finds the largest ID appearing in *every*
graph data whose input node type is :math:`T`, and sets the number of nodes to graph data whose source node type is :math:`T`, and sets the number of nodes to
be that ID plus one. If given and the value is no greater than the largest ID for some be that ID plus one. If given and the value is no greater than the largest ID for some
input node type, DGL will raise an error. By default, DGL infers the number of nodes for source node type, DGL will raise an error. By default, DGL infers the number of nodes for
all input node types. all source node types.
If you would like to create a block with a single input node type, a single output If you would like to create a MFG with a single source node type, a single destination
node type, and a single edge type, then you can pass in an integer to directly node type, and a single edge type, then you can pass in an integer to directly
represent the number of input nodes. represent the number of source nodes.
num_dst_nodes : dict[str, int] or int, optional num_dst_nodes : dict[str, int] or int, optional
The number of nodes for each output node type, which is a dictionary mapping a node type The number of nodes for each destination node type, which is a dictionary mapping a node
:math:`T` to the number of :math:`T`-typed output nodes. type :math:`T` to the number of :math:`T`-typed destination nodes.
If not given for a node type :math:`T`, DGL finds the largest ID appearing in *every* If not given for a node type :math:`T`, DGL finds the largest ID appearing in *every*
graph data whose output node type is :math:`T`, and sets the number of nodes to graph data whose destination node type is :math:`T`, and sets the number of nodes to
be that ID plus one. If given and the value is no greater than the largest ID for some be that ID plus one. If given and the value is no greater than the largest ID for some
output node type, DGL will raise an error. By default, DGL infers the number of nodes for destination node type, DGL will raise an error. By default, DGL infers the number of nodes
all output node types. for all destination node types.
If you would like to create a block with a single output node type, a single output If you would like to create a MFG with a single destination node type, a single
node type, and a single edge type, then you can pass in an integer to directly destination node type, and a single edge type, then you can pass in an integer to directly
represent the number of output nodes. represent the number of destination nodes.
idtype : int32 or int64, optional idtype : int32 or int64, optional
The data type for storing the structure-related graph information such as node and The data type for storing the structure-related graph information such as node and
edge IDs. It should be a framework-specific data type object (e.g., ``torch.int32``). edge IDs. It should be a framework-specific data type object (e.g., ``torch.int32``).
...@@ -419,7 +419,7 @@ def create_block(data_dict, num_src_nodes=None, num_dst_nodes=None, idtype=None, ...@@ -419,7 +419,7 @@ def create_block(data_dict, num_src_nodes=None, num_dst_nodes=None, idtype=None,
Returns Returns
------- -------
DGLBlock DGLBlock
The created block. The created MFG.
Notes Notes
----- -----
...@@ -501,12 +501,12 @@ def create_block(data_dict, num_src_nodes=None, num_dst_nodes=None, idtype=None, ...@@ -501,12 +501,12 @@ def create_block(data_dict, num_src_nodes=None, num_dst_nodes=None, idtype=None,
num_dst_nodes[dty] = max(num_dst_nodes[dty], vrange) num_dst_nodes[dty] = max(num_dst_nodes[dty], vrange)
else: # sanity check else: # sanity check
if num_src_nodes[sty] < urange: if num_src_nodes[sty] < urange:
raise DGLError('The given number of nodes of input node type {} must be larger' raise DGLError('The given number of nodes of source node type {} must be larger'
' than the max ID in the data, but got {} and {}.'.format( ' than the max ID in the data, but got {} and {}.'.format(
sty, num_src_nodes[sty], urange - 1)) sty, num_src_nodes[sty], urange - 1))
if num_dst_nodes[dty] < vrange: if num_dst_nodes[dty] < vrange:
raise DGLError('The given number of nodes of output node type {} must be larger' raise DGLError('The given number of nodes of destination node type {} must be'
' than the max ID in the data, but got {} and {}.'.format( ' larger than the max ID in the data, but got {} and {}.'.format(
dty, num_dst_nodes[dty], vrange - 1)) dty, num_dst_nodes[dty], vrange - 1))
# Create the graph # Create the graph
...@@ -546,17 +546,17 @@ def create_block(data_dict, num_src_nodes=None, num_dst_nodes=None, idtype=None, ...@@ -546,17 +546,17 @@ def create_block(data_dict, num_src_nodes=None, num_dst_nodes=None, idtype=None,
return retg.to(device) return retg.to(device)
def block_to_graph(block): def block_to_graph(block):
"""Convert a :class:`DGLBlock` object to a :class:`DGLGraph`. """Convert a message flow graph (MFG) as a :class:`DGLBlock` object to a :class:`DGLGraph`.
DGL will rename all the input node types by suffixing with ``_src``, and DGL will rename all the source node types by suffixing with ``_src``, and
all the output node types by suffixing with ``_dst``. all the destination node types by suffixing with ``_dst``.
Features on the returned graph will be preserved. Features on the returned graph will be preserved.
Parameters Parameters
---------- ----------
block : DGLBlock block : DGLBlock
The block. The MFG.
Returns Returns
------- -------
......
...@@ -15,7 +15,7 @@ from ..distributed.dist_graph import DistGraph ...@@ -15,7 +15,7 @@ from ..distributed.dist_graph import DistGraph
# pylint: disable=unused-argument # pylint: disable=unused-argument
def assign_block_eids(block, frontier): def assign_block_eids(block, frontier):
"""Assigns edge IDs from the original graph to the block. """Assigns edge IDs from the original graph to the message flow graph (MFG).
See also See also
-------- --------
...@@ -117,8 +117,8 @@ class BlockSampler(object): ...@@ -117,8 +117,8 @@ class BlockSampler(object):
"""Abstract class specifying the neighborhood sampling strategy for DGL data loaders. """Abstract class specifying the neighborhood sampling strategy for DGL data loaders.
The main method for BlockSampler is :meth:`sample_blocks`, The main method for BlockSampler is :meth:`sample_blocks`,
which generates a list of blocks for a multi-layer GNN given a set of seed nodes to which generates a list of message flow graphs (MFGs) for a multi-layer GNN given a set of
have their outputs computed. seed nodes to have their outputs computed.
The default implementation of :meth:`sample_blocks` is The default implementation of :meth:`sample_blocks` is
to repeat :attr:`num_layers` times the following procedure from the last layer to the first to repeat :attr:`num_layers` times the following procedure from the last layer to the first
...@@ -133,13 +133,13 @@ class BlockSampler(object): ...@@ -133,13 +133,13 @@ class BlockSampler(object):
reverse edges. This is controlled by the argument :attr:`exclude_eids` in reverse edges. This is controlled by the argument :attr:`exclude_eids` in
:meth:`sample_blocks` method. :meth:`sample_blocks` method.
* Convert the frontier into a block. * Convert the frontier into a MFG.
* Optionally assign the IDs of the edges in the original graph selected in the first step * Optionally assign the IDs of the edges in the original graph selected in the first step
to the block, controlled by the argument ``return_eids`` in to the MFG, controlled by the argument ``return_eids`` in
:meth:`sample_blocks` method. :meth:`sample_blocks` method.
* Prepend the block to the block list to be returned. * Prepend the MFG to the MFG list to be returned.
All subclasses should override :meth:`sample_frontier` All subclasses should override :meth:`sample_frontier`
method while specifying the number of layers to sample in :attr:`num_layers` argument. method while specifying the number of layers to sample in :attr:`num_layers` argument.
...@@ -149,19 +149,21 @@ class BlockSampler(object): ...@@ -149,19 +149,21 @@ class BlockSampler(object):
num_layers : int num_layers : int
The number of layers to sample. The number of layers to sample.
return_eids : bool, default False return_eids : bool, default False
Whether to return the edge IDs involved in message passing in the block. Whether to return the edge IDs involved in message passing in the MFG.
If True, the edge IDs will be stored as an edge feature named ``dgl.EID``. If True, the edge IDs will be stored as an edge feature named ``dgl.EID``.
Notes Notes
----- -----
For the concept of frontiers and blocks, please refer to User Guide Section 6 [TODO]. For the concept of frontiers and MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
""" """
def __init__(self, num_layers, return_eids): def __init__(self, num_layers, return_eids):
self.num_layers = num_layers self.num_layers = num_layers
self.return_eids = return_eids self.return_eids = return_eids
def sample_frontier(self, block_id, g, seed_nodes): def sample_frontier(self, block_id, g, seed_nodes):
"""Generate the frontier given the output nodes. """Generate the frontier given the destination nodes.
The subclasses should override this function. The subclasses should override this function.
...@@ -172,7 +174,7 @@ class BlockSampler(object): ...@@ -172,7 +174,7 @@ class BlockSampler(object):
g : DGLGraph g : DGLGraph
The original graph. The original graph.
seed_nodes : Tensor or dict[ntype, Tensor] seed_nodes : Tensor or dict[ntype, Tensor]
The output nodes by node type. The destination nodes by node type.
If the graph only has one node type, one can just specify a single tensor If the graph only has one node type, one can just specify a single tensor
of node IDs. of node IDs.
...@@ -184,19 +186,21 @@ class BlockSampler(object): ...@@ -184,19 +186,21 @@ class BlockSampler(object):
Notes Notes
----- -----
For the concept of frontiers and blocks, please refer to User Guide Section 6 [TODO]. For the concept of frontiers and MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
""" """
raise NotImplementedError raise NotImplementedError
def sample_blocks(self, g, seed_nodes, exclude_eids=None): def sample_blocks(self, g, seed_nodes, exclude_eids=None):
"""Generate the a list of blocks given the output nodes. """Generate the a list of MFGs given the destination nodes.
Parameters Parameters
---------- ----------
g : DGLGraph g : DGLGraph
The original graph. The original graph.
seed_nodes : Tensor or dict[ntype, Tensor] seed_nodes : Tensor or dict[ntype, Tensor]
The output nodes by node type. The destination nodes by node type.
If the graph only has one node type, one can just specify a single tensor If the graph only has one node type, one can just specify a single tensor
of node IDs. of node IDs.
...@@ -206,11 +210,13 @@ class BlockSampler(object): ...@@ -206,11 +210,13 @@ class BlockSampler(object):
Returns Returns
------- -------
list[DGLGraph] list[DGLGraph]
The blocks generated for computing the multi-layer GNN output. The MFGs generated for computing the multi-layer GNN output.
Notes Notes
----- -----
For the concept of frontiers and blocks, please refer to User Guide Section 6 [TODO]. For the concept of frontiers and MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
""" """
blocks = [] blocks = []
exclude_eids = ( exclude_eids = (
...@@ -259,11 +265,13 @@ class Collator(ABC): ...@@ -259,11 +265,13 @@ class Collator(ABC):
Provides a :attr:`dataset` object containing the collection of all nodes or edges, Provides a :attr:`dataset` object containing the collection of all nodes or edges,
as well as a :attr:`collate` method that combines a set of items from as well as a :attr:`collate` method that combines a set of items from
:attr:`dataset` and obtains the blocks. :attr:`dataset` and obtains the message flow graphs (MFGs).
Notes Notes
----- -----
For the concept of blocks, please refer to User Guide Section 6 [TODO]. For the concept of MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
""" """
@abstractproperty @abstractproperty
def dataset(self): def dataset(self):
...@@ -272,7 +280,7 @@ class Collator(ABC): ...@@ -272,7 +280,7 @@ class Collator(ABC):
@abstractmethod @abstractmethod
def collate(self, items): def collate(self, items):
"""Combines the items from the dataset object and obtains the list of blocks. """Combines the items from the dataset object and obtains the list of MFGs.
Parameters Parameters
---------- ----------
...@@ -281,7 +289,9 @@ class Collator(ABC): ...@@ -281,7 +289,9 @@ class Collator(ABC):
Notes Notes
----- -----
For the concept of blocks, please refer to User Guide Section 6 [TODO]. For the concept of MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
""" """
raise NotImplementedError raise NotImplementedError
...@@ -330,6 +340,12 @@ class NodeCollator(Collator): ...@@ -330,6 +340,12 @@ class NodeCollator(Collator):
... batch_size=1024, shuffle=True, drop_last=False, num_workers=4) ... batch_size=1024, shuffle=True, drop_last=False, num_workers=4)
>>> for input_nodes, output_nodes, blocks in dataloader: >>> for input_nodes, output_nodes, blocks in dataloader:
... train_on(input_nodes, output_nodes, blocks) ... train_on(input_nodes, output_nodes, blocks)
Notes
-----
For the concept of MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
""" """
def __init__(self, g, nids, block_sampler): def __init__(self, g, nids, block_sampler):
self.g = g self.g = g
...@@ -351,7 +367,7 @@ class NodeCollator(Collator): ...@@ -351,7 +367,7 @@ class NodeCollator(Collator):
return self._dataset return self._dataset
def collate(self, items): def collate(self, items):
"""Find the list of blocks necessary for computing the representation of given """Find the list of MFGs necessary for computing the representation of given
nodes for a node classification/regression task. nodes for a node classification/regression task.
Parameters Parameters
...@@ -372,8 +388,8 @@ class NodeCollator(Collator): ...@@ -372,8 +388,8 @@ class NodeCollator(Collator):
If the original graph has multiple node types, return a dictionary of If the original graph has multiple node types, return a dictionary of
node type names and node ID tensors. Otherwise, return a single tensor. node type names and node ID tensors. Otherwise, return a single tensor.
blocks : list[DGLGraph] MFGs : list[DGLGraph]
The list of blocks necessary for computing the representation. The list of MFGs necessary for computing the representation.
""" """
if isinstance(items[0], tuple): if isinstance(items[0], tuple):
# returns a list of pairs: group them by node types into a dict # returns a list of pairs: group them by node types into a dict
...@@ -404,7 +420,7 @@ class EdgeCollator(Collator): ...@@ -404,7 +420,7 @@ class EdgeCollator(Collator):
* If a negative sampler is given, another graph that contains the "negative edges", * If a negative sampler is given, another graph that contains the "negative edges",
connecting the source and destination nodes yielded from the given negative sampler. connecting the source and destination nodes yielded from the given negative sampler.
* A list of blocks necessary for computing the representation of the incident nodes * A list of MFGs necessary for computing the representation of the incident nodes
of the edges in the minibatch. of the edges in the minibatch.
Parameters Parameters
...@@ -552,6 +568,12 @@ class EdgeCollator(Collator): ...@@ -552,6 +568,12 @@ class EdgeCollator(Collator):
... batch_size=1024, shuffle=True, drop_last=False, num_workers=4) ... batch_size=1024, shuffle=True, drop_last=False, num_workers=4)
>>> for input_nodes, pos_pair_graph, neg_pair_graph, blocks in dataloader: >>> for input_nodes, pos_pair_graph, neg_pair_graph, blocks in dataloader:
... train_on(input_nodes, pair_graph, neg_pair_graph, blocks) ... train_on(input_nodes, pair_graph, neg_pair_graph, blocks)
Notes
-----
For the concept of MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
""" """
def __init__(self, g, eids, block_sampler, g_sampling=None, exclude=None, def __init__(self, g, eids, block_sampler, g_sampling=None, exclude=None,
reverse_eids=None, reverse_etypes=None, negative_sampler=None): reverse_eids=None, reverse_etypes=None, negative_sampler=None):
...@@ -690,7 +712,7 @@ class EdgeCollator(Collator): ...@@ -690,7 +712,7 @@ class EdgeCollator(Collator):
Note that the metagraph of this graph will be identical to that of the original Note that the metagraph of this graph will be identical to that of the original
graph. graph.
blocks : list[DGLGraph] blocks : list[DGLGraph]
The list of blocks necessary for computing the representation of the edges. The list of MFGs necessary for computing the representation of the edges.
""" """
if self.negative_sampler is None: if self.negative_sampler is None:
return self._collate(items) return self._collate(items)
......
...@@ -25,7 +25,7 @@ class MultiLayerNeighborSampler(BlockSampler): ...@@ -25,7 +25,7 @@ class MultiLayerNeighborSampler(BlockSampler):
replace : bool, default True replace : bool, default True
Whether to sample with replacement Whether to sample with replacement
return_eids : bool, default False return_eids : bool, default False
Whether to return the edge IDs involved in message passing in the block. Whether to return the edge IDs involved in message passing in the MFG.
If True, the edge IDs will be stored as an edge feature named ``dgl.EID``. If True, the edge IDs will be stored as an edge feature named ``dgl.EID``.
Examples Examples
...@@ -50,6 +50,12 @@ class MultiLayerNeighborSampler(BlockSampler): ...@@ -50,6 +50,12 @@ class MultiLayerNeighborSampler(BlockSampler):
... {('user', 'follows', 'user'): 5, ... {('user', 'follows', 'user'): 5,
... ('user', 'plays', 'game'): 4, ... ('user', 'plays', 'game'): 4,
... ('game', 'played-by', 'user'): 3}] * 3) ... ('game', 'played-by', 'user'): 3}] * 3)
Notes
-----
For the concept of MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
""" """
def __init__(self, fanouts, replace=False, return_eids=False): def __init__(self, fanouts, replace=False, return_eids=False):
super().__init__(len(fanouts), return_eids) super().__init__(len(fanouts), return_eids)
...@@ -84,7 +90,7 @@ class MultiLayerFullNeighborSampler(MultiLayerNeighborSampler): ...@@ -84,7 +90,7 @@ class MultiLayerFullNeighborSampler(MultiLayerNeighborSampler):
n_layers : int n_layers : int
The number of GNN layers to sample. The number of GNN layers to sample.
return_eids : bool, default False return_eids : bool, default False
Whether to return the edge IDs involved in message passing in the block. Whether to return the edge IDs involved in message passing in the MFG.
If True, the edge IDs will be stored as an edge feature named ``dgl.EID``. If True, the edge IDs will be stored as an edge feature named ``dgl.EID``.
Examples Examples
...@@ -100,6 +106,12 @@ class MultiLayerFullNeighborSampler(MultiLayerNeighborSampler): ...@@ -100,6 +106,12 @@ class MultiLayerFullNeighborSampler(MultiLayerNeighborSampler):
... batch_size=1024, shuffle=True, drop_last=False, num_workers=4) ... batch_size=1024, shuffle=True, drop_last=False, num_workers=4)
>>> for blocks in dataloader: >>> for blocks in dataloader:
... train_on(blocks) ... train_on(blocks)
Notes
-----
For the concept of MFGs, please refer to
:ref:`User Guide Section 6 <guide-minibatch>` and
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`.
""" """
def __init__(self, n_layers, return_eids=False): def __init__(self, n_layers, return_eids=False):
super().__init__([None] * n_layers, return_eids=return_eids) super().__init__([None] * n_layers, return_eids=return_eids)
...@@ -16,8 +16,8 @@ def _remove_kwargs_dist(kwargs): ...@@ -16,8 +16,8 @@ def _remove_kwargs_dist(kwargs):
# The following code is a fix to the PyTorch-specific issue in # The following code is a fix to the PyTorch-specific issue in
# https://github.com/dmlc/dgl/issues/2137 # https://github.com/dmlc/dgl/issues/2137
# #
# Basically the sampled blocks/subgraphs contain the features extracted from the # Basically the sampled MFGs/subgraphs contain the features extracted from the
# parent graph. In DGL, the blocks/subgraphs will hold a reference to the parent # parent graph. In DGL, the MFGs/subgraphs will hold a reference to the parent
# graph feature tensor and an index tensor, so that the features could be extracted upon # graph feature tensor and an index tensor, so that the features could be extracted upon
# request. However, in the context of multiprocessed sampling, we do not need to # request. However, in the context of multiprocessed sampling, we do not need to
# transmit the parent graph feature tensor from the subprocess to the main process, # transmit the parent graph feature tensor from the subprocess to the main process,
...@@ -26,13 +26,13 @@ def _remove_kwargs_dist(kwargs): ...@@ -26,13 +26,13 @@ def _remove_kwargs_dist(kwargs):
# it with the following trick: # it with the following trick:
# #
# In the collator running in the sampler processes: # In the collator running in the sampler processes:
# For each frame in the block, we check each column and the column with the same name # For each frame in the MFG, we check each column and the column with the same name
# in the corresponding parent frame. If the storage of the former column is the # in the corresponding parent frame. If the storage of the former column is the
# same object as the latter column, we are sure that the former column is a # same object as the latter column, we are sure that the former column is a
# subcolumn of the latter, and set the storage of the former column as None. # subcolumn of the latter, and set the storage of the former column as None.
# #
# In the iterator of the main process: # In the iterator of the main process:
# For each frame in the block, we check each column and the column with the same name # For each frame in the MFG, we check each column and the column with the same name
# in the corresponding parent frame. If the storage of the former column is None, # in the corresponding parent frame. If the storage of the former column is None,
# we replace it with the storage of the latter column. # we replace it with the storage of the latter column.
...@@ -118,7 +118,7 @@ def _restore_blocks_storage(blocks, g): ...@@ -118,7 +118,7 @@ def _restore_blocks_storage(blocks, g):
class _NodeCollator(NodeCollator): class _NodeCollator(NodeCollator):
def collate(self, items): def collate(self, items):
# input_nodes, output_nodes, [items], blocks # input_nodes, output_nodes, blocks
result = super().collate(items) result = super().collate(items)
_pop_blocks_storage(result[-1], self.g) _pop_blocks_storage(result[-1], self.g)
return result return result
...@@ -173,7 +173,7 @@ class _EdgeDataLoaderIter: ...@@ -173,7 +173,7 @@ class _EdgeDataLoaderIter:
result_ = next(self.iter_) result_ = next(self.iter_)
if self.edge_dataloader.collator.negative_sampler is not None: if self.edge_dataloader.collator.negative_sampler is not None:
# input_nodes, pair_graph, neg_pair_graph, blocks # input_nodes, pair_graph, neg_pair_graph, blocks if None.
# Otherwise, input_nodes, pair_graph, blocks # Otherwise, input_nodes, pair_graph, blocks
_restore_subgraph_storage(result_[2], self.edge_dataloader.collator.g) _restore_subgraph_storage(result_[2], self.edge_dataloader.collator.g)
_restore_subgraph_storage(result_[1], self.edge_dataloader.collator.g) _restore_subgraph_storage(result_[1], self.edge_dataloader.collator.g)
...@@ -184,7 +184,7 @@ class _EdgeDataLoaderIter: ...@@ -184,7 +184,7 @@ class _EdgeDataLoaderIter:
class NodeDataLoader: class NodeDataLoader:
"""PyTorch dataloader for batch-iterating over a set of nodes, generating the list """PyTorch dataloader for batch-iterating over a set of nodes, generating the list
of blocks as computation dependency of the said minibatch. of message flow graphs (MFGs) as computation dependency of the said minibatch.
Parameters Parameters
---------- ----------
...@@ -195,7 +195,7 @@ class NodeDataLoader: ...@@ -195,7 +195,7 @@ class NodeDataLoader:
block_sampler : dgl.dataloading.BlockSampler block_sampler : dgl.dataloading.BlockSampler
The neighborhood sampler. The neighborhood sampler.
device : device context, optional device : device context, optional
The device of the generated blocks in each iteration, which should be a The device of the generated MFGs in each iteration, which should be a
PyTorch device object (e.g., ``torch.device``). PyTorch device object (e.g., ``torch.device``).
kwargs : dict kwargs : dict
Arguments being passed to :py:class:`torch.utils.data.DataLoader`. Arguments being passed to :py:class:`torch.utils.data.DataLoader`.
...@@ -212,6 +212,12 @@ class NodeDataLoader: ...@@ -212,6 +212,12 @@ class NodeDataLoader:
... batch_size=1024, shuffle=True, drop_last=False, num_workers=4) ... batch_size=1024, shuffle=True, drop_last=False, num_workers=4)
>>> for input_nodes, output_nodes, blocks in dataloader: >>> for input_nodes, output_nodes, blocks in dataloader:
... train_on(input_nodes, output_nodes, blocks) ... train_on(input_nodes, output_nodes, blocks)
Notes
-----
Please refer to
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`
and :ref:`User Guide Section 6 <guide-minibatch>` for usage.
""" """
collator_arglist = inspect.getfullargspec(NodeCollator).args collator_arglist = inspect.getfullargspec(NodeCollator).args
...@@ -261,8 +267,8 @@ class NodeDataLoader: ...@@ -261,8 +267,8 @@ class NodeDataLoader:
class EdgeDataLoader: class EdgeDataLoader:
"""PyTorch dataloader for batch-iterating over a set of edges, generating the list """PyTorch dataloader for batch-iterating over a set of edges, generating the list
of blocks as computation dependency of the said minibatch for edge classification, of message flow graphs (MFGs) as computation dependency of the said minibatch for
edge regression, and link prediction. edge classification, edge regression, and link prediction.
For each iteration, the object will yield For each iteration, the object will yield
...@@ -275,7 +281,7 @@ class EdgeDataLoader: ...@@ -275,7 +281,7 @@ class EdgeDataLoader:
* If a negative sampler is given, another graph that contains the "negative edges", * If a negative sampler is given, another graph that contains the "negative edges",
connecting the source and destination nodes yielded from the given negative sampler. connecting the source and destination nodes yielded from the given negative sampler.
* A list of blocks necessary for computing the representation of the incident nodes * A list of MFGs necessary for computing the representation of the incident nodes
of the edges in the minibatch. of the edges in the minibatch.
For more details, please refer to :ref:`guide-minibatch-edge-classification-sampler` For more details, please refer to :ref:`guide-minibatch-edge-classification-sampler`
...@@ -290,7 +296,7 @@ class EdgeDataLoader: ...@@ -290,7 +296,7 @@ class EdgeDataLoader:
block_sampler : dgl.dataloading.BlockSampler block_sampler : dgl.dataloading.BlockSampler
The neighborhood sampler. The neighborhood sampler.
device : device context, optional device : device context, optional
The device of the generated blocks and graphs in each iteration, which should be a The device of the generated MFGs and graphs in each iteration, which should be a
PyTorch device object (e.g., ``torch.device``). PyTorch device object (e.g., ``torch.device``).
g_sampling : DGLGraph, optional g_sampling : DGLGraph, optional
The graph where neighborhood sampling is performed. The graph where neighborhood sampling is performed.
...@@ -406,11 +412,17 @@ class EdgeDataLoader: ...@@ -406,11 +412,17 @@ class EdgeDataLoader:
... negative_sampler=neg_sampler, ... negative_sampler=neg_sampler,
... batch_size=1024, shuffle=True, drop_last=False, num_workers=4) ... batch_size=1024, shuffle=True, drop_last=False, num_workers=4)
>>> for input_nodes, pos_pair_graph, neg_pair_graph, blocks in dataloader: >>> for input_nodes, pos_pair_graph, neg_pair_graph, blocks in dataloader:
... train_on(input_nodse, pair_graph, neg_pair_graph, blocks) ... train_on(input_nodes, pair_graph, neg_pair_graph, blocks)
See also See also
-------- --------
:class:`~dgl.dataloading.dataloader.EdgeCollator` dgl.dataloading.dataloader.EdgeCollator
Notes
-----
Please refer to
:doc:`Minibatch Training Tutorials <tutorials/large/L0_neighbor_sampling_overview>`
and :ref:`User Guide Section 6 <guide-minibatch>` for usage.
For end-to-end usages, please refer to the following tutorial/examples: For end-to-end usages, please refer to the following tutorial/examples:
......
...@@ -1668,35 +1668,35 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True): ...@@ -1668,35 +1668,35 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True):
"""Convert a graph into a bipartite-structured *block* for message passing. """Convert a graph into a bipartite-structured *block* for message passing.
A block is a graph consisting of two sets of nodes: the A block is a graph consisting of two sets of nodes: the
*input* nodes and *output* nodes. The input and output nodes can have multiple *source* nodes and *destination* nodes. The source and destination nodes can have multiple
node types. All the edges connect from input nodes to output nodes. node types. All the edges connect from source nodes to destination nodes.
Specifically, the input nodes and output nodes will have the same node types as the Specifically, the source nodes and destination nodes will have the same node types as the
ones in the original graph. DGL maps each edge ``(u, v)`` with edge type ones in the original graph. DGL maps each edge ``(u, v)`` with edge type
``(utype, etype, vtype)`` in the original graph to the edge with type ``(utype, etype, vtype)`` in the original graph to the edge with type
``etype`` connecting from node ID ``u`` of type ``utype`` in the input side to node ``etype`` connecting from node ID ``u`` of type ``utype`` in the source side to node
ID ``v`` of type ``vtype`` in the output side. ID ``v`` of type ``vtype`` in the destination side.
For blocks returned by :func:`to_block`, the output nodes of the block will only For blocks returned by :func:`to_block`, the destination nodes of the block will only
contain the nodes that have at least one inbound edge of any type. The input nodes contain the nodes that have at least one inbound edge of any type. The source nodes
of the block will only contain the nodes that appear in the output nodes, as well of the block will only contain the nodes that appear in the destination nodes, as well
as the nodes that have at least one outbound edge connecting to one of the output nodes. as the nodes that have at least one outbound edge connecting to one of the destination nodes.
If the :attr:`dst_nodes` argument is not None, it specifies the output nodes instead. The destination nodes are specified by the :attr:`dst_nodes` argument if it is not None.
Parameters Parameters
---------- ----------
graph : DGLGraph graph : DGLGraph
The graph. The graph.
dst_nodes : Tensor or dict[str, Tensor], optional dst_nodes : Tensor or dict[str, Tensor], optional
The list of output nodes. The list of destination nodes.
If a tensor is given, the graph must have only one node type. If a tensor is given, the graph must have only one node type.
If given, it must be a superset of all the nodes that have at least one inbound If given, it must be a superset of all the nodes that have at least one inbound
edge. An error will be raised otherwise. edge. An error will be raised otherwise.
include_dst_in_src : bool include_dst_in_src : bool
If False, do not include output nodes in input nodes. If False, do not include destination nodes in source nodes.
(Default: True) (Default: True)
...@@ -1734,13 +1734,13 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True): ...@@ -1734,13 +1734,13 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True):
>>> g = dgl.graph(([1, 2], [2, 3])) >>> g = dgl.graph(([1, 2], [2, 3]))
>>> block = dgl.to_block(g, torch.LongTensor([3, 2])) >>> block = dgl.to_block(g, torch.LongTensor([3, 2]))
The output nodes would be exactly the same as the ones given: [3, 2]. The destination nodes would be exactly the same as the ones given: [3, 2].
>>> induced_dst = block.dstdata[dgl.NID] >>> induced_dst = block.dstdata[dgl.NID]
>>> induced_dst >>> induced_dst
tensor([3, 2]) tensor([3, 2])
The first few input nodes would also be exactly the same as The first few source nodes would also be exactly the same as
the ones given. The rest of the nodes are the ones necessary for message passing the ones given. The rest of the nodes are the ones necessary for message passing
into nodes 3, 2. This means that the node 1 would be included. into nodes 3, 2. This means that the node 1 would be included.
...@@ -1749,7 +1749,7 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True): ...@@ -1749,7 +1749,7 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True):
tensor([3, 2, 1]) tensor([3, 2, 1])
You can notice that the first two nodes are identical to the given nodes as well as You can notice that the first two nodes are identical to the given nodes as well as
the output nodes. the destination nodes.
The induced edges can also be obtained by the following: The induced edges can also be obtained by the following:
...@@ -1764,20 +1764,20 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True): ...@@ -1764,20 +1764,20 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True):
>>> induced_src[src], induced_dst[dst] >>> induced_src[src], induced_dst[dst]
(tensor([2, 1]), tensor([3, 2])) (tensor([2, 1]), tensor([3, 2]))
The output nodes specified must be a superset of the nodes that have edges connecting The destination nodes specified must be a superset of the nodes that have edges connecting
to them. For example, the following will raise an error since the output nodes to them. For example, the following will raise an error since the destination nodes
does not contain node 3, which has an edge connecting to it. does not contain node 3, which has an edge connecting to it.
>>> g = dgl.graph(([1, 2], [2, 3])) >>> g = dgl.graph(([1, 2], [2, 3]))
>>> dgl.to_block(g, torch.LongTensor([2])) # error >>> dgl.to_block(g, torch.LongTensor([2])) # error
Converting a heterogeneous graph to a block is similar, except that when specifying Converting a heterogeneous graph to a block is similar, except that when specifying
the output nodes, you have to give a dict: the destination nodes, you have to give a dict:
>>> g = dgl.heterograph({('A', '_E', 'B'): ([1, 2], [2, 3])}) >>> g = dgl.heterograph({('A', '_E', 'B'): ([1, 2], [2, 3])})
If you don't specify any node of type A on the output side, the node type ``A`` If you don't specify any node of type A on the destination side, the node type ``A``
in the block would have zero nodes on the output side. in the block would have zero nodes on the destination side.
>>> block = dgl.to_block(g, {'B': torch.LongTensor([3, 2])}) >>> block = dgl.to_block(g, {'B': torch.LongTensor([3, 2])})
>>> block.number_of_dst_nodes('A') >>> block.number_of_dst_nodes('A')
...@@ -1787,12 +1787,12 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True): ...@@ -1787,12 +1787,12 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True):
>>> block.dstnodes['B'].data[dgl.NID] >>> block.dstnodes['B'].data[dgl.NID]
tensor([3, 2]) tensor([3, 2])
The input side would contain all the nodes on the output side: The source side would contain all the nodes on the destination side:
>>> block.srcnodes['B'].data[dgl.NID] >>> block.srcnodes['B'].data[dgl.NID]
tensor([3, 2]) tensor([3, 2])
As well as all the nodes that have connections to the nodes on the output side: As well as all the nodes that have connections to the nodes on the destination side:
>>> block.srcnodes['A'].data[dgl.NID] >>> block.srcnodes['A'].data[dgl.NID]
tensor([2, 1]) tensor([2, 1])
......
...@@ -93,15 +93,16 @@ By the end of this tutorial, you will be able to ...@@ -93,15 +93,16 @@ By the end of this tutorial, you will be able to
###################################################################### ######################################################################
# You can also notice in the animation above that the computation # You can also notice in the animation above that the computation
# dependencies in the animation above can be described as a series of # dependencies in the animation above can be described as a series of
# *bipartite graphs*. # bipartite graphs.
# The output nodes are on one side and all the nodes necessary for inputs # The output nodes (called *destination nodes*) are on one side and all the
# are on the other side. The arrows indicate how the sampled neighbors # nodes necessary for inputs (called *source nodes*) are on the other side.
# propagates messages to the nodes. # The arrows indicate how the sampled neighbors propagates messages to the nodes.
# DGL calls such graphs *message flow graphs* (MFG).
# #
# Note that some GNN modules, such as `SAGEConv`, need to use the output # Note that some GNN modules, such as `SAGEConv`, need to use the destination
# nodes' features on the previous layer to compute the outputs. Without # nodes' features on the previous layer to compute the outputs. Without
# loss of generality, DGL always includes the output nodes themselves # loss of generality, DGL always includes the destination nodes themselves
# in the input nodes. # in the source nodes.
# #
......
...@@ -70,7 +70,7 @@ test_nids = idx_split['test'] ...@@ -70,7 +70,7 @@ test_nids = idx_split['test']
# #
# In the :doc:`previous tutorial <L0_neighbor_sampling_overview>`, you # In the :doc:`previous tutorial <L0_neighbor_sampling_overview>`, you
# have seen that the computation dependency for message passing of a # have seen that the computation dependency for message passing of a
# single node can be described as a series of bipartite graphs. # single node can be described as a series of *message flow graphs* (MFG).
# #
# |image1| # |image1|
# #
...@@ -84,10 +84,10 @@ test_nids = idx_split['test'] ...@@ -84,10 +84,10 @@ test_nids = idx_split['test']
# #
# DGL provides tools to iterate over the dataset in minibatches # DGL provides tools to iterate over the dataset in minibatches
# while generating the computation dependencies to compute their outputs # while generating the computation dependencies to compute their outputs
# with the bipartite graphs above. For node classification, you can use # with the MFGs above. For node classification, you can use
# ``dgl.dataloading.NodeDataLoader`` for iterating over the dataset. # ``dgl.dataloading.NodeDataLoader`` for iterating over the dataset.
# It accepts a sampler object to control how to generate the computation # It accepts a sampler object to control how to generate the computation
# dependencies in the form of bipartite graphs. DGL provides # dependencies in the form of MFGs. DGL provides
# implementations of common sampling algorithms such as # implementations of common sampling algorithms such as
# ``dgl.dataloading.MultiLayerNeighborSampler`` which randomly picks # ``dgl.dataloading.MultiLayerNeighborSampler`` which randomly picks
# a fixed number of neighbors for each node. # a fixed number of neighbors for each node.
...@@ -113,7 +113,7 @@ train_dataloader = dgl.dataloading.NodeDataLoader( ...@@ -113,7 +113,7 @@ train_dataloader = dgl.dataloading.NodeDataLoader(
graph, # The graph graph, # The graph
train_nids, # The node IDs to iterate over in minibatches train_nids, # The node IDs to iterate over in minibatches
sampler, # The neighbor sampler sampler, # The neighbor sampler
device=device, # Put the sampled bipartite graphs on CPU or GPU device=device, # Put the sampled MFGs on CPU or GPU
# The following arguments are inherited from PyTorch DataLoader. # The following arguments are inherited from PyTorch DataLoader.
batch_size=1024, # Batch size batch_size=1024, # Batch size
shuffle=True, # Whether to shuffle the nodes for every epoch shuffle=True, # Whether to shuffle the nodes for every epoch
...@@ -126,7 +126,7 @@ train_dataloader = dgl.dataloading.NodeDataLoader( ...@@ -126,7 +126,7 @@ train_dataloader = dgl.dataloading.NodeDataLoader(
# You can iterate over the data loader and see what it yields. # You can iterate over the data loader and see what it yields.
# #
input_nodes, output_nodes, bipartites = example_minibatch = next(iter(train_dataloader)) input_nodes, output_nodes, mfgs = example_minibatch = next(iter(train_dataloader))
print(example_minibatch) print(example_minibatch)
print("To compute {} nodes' outputs, we need {} nodes' input features".format(len(output_nodes), len(input_nodes))) print("To compute {} nodes' outputs, we need {} nodes' input features".format(len(output_nodes), len(input_nodes)))
...@@ -138,24 +138,24 @@ print("To compute {} nodes' outputs, we need {} nodes' input features".format(le ...@@ -138,24 +138,24 @@ print("To compute {} nodes' outputs, we need {} nodes' input features".format(le
# are needed on the first GNN layer for this minibatch. # are needed on the first GNN layer for this minibatch.
# - An ID tensor for the output nodes, i.e. nodes whose representations # - An ID tensor for the output nodes, i.e. nodes whose representations
# are to be computed. # are to be computed.
# - A list of bipartite graphs storing the computation dependencies # - A list of MFGs storing the computation dependencies
# for each GNN layer. # for each GNN layer.
# #
###################################################################### ######################################################################
# You can get the input and output node IDs of the bipartite graphs # You can get the source and destination node IDs of the MFGs
# and verify that the first few input nodes are always the same as the output # and verify that the first few source nodes are always the same as the destination
# nodes. As we described in the :doc:`overview <L0_neighbor_sampling_overview>`, # nodes. As we described in the :doc:`overview <L0_neighbor_sampling_overview>`,
# output nodes' own features from the previous layer may also be necessary in # destination nodes' own features from the previous layer may also be necessary in
# the computation of the new features. # the computation of the new features.
# #
bipartite_0_src = bipartites[0].srcdata[dgl.NID] mfg_0_src = mfgs[0].srcdata[dgl.NID]
bipartite_0_dst = bipartites[0].dstdata[dgl.NID] mfg_0_dst = mfgs[0].dstdata[dgl.NID]
print(bipartite_0_src) print(mfg_0_src)
print(bipartite_0_dst) print(mfg_0_dst)
print(torch.equal(bipartite_0_src[:bipartites[0].num_dst_nodes()], bipartite_0_dst)) print(torch.equal(mfg_0_src[:mfgs[0].num_dst_nodes()], mfg_0_dst))
###################################################################### ######################################################################
...@@ -177,14 +177,14 @@ class Model(nn.Module): ...@@ -177,14 +177,14 @@ class Model(nn.Module):
self.conv2 = SAGEConv(h_feats, num_classes, aggregator_type='mean') self.conv2 = SAGEConv(h_feats, num_classes, aggregator_type='mean')
self.h_feats = h_feats self.h_feats = h_feats
def forward(self, bipartites, x): def forward(self, mfgs, x):
# Lines that are changed are marked with an arrow: "<---" # Lines that are changed are marked with an arrow: "<---"
h_dst = x[:bipartites[0].num_dst_nodes()] # <--- h_dst = x[:mfgs[0].num_dst_nodes()] # <---
h = self.conv1(bipartites[0], (x, h_dst)) # <--- h = self.conv1(mfgs[0], (x, h_dst)) # <---
h = F.relu(h) h = F.relu(h)
h_dst = h[:bipartites[1].num_dst_nodes()] # <--- h_dst = h[:mfgs[1].num_dst_nodes()] # <---
h = self.conv2(bipartites[1], (h, h_dst)) # <--- h = self.conv2(mfgs[1], (h, h_dst)) # <---
return h return h
model = Model(num_features, 128, num_classes).to(device) model = Model(num_features, 128, num_classes).to(device)
...@@ -195,44 +195,44 @@ model = Model(num_features, 128, num_classes).to(device) ...@@ -195,44 +195,44 @@ model = Model(num_features, 128, num_classes).to(device)
# :doc:`introduction <../blitz/1_introduction>`, you will notice several # :doc:`introduction <../blitz/1_introduction>`, you will notice several
# differences: # differences:
# #
# - **DGL GNN layers on bipartite graphs**. Instead of computing on the # - **DGL GNN layers on MFGs**. Instead of computing on the
# full graph: # full graph:
# #
# .. code:: python # .. code:: python
# #
# h = self.conv1(g, x) # h = self.conv1(g, x)
# #
# you only compute on the sampled bipartite graph: # you only compute on the sampled MFG:
# #
# .. code:: python # .. code:: python
# #
# h = self.conv1(bipartites[0], (x, h_dst)) # h = self.conv1(mfgs[0], (x, h_dst))
# #
# All DGL’s GNN modules support message passing on bipartite graphs, # All DGL’s GNN modules support message passing on MFGs,
# where you supply a pair of features, one for input nodes and another # where you supply a pair of features, one for source nodes and another
# for output nodes. # for destination nodes.
# #
# - **Feature slicing for self-dependency**. There are statements that # - **Feature slicing for self-dependency**. There are statements that
# perform slicing to obtain the previous-layer representation of the # perform slicing to obtain the previous-layer representation of the
# output nodes: # nodes:
# #
# .. code:: python # .. code:: python
# #
# h_dst = x[:bipartites[0].num_dst_nodes()] # h_dst = x[:mfgs[0].num_dst_nodes()]
# #
# ``num_dst_nodes`` method works with bipartite graphs, where it will # ``num_dst_nodes`` method works with MFGs, where it will
# return the number of output nodes. # return the number of destination nodes.
# #
# Since the first few input nodes of the yielded bipartite graph are # Since the first few source nodes of the yielded MFG are
# always the same as the output nodes, these statements obtain the # always the same as the destination nodes, these statements obtain the
# representations of the output nodes on the previous layer. They are # representations of the destination nodes on the previous layer. They are
# then combined with neighbor aggregation in ``dgl.nn.SAGEConv`` layer. # then combined with neighbor aggregation in ``dgl.nn.SAGEConv`` layer.
# #
# .. note:: # .. note::
# #
# See the :doc:`custom message passing # See the :doc:`custom message passing
# tutorial <L4_message_passing>` for more details on how to # tutorial <L4_message_passing>` for more details on how to
# manipulate bipartite graphs produced in this way, such as the usage # manipulate MFGs produced in this way, such as the usage
# of ``num_dst_nodes``. # of ``num_dst_nodes``.
# #
...@@ -277,12 +277,12 @@ for epoch in range(10): ...@@ -277,12 +277,12 @@ for epoch in range(10):
model.train() model.train()
with tqdm.tqdm(train_dataloader) as tq: with tqdm.tqdm(train_dataloader) as tq:
for step, (input_nodes, output_nodes, bipartites) in enumerate(tq): for step, (input_nodes, output_nodes, mfgs) in enumerate(tq):
# feature copy from CPU to GPU takes place here # feature copy from CPU to GPU takes place here
inputs = bipartites[0].srcdata['feat'] inputs = mfgs[0].srcdata['feat']
labels = bipartites[-1].dstdata['label'] labels = mfgs[-1].dstdata['label']
predictions = model(bipartites, inputs) predictions = model(mfgs, inputs)
loss = F.cross_entropy(predictions, labels) loss = F.cross_entropy(predictions, labels)
opt.zero_grad() opt.zero_grad()
...@@ -298,10 +298,10 @@ for epoch in range(10): ...@@ -298,10 +298,10 @@ for epoch in range(10):
predictions = [] predictions = []
labels = [] labels = []
with tqdm.tqdm(valid_dataloader) as tq, torch.no_grad(): with tqdm.tqdm(valid_dataloader) as tq, torch.no_grad():
for input_nodes, output_nodes, bipartites in tq: for input_nodes, output_nodes, mfgs in tq:
inputs = bipartites[0].srcdata['feat'] inputs = mfgs[0].srcdata['feat']
labels.append(bipartites[-1].dstdata['label'].cpu().numpy()) labels.append(mfgs[-1].dstdata['label'].cpu().numpy())
predictions.append(model(bipartites, inputs).argmax(1).cpu().numpy()) predictions.append(model(mfgs, inputs).argmax(1).cpu().numpy())
predictions = np.concatenate(predictions) predictions = np.concatenate(predictions)
labels = np.concatenate(labels) labels = np.concatenate(labels)
accuracy = sklearn.metrics.accuracy_score(labels, predictions) accuracy = sklearn.metrics.accuracy_score(labels, predictions)
......
...@@ -117,7 +117,7 @@ train_dataloader = dgl.dataloading.EdgeDataLoader( ...@@ -117,7 +117,7 @@ train_dataloader = dgl.dataloading.EdgeDataLoader(
torch.arange(graph.number_of_edges()), # The edges to iterate over torch.arange(graph.number_of_edges()), # The edges to iterate over
sampler, # The neighbor sampler sampler, # The neighbor sampler
negative_sampler=negative_sampler, # The negative sampler negative_sampler=negative_sampler, # The negative sampler
device=device, # Put the bipartite graphs on CPU or GPU device=device, # Put the MFGs on CPU or GPU
# The following arguments are inherited from PyTorch DataLoader. # The following arguments are inherited from PyTorch DataLoader.
batch_size=1024, # Batch size batch_size=1024, # Batch size
shuffle=True, # Whether to shuffle the nodes for every epoch shuffle=True, # Whether to shuffle the nodes for every epoch
...@@ -131,11 +131,11 @@ train_dataloader = dgl.dataloading.EdgeDataLoader( ...@@ -131,11 +131,11 @@ train_dataloader = dgl.dataloading.EdgeDataLoader(
# will give you. # will give you.
# #
input_nodes, pos_graph, neg_graph, bipartites = next(iter(train_dataloader)) input_nodes, pos_graph, neg_graph, mfgs = next(iter(train_dataloader))
print('Number of input nodes:', len(input_nodes)) print('Number of input nodes:', len(input_nodes))
print('Positive graph # nodes:', pos_graph.number_of_nodes(), '# edges:', pos_graph.number_of_edges()) print('Positive graph # nodes:', pos_graph.number_of_nodes(), '# edges:', pos_graph.number_of_edges())
print('Negative graph # nodes:', neg_graph.number_of_nodes(), '# edges:', neg_graph.number_of_edges()) print('Negative graph # nodes:', neg_graph.number_of_nodes(), '# edges:', neg_graph.number_of_edges())
print(bipartites) print(mfgs)
###################################################################### ######################################################################
...@@ -152,9 +152,9 @@ print(bipartites) ...@@ -152,9 +152,9 @@ print(bipartites)
# necessary for computing the pair-wise scores of positive and negative examples # necessary for computing the pair-wise scores of positive and negative examples
# in the current minibatch. # in the current minibatch.
# #
# The last element is a list of bipartite graphs storing the computation # The last element is a list of :doc:`MFGs <L0_neighbor_sampling_overview>`
# dependencies for each GNN layer. # storing the computation dependencies for each GNN layer.
# The bipartite graphs are used to compute the GNN outputs of the nodes # The MFGs are used to compute the GNN outputs of the nodes
# involved in positive/negative graph. # involved in positive/negative graph.
# #
...@@ -180,12 +180,12 @@ class Model(nn.Module): ...@@ -180,12 +180,12 @@ class Model(nn.Module):
self.conv2 = SAGEConv(h_feats, h_feats, aggregator_type='mean') self.conv2 = SAGEConv(h_feats, h_feats, aggregator_type='mean')
self.h_feats = h_feats self.h_feats = h_feats
def forward(self, bipartites, x): def forward(self, mfgs, x):
h_dst = x[:bipartites[0].num_dst_nodes()] h_dst = x[:mfgs[0].num_dst_nodes()]
h = self.conv1(bipartites[0], (x, h_dst)) h = self.conv1(mfgs[0], (x, h_dst))
h = F.relu(h) h = F.relu(h)
h_dst = h[:bipartites[1].num_dst_nodes()] h_dst = h[:mfgs[1].num_dst_nodes()]
h = self.conv2(bipartites[1], (h, h_dst)) h = self.conv2(mfgs[1], (h, h_dst))
return h return h
model = Model(num_features, 128).to(device) model = Model(num_features, 128).to(device)
...@@ -256,10 +256,10 @@ def inference(model, graph, node_features): ...@@ -256,10 +256,10 @@ def inference(model, graph, node_features):
device=device) device=device)
result = [] result = []
for input_nodes, output_nodes, bipartites in train_dataloader: for input_nodes, output_nodes, mfgs in train_dataloader:
# feature copy from CPU to GPU takes place here # feature copy from CPU to GPU takes place here
inputs = bipartites[0].srcdata['feat'] inputs = mfgs[0].srcdata['feat']
result.append(model(bipartites, inputs)) result.append(model(mfgs, inputs))
return torch.cat(result) return torch.cat(result)
...@@ -324,11 +324,11 @@ best_accuracy = 0 ...@@ -324,11 +324,11 @@ best_accuracy = 0
best_model_path = 'model.pt' best_model_path = 'model.pt'
for epoch in range(1): for epoch in range(1):
with tqdm.tqdm(train_dataloader) as tq: with tqdm.tqdm(train_dataloader) as tq:
for step, (input_nodes, pos_graph, neg_graph, bipartites) in enumerate(tq): for step, (input_nodes, pos_graph, neg_graph, mfgs) in enumerate(tq):
# feature copy from CPU to GPU takes place here # feature copy from CPU to GPU takes place here
inputs = bipartites[0].srcdata['feat'] inputs = mfgs[0].srcdata['feat']
outputs = model(bipartites, inputs) outputs = model(mfgs, inputs)
pos_score = predictor(pos_graph, outputs) pos_score = predictor(pos_graph, outputs)
neg_score = predictor(neg_graph, outputs) neg_score = predictor(neg_graph, outputs)
......
...@@ -38,33 +38,34 @@ train_dataloader = dgl.dataloading.NodeDataLoader( ...@@ -38,33 +38,34 @@ train_dataloader = dgl.dataloading.NodeDataLoader(
num_workers=0 num_workers=0
) )
input_nodes, output_nodes, bipartites = next(iter(train_dataloader)) input_nodes, output_nodes, mfgs = next(iter(train_dataloader))
###################################################################### ######################################################################
# DGL Bipartite Graph Introduction # DGL Bipartite Graph Introduction
# -------------------------------- # --------------------------------
# #
# In the previous tutorials, you have seen the concept *bipartite graph*, # In the previous tutorials, you have seen the concept *message flow graph*
# where nodes are divided into two parts. # (MFG), where nodes are divided into two parts. It is a kind of (directional)
# bipartite graph.
# This section introduces how you can manipulate (directional) bipartite # This section introduces how you can manipulate (directional) bipartite
# graphs. # graphs.
# #
# You can access the input node features and output node features via # You can access the source node features and destination node features via
# ``srcdata`` and ``dstdata`` attributes: # ``srcdata`` and ``dstdata`` attributes:
# #
bipartite = bipartites[0] mfg = mfgs[0]
print(bipartite.srcdata) print(mfg.srcdata)
print(bipartite.dstdata) print(mfg.dstdata)
###################################################################### ######################################################################
# It also has ``num_src_nodes`` and ``num_dst_nodes`` functions to query # It also has ``num_src_nodes`` and ``num_dst_nodes`` functions to query
# how many input nodes and output nodes exist in the bipartite graph: # how many source nodes and destination nodes exist in the bipartite graph:
# #
print(bipartite.num_src_nodes(), bipartite.num_dst_nodes()) print(mfg.num_src_nodes(), mfg.num_dst_nodes())
###################################################################### ######################################################################
...@@ -72,18 +73,18 @@ print(bipartite.num_src_nodes(), bipartite.num_dst_nodes()) ...@@ -72,18 +73,18 @@ print(bipartite.num_src_nodes(), bipartite.num_dst_nodes())
# will do with ``ndata`` on the graphs you have seen earlier: # will do with ``ndata`` on the graphs you have seen earlier:
# #
bipartite.srcdata['x'] = torch.zeros(bipartite.num_src_nodes(), bipartite.num_dst_nodes()) mfg.srcdata['x'] = torch.zeros(mfg.num_src_nodes(), mfg.num_dst_nodes())
dst_feat = bipartite.dstdata['feat'] dst_feat = mfg.dstdata['feat']
###################################################################### ######################################################################
# Also, since the bipartite graphs are constructed by DGL, you can # Also, since the bipartite graphs are constructed by DGL, you can
# retrieve the input node IDs (i.e. those that are required to compute the # retrieve the source node IDs (i.e. those that are required to compute the
# output) and output node IDs (i.e. those whose representations the # output) and destination node IDs (i.e. those whose representations the
# current GNN layer should compute) as follows. # current GNN layer should compute) as follows.
# #
bipartite.srcdata[dgl.NID], bipartite.dstdata[dgl.NID] mfg.srcdata[dgl.NID], mfg.dstdata[dgl.NID]
###################################################################### ######################################################################
...@@ -93,30 +94,30 @@ bipartite.srcdata[dgl.NID], bipartite.dstdata[dgl.NID] ...@@ -93,30 +94,30 @@ bipartite.srcdata[dgl.NID], bipartite.dstdata[dgl.NID]
###################################################################### ######################################################################
# Recall that the bipartite graphs yielded by the ``NodeDataLoader`` and # Recall that the MFGs yielded by the ``NodeDataLoader`` and
# ``EdgeDataLoader`` have the property that the first few input nodes are # ``EdgeDataLoader`` have the property that the first few source nodes are
# always identical to the output nodes: # always identical to the destination nodes:
# #
# |image1| # |image1|
# #
# .. |image1| image:: https://data.dgl.ai/tutorial/img/bipartite.gif # .. |image1| image:: https://data.dgl.ai/tutorial/img/bipartite.gif
# #
print(torch.equal(bipartite.srcdata[dgl.NID][:bipartite.num_dst_nodes()], bipartite.dstdata[dgl.NID])) print(torch.equal(mfg.srcdata[dgl.NID][:mfg.num_dst_nodes()], mfg.dstdata[dgl.NID]))
###################################################################### ######################################################################
# Suppose you have obtained the input node representations # Suppose you have obtained the source node representations
# :math:`h_u^{(l-1)}`: # :math:`h_u^{(l-1)}`:
# #
bipartite.srcdata['h'] = torch.randn(bipartite.num_src_nodes(), 10) mfg.srcdata['h'] = torch.randn(mfg.num_src_nodes(), 10)
###################################################################### ######################################################################
# Recall that DGL provides the `update_all` interface for expressing how # Recall that DGL provides the `update_all` interface for expressing how
# to compute messages and how to aggregate them on the nodes that receive # to compute messages and how to aggregate them on the nodes that receive
# them. This concept naturally applies to bipartite graphs -- message # them. This concept naturally applies to bipartite graphs like MFGs -- message
# computation happens on the edges between source and destination nodes of # computation happens on the edges between source and destination nodes of
# the edges, and message aggregation happens on the destination nodes. # the edges, and message aggregation happens on the destination nodes.
# #
...@@ -129,8 +130,8 @@ bipartite.srcdata['h'] = torch.randn(bipartite.num_src_nodes(), 10) ...@@ -129,8 +130,8 @@ bipartite.srcdata['h'] = torch.randn(bipartite.num_src_nodes(), 10)
import dgl.function as fn import dgl.function as fn
bipartite.update_all(message_func=fn.copy_u('h', 'm'), reduce_func=fn.mean('m', 'h')) mfg.update_all(message_func=fn.copy_u('h', 'm'), reduce_func=fn.mean('m', 'h'))
m_v = bipartite.dstdata['h'] m_v = mfg.dstdata['h']
m_v m_v
...@@ -165,9 +166,9 @@ class SAGEConv(nn.Module): ...@@ -165,9 +166,9 @@ class SAGEConv(nn.Module):
Parameters Parameters
---------- ----------
g : Graph g : Graph
The input bipartite graph. The input MFG.
h : (Tensor, Tensor) h : (Tensor, Tensor)
The feature of input nodes and output nodes as a pair of Tensors. The feature of source nodes and destination nodes as a pair of Tensors.
""" """
with g.local_scope(): with g.local_scope():
h_src, h_dst = h h_src, h_dst = h
...@@ -185,12 +186,12 @@ class Model(nn.Module): ...@@ -185,12 +186,12 @@ class Model(nn.Module):
self.conv1 = SAGEConv(in_feats, h_feats) self.conv1 = SAGEConv(in_feats, h_feats)
self.conv2 = SAGEConv(h_feats, num_classes) self.conv2 = SAGEConv(h_feats, num_classes)
def forward(self, bipartites, x): def forward(self, mfgs, x):
h_dst = x[:bipartites[0].num_dst_nodes()] h_dst = x[:mfgs[0].num_dst_nodes()]
h = self.conv1(bipartites[0], (x, h_dst)) h = self.conv1(mfgs[0], (x, h_dst))
h = F.relu(h) h = F.relu(h)
h_dst = h[:bipartites[1].num_dst_nodes()] h_dst = h[:mfgs[1].num_dst_nodes()]
h = self.conv2(bipartites[1], (h, h_dst)) h = self.conv2(mfgs[1], (h, h_dst))
return h return h
sampler = dgl.dataloading.MultiLayerNeighborSampler([4, 4]) sampler = dgl.dataloading.MultiLayerNeighborSampler([4, 4])
...@@ -205,15 +206,15 @@ train_dataloader = dgl.dataloading.NodeDataLoader( ...@@ -205,15 +206,15 @@ train_dataloader = dgl.dataloading.NodeDataLoader(
model = Model(graph.ndata['feat'].shape[1], 128, dataset.num_classes).to(device) model = Model(graph.ndata['feat'].shape[1], 128, dataset.num_classes).to(device)
with tqdm.tqdm(train_dataloader) as tq: with tqdm.tqdm(train_dataloader) as tq:
for step, (input_nodes, output_nodes, bipartites) in enumerate(tq): for step, (input_nodes, output_nodes, mfgs) in enumerate(tq):
inputs = bipartites[0].srcdata['feat'] inputs = mfgs[0].srcdata['feat']
labels = bipartites[-1].dstdata['label'] labels = mfgs[-1].dstdata['label']
predictions = model(bipartites, inputs) predictions = model(mfgs, inputs)
###################################################################### ######################################################################
# Both ``update_all`` and the functions in ``nn.functional`` namespace # Both ``update_all`` and the functions in ``nn.functional`` namespace
# support bipartite graphs, so you can migrate the code working for small # support MFGs, so you can migrate the code working for small
# graphs to large graph training with minimal changes introduced above. # graphs to large graph training with minimal changes introduced above.
# #
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment