dgl.dataloading.rst 3.13 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
.. _api-dataloading:

dgl.dataloading
=================================

.. automodule:: dgl.dataloading

DataLoaders
-----------
.. currentmodule:: dgl.dataloading.pytorch

12
13
14
15
16
DGL DataLoader for mini-batch training works similarly to PyTorch's DataLoader.
It has a generator interface that returns mini-batches sampled from some given graphs.
DGL provides two DataLoaders: a ``NodeDataLoader`` for node classification task
and an ``EdgeDataLoader`` for edge/link prediction task.

17
18
.. autoclass:: NodeDataLoader
.. autoclass:: EdgeDataLoader
19
.. autoclass:: GraphDataLoader
20
21
.. autoclass:: DistNodeDataLoader
.. autoclass:: DistEdgeDataLoader
22

23
.. _api-dataloading-neighbor-sampling:
24

25
Neighbor Sampler
26
----------------
27
.. currentmodule:: dgl.dataloading.neighbor
28

29
30
31
32
Neighbor samplers are classes that control the behavior of ``DataLoader`` s
to sample neighbors. All of them inherit the base :class:`BlockSampler` class, but implement
different neighbor sampling strategies by overriding the ``sample_frontier`` or
the ``sample_blocks`` methods.
33
34

.. autoclass:: BlockSampler
35
    :members: sample_frontier, sample_blocks, sample
36
37
38

.. autoclass:: MultiLayerNeighborSampler
    :members: sample_frontier
39
40
41
42
    :show-inheritance:

.. autoclass:: MultiLayerFullNeighborSampler
    :show-inheritance:
43

44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
Subgraph Iterators
------------------
Subgraph iterators iterate over the original graph in subgraphs. One should use subgraph
iterators with ``GraphDataLoader`` like follows:

.. code:: python

   sgiter = dgl.dataloading.ClusterGCNSubgraphIterator(
       g, num_partitions=100, cache_directory='.', refresh=True)
   dataloader = dgl.dataloading.GraphDataLoader(sgiter, batch_size=4, num_workers=0)
   for subgraph_batch in dataloader:
       train_on(subgraph_batch)

.. autoclass:: dgl.dataloading.dataloader.SubgraphIterator

.. autoclass:: dgl.dataloading.cluster_gcn.ClusterGCNSubgraphIterator

ShaDow-GNN Subgraph Sampler
---------------------------
.. currentmodule:: dgl.dataloading.shadow

.. autoclass:: ShaDowKHopSampler

67
68
69
70
71
72
73
74
75
76
77
78
79
.. _api-dataloading-collators:

Collators
---------
.. currentmodule:: dgl.dataloading

Collators are platform-agnostic classes that generates the mini-batches
given the graphs and indices to sample from.

.. autoclass:: NodeCollator
.. autoclass:: EdgeCollator
.. autoclass:: GraphCollator

80
.. _api-dataloading-negative-sampling:
81
82
83
84
85

Negative Samplers for Link Prediction
-------------------------------------
.. currentmodule:: dgl.dataloading.negative_sampler

86
87
88
Negative samplers are classes that control the behavior of the ``EdgeDataLoader``
to generate negative edges.

89
90
.. autoclass:: Uniform
    :members: __call__
91

92
93
94
.. autoclass:: GlobalUniform
    :members: __call__

95
96
97
98
Async Copying to/from GPUs
--------------------------
.. currentmodule:: dgl.dataloading

99
Data can be copied from the CPU to the GPU
100
101
while the GPU is being used for
computation, using the :class:`AsyncTransferer`.
102
103
104
105
106
For the transfer to be fully asynchronous, the context the
:class:`AsyncTranserer`
is created with must be a GPU context, and the input tensor must be in 
pinned memory.

107
108
109
110
111
112

.. autoclass:: AsyncTransferer
    :members: __init__, async_copy

.. autoclass:: async_transferer.Transfer
    :members: wait