dgl.dataloading.rst 3.06 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
.. _api-dataloading:

dgl.dataloading
=================================

.. automodule:: dgl.dataloading

DataLoaders
-----------
.. currentmodule:: dgl.dataloading.pytorch

12
13
14
15
16
DGL DataLoader for mini-batch training works similarly to PyTorch's DataLoader.
It has a generator interface that returns mini-batches sampled from some given graphs.
DGL provides two DataLoaders: a ``NodeDataLoader`` for node classification task
and an ``EdgeDataLoader`` for edge/link prediction task.

17
18
.. autoclass:: NodeDataLoader
.. autoclass:: EdgeDataLoader
19
.. autoclass:: GraphDataLoader
20

21
.. _api-dataloading-neighbor-sampling:
22

23
Neighbor Sampler
24
----------------
25
.. currentmodule:: dgl.dataloading.neighbor
26

27
28
29
30
Neighbor samplers are classes that control the behavior of ``DataLoader`` s
to sample neighbors. All of them inherit the base :class:`BlockSampler` class, but implement
different neighbor sampling strategies by overriding the ``sample_frontier`` or
the ``sample_blocks`` methods.
31
32

.. autoclass:: BlockSampler
33
    :members: sample_frontier, sample_blocks, sample
34
35
36

.. autoclass:: MultiLayerNeighborSampler
    :members: sample_frontier
37
38
39
40
    :show-inheritance:

.. autoclass:: MultiLayerFullNeighborSampler
    :show-inheritance:
41

42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
Subgraph Iterators
------------------
Subgraph iterators iterate over the original graph in subgraphs. One should use subgraph
iterators with ``GraphDataLoader`` like follows:

.. code:: python

   sgiter = dgl.dataloading.ClusterGCNSubgraphIterator(
       g, num_partitions=100, cache_directory='.', refresh=True)
   dataloader = dgl.dataloading.GraphDataLoader(sgiter, batch_size=4, num_workers=0)
   for subgraph_batch in dataloader:
       train_on(subgraph_batch)

.. autoclass:: dgl.dataloading.dataloader.SubgraphIterator

.. autoclass:: dgl.dataloading.cluster_gcn.ClusterGCNSubgraphIterator

ShaDow-GNN Subgraph Sampler
---------------------------
.. currentmodule:: dgl.dataloading.shadow

.. autoclass:: ShaDowKHopSampler

65
66
67
68
69
70
71
72
73
74
75
76
77
.. _api-dataloading-collators:

Collators
---------
.. currentmodule:: dgl.dataloading

Collators are platform-agnostic classes that generates the mini-batches
given the graphs and indices to sample from.

.. autoclass:: NodeCollator
.. autoclass:: EdgeCollator
.. autoclass:: GraphCollator

78
.. _api-dataloading-negative-sampling:
79
80
81
82
83

Negative Samplers for Link Prediction
-------------------------------------
.. currentmodule:: dgl.dataloading.negative_sampler

84
85
86
Negative samplers are classes that control the behavior of the ``EdgeDataLoader``
to generate negative edges.

87
88
.. autoclass:: Uniform
    :members: __call__
89

90
91
92
.. autoclass:: GlobalUniform
    :members: __call__

93
94
95
96
Async Copying to/from GPUs
--------------------------
.. currentmodule:: dgl.dataloading

97
Data can be copied from the CPU to the GPU
98
99
while the GPU is being used for
computation, using the :class:`AsyncTransferer`.
100
101
102
103
104
For the transfer to be fully asynchronous, the context the
:class:`AsyncTranserer`
is created with must be a GPU context, and the input tensor must be in 
pinned memory.

105
106
107
108
109
110

.. autoclass:: AsyncTransferer
    :members: __init__, async_copy

.. autoclass:: async_transferer.Transfer
    :members: wait