"git@developer.sourcefind.cn:OpenDAS/pytorch3d.git" did not exist on "b28754f8e1e0d5852a529e737d628fb2e7e2bd96"
Unverified Commit a88f3511 authored by Minjie Wang's avatar Minjie Wang Committed by GitHub
Browse files

[Doc] update doc string (#426)

parent 3cc32a97
...@@ -15,3 +15,4 @@ API Reference ...@@ -15,3 +15,4 @@ API Reference
data data
transform transform
nn nn
subgraph
.. _apigraph: .. _apigraph:
DGLSubGraph -- Class for subgraph data structure DGLSubGraph -- Class for subgraph data structure
========================================= ================================================
.. currentmodule:: dgl .. currentmodule:: dgl.subgraph
.. autoclass:: DGLSubGraph .. autoclass:: DGLSubGraph
Mapping between subgraph and parent graph Mapping between subgraph and parent graph
------------------------------------- -----------------------------------------
.. autosummary::
:toctree: ../../generated/ :toctree: ../../generated/
DGLSubGraph.parent_nid DGLSubGraph.parent_nid
...@@ -16,6 +17,7 @@ DGLSubGraph -- Class for subgraph data structure ...@@ -16,6 +17,7 @@ DGLSubGraph -- Class for subgraph data structure
Synchronize features between subgraph and parent graph Synchronize features between subgraph and parent graph
------------------------------------------------------ ------------------------------------------------------
.. autosummary::
:toctree: ../../generated/ :toctree: ../../generated/
DGLSubGraph.copy_from_parent DGLSubGraph.copy_from_parent
......
...@@ -192,52 +192,77 @@ def NeighborSampler(g, batch_size, expand_factor, num_hops=1, ...@@ -192,52 +192,77 @@ def NeighborSampler(g, batch_size, expand_factor, num_hops=1,
shuffle=False, num_workers=1, prefetch=False, add_self_loop=False): shuffle=False, num_workers=1, prefetch=False, add_self_loop=False):
'''Create a sampler that samples neighborhood. '''Create a sampler that samples neighborhood.
This creates a NodeFlow loader that samples subgraphs from the input graph It returns a generator of :class:`~dgl.NodeFlow`. This can be viewed as
with neighbor sampling. This sampling method is implemented in C and can perform an analogy of *mini-batch training* on graph data -- the given graph represents
sampling very efficiently. the whole dataset and the returned generator produces mini-batches (in the form
of :class:`~dgl.NodeFlow` objects).
A NodeFlow grows from a seed vertex. It contains sampled neighbors A NodeFlow grows from sampled nodes. It first samples a set of nodes from the given
of the seed vertex as well as the edges that connect neighbor nodes with ``seed_nodes`` (or all the nodes if not given), then samples their neighbors
seed nodes. When the number of hops is k (>1), the neighbors are sampled and extracts the subgraph. If the number of hops is :math:`k(>1)`, the process is repeated
from the k-hop neighborhood. In this case, the sampled edges are the ones recursively, with the neighbor nodes just sampled become the new seed nodes.
that connect the source nodes and the sampled neighbor nodes of the source The result is a graph we defined as :class:`~dgl.NodeFlow` that contains :math:`k+1`
nodes. layers. The last layer is the initial seed nodes. The sampled neighbor nodes in
layer :math:`i+1` are in layer :math:`i`. All the edges are from nodes
in layer :math:`i` to layer :math:`i+1`.
The NodeFlow loader returns a list of NodeFlows. The size of the NodeFlow list TODO(minjie): give a figure here.
is the number of workers.
As an analogy to mini-batch training, the ``batch_size`` here is equal to the number
of the initial seed nodes (number of nodes in the last layer).
The number of nodeflow objects (the number of batches) is calculated by
``len(seed_nodes) // batch_size`` (if ``seed_nodes`` is None, then it is equal
to the set of all nodes in the graph).
Parameters Parameters
---------- ----------
g: the DGLGraph where we sample NodeFlows. g : DGLGraph
batch_size: The number of NodeFlows in a batch. The DGLGraph where we sample NodeFlows.
expand_factor: the number of neighbors sampled from the neighbor list batch_size : int
of a vertex. The value of this parameter can be The batch size (i.e, the number of nodes in the last layer)
an integer: indicates the number of neighbors sampled from a neighbor list. expand_factor : int, float, str
a floating-point: indicates the ratio of the sampled neighbors in a neighbor list. The number of neighbors sampled from the neighbor list of a vertex.
string: indicates some common ways of calculating the number of sampled neighbors, The value of this parameter can be:
e.g., 'sqrt(deg)'.
num_hops: The size of the neighborhood where we sample vertices. * int: indicates the number of neighbors sampled from a neighbor list.
neighbor_type: indicates the neighbors on different types of edges. * float: indicates the ratio of the sampled neighbors in a neighbor list.
"in" means the neighbors on the in-edges, "out" means the neighbors on * str: indicates some common ways of calculating the number of sampled neighbors,
the out-edges and "both" means neighbors on both types of edges. e.g., ``sqrt(deg)``.
node_prob: the probability that a neighbor node is sampled.
1D Tensor. None means uniform sampling. Otherwise, the number of elements num_hops : int, optional
should be the same as the number of vertices in the graph. The number of hops to sample (i.e, the number of layers in the NodeFlow).
seed_nodes: a list of nodes where we sample NodeFlows from. Default: 1
If it's None, the seed vertices are all vertices in the graph. neighbor_type: str, optional
shuffle: indicates the sampled NodeFlows are shuffled. Indicates the neighbors on different types of edges.
num_workers: the number of worker threads that sample NodeFlows in parallel.
prefetch : bool, default False * "in": the neighbors on the in-edges.
Whether to prefetch the samples in the next batch. * "out": the neighbors on the out-edges.
add_self_loop : bool, default False * "both": the neighbors on both types of edges.
Whether to add self loop to the sampled NodeFlow.
If True, the edge IDs of the self loop edges are -1. Default: "in"
node_prob : Tensor, optional
A 1D tensor for the probability that a neighbor node is sampled.
None means uniform sampling. Otherwise, the number of elements
should be equal to the number of vertices in the graph.
Default: None
seed_nodes : Tensor, optional
A 1D tensor list of nodes where we sample NodeFlows from.
If None, the seed vertices are all the vertices in the graph.
Default: None
shuffle : bool, optional
Indicates the sampled NodeFlows are shuffled. Default: False
num_workers : int, optional
The number of worker threads that sample NodeFlows in parallel. Default: 1
prefetch : bool, optional
If true, prefetch the samples in the next batch. Default: False
add_self_loop : bool, optional
If true, add self loop to the sampled NodeFlow.
The edge IDs of the self loop edges are -1. Default: False
Returns Returns
------- -------
A NodeFlow iterator generator
The iterator returns a list of batched NodeFlows and a dictionary of additional The generator of NodeFlows.
information about the NodeFlows.
''' '''
loader = NSSubgraphLoader(g, batch_size, expand_factor, num_hops, neighbor_type, node_prob, loader = NSSubgraphLoader(g, batch_size, expand_factor, num_hops, neighbor_type, node_prob,
seed_nodes, shuffle, num_workers, add_self_loop) seed_nodes, shuffle, num_workers, add_self_loop)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment