Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
dgl
Commits
34ae70b5
Unverified
Commit
34ae70b5
authored
Mar 07, 2024
by
Rhett Ying
Committed by
GitHub
Mar 07, 2024
Browse files
[DistGB] update documentation (#7201)
parent
996a9364
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
54 additions
and
0 deletions
+54
-0
docs/source/api/python/dgl.distributed.rst
docs/source/api/python/dgl.distributed.rst
+1
-0
tutorials/dist/1_node_classification.py
tutorials/dist/1_node_classification.py
+53
-0
No files found.
docs/source/api/python/dgl.distributed.rst
View file @
34ae70b5
...
@@ -104,3 +104,4 @@ Split and Load Partitions
...
@@ -104,3 +104,4 @@ Split and Load Partitions
load_partition_feats
load_partition_feats
load_partition_book
load_partition_book
partition_graph
partition_graph
dgl_partition_to_graphbolt
tutorials/dist/1_node_classification.py
View file @
34ae70b5
...
@@ -436,4 +436,57 @@ If we split the graph into four partitions as demonstrated at the beginning of t
...
@@ -436,4 +436,57 @@ If we split the graph into four partitions as demonstrated at the beginning of t
ip_addr3
ip_addr3
ip_addr4
ip_addr4
Sample neighbors with `GraphBolt`
----------------------------------
Since DGL 2.0, we have introduced a new dataloading framework
`GraphBolt <https://doc.dgl.ai/stochastic_training/index.html>`_ in
which sampling is highly improved compared to previous implementations in DGL.
As a result, we've introduced `GraphBolt` to distributed training to improve
the performance of distributed sampling. What's more, the graph partitions
could be much smaller than before, which is beneficial for the loading speed
and memory usage during distributed training.
Graph partitioning
^^^^^^^^^^^^^^^^^^^
In order to benefit from `GraphBolt` for distributed sampling, we need to
convert partitions from `DGL` format to `GraphBolt` format. This can be done by
`dgl.distributed.dgl_partition_to_graphbolt` function. Alternatively, we can use
`dgl.distributed.partition_graph` function to generate partitions in `GraphBolt`
format directly.
1. Convert partitions from `DGL` format to `GraphBolt` format.
.. code-block:: python
part_config = "4part_data/ogbn-products.json"
dgl.distributed.dgl_partition_to_graphbolt(part_config)
The new partitions will be stored in the same directory as the original
partitions.
2. Generate partitions in `GraphBolt` format directly. Just set the
`use_graphbolt` flag to `True` in `partition_graph` function.
.. code-block:: python
dgl.distributed.partition_graph(graph, graph_name='ogbn-products', num_parts=4,
out_path='4part_data',
balance_ntypes=graph.ndata['train_mask'],
balance_edges=True,
use_graphbolt=True)
Enable `GraphBolt` sampling in the training script
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Just set the `use_graphbolt` flag to `True` in `dgl.distributed.initialize`
function. This is the only change needed in the training script to enable
`GraphBolt` sampling.
.. code-block:: python
dgl.distributed.initialize('ip_config.txt', use_graphbolt=True)
"""
"""
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment