[Doc] user guide CN chapter 6 (#2452)

* [Doc] Add 1st CN in chapter 6 * [Doc] Add 1st CN in chapter 6 * [Doc] Add 1st CN in chapter 6.1 * [Doc] Add 1st CN in chapter 6.1 * [Doc] Add 1st CN in chapter 6.2 * [Doc] Add CN in chapter 6.2 * [Doc] Add CN in chapter 6.3 * [Doc] Fix a code bug in chapter 6.3 * [Doc] Add CN in chapter 6.4, 6.5, and 6.6. * [Doc] Add CN in chapter 6.4, 6.5, and 6.6，then remove EN parts. * [Doc] revised 6.4, 6.5, and 6.6. * [Doc] 2nd round of copyediting. * Delete syn1.bin Remove this unrelated file to this branch * [Doc] Update guide_cn index.rst. * [Doc] Update guide_cn index.rst. * [Doc] Update guide_cn index.rst. * [Doc] Update guide_cn index.rst. * [Doc] Add chapter 7 in index page * [Doc] edit chapter 6 in PR * [Doc] edit chapter 6 in PR * [Doc] Copyediting with Murphy and Minjie's comments. * [Doc] Fix doc line in the previous chapters. Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com> Co-authored-by: Mufei Li <mufeili1996@gmail.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

[Doc] user guide CN chapter 6 (#2452)
* [Doc] Add 1st CN in chapter 6 * [Doc] Add 1st CN in chapter 6 * [Doc] Add 1st CN in chapter 6.1 * [Doc] Add 1st CN in chapter 6.1 * [Doc] Add 1st CN in chapter 6.2 * [Doc] Add CN in chapter 6.2 * [Doc] Add CN in chapter 6.3 * [Doc] Fix a code bug in chapter 6.3 * [Doc] Add CN in chapter 6.4, 6.5, and 6.6. * [Doc] Add CN in chapter 6.4, 6.5, and 6.6，then remove EN parts. * [Doc] revised 6.4, 6.5, and 6.6. * [Doc] 2nd round of copyediting. * Delete syn1.bin Remove this unrelated file to this branch * [Doc] Update guide_cn index.rst. * [Doc] Update guide_cn index.rst. * [Doc] Update guide_cn index.rst. * [Doc] Update guide_cn index.rst. * [Doc] Add chapter 7 in index page * [Doc] edit chapter 6 in PR * [Doc] edit chapter 6 in PR * [Doc] Copyediting with Murphy and Minjie's comments. * [Doc] Fix doc line in the previous chapters. Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com> Co-authored-by: Mufei Li <mufeili1996@gmail.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
31f65f03 · zhjwy9343 · GitHub · 54f186bb · 31f65f03 · 31f65f03
Unverified Commit 31f65f03 authored Jan 13, 2021 by zhjwy9343 Committed by GitHub Jan 13, 2021
20 changed files
--- a/docs/source/guide/minibatch-custom-sampler.rst
+++ b/docs/source/guide/minibatch-custom-sampler.rst
@@ -3,6 +3,8 @@
 6.4 Customizing Neighborhood Sampler
 ----------------------------------------------

+:ref:`(中文版) <guide_cn-minibatch-customizing-neighborhood-sampler>`
+
 Although DGL provides some neighborhood sampling strategies, sometimes
 users would want to write their own sampling strategy. This section
 explains how to write your own strategy and plug it into your stochastic
@@ -51,9 +53,7 @@ green nodes:
 Neighborhood sampling with pencil and paper
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-We then consider how multi-layer message passing works for computing the
-output of a single node. In the following text we refer to the nodes
-whose GNN outputs are to be computed as *seed nodes*.
+Let's first define a DGL graph according to the above image.

 .. code:: python

@@ -67,8 +67,10 @@ whose GNN outputs are to be computed as *seed nodes*.
        [1, 2, 3, 3, 3, 4, 5, 5, 6, 5, 8, 6, 8, 9, 8, 11, 11, 10, 11,
         0, 0, 0, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 7, 7, 8, 9, 10])
    g = dgl.graph((src, dst))
-    g.ndata['x'] = torch.randn(12, 5)
-    g.ndata['y'] = torch.randn(12, 1)
+
+We then consider how multi-layer message passing works for computing the
+output of a single node. In the following text we refer to the nodes
+whose GNN outputs are to be computed as *seed nodes*.

 Finding the message passing dependency
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -153,7 +155,6 @@ second GNN layer for node 8.
   :alt: Imgur


-
 Note that the output nodes also appear in the input nodes. The reason is
 that representations of output nodes from the previous layer are needed
 for feature combination after message passing (i.e. :math:`\phi^{(2)}`).
@@ -206,8 +207,6 @@ output nodes via

 ::

-   <b>ID Mappings</b>
-
 The original node IDs of the input nodes and output nodes in the block
 can be found as the feature ``dgl.NID``, and the mapping from the
 block’s edge IDs to the input frontier’s edge IDs can be found as the

--- a/docs/source/guide/minibatch-edge.rst
+++ b/docs/source/guide/minibatch-edge.rst
@@ -3,6 +3,8 @@
 6.2 Training GNN for Edge Classification with Neighborhood Sampling
 ----------------------------------------------------------------------

+:ref:`(中文版) <guide_cn-minibatch-edge-classification-sampler>`
+
 Training for edge classification/regression is somewhat similar to that
 of node classification/regression with several notable differences.

@@ -20,7 +22,7 @@ To use the neighborhood sampler provided by DGL for edge classification,
 one need to instead combine it with
 :class:`~dgl.dataloading.pytorch.EdgeDataLoader`, which iterates
 over a set of edges in minibatches, yielding the subgraph induced by the
-edge minibatch and ``blocks`` to be consumed by the module above.
+edge minibatch and ``blocks`` to be consumed by the module below.

 For example, the following code creates a PyTorch DataLoader that
 iterates over the training edge ID array ``train_eids`` in batches,

--- a/docs/source/guide/minibatch-inference.rst
+++ b/docs/source/guide/minibatch-inference.rst
@@ -3,6 +3,8 @@
 6.6 Exact Offline Inference on Large Graphs
 ------------------------------------------------------

+:ref:`(中文版) <guide_cn-minibatch-inference>`
+
 Both subgraph sampling and neighborhood sampling are to reduce the
 memory and time consumption for training GNNs with GPUs. When performing
 inference it is usually better to truly aggregate over all neighbors
@@ -34,7 +36,8 @@ that for every layer only the first three minibatches are drawn).
 Implementing Offline Inference
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-Consider the two-layer GCN we have mentioned in Section 6.5.1. The way
+Consider the two-layer GCN we have mentioned in Section 6.1
+:ref:`guide-minibatch-node-classification-model`. The way
 to implement offline inference still involves using
 :class:`~dgl.dataloading.neighbor.MultiLayerFullNeighborSampler`, but sampling for
 only one layer at a time. Note that offline inference is implemented as

--- a/docs/source/guide/minibatch-link.rst
+++ b/docs/source/guide/minibatch-link.rst
@@ -3,6 +3,8 @@
 6.3 Training GNN for Link Prediction with Neighborhood Sampling
 --------------------------------------------------------------------

+:ref:`(中文版) <guide_cn-minibatch-link-classification-sampler>`
+
 Define a neighborhood sampler and data loader with negative sampling
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@@ -262,6 +264,7 @@ Then you can give the dataloader a dictionary of edge types and edge IDs as well
 sampler.  For instance, the following iterates over all edges of the heterogeneous graph.

 .. code:: python
+
    train_eid_dict = {
        g.edges(etype=etype, form='eid')
        for etype in g.etypes}

--- a/docs/source/guide/minibatch-nn.rst
+++ b/docs/source/guide/minibatch-nn.rst
@@ -3,6 +3,8 @@
 6.5 Implementing Custom GNN Module for Mini-batch Training
 -------------------------------------------------------------

+:ref:`(中文版) <guide_cn-minibatch-custom-gnn-module>`
+
 If you were familiar with how to write a custom GNN module for updating
 the entire graph for homogeneous or heterogeneous graphs (see
 :ref:`guide-nn`), the code for computing on

--- a/docs/source/guide/minibatch-node.rst
+++ b/docs/source/guide/minibatch-node.rst
@@ -3,6 +3,8 @@
 6.1 Training GNN for Node Classification with Neighborhood Sampling
 -----------------------------------------------------------------------

+:ref:`(中文版) <guide_cn-minibatch-node-classification-sampler>`
+
 To make your model been trained stochastically, you need to do the
 followings:


--- a/docs/source/guide/minibatch.rst
+++ b/docs/source/guide/minibatch.rst
@@ -3,6 +3,8 @@
 Chapter 6: Stochastic Training on Large Graphs
 =======================================================

+:ref:`(中文版) <guide_cn-minibatch>`
+
 If we have a massive graph with, say, millions or even billions of nodes
 or edges, usually full-graph training as described in
 :ref:`guide-training`

--- a/docs/source/guide_cn/data-process.rst
+++ b/docs/source/guide_cn/data-process.rst
@@ -97,9 +97,9 @@ DGL建议让 ``__getitem__(idx)`` 返回如上面代码所示的元组 ``(图，
            # 用户自己的训练代码
            pass

-训练整图分类模型的完整指南可以在 :ref:`guide-training-graph-classification` 中找到。
+训练整图分类模型的完整指南可以在 :ref:`guide_cn-training-graph-classification` 中找到。

-有关整图分类数据集的更多示例，用户可以参考 :ref:`guide-training-graph-classification`：
+有关整图分类数据集的更多示例，用户可以参考 :ref:`guide_cn-training-graph-classification`：

 * :ref:`gindataset`

@@ -195,7 +195,7 @@ DGL建议使用节点掩码来指定数据集的划分。
    # 获取标签
    labels = graph.ndata['label']

-:ref:`guide-training-node-classification` 提供了训练节点分类模型的完整指南。
+:ref:`guide_cn-training-node-classification` 提供了训练节点分类模型的完整指南。

 有关节点分类数据集的更多示例，用户可以参考以下内置数据集：

@@ -285,7 +285,7 @@ DGL建议使用节点掩码来指定数据集的划分。
    # 获取训练集中的边类型
    rel = graph.edata['etype'][train_idx]

-有关训练链接预测模型的完整指南，请参见 :ref:`guide-training-link-prediction`。
+有关训练链接预测模型的完整指南，请参见 :ref:`guide_cn-training-link-prediction`。

 有关链接预测数据集的更多示例，请参考DGL的内置数据集：


--- a/docs/source/guide_cn/index.rst
+++ b/docs/source/guide_cn/index.rst
@@ -15,91 +15,128 @@
  minibatch
  distributed

-2020年9月，在DGL社区的一群热心贡献者的帮助下，DGL *User Guide* 被翻译成了中文，方便广大中文用户群学习和使用DGL。
+2020年9月，DGL社区的一群热心贡献者把DGL用户指南译成了中文，方便广大中文用户群学习和使用DGL。

 特此致谢下述贡献者：

 .. list-table::
-   :widths: 50 25 25 50
+   :widths: 20 20 20
   :header-rows: 1

   * - 章节
-     - 译者
     - 个人姓名/昵称
     - 个人链接
   * - :ref:`guide_cn-graph`
-     - 怀文
     - 张怀文/Huaiwen Zhang
     - https://github.com/huaiwen
-   * - :ref:`guide_cn-graph-basic`, :ref:`guide_cn-graph-feature`
-     - 枫轺
+   * - :ref:`guide_cn-graph-basic`
     - 沈成 / mlsoar
     - https://github.com/mlsoar
   * - :ref:`guide_cn-graph-graphs-nodes-edges`
-     - 塔目农民
     - 张建 / zhjwy9343
     - https://github.com/zhjwy9343
+   * - :ref:`guide_cn-graph-feature`
+     - 沈成 / mlsoar
+     - https://github.com/mlsoar
   * - :ref:`guide_cn-graph-external`
-     - 枫轺
     - 沈成 / mlsoar
     - https://github.com/mlsoar
   * - :ref:`guide_cn-graph-heterogeneous`
-     - 怀文
     - 张怀文/Huaiwen Zhang
     - https://github.com/huaiwen
   * - :ref:`guide_cn-message-passing`,
-     - Brook
     - 黄崟/Brook Huang
     - https://github.com/brookhuang16211
   * - :ref:`guide_cn-message-passing-api`
-     - Brook
     - 黄崟/Brook Huang
     - https://github.com/brookhuang16211
-   * - :ref:`guide_cn-message-passing-efficient`, :ref:`guide_cn-message-passing-part`
-     - Zhiyu
+   * - :ref:`guide_cn-message-passing-efficient`
+     - 黄崟/Brook Huang
+     - https://github.com/brookhuang16211
+   * - :ref:`guide_cn-message-passing-part`
+     - 陈知雨/Zhiyu Chen
+     - https://www.zhiyuchen.com
+   * - :ref:`guide_cn-message-passing-edge`
     - 陈知雨/Zhiyu Chen
     - https://www.zhiyuchen.com
-   * - :ref:`guide_cn-message-passing-edge`, :ref:`guide_cn-message-passing-heterograph`
-     - Zhiyu
+   * - :ref:`guide_cn-message-passing-heterograph`
     - 陈知雨/Zhiyu Chen
     - https://www.zhiyuchen.com
   * - :ref:`guide_cn-nn`
-     - Zhiyu
     - 陈知雨/Zhiyu Chen
     - https://www.zhiyuchen.com
   * - :ref:`guide_cn-nn-construction`
-     - Zhiyu
     - 陈知雨/Zhiyu Chen
     - https://www.zhiyuchen.com
-   * - :ref:`guide_cn-nn-forward`, :ref:`guide_cn-nn-heterograph`
+   * - :ref:`guide_cn-nn-forward`
     - 栩栩的夏天
     -
+   * - :ref:`guide_cn-nn-heterograph`
+     - 栩栩的夏天
     -
   * - :ref:`guide_cn-data-pipeline`
-     - 吴紫薇
     - 吴紫薇/ Maggie Wu
     - https://github.com/hhhiddleston
-   * - :ref:`guide_cn-data-pipeline-dataset`, :ref:`guide_cn-data-pipeline-download`, :ref:`guide_cn-data-pipeline-process`
-     - 吴紫薇
+   * - :ref:`guide_cn-data-pipeline-dataset`
     - 吴紫薇/ Maggie Wu
     - https://github.com/hhhiddleston
-   * - :ref:`guide_cn-data-pipeline-savenload`, :ref:`guide_cn-data-pipeline-loadogb`,
-     - 王建民-DrugAI
+   * - :ref:`guide_cn-data-pipeline-download`
+     - 吴紫薇/ Maggie Wu
+     - https://github.com/hhhiddleston
+   * - :ref:`guide_cn-data-pipeline-process`
+     - 吴紫薇/ Maggie Wu
+     - https://github.com/hhhiddleston
+   * - :ref:`guide_cn-data-pipeline-savenload`
+     - 王建民/DrugAI
+     - https://github.com/AspirinCode
+   * - :ref:`guide_cn-data-pipeline-loadogb`
     - 王建民/DrugAI
     - https://github.com/AspirinCode
   * - :ref:`guide_cn-training`
-     - 王建民-DrugAI
     - 王建民/DrugAI
     - https://github.com/AspirinCode
   * - :ref:`guide_cn-training-node-classification`,
-     - 王建民-DrugAI
     - 王建民/DrugAI
     - https://github.com/AspirinCode
-   * - :ref:`guide_cn-training-edge-classification`, :ref:`guide_cn-training-link-prediction`
-     - XDH
+   * - :ref:`guide_cn-training-edge-classification`
+     - 徐东辉/DonghuiXu
+     - https://github.com/rewonderful
+   * - :ref:`guide_cn-training-link-prediction`
     - 徐东辉/DonghuiXu
     - https://github.com/rewonderful
   * - :ref:`guide_cn-training-graph-classification`
-     - ੭ ᐕ)੭*⁾⁾ 蜜糖
     - 莫佳帅子/Molasses
     - https://github.com/sleeplessai
+   * - :ref:`guide_cn-minibatch`
+     - 莫佳帅子/Molasses
+     - https://github.com/sleeplessai
+   * - :ref:`guide_cn-minibatch-node-classification-sampler`
+     - 孟凡荣/kevin-meng
+     - https://github.com/kevin-meng
+   * - :ref:`guide_cn-minibatch-edge-classification-sampler`
+     - 莫佳帅子/Molasses
+     - https://github.com/sleeplessai
+   * - :ref:`guide_cn-minibatch-link-classification-sampler`
+     - 孟凡荣/kevin-meng
+     - https://github.com/kevin-meng
+   * - :ref:`guide_cn-minibatch-customizing-neighborhood-sampler`
+     - 孟凡荣/kevin-meng
+     - https://github.com/kevin-meng
+   * - :ref:`guide_cn-minibatch-custom-gnn-module`
+     - 胡骏
+     - https://github.com/CrawlScript
+   * - :ref:`guide_cn-minibatch-inference`
+     - 胡骏
+     - https://github.com/CrawlScript
+   * - :ref:`guide_cn-distributed`
+     - 宋怡然/Yiran Song
+     - https://github.com/rr-Yiran
+   * - :ref:`guide_cn-distributed-preprocessing`
+     - 宋怡然/Yiran Song
+     - https://github.com/rr-Yiran
+   * - :ref:`guide_cn-distributed-apis`
+     - 李庆标/Qingbiao Li
+     - https://qingbiaoli.github.io/
+   * - :ref:`guide_cn-distributed-tools`
+     - 李庆标/Qingbiao Li
+     - https://qingbiaoli.github.io/
--- a/docs/source/guide_cn/message-part.rst
+++ b/docs/source/guide_cn/message-part.rst
@@ -14,4 +14,4 @@
    sg = g.subgraph(nid)
    sg.update_all(message_func, reduce_func, apply_node_func)

-这是小批量训练中的常见用法。更多详细用法请参考用户指南 :ref:`第6章：在大图上的随机（批次）训练 <guide-minibatch>`。
\ No newline at end of file
+这是小批量训练中的常见用法。更多详细用法请参考用户指南 :ref:`guide_cn-minibatch`。
\ No newline at end of file
--- a/docs/source/guide_cn/minibatch-custom-sampler.rst
+++ b/docs/source/guide_cn/minibatch-custom-sampler.rst
+.. _guide_cn-minibatch-customizing-neighborhood-sampler:
+
+6.4 定制用户自己的邻居采样器
+----------------------------------------------
+
+:ref:`(English Version) <guide-minibatch-customizing-neighborhood-sampler>`
+
+虽然DGL提供了一些邻居采样器，但有时用户还是希望编写自己的采样器。
+本节会说明如何编写用户自己的采样器并将其加入到GNN的训练框架中。
+
+回想一下在
+`How Powerful are Graph Neural Networks <https://arxiv.org/pdf/1810.00826.pdf>`__
+的论文中，消息传递的定义是：
+
+.. math::
+
+   \begin{gathered}
+     \boldsymbol{a}_v^{(l)} = \rho^{(l)} \left(
+       \left\lbrace
+         \boldsymbol{h}_u^{(l-1)} : u \in \mathcal{N} \left( v \right)
+       \right\rbrace
+     \right)
+   \\
+     \boldsymbol{h}_v^{(l)} = \phi^{(l)} \left(
+       \boldsymbol{h}_v^{(l-1)}, \boldsymbol{a}_v^{(l)}
+     \right)
+   \end{gathered}
+
+其中， :math:`\rho^{(l)}` 和 :math:`\phi^{(l)}` 分别是可自定义的消息函数与聚合函数，
+:math:`\mathcal{N}(v)` 为有向图 :math:`\mathcal{G}` 上的节点 :math:`v` 的前驱节点(或无向图中的邻居)。
+
+以下图为例，假设红色节点为需要更新的目标节点：
+
+.. figure:: https://data.dgl.ai/asset/image/guide_6_4_0.png
+   :alt: Imgur
+
+
+消息传递需要聚集其邻居(绿色节点)的节点特征，如下图所示：
+
+.. figure:: https://data.dgl.ai/asset/image/guide_6_4_1.png
+   :alt: Imgur
+
+
+理解邻居采样的工作原理
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+在介绍DGL中邻居采样的用法之前，这里先解释一下邻居采样的工作原理。下文继续使用上述的例子。
+首先定义一个如上图所示的DGLGraph。
+
+.. code:: python
+
+    import torch
+    import dgl
+
+    src = torch.LongTensor(
+        [0, 0, 0, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 7, 7, 8, 9, 10,
+         1, 2, 3, 3, 3, 4, 5, 5, 6, 5, 8, 6, 8, 9, 8, 11, 11, 10, 11])
+    dst = torch.LongTensor(
+        [1, 2, 3, 3, 3, 4, 5, 5, 6, 5, 8, 6, 8, 9, 8, 11, 11, 10, 11,
+         0, 0, 0, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 7, 7, 8, 9, 10])
+    g = dgl.graph((src, dst))
+
+该例子的目标是计算单个节点(节点8)的输出。DGL将需要计算GNN输出的节点称为 *种子节点* 。
+
+找出消息传递的依赖
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+假设要使用2层GNN计算种子节点8(红色点)的输出：
+
+.. figure:: https://data.dgl.ai/asset/image/guide_6_4_2.png
+   :alt: Imgur
+
+
+其消息传递的计算公式如下：
+
+.. math::
+
+   \begin{gathered}
+     \boldsymbol{a}_8^{(2)} = \rho^{(2)} \left(
+       \left\lbrace
+         \boldsymbol{h}_u^{(1)} : u \in \mathcal{N} \left( 8 \right)
+       \right\rbrace
+     \right) = \rho^{(2)} \left(
+       \left\lbrace
+         \boldsymbol{h}_4^{(1)}, \boldsymbol{h}_5^{(1)},
+         \boldsymbol{h}_7^{(1)}, \boldsymbol{h}_{11}^{(1)}
+       \right\rbrace
+     \right)
+   \\
+     \boldsymbol{h}_8^{(2)} = \phi^{(2)} \left(
+       \boldsymbol{h}_8^{(1)}, \boldsymbol{a}_8^{(2)}
+     \right)
+   \end{gathered}
+
+从公式中可以看出，要计算 :math:`\boldsymbol{h}_8^{(2)}`，需要下图中的来自节点4、5、7和11(绿色点)的消息。
+
+.. figure:: https://data.dgl.ai/asset/image/guide_6_4_3.png
+   :alt: Imgur
+
+
+上图中隐去了和计算不相关的边，仅仅保留了输出节点所需要收集消息的边。DGL称它们为红色节点8在第二个GNN层的 *边界子图*。
+
+DGL实现了多个可用于生成边界的函数。例如，
+:func:`dgl.in_subgraph()` 是一个生成子图的函数，该子图包括初始图中的所有节点和指定节点的入边。
+用户可以将其用作沿所有入边传递消息的边界。
+
+.. code:: python
+
+    frontier = dgl.in_subgraph(g, [8])
+    print(frontier.all_edges())
+
+想了解更多的相关函数，用户可以参考 :ref:`api-subgraph-extraction` 和 :ref:`api-sampling`。
+
+在DGL中，任何具有与初始图相同的节点的图都可以用作边界。这点在之后的
+:ref:`guide_cn-minibatch-customizing-neighborhood-sampler-impl`
+章节中也会提到。
+
+多层小批量消息传递的二分计算图
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+从上图中可以看到，从 :math:`\boldsymbol{h}_\cdot^{(1)}` 计算
+:math:`\boldsymbol{h}_8^{(2)}` 只需要节点4, 5, 7, 8和11(绿色和红色节点)作为输入。
+原图上的其他节点是不参与计算的，因此直接在边界子图上执行消息传递有很大开销。
+因此，DGL对边界子图做了一个转换，把它的计算依赖关系变成了一个小的二分图。
+DGL称这种仅包含必要的输入节点和输出节点的二分图为一个 *块* (block)。
+下图显示了以节点8为种子节点时第二个GNN层所需的块。
+
+.. figure:: https://data.dgl.ai/asset/image/guide_6_4_4.png
+   :alt: Imgur
+
+
+请注意，输出节点也出现在输入节点中。原因是消息传递后的特征组合需要前一层的输出节点表示
+(即 :math:`\phi^{(2)}`)。
+
+DGL提供了 :func:`dgl.to_block` 以将任何边界转换为块。其中第一个参数指定边界，
+第二个参数指定输出节点。例如，可以使用以下代码将上述边界转换为输出节点为8的块。
+
+.. code:: python
+
+    output_nodes = torch.LongTensor([8])
+    block = dgl.to_block(frontier, output_nodes)
+
+要查找给定节点类型的输入节点和输出节点的数量，可以使用
+:meth:`dgl.DGLHeteroGraph.number_of_src_nodes`  和
+:meth:`dgl.DGLHeteroGraph.number_of_dst_nodes` 方法。
+
+.. code:: python
+
+    num_input_nodes, num_output_nodes = block.number_of_src_nodes(), block.number_of_dst_nodes()
+    print(num_input_nodes, num_output_nodes)
+
+可以通过 :attr:`dgl.DGLHeteroGraph.srcdata` 和
+:attr:`dgl.DGLHeteroGraph.srcnodes` 访问该块的输入节点特征，
+并且可以通过 :attr:`dgl.DGLHeteroGraph.dstdata` 和
+:attr:`dgl.DGLHeteroGraph.dstnodes` 访问其输出节点特征。
+``srcdata``/``dstdata`` 和 ``srcnodes``/``dstnodes``
+的语法与常规图中的 :attr:`dgl.DGLHeteroGraph.ndata` 和 :attr:`dgl.DGLHeteroGraph.nodes` 相同。
+
+.. code:: python
+
+    block.srcdata['h'] = torch.randn(num_input_nodes, 5)
+    block.dstdata['h'] = torch.randn(num_output_nodes, 5)
+
+如果是从图中得到的边界，再由边界转换成块，则可以通过以下方式直接读取块的输入和输出节点的特征。
+
+.. code:: python
+
+    print(block.srcdata['x'])
+    print(block.dstdata['y'])
+
+.. raw:: html
+
+   <div class="alert alert-info">
+
+::
+
+用户可以通过 ``dgl.NID`` 得到块中输入节点和输出节点的初始节点ID，可以通过 ``dgl.EID``
+得到边ID到输入边界中边的初始ID的映射。
+
+.. raw:: html
+
+   </div>
+
+**输出节点**
+
+DGL确保块的输出节点将始终出现在输入节点中。如下代码所演示的，在输入节点中，输出节点的ID位于其它节点之前。
+
+.. code:: python
+
+    input_nodes = block.srcdata[dgl.NID]
+    output_nodes = block.dstdata[dgl.NID]
+    assert torch.equal(input_nodes[:len(output_nodes)], output_nodes)
+
+因此，在用多层图神经网络时，中间某一层对应的边界需要包含该层及所有后续层计算涉及边的目标节点。例如，考虑以下边界
+
+.. figure:: https://data.dgl.ai/asset/image/guide_6_4_5.png
+   :alt: Imgur
+
+
+其中红色和绿色节点（即节点4、5、7、8和11）都是后续图神经网络层计算中某条边的目标节点。
+以下代码由于输出节点未覆盖所有这些节点，将会报错。
+
+.. code:: python
+
+    dgl.to_block(frontier2, torch.LongTensor([4, 5]))   # ERROR
+
+但是，输出节点可以比以上节点包含更多节点。下例的输出节点包含了没有入边的孤立节点。
+输入节点和输出节点将同时包含这些孤立节点。
+
+.. code:: python
+
+    # 节点3是一个孤立节点，没有任何指向它的边.
+    block3 = dgl.to_block(frontier2, torch.LongTensor([4, 5, 7, 8, 11, 3]))
+    print(block3.srcdata[dgl.NID])
+    print(block3.dstdata[dgl.NID])
+
+异构图上的采样
+^^^^^^^^^^^^^^^^^^^^
+
+块也可用于异构图。假设有如下的边界：
+
+.. code:: python
+
+    hetero_frontier = dgl.heterograph({
+        ('user', 'follow', 'user'): ([1, 3, 7], [3, 6, 8]),
+        ('user', 'play', 'game'): ([5, 5, 4], [6, 6, 2]),
+        ('game', 'played-by', 'user'): ([2], [6])
+    }, num_nodes_dict={'user': 10, 'game': 10})
+
+可以创建一个如下的块，块的输出节点为 ``User`` 节点3、6、8和 ``Game`` 节点2、6。
+
+.. code:: python
+
+    hetero_block = dgl.to_block(hetero_frontier, {'user': [3, 6, 8], 'block': [2, 6]})
+
+对于这个块，用户可以按节点类型来获取输入节点和输出节点：
+
+.. code:: python
+
+    # 输入的User和Game节点
+    print(hetero_block.srcnodes['user'].data[dgl.NID], hetero_block.srcnodes['game'].data[dgl.NID])
+    # 输出的User和Game节点
+    print(hetero_block.dstnodes['user'].data[dgl.NID], hetero_block.dstnodes['game'].data[dgl.NID])
+
+
+.. _guide_cn-minibatch-customizing-neighborhood-sampler-impl:
+
+实现一个自定义邻居采样器
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+前面章节里给出了以下用在节点分类任务的邻居采样器。
+
+.. code:: python
+
+    sampler = dgl.dataloading.MultiLayerFullNeighborSampler(2)
+
+想实现自定义的邻居采样策略，用户可以将采样器对象替换为自定义的采样器对象。
+为此，先来看一下
+:class:`~dgl.dataloading.neighbor.MultiLayerFullNeighborSampler`
+的父类
+:class:`~dgl.dataloading.dataloader.BlockSampler`。
+
+:class:`~dgl.dataloading.dataloader.BlockSampler`
+负责使用
+:meth:`~dgl.dataloading.dataloader.BlockSampler.sample_blocks`
+方法从最后一层开始生成一个块的列表。 ``sample_blocks`` 的默认实现是向后迭代，生成边界，并将其转换为块。
+
+因此，对于邻居采样，**用户仅需要实现**\ :meth:`~dgl.dataloading.dataloader.BlockSampler.sample_frontier`\ **方法**。
+给定GNN层、初始图和要计算表示的节点，该方法负责为它们生成边界。
+
+同时，用户还必须将GNN的层数传递给父类。
+
+例如， :class:`~dgl.dataloading.neighbor.MultiLayerFullNeighborSampler` 的实现如下。
+
+.. code:: python
+
+    class MultiLayerFullNeighborSampler(dgl.dataloading.BlockSampler):
+        def __init__(self, n_layers):
+            super().__init__(n_layers)
+    
+        def sample_frontier(self, block_id, g, seed_nodes):
+            frontier = dgl.in_subgraph(g, seed_nodes)
+            return frontier
+
+:class:`dgl.dataloading.neighbor.MultiLayerNeighborSampler`
+是一个更复杂的邻居采样器类，它允许用户为每个节点采样部分邻居节点以汇聚信息，如下所示。
+
+.. code:: python
+
+    class MultiLayerNeighborSampler(dgl.dataloading.BlockSampler):
+        def __init__(self, fanouts):
+            super().__init__(len(fanouts))
+    
+            self.fanouts = fanouts
+    
+        def sample_frontier(self, block_id, g, seed_nodes):
+            fanout = self.fanouts[block_id]
+            if fanout is None:
+                frontier = dgl.in_subgraph(g, seed_nodes)
+            else:
+                frontier = dgl.sampling.sample_neighbors(g, seed_nodes, fanout)
+            return frontier
+
+虽然上面的函数可以生成边界，但是任何拥有与初始图相同节点的图都可用作边界。
+
+例如，如果要以某种概率将种子节点的入边随机剔除，则可以按照以下方式简单地定义采样器：
+
+.. code:: python
+
+    class MultiLayerDropoutSampler(dgl.dataloading.BlockSampler):
+        def __init__(self, p, n_layers):
+            super().__init__()
+    
+            self.n_layers = n_layers
+            self.p = p
+    
+        def sample_frontier(self, block_id, g, seed_nodes, *args, **kwargs):
+            # 获取种 `seed_nodes` 的所有入边
+            src, dst = dgl.in_subgraph(g, seed_nodes).all_edges()
+            # 以概率p随机选择边
+            mask = torch.zeros_like(src).bernoulli_(self.p)
+            src = src[mask]
+            dst = dst[mask]
+            # 返回一个与初始图有相同节点的边界
+            frontier = dgl.graph((src, dst), num_nodes=g.number_of_nodes())
+            return frontier
+    
+        def __len__(self):
+            return self.n_layers
+
+在实现自定义采样器后，用户可以创建一个数据加载器。这个数据加载器使用用户自定义的采样器，
+并且遍历种子节点生成一系列的块。
+
+.. code:: python
+
+    sampler = MultiLayerDropoutSampler(0.5, 2)
+    dataloader = dgl.dataloading.NodeDataLoader(
+        g, train_nids, sampler,
+        batch_size=1024,
+        shuffle=True,
+        drop_last=False,
+        num_workers=4)
+    
+    model = StochasticTwoLayerRGCN(in_features, hidden_features, out_features)
+    model = model.cuda()
+    opt = torch.optim.Adam(model.parameters())
+    
+    for input_nodes, blocks in dataloader:
+        blocks = [b.to(torch.device('cuda')) for b in blocks]
+        input_features = blocks[0].srcdata     # 返回一个字典
+        output_labels = blocks[-1].dstdata     # 返回一个字典
+        output_predictions = model(blocks, input_features)
+        loss = compute_loss(output_labels, output_predictions)
+        opt.zero_grad()
+        loss.backward()
+        opt.step()
+
+异构图上自定义采样器
+^^^^^^^^^^^^^^^^^^^^
+
+为异构图生成边界与为同构图生成边界没有什么不同。只要使返回的图具有与初始图相同的节点，
+就可以正常工作。例如，可以重写上面的 ``MultiLayerDropoutSampler`` 以遍历所有的边类型，
+以便它也可以在异构图上使用。
+
+.. code:: python
+
+    class MultiLayerDropoutSampler(dgl.dataloading.BlockSampler):
+        def __init__(self, p, n_layers):
+            super().__init__()
+    
+            self.n_layers = n_layers
+            self.p = p
+    
+        def sample_frontier(self, block_id, g, seed_nodes, *args, **kwargs):
+            # 获取 `seed_nodes` 的所有入边
+            sg = dgl.in_subgraph(g, seed_nodes)
+    
+            new_edges_masks = {}
+            # 遍历所有边的类型
+            for etype in sg.canonical_etypes:
+                edge_mask = torch.zeros(sg.number_of_edges(etype))
+                edge_mask.bernoulli_(self.p)
+                new_edges_masks[etype] = edge_mask.bool()
+    
+            # 返回一个与初始图有相同节点的图作为边界
+            frontier = dgl.edge_subgraph(new_edge_masks, preserve_nodes=True)
+            return frontier
+    
+        def __len__(self):
+            return self.n_layers
\ No newline at end of file
--- a/docs/source/guide_cn/minibatch-edge.rst
+++ b/docs/source/guide_cn/minibatch-edge.rst
+.. _guide_cn-minibatch-edge-classification-sampler:
+
+6.2 针对边分类任务的邻居采样训练方法
+----------------------------------------------------------------------
+
+:ref:`(English Version) <guide-minibatch-edge-classification-sampler>`
+
+边分类/回归的训练与节点分类/回归的训练类似，但还是有一些明显的区别。
+
+定义邻居采样器和数据加载器
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+用户可以使用
+:ref:`和节点分类一样的邻居采样器 <guide_cn-minibatch-node-classification-sampler>`。
+
+.. code:: python
+
+    sampler = dgl.dataloading.MultiLayerFullNeighborSampler(2)
+
+想要用DGL提供的邻居采样器做边分类，需要将其与
+:class:`~dgl.dataloading.pytorch.EdgeDataLoader` 结合使用。
+:class:`~dgl.dataloading.pytorch.EdgeDataLoader` 以小批次的形式对一组边进行迭代，
+从而产生包含边小批次的子图以及供下文中模块使用的 ``块``。
+
+例如，以下代码创建了一个PyTorch数据加载器，该PyTorch数据加载器以批的形式迭代训练边ID的数组
+``train_eids``，并将生成的块列表放到GPU上。
+
+.. code:: python
+
+    dataloader = dgl.dataloading.EdgeDataLoader(
+        g, train_eid_dict, sampler,
+        batch_size=1024,
+        shuffle=True,
+        drop_last=False,
+        num_workers=4)
+
+有关DGL的内置采样器的完整列表，用户可以参考
+:ref:`neighborhood sampler API reference <api-dataloading-neighbor-sampling>`。
+
+如果用户希望开发自己的邻居采样器，或者想要对块的概念有更详细的了解，请参考
+:ref:`guide_cn-minibatch-customizing-neighborhood-sampler`。
+
+小批次邻居采样训练时删边
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+用户在训练边分类模型时，有时希望从计算依赖中删除出现在训练数据中的边，就好像这些边根本不存在一样。
+否则，模型将 "知道" 两个节点之间存在边的联系，并有可能利用这点 "作弊" 。
+
+因此，在基于邻居采样的边分类中，用户有时会希望从采样得到的小批次图中删去部分边及其对应的反向边。
+用户可以在实例化
+:class:`~dgl.dataloading.pytorch.EdgeDataLoader`
+时设置 ``exclude='reverse_id'``，同时将边ID映射到其反向边ID。
+通常这样做会导致采样过程变慢很多，这是因为DGL要定位并删除包含在小批次中的反向边。
+
+.. code:: python
+
+    n_edges = g.number_of_edges()
+    dataloader = dgl.dataloading.EdgeDataLoader(
+        g, train_eid_dict, sampler,
+
+        # 下面的两个参数专门用于在邻居采样时删除小批次的一些边和它们的反向边
+        exclude='reverse_id',
+        reverse_eids=torch.cat([
+            torch.arange(n_edges // 2, n_edges), torch.arange(0, n_edges // 2)]),
+    
+        batch_size=1024,
+        shuffle=True,
+        drop_last=False,
+        num_workers=4)
+
+调整模型以适用小批次训练
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+边分类模型通常由两部分组成：
+
+-  获取边两端节点的表示。
+-  用边两端节点表示为每个类别打分。
+
+第一部分与
+:ref:`随机批次训练节点分类 <guide_cn-minibatch-node-classification-model>`
+完全相同，用户可以简单地复用它。输入仍然是DGL的数据加载器生成的块列表和输入特征。
+
+.. code:: python
+
+    class StochasticTwoLayerGCN(nn.Module):
+        def __init__(self, in_features, hidden_features, out_features):
+            super().__init__()
+            self.conv1 = dglnn.GraphConv(in_features, hidden_features)
+            self.conv2 = dglnn.GraphConv(hidden_features, out_features)
+    
+        def forward(self, blocks, x):
+            x = F.relu(self.conv1(blocks[0], x))
+            x = F.relu(self.conv2(blocks[1], x))
+            return x
+
+第二部分的输入通常是前一部分的输出，以及由小批次边导出的原始图的子图。
+子图是从相同的数据加载器产生的。用户可以调用 :meth:`dgl.DGLHeteroGraph.apply_edges` 计算边子图中边的得分。
+
+以下代码片段实现了通过合并边两端节点的特征并将其映射到全连接层来预测边的得分。
+
+.. code:: python
+
+    class ScorePredictor(nn.Module):
+        def __init__(self, num_classes, in_features):
+            super().__init__()
+            self.W = nn.Linear(2 * in_features, num_classes)
+    
+        def apply_edges(self, edges):
+            data = torch.cat([edges.src['x'], edges.dst['x']])
+            return {'score': self.W(data)}
+    
+        def forward(self, edge_subgraph, x):
+            with edge_subgraph.local_scope():
+                edge_subgraph.ndata['x'] = x
+                edge_subgraph.apply_edges(self.apply_edges)
+                return edge_subgraph.edata['score']
+
+模型接受数据加载器生成的块列表、边子图以及输入节点特征进行前向传播，如下所示：
+
+.. code:: python
+
+    class Model(nn.Module):
+        def __init__(self, in_features, hidden_features, out_features, num_classes):
+            super().__init__()
+            self.gcn = StochasticTwoLayerGCN(
+                in_features, hidden_features, out_features)
+            self.predictor = ScorePredictor(num_classes, out_features)
+    
+        def forward(self, edge_subgraph, blocks, x):
+            x = self.gcn(blocks, x)
+            return self.predictor(edge_subgraph, x)
+
+DGL保证边子图中的节点与生成的块列表中最后一个块的输出节点相同。
+
+模型的训练
+~~~~~~~~~~~~~
+
+模型的训练与节点分类的随机批次训练的情况非常相似。用户可以遍历数据加载器以获得由小批次边组成的子图，
+以及计算其两端节点表示所需的块列表。
+
+.. code:: python
+
+    model = Model(in_features, hidden_features, out_features, num_classes)
+    model = model.cuda()
+    opt = torch.optim.Adam(model.parameters())
+    
+    for input_nodes, edge_subgraph, blocks in dataloader:
+        blocks = [b.to(torch.device('cuda')) for b in blocks]
+        edge_subgraph = edge_subgraph.to(torch.device('cuda'))
+        input_features = blocks[0].srcdata['features']
+        edge_labels = edge_subgraph.edata['labels']
+        edge_predictions = model(edge_subgraph, blocks, input_features)
+        loss = compute_loss(edge_labels, edge_predictions)
+        opt.zero_grad()
+        loss.backward()
+        opt.step()
+
+异构图上的模型训练
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+在异构图上，计算节点表示的模型也可以用于计算边分类/回归所需的两端节点的表示。
+
+.. code:: python
+
+    class StochasticTwoLayerRGCN(nn.Module):
+        def __init__(self, in_feat, hidden_feat, out_feat, rel_names):
+            super().__init__()
+            self.conv1 = dglnn.HeteroGraphConv({
+                    rel : dglnn.GraphConv(in_feat, hidden_feat, norm='right')
+                    for rel in rel_names
+                })
+            self.conv2 = dglnn.HeteroGraphConv({
+                    rel : dglnn.GraphConv(hidden_feat, out_feat, norm='right')
+                    for rel in rel_names
+                })
+    
+        def forward(self, blocks, x):
+            x = self.conv1(blocks[0], x)
+            x = self.conv2(blocks[1], x)
+            return x
+
+在同构图和异构图上做评分预测时，代码实现的唯一不同在于调用
+:meth:`~dgl.DGLHeteroGraph.apply_edges`
+时需要在特定类型的边上进行迭代。
+
+.. code:: python
+
+    class ScorePredictor(nn.Module):
+        def __init__(self, num_classes, in_features):
+            super().__init__()
+            self.W = nn.Linear(2 * in_features, num_classes)
+    
+        def apply_edges(self, edges):
+            data = torch.cat([edges.src['x'], edges.dst['x']])
+            return {'score': self.W(data)}
+    
+        def forward(self, edge_subgraph, x):
+            with edge_subgraph.local_scope():
+                edge_subgraph.ndata['x'] = x
+                for etype in edge_subgraph.canonical_etypes:
+                    edge_subgraph.apply_edges(self.apply_edges, etype=etype)
+                return edge_subgraph.edata['score']
+
+    class Model(nn.Module):
+        def __init__(self, in_features, hidden_features, out_features, num_classes,
+                     etypes):
+            super().__init__()
+            self.rgcn = StochasticTwoLayerRGCN(
+                in_features, hidden_features, out_features, etypes)
+            self.pred = ScorePredictor(num_classes, out_features)
+
+        def forward(self, edge_subgraph, blocks, x):
+            x = self.rgcn(blocks, x)
+            return self.pred(edge_subgraph, x)
+
+数据加载器的定义也与节点分类的非常相似。唯一的区别是用户需要使用
+:class:`~dgl.dataloading.pytorch.EdgeDataLoader`
+而不是
+:class:`~dgl.dataloading.pytorch.NodeDataLoader`，
+并且提供边类型和边ID张量的字典，而不是节点类型和节点ID张量的字典。
+
+.. code:: python
+
+    sampler = dgl.dataloading.MultiLayerFullNeighborSampler(2)
+    dataloader = dgl.dataloading.EdgeDataLoader(
+        g, train_eid_dict, sampler,
+        batch_size=1024,
+        shuffle=True,
+        drop_last=False,
+        num_workers=4)
+
+如果用户希望删除异构图中的反向边，情况会有所不同。在异构图上，
+反向边通常具有与正向边本身不同的边类型，以便区分 ``向前`` 和 ``向后`` 关系。
+例如，``关注`` 和 ``被关注`` 是一对相反的关系， ``购买`` 和 ``被买下`` 也是一对相反的关系。
+
+如果一个类型中的每个边都有一个与之对应的ID相同、属于另一类型的反向边，
+则用户可以指定边类型及其反向边类型之间的映射。删除小批次中的边及其反向边的方法如下。
+
+.. code:: python
+
+    dataloader = dgl.dataloading.EdgeDataLoader(
+        g, train_eid_dict, sampler,
+    
+        # 下面的两个参数专门用于在邻居采样时删除小批次的一些边和它们的反向边
+        exclude='reverse_types',
+        reverse_etypes={'follow': 'followed by', 'followed by': 'follow',
+                        'purchase': 'purchased by', 'purchased by': 'purchase'}
+    
+        batch_size=1024,
+        shuffle=True,
+        drop_last=False,
+        num_workers=4)
+
+除了 ``compute_loss`` 的代码实现有所不同，异构图的训练循环与同构图中的训练循环几乎相同，
+计算损失函数接受节点类型和预测的两个字典。
+
+.. code:: python
+
+    model = Model(in_features, hidden_features, out_features, num_classes, etypes)
+    model = model.cuda()
+    opt = torch.optim.Adam(model.parameters())
+    
+    for input_nodes, edge_subgraph, blocks in dataloader:
+        blocks = [b.to(torch.device('cuda')) for b in blocks]
+        edge_subgraph = edge_subgraph.to(torch.device('cuda'))
+        input_features = blocks[0].srcdata['features']
+        edge_labels = edge_subgraph.edata['labels']
+        edge_predictions = model(edge_subgraph, blocks, input_features)
+        loss = compute_loss(edge_labels, edge_predictions)
+        opt.zero_grad()
+        loss.backward()
+        opt.step()
+
+`GCMC <https://github.com/dmlc/dgl/tree/master/examples/pytorch/gcmc>`__
+是一个在二分图上做边分类的代码示例。
+
--- a/docs/source/guide_cn/minibatch-inference.rst
+++ b/docs/source/guide_cn/minibatch-inference.rst
+.. _guide_cn-minibatch-inference:
+
+6.6 超大图上的精准离线推断
+------------------------------------------------------
+
+:ref:`(English Version) <guide-minibatch-inference>`
+
+子图采样和邻居采样都是为了减少用GPU训练GNN模型的内存和时间消耗。在进行推断时，
+通常更好的方法是将所有邻居进行真正的聚合，以避免采样所带来的随机性。
+然而，在GPU上进行全图前向传播通常由于显存大小的限制而不可行，而在CPU上进行则计算速度很慢。
+本节介绍了在GPU显存有限的情况下通过小批次处理和邻居采样实现全图前向传播的方法。
+
+推断算法不同于训练算法，因为需要从第一层开始对节点表示逐层计算。具体来说，对于一个指定的层，
+需要以小批次的方式计算这个GNN层所有节点的输出表示。其结果是，推断算法将包含一个外循环以迭代执行各层，
+和一个内循环以迭代处理各个节点小批次。相比之下，训练算法有一个外循环以迭代处理各个节点小批次，
+和一个内循环以迭代执行各层（包含邻居采样和消息传递）。
+
+下面的动画展示了计算的过程（注意，每层只展示前3个小批次）：
+
+.. figure:: https://data.dgl.ai/asset/image/guide_6_6_0.gif
+   :alt: Imgur
+
+
+实现离线推断
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+这里以6.1节中 :ref:`guide_cn-minibatch-node-classification-model`
+提到的两层GCN为例。实现离线推断的方法依然需要使用 ``MultiLayerFullNeighborSampler``，
+但它每次只为一层进行采样。注意，这里的离线推断被实现为GNN模块的一个方法，
+这是因为它对一层的计算依赖于消息的聚合和结合。
+
+.. code:: python
+
+    class StochasticTwoLayerGCN(nn.Module):
+        def __init__(self, in_features, hidden_features, out_features):
+            super().__init__()
+            self.hidden_features = hidden_features
+            self.out_features = out_features
+            self.conv1 = dgl.nn.GraphConv(in_features, hidden_features)
+            self.conv2 = dgl.nn.GraphConv(hidden_features, out_features)
+            self.n_layers = 2
+    
+        def forward(self, blocks, x):
+            x_dst = x[:blocks[0].number_of_dst_nodes()]
+            x = F.relu(self.conv1(blocks[0], (x, x_dst)))
+            x_dst = x[:blocks[1].number_of_dst_nodes()]
+            x = F.relu(self.conv2(blocks[1], (x, x_dst)))
+            return x
+    
+        def inference(self, g, x, batch_size, device):
+            """        用该模块进行离线推断        """
+            # 逐层计算表示
+            for l, layer in enumerate([self.conv1, self.conv2]):
+                y = torch.zeros(g.number_of_nodes(),
+                                self.hidden_features
+                                if l != self.n_layers - 1
+                                else self.out_features)
+                sampler = dgl.dataloading.MultiLayerFullNeighborSampler(1)
+                dataloader = dgl.dataloading.NodeDataLoader(
+                    g, torch.arange(g.number_of_nodes()), sampler,
+                    batch_size=batch_size,
+                    shuffle=True,
+                    drop_last=False)
+
+                # 在一层中，依批次对节点进行迭代
+                for input_nodes, output_nodes, blocks in dataloader:
+                    block = blocks[0]
+
+                    # 将必要输入节点的特征复制到GPU上
+                    h = x[input_nodes].to(device)
+
+                    # 计算输出，注意计算方法是一样的，但只对一层进行计算
+                    h_dst = h[:block.number_of_dst_nodes()]
+                    h = F.relu(layer(block, (h, h_dst)))
+
+                    # 将输出复制回CPU
+                    y[output_nodes] = h.cpu()
+
+                x = y
+    
+            return y
+
+注意，如果以模型选择为目的在验证集上计算评价指标，则通常不需要进行计算精确的离线推断。
+原因是这需要为每一层上的每个节点计算表示，会非常消耗资源，尤其是在包含大量未标记数据的半监督系统中。
+邻居采样在这个时候可以更好地发挥作用。
+
+对于离线推断的示例，用户可以参照
+`GraphSAGE <https://github.com/dmlc/dgl/blob/master/examples/pytorch/graphsage/train_sampling.py>`__
+和
+`RGCN <https://github.com/dmlc/dgl/blob/master/examples/pytorch/rgcn-hetero/entity_classify_mb.py>`__。
--- a/docs/source/guide_cn/minibatch-link.rst
+++ b/docs/source/guide_cn/minibatch-link.rst
+.. _guide_cn-minibatch-link-classification-sampler:
+
+6.3 针对链接预测任务的邻居采样训练方法
+--------------------------------------------------------------------
+
+:ref:`(English Version) <guide-minibatch-link-classification-sampler>`
+
+结合负采样来定义邻居采样器和数据加载器
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+用户仍然可以使用与节点/边分类中相同的邻居采样器。
+
+.. code:: python
+
+    sampler = dgl.dataloading.MultiLayerFullNeighborSampler(2)
+
+DGL中的
+:class:`~dgl.dataloading.pytorch.EdgeDataLoader`
+还支持生成用于链接预测的负样本。为此，用户需要定义负采样函数。例如，
+:class:`~dgl.dataloading.negative_sampler.Uniform`
+函数是基于均匀分布的采样函数，它对于每个边的源节点，采样 ``k`` 个负样本的目标节点。
+
+以下数据加载器将为每个边的源节点均匀采样5个负样本的目标节点。
+
+.. code:: python
+
+    dataloader = dgl.dataloading.EdgeDataLoader(
+        g, train_seeds, sampler,
+        negative_sampler=dgl.dataloading.negative_sampler.Uniform(5),
+        batch_size=args.batch_size,
+        shuffle=True,
+        drop_last=False,
+        pin_memory=True,
+        num_workers=args.num_workers)
+
+关于内置的负采样方法，用户可以参考 :ref:`api-dataloading-negative-sampling`。
+
+用户还可以自定义负采样函数，它应当以原图 ``g`` 和小批量的边ID数组 ``eid`` 作为入参，
+并返回源节点ID数组和目标节点ID数组。
+
+下面给出了一个自定义的负采样方法的示例，该采样方法根据与节点的度的幂成正比的概率分布对负样本目标节点进行采样。
+
+.. code:: python
+
+    class NegativeSampler(object):
+        def __init__(self, g, k):
+            # 缓存概率分布
+            self.weights = g.in_degrees().float() ** 0.75
+            self.k = k
+    
+        def __call__(self, g, eids):
+            src, _ = g.find_edges(eids)
+            src = src.repeat_interleave(self.k)
+            dst = self.weights.multinomial(len(src), replacement=True)
+            return src, dst
+    
+    dataloader = dgl.dataloading.EdgeDataLoader(
+        g, train_seeds, sampler,
+        negative_sampler=NegativeSampler(g, 5),
+        batch_size=args.batch_size,
+        shuffle=True,
+        drop_last=False,
+        pin_memory=True,
+        num_workers=args.num_workers)
+
+调整模型以进行小批次训练
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+如 :ref:`guide_cn-training-link-prediction` 中所介绍的，
+用户可以通过比较边(正样本)与不存在的边(负样本)的得分来训练链路模型。用户可以重用在边分类/回归中的节点表示模型，
+来计算边的分数。
+
+.. code:: python
+
+    class StochasticTwoLayerGCN(nn.Module):
+        def __init__(self, in_features, hidden_features, out_features):
+            super().__init__()
+            self.conv1 = dgl.nn.GraphConv(in_features, hidden_features)
+            self.conv2 = dgl.nn.GraphConv(hidden_features, out_features)
+    
+        def forward(self, blocks, x):
+            x = F.relu(self.conv1(blocks[0], x))
+            x = F.relu(self.conv2(blocks[1], x))
+            return x
+
+对于得分的预测，只需要预测每个边的标量分数而不是类别的概率分布，
+因此本示例说明了如何使用边的两个端点的向量的点积来计算分数。
+
+.. code:: python
+
+    class ScorePredictor(nn.Module):
+        def forward(self, edge_subgraph, x):
+            with edge_subgraph.local_scope():
+                edge_subgraph.ndata['x'] = x
+                edge_subgraph.apply_edges(dgl.function.u_dot_v('x', 'x', 'score'))
+                return edge_subgraph.edata['score']
+
+使用负采样方法后，DGL的数据加载器将为每个小批次生成三项：
+
+-  一个正样本图，其中包含采样得到的小批次内所有的边。
+-  一个负样本图，其中包含由负采样方法生成的所有不存在的边。
+-  邻居采样方法生成的块的列表。
+
+因此，可以如下定义链接预测模型，该模型的输入包括上述三项以及输入的特征。
+
+.. code:: python
+
+    class Model(nn.Module):
+        def __init__(self, in_features, hidden_features, out_features):
+            super().__init__()
+            self.gcn = StochasticTwoLayerGCN(
+                in_features, hidden_features, out_features)
+    
+        def forward(self, positive_graph, negative_graph, blocks, x):
+            x = self.gcn(blocks, x)
+            pos_score = self.predictor(positive_graph, x)
+            neg_score = self.predictor(negative_graph, x)
+            return pos_score, neg_score
+
+模型的训练
+~~~~~~~~~~~~~
+
+训练循环通过数据加载器去遍历数据，将得到的图和输入特征传入上述模型。
+
+.. code:: python
+
+    model = Model(in_features, hidden_features, out_features)
+    model = model.cuda()
+    opt = torch.optim.Adam(model.parameters())
+    
+    for input_nodes, positive_graph, negative_graph, blocks in dataloader:
+        blocks = [b.to(torch.device('cuda')) for b in blocks]
+        positive_graph = positive_graph.to(torch.device('cuda'))
+        negative_graph = negative_graph.to(torch.device('cuda'))
+        input_features = blocks[0].srcdata['features']
+        pos_score, neg_score = model(positive_graph, negative_graph, blocks, input_features)
+        loss = compute_loss(pos_score, neg_score)
+        opt.zero_grad()
+        loss.backward()
+        opt.step()
+
+DGL提供了在同构图上做链路预测的一个示例：
+`无监督学习GraphSAGE <https://github.com/dmlc/dgl/blob/master/examples/pytorch/graphsage/train_sampling_unsupervised.py>`__。
+
+异构图上的随机批次训练
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+计算异构图上的节点表示的模型也可以用于计算边分类/回归中的边两端节点的表示。
+
+.. code:: python
+
+    class StochasticTwoLayerRGCN(nn.Module):
+        def __init__(self, in_feat, hidden_feat, out_feat, rel_names):
+            super().__init__()
+            self.conv1 = dglnn.HeteroGraphConv({
+                    rel : dglnn.GraphConv(in_feat, hidden_feat, norm='right')
+                    for rel in rel_names
+                })
+            self.conv2 = dglnn.HeteroGraphConv({
+                    rel : dglnn.GraphConv(hidden_feat, out_feat, norm='right')
+                    for rel in rel_names
+                })
+    
+        def forward(self, blocks, x):
+            x = self.conv1(blocks[0], x)
+            x = self.conv2(blocks[1], x)
+            return x
+
+对于得分的预测，同构图和异构图之间唯一的实现差异是后者需要用
+:meth:`dgl.DGLHeteroGraph.apply_edges`
+来遍历所有的边类型。
+
+.. code:: python
+
+    class ScorePredictor(nn.Module):
+        def forward(self, edge_subgraph, x):
+            with edge_subgraph.local_scope():
+                edge_subgraph.ndata['x'] = x
+                for etype in edge_subgraph.canonical_etypes:
+                    edge_subgraph.apply_edges(
+                        dgl.function.u_dot_v('x', 'x', 'score'), etype=etype)
+                return edge_subgraph.edata['score']
+
+    class Model(nn.Module):
+        def __init__(self, in_features, hidden_features, out_features, num_classes,
+                     etypes):
+            super().__init__()
+            self.rgcn = StochasticTwoLayerRGCN(
+                in_features, hidden_features, out_features, etypes)
+            self.pred = ScorePredictor()
+
+        def forward(self, positive_graph, negative_graph, blocks, x):
+            x = self.rgcn(blocks, x)
+            pos_score = self.pred(positive_graph, x)
+            neg_score = self.pred(negative_graph, x)
+            return pos_score, neg_score
+
+数据加载器的定义也与边分类/回归里的定义非常相似。唯一的区别是用户需要提供负采样方法，
+并且提供边类型和边ID张量的字典，而不是节点类型和节点ID张量的字典。
+
+.. code:: python
+
+    sampler = dgl.dataloading.MultiLayerFullNeighborSampler(2)
+    dataloader = dgl.dataloading.EdgeDataLoader(
+        g, train_eid_dict, sampler,
+        negative_sampler=dgl.dataloading.negative_sampler.Uniform(5),
+        batch_size=1024,
+        shuffle=True,
+        drop_last=False,
+        num_workers=4)
+
+如果用户想自定义负采样函数，那么该函数应以初始图以及由边类型和边ID张量构成的字典作为输入。
+它返回以边类型为键、源节点-目标节点数组对为值的字典。示例如下所示：
+
+.. code:: python
+
+   class NegativeSampler(object):
+       def __init__(self, g, k):
+           # 缓存概率分布
+           self.weights = {
+               etype: g.in_degrees(etype=etype).float() ** 0.75
+               for _, etype, _ in g.canonical_etypes
+           }
+           self.k = k
+
+       def __call__(self, g, eids_dict):
+           result_dict = {}
+           for etype, eids in eids_dict.items():
+               src, _ = g.find_edges(eids, etype=etype)
+               src = src.repeat_interleave(self.k)
+               dst = self.weights[etype].multinomial(len(src), replacement=True)
+               result_dict[etype] = (src, dst)
+           return result_dict
+
+随后，需要向数据载入器提供边类型和对应边ID的字典，以及负采样器。示例如下所示：
+
+.. code:: python
+
+    train_eid_dict = {
+        g.edges(etype=etype, form='eid')
+        for etype in g.etypes}
+
+    dataloader = dgl.dataloading.EdgeDataLoader(
+        g, train_eid_dict, sampler,
+        negative_sampler=NegativeSampler(g, 5),
+        batch_size=1024,
+        shuffle=True,
+        drop_last=False,
+        num_workers=4)
+
+异构图上的随机批次模型训练与同构图中的训练几乎相同，不同之处在于，
+``compute_loss`` 是以边类型字典和预测结果字典作为输入。
+
+.. code:: python
+
+    model = Model(in_features, hidden_features, out_features, num_classes, etypes)
+    model = model.cuda()
+    opt = torch.optim.Adam(model.parameters())
+    
+    for input_nodes, positive_graph, negative_graph, blocks in dataloader:
+        blocks = [b.to(torch.device('cuda')) for b in blocks]
+        positive_graph = positive_graph.to(torch.device('cuda'))
+        negative_graph = negative_graph.to(torch.device('cuda'))
+        input_features = blocks[0].srcdata['features']
+        pos_score, neg_score = model(positive_graph, negative_graph, blocks, input_features)
+        loss = compute_loss(pos_score, neg_score)
+        opt.zero_grad()
+        loss.backward()
+        opt.step()
+
+
+
--- a/docs/source/guide_cn/minibatch-nn.rst
+++ b/docs/source/guide_cn/minibatch-nn.rst
+.. _guide_cn-minibatch-custom-gnn-module:
+
+6.5 为小批次训练实现定制化的GNN模块
+-------------------------------------------------------------
+
+:ref:`(English Version) <guide-minibatch-custom-gnn-module>`
+
+如果用户熟悉如何定制用于更新整个同构图或异构图的GNN模块(参见
+:ref:`guide_cn-nn`)，那么在块上计算的代码也是类似的，区别只在于节点被划分为输入节点和输出节点。
+
+以下面的自定义图卷积模块代码为例。注意，该代码并不一定是最高效的实现，
+此处只是将其作为自定义GNN模块的一个示例。
+
+.. code:: python
+
+    class CustomGraphConv(nn.Module):
+        def __init__(self, in_feats, out_feats):
+            super().__init__()
+            self.W = nn.Linear(in_feats * 2, out_feats)
+    
+        def forward(self, g, h):
+            with g.local_scope():
+                g.ndata['h'] = h
+                g.update_all(fn.copy_u('h', 'm'), fn.mean('m', 'h_neigh'))
+                return self.W(torch.cat([g.ndata['h'], g.ndata['h_neigh']], 1))
+
+如果用户已有一个用于整个图的自定义消息传递模块，并且想将其用于块，则只需要按照如下的方法重写forward函数。
+注意，以下代码在注释里保留了整图实现的语句，用户可以将用于块的语句和原先用于整图的语句进行比较。
+
+.. code:: python
+
+    class CustomGraphConv(nn.Module):
+        def __init__(self, in_feats, out_feats):
+            super().__init__()
+            self.W = nn.Linear(in_feats * 2, out_feats)
+
+        # h现在是输入和输出节点的特征张量对，而不是一个单独的特征张量
+
+        # def forward(self, g, h):
+        def forward(self, block, h):
+            # with g.local_scope():
+            with block.local_scope():
+                # g.ndata['h'] = h
+                h_src = h
+                h_dst = h[:block.number_of_dst_nodes()]
+                block.srcdata['h'] = h_src
+                block.dstdata['h'] = h_dst
+    
+                # g.update_all(fn.copy_u('h', 'm'), fn.mean('m', 'h_neigh'))
+                block.update_all(fn.copy_u('h', 'm'), fn.mean('m', 'h_neigh'))
+    
+                # return self.W(torch.cat([g.ndata['h'], g.ndata['h_neigh']], 1))
+                return self.W(torch.cat(
+                    [block.dstdata['h'], block.dstdata['h_neigh']], 1))
+
+通常，需要对用于整图的GNN模块进行如下调整以将其用于块作为输入的情况：
+
+-  切片取输入特征的前几行，得到输出节点的特征。切片行数可以通过
+   :meth:`block.number_of_dst_nodes <dgl.DGLHeteroGraph.number_of_dst_nodes>` 获得。
+-  如果原图只包含一种节点类型，对输入节点特征，将 :attr:`g.ndata <dgl.DGLHeteroGraph.ndata>` 替换为
+   :attr:`block.srcdata <dgl.DGLHeteroGraph.srcdata>`；对于输出节点特征，将
+   :attr:`g.ndata <dgl.DGLHeteroGraph.ndata>`  替换为
+   :attr:`block.dstdata <dgl.DGLHeteroGraph.dstdata>`。
+-  如果原图包含多种节点类型，对于输入节点特征，将
+   :attr:`g.nodes <dgl.DGLHeteroGraph.nodes>` 替换为
+   :attr:`block.srcnodes <dgl.DGLHeteroGraph.srcnodes>`；对于输出节点特征，将
+   :attr:`g.nodes <dgl.DGLHeteroGraph.nodes>` 替换为
+   :attr:`block.dstnodes <dgl.DGLHeteroGraph.dstnodes>`。
+-  对于输入节点数量，将 :meth:`g.number_of_nodes <dgl.DGLHeteroGraph.number_of_nodes>` 替换为
+   :meth:`block.number_of_src_nodes <dgl.DGLHeteroGraph.number_of_src_nodes>` ；
+   对于输出节点数量，将 :meth:`g.number_of_nodes <dgl.DGLHeteroGraph.number_of_nodes>` 替换为
+   :meth:`block.number_of_dst_nodes <dgl.DGLHeteroGraph.number_of_dst_nodes>` 。
+
+异构图上的模型定制
+~~~~~~~~~~~~~~~~~~~~
+
+为异构图修改GNN模块的方法是类似的。例如，以下面用于全图的GNN模块为例：
+
+.. code:: python
+
+    class CustomHeteroGraphConv(nn.Module):
+        def __init__(self, g, in_feats, out_feats):
+            super().__init__()
+            self.Ws = nn.ModuleDict()
+            for etype in g.canonical_etypes:
+                utype, _, vtype = etype
+                self.Ws[etype] = nn.Linear(in_feats[utype], out_feats[vtype])
+            for ntype in g.ntypes:
+                self.Vs[ntype] = nn.Linear(in_feats[ntype], out_feats[ntype])
+    
+        def forward(self, g, h):
+            with g.local_scope():
+                for ntype in g.ntypes:
+                    g.nodes[ntype].data['h_dst'] = self.Vs[ntype](h[ntype])
+                    g.nodes[ntype].data['h_src'] = h[ntype]
+                for etype in g.canonical_etypes:
+                    utype, _, vtype = etype
+                    g.update_all(
+                        fn.copy_u('h_src', 'm'), fn.mean('m', 'h_neigh'),
+                        etype=etype)
+                    g.nodes[vtype].data['h_dst'] = g.nodes[vtype].data['h_dst'] + \
+                        self.Ws[etype](g.nodes[vtype].data['h_neigh'])
+                return {ntype: g.nodes[ntype].data['h_dst'] for ntype in g.ntypes}
+
+对于 ``CustomHeteroGraphConv``，原则是将 ``g.nodes`` 替换为 ``g.srcnodes`` 或
+``g.dstnodes`` (根据需要输入还是输出节点的特征来选择)。
+
+.. code:: python
+
+    class CustomHeteroGraphConv(nn.Module):
+        def __init__(self, g, in_feats, out_feats):
+            super().__init__()
+            self.Ws = nn.ModuleDict()
+            for etype in g.canonical_etypes:
+                utype, _, vtype = etype
+                self.Ws[etype] = nn.Linear(in_feats[utype], out_feats[vtype])
+            for ntype in g.ntypes:
+                self.Vs[ntype] = nn.Linear(in_feats[ntype], out_feats[ntype])
+    
+        def forward(self, g, h):
+            with g.local_scope():
+                for ntype in g.ntypes:
+                    h_src, h_dst = h[ntype]
+                    g.dstnodes[ntype].data['h_dst'] = self.Vs[ntype](h[ntype])
+                    g.srcnodes[ntype].data['h_src'] = h[ntype]
+                for etype in g.canonical_etypes:
+                    utype, _, vtype = etype
+                    g.update_all(
+                        fn.copy_u('h_src', 'm'), fn.mean('m', 'h_neigh'),
+                        etype=etype)
+                    g.dstnodes[vtype].data['h_dst'] = \
+                        g.dstnodes[vtype].data['h_dst'] + \
+                        self.Ws[etype](g.dstnodes[vtype].data['h_neigh'])
+                return {ntype: g.dstnodes[ntype].data['h_dst']
+                        for ntype in g.ntypes}
+
+实现能够处理同构图、二分图和块的模块
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+DGL中所有的消息传递模块(参见 :ref:`apinn`)都能够处理同构图、
+单向二分图(包含两种节点类型和一种边类型)和包含一种边类型的块。
+本质上，内置的DGL神经网络模块的输入图及特征必须满足下列情况之一：
+
+-  如果输入特征是一个张量对，则输入图必须是一个单向二分图
+-  如果输入特征是一个单独的张量且输入图是一个块，则DGL会自动将输入节点特征前一部分设为输出节点的特征。
+-  如果输入特征是一个单独的张量且输入图不是块，则输入图必须是同构图。
+
+例如，下面的代码是 :class:`dgl.nn.pytorch.SAGEConv` 的简化版(DGL同样支持它在MXNet和TensorFlow后端里的实现)。
+代码里移除了归一化，且只考虑平均聚合函数的情况。
+
+.. code:: python
+
+    import dgl.function as fn
+    class SAGEConv(nn.Module):
+        def __init__(self, in_feats, out_feats):
+            super().__init__()
+            self.W = nn.Linear(in_feats * 2, out_feats)
+    
+        def forward(self, g, h):
+            if isinstance(h, tuple):
+                h_src, h_dst = h
+            elif g.is_block:
+                h_src = h
+                h_dst = h[:g.number_of_dst_nodes()]
+            else:
+                h_src = h_dst = h
+                 
+            g.srcdata['h'] = h_src
+            g.dstdata['h'] = h_dst
+            g.update_all(fn.copy_u('h', 'm'), fn.sum('m', 'h_neigh'))
+            return F.relu(
+                self.W(torch.cat([g.dstdata['h'], g.dstdata['h_neigh']], 1)))
+
+:ref:`guide_cn-nn` 提供了对 :class:`dgl.nn.pytorch.SAGEConv` 代码的详细解读，
+其适用于单向二分图、同构图和块。
--- a/docs/source/guide_cn/minibatch-node.rst
+++ b/docs/source/guide_cn/minibatch-node.rst
+.. _guide_cn-minibatch-node-classification-sampler:
+
+6.1 针对节点分类任务的邻居采样训练方法
+-----------------------------------------------------------------------
+
+:ref:`(English Version) <guide-minibatch-node-classification-sampler>`
+
+为了随机(批次)训练模型，需要进行以下操作：
+
+- 定义邻居采样器。
+- 调整模型以进行小批次训练。
+- 修改模型训练循环部分。
+
+以下小节将逐一介绍这些步骤。
+
+定义邻居采样器和数据加载器
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+DGL提供了几个邻居采样类，这些类会生成需计算的节点在每一层计算时所需的依赖图。
+
+最简单的邻居采样器是
+:class:`~dgl.dataloading.neighbor.MultiLayerFullNeighborSampler`，它可获取节点的所有邻居。
+
+要使用DGL提供的采样器，还需要将其与
+:class:`~dgl.dataloading.pytorch.NodeDataLoader`
+结合使用，后者可以以小批次的形式对一个节点的集合进行迭代。
+
+例如，以下代码创建了一个PyTorch的 DataLoader，它分批迭代训练节点ID数组 ``train_nids``，
+并将生成的子图列表放到GPU上。
+
+.. code:: python
+
+    import dgl
+    import dgl.nn as dglnn
+    import torch
+    import torch.nn as nn
+    import torch.nn.functional as F
+    
+    sampler = dgl.dataloading.MultiLayerFullNeighborSampler(2)
+    dataloader = dgl.dataloading.NodeDataLoader(
+        g, train_nids, sampler,
+        batch_size=1024,
+        shuffle=True,
+        drop_last=False,
+        num_workers=4)
+
+对DataLoader进行迭代，将会创建一个特定图的列表，这些图表示每层的计算依赖。在DGL中称之为 *块*。
+
+.. code:: python
+
+    input_nodes, output_nodes, blocks = next(iter(dataloader))
+    print(blocks)
+
+上面的dataloader一次迭代会生成三个输出。 ``input_nodes`` 代表计算 ``output_nodes`` 的表示所需的节点。
+``块`` 包含了每个GNN层要计算哪些节点表示作为输出，要将哪些节点表示作为输入，以及来自输入节点的表示如何传播到输出节点。
+
+完整的内置采样方法清单，用户可以参考
+:ref:`neighborhood sampler API reference <api-dataloading-neighbor-sampling>`。
+
+如果用户希望编写自己的邻居采样器，或者想要关于块的更深入的介绍，读者可以参考
+:ref:`guide_cn-minibatch-customizing-neighborhood-sampler`。
+
+.. _guide_cn-minibatch-node-classification-model:
+
+调整模型以进行小批次训练
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+如果用户的消息传递模块全使用的是DGL内置模块，则模型在进行小批次训练时只需做很小的调整。
+以多层GCN为例。如果用户模型在全图上是按以下方式实现的：
+
+.. code:: python
+
+    class TwoLayerGCN(nn.Module):
+        def __init__(self, in_features, hidden_features, out_features):
+            super().__init__()
+            self.conv1 = dglnn.GraphConv(in_features, hidden_features)
+            self.conv2 = dglnn.GraphConv(hidden_features, out_features)
+    
+        def forward(self, g, x):
+            x = F.relu(self.conv1(g, x))
+            x = F.relu(self.conv2(g, x))
+            return x
+
+然后，用户所需要做的就是用上面生成的块( ``block`` )来替换图( ``g`` )。
+
+.. code:: python
+
+    class StochasticTwoLayerGCN(nn.Module):
+        def __init__(self, in_features, hidden_features, out_features):
+            super().__init__()
+            self.conv1 = dgl.nn.GraphConv(in_features, hidden_features)
+            self.conv2 = dgl.nn.GraphConv(hidden_features, out_features)
+    
+        def forward(self, blocks, x):
+            x = F.relu(self.conv1(blocks[0], x))
+            x = F.relu(self.conv2(blocks[1], x))
+            return x
+
+上面的DGL ``GraphConv`` 模块接受的一个参数是数据加载器生成的 ``块`` 中的一个元素。
+
+用户可以查阅 :ref:`NN模块的API参考 <apinn>` 来查看DGL的内置模型模块是否支持接受 ``块`` 作为参数。
+
+如果希望使用自定义的消息传递模块，用户可以参考
+:ref:`guide_cn-minibatch-custom-gnn-module`。
+
+模型的训练
+~~~~~~~~~~~~~
+
+这里的模型的训练循环仅包含使用定制的批处理迭代器遍历数据集的内容。在每个生成块列表的迭代中：
+
+
+1. 将与输入节点相对应的节点特征加载到GPU上。节点特征可以存储在内存或外部存储中。
+   请注意，用户只需要加载输入节点的特征，而不是像整图训练那样加载所有节点的特征。
+
+   如果特征存储在 ``g.ndata`` 中，则可以通过 ``blocks[0].srcdata`` 来加载第一个块的输入节点的特征，
+   这些节点是计算节点最终表示所需的所有必需的节点。
+
+2. 将块列表和输入节点特征传入多层GNN并获取输出。
+
+3. 将与输出节点相对应的节点标签加载到GPU上。同样，节点标签可以存储在内存或外部存储器中。
+   再次提醒下，用户只需要加载输出节点的标签，而不是像整图训练那样加载所有节点的标签。
+
+   如果特征存储在 ``g.ndata`` 中，则可以通过访问 ``blocks[-1].srcdata`` 中的特征来加载标签，
+   它是最后一个块的输出节点的特征，这些节点与用户希望计算最终表示的节点相同。
+
+4. 计算损失并反向传播。
+
+.. code:: python
+
+    model = StochasticTwoLayerGCN(in_features, hidden_features, out_features)
+    model = model.cuda()
+    opt = torch.optim.Adam(model.parameters())
+    
+    for input_nodes, output_nodes, blocks in dataloader:
+        blocks = [b.to(torch.device('cuda')) for b in blocks]
+        input_features = blocks[0].srcdata['features']
+        output_labels = blocks[-1].dstdata['label']
+        output_predictions = model(blocks, input_features)
+        loss = compute_loss(output_labels, output_predictions)
+        opt.zero_grad()
+        loss.backward()
+        opt.step()
+
+DGL提供了一个端到端的随机批次训练示例
+`GraphSAGE的实现 <https://github.com/dmlc/dgl/blob/master/examples/pytorch/graphsage/train_sampling.py>`__。
+
+
+异构图上模型的训练
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+在异构图上训练图神经网络进行节点分类的方法也是类似的。
+
+例如，在
+:ref:`guide_cn-training-rgcn-node-classification`
+中介绍了如何在整图上训练一个2层的RGCN模型。
+RGCN小批次训练的代码与它非常相似(为简单起见，这里删除了自环、非线性和基分解)：
+
+.. code:: python
+
+    class StochasticTwoLayerRGCN(nn.Module):
+        def __init__(self, in_feat, hidden_feat, out_feat, rel_names):
+            super().__init__()
+            self.conv1 = dglnn.HeteroGraphConv({
+                    rel : dglnn.GraphConv(in_feat, hidden_feat, norm='right')
+                    for rel in rel_names
+                })
+            self.conv2 = dglnn.HeteroGraphConv({
+                    rel : dglnn.GraphConv(hidden_feat, out_feat, norm='right')
+                    for rel in rel_names
+                })
+    
+        def forward(self, blocks, x):
+            x = self.conv1(blocks[0], x)
+            x = self.conv2(blocks[1], x)
+            return x
+
+DGL提供的一些采样方法也支持异构图。例如，用户仍然可以使用
+:class:`~dgl.dataloading.neighbor.MultiLayerFullNeighborSampler` 类和
+:class:`~dgl.dataloading.pytorch.NodeDataLoader` 类进行随机批次训练。
+对于全邻居采样，唯一的区别是用户需要为训练集指定节点类型和节点ID的字典。
+
+.. code:: python
+
+    sampler = dgl.dataloading.MultiLayerFullNeighborSampler(2)
+    dataloader = dgl.dataloading.NodeDataLoader(
+        g, train_nid_dict, sampler,
+        batch_size=1024,
+        shuffle=True,
+        drop_last=False,
+        num_workers=4)
+
+模型的训练与同构图几乎相同。不同之处在于， ``compute_loss`` 的实现会包含两个字典：节点类型和预测结果。
+
+.. code:: python
+
+    model = StochasticTwoLayerRGCN(in_features, hidden_features, out_features, etypes)
+    model = model.cuda()
+    opt = torch.optim.Adam(model.parameters())
+    
+    for input_nodes, output_nodes, blocks in dataloader:
+        blocks = [b.to(torch.device('cuda')) for b in blocks]
+        input_features = blocks[0].srcdata     # returns a dict
+        output_labels = blocks[-1].dstdata     # returns a dict
+        output_predictions = model(blocks, input_features)
+        loss = compute_loss(output_labels, output_predictions)
+        opt.zero_grad()
+        loss.backward()
+        opt.step()
+
+DGL提供了端到端随机批次训练的
+`RGCN的实现 <https://github.com/dmlc/dgl/blob/master/examples/pytorch/rgcn-hetero/entity_classify_mb.py>`__。
\ No newline at end of file
--- a/docs/source/guide_cn/minibatch.rst
+++ b/docs/source/guide_cn/minibatch.rst
+.. _guide_cn-minibatch:
+
+第6章：在大图上的随机（批次）训练
+=======================================================
+
+:ref:`(English Version) <guide-minibatch>`
+
+如果用户有包含数百万甚至数十亿个节点或边的大图，通常无法进行
+:ref:`guide_cn-training`
+中所述的全图训练。考虑在一个有 :math:`N` 个节点的图上运行的、隐层大小为 :math:`H` 的 :math:`L` 层图卷积网络，
+存储隐层表示需要 :math:`O(NLH)` 的内存空间，当 :math:`N` 较大时，这很容易超过一块GPU的显存限制。
+
+本章介绍了一种在大图上进行随机小批次训练的方法，可以让用户不用一次性把所有节点特征拷贝到GPU上。
+
+邻居采样方法概述
+--------------------------------------------
+
+邻居节点采样的工作流程通常如下：每次梯度下降，选择一个小批次的图节点，
+其最终表示将在神经网络的第 :math:`L` 层进行计算，然后在网络的第 :math:`L-1` 层选择该批次节点的全部或部分邻居节点。
+重复这个过程，直到到达输入层。这个迭代过程会构建计算的依赖关系图，从输出开始，一直到输入，如下图所示：
+
+.. figure:: https://data.dgl.ai/asset/image/guide_6_0_0.png
+   :alt: Imgur
+
+该方法能节省在大图上训练图神经网络的开销和计算资源。
+
+DGL实现了一些邻居节点采样的方法和使用邻居节点采样训练图神经网络的管道，同时也支持让用户自定义采样策略。
+
+本章路线图
+-----------
+
+本章的前半部分介绍了不同场景下如何进行随机训练的方法。
+
+* :ref:`guide_cn-minibatch-node-classification-sampler`
+* :ref:`guide_cn-minibatch-edge-classification-sampler`
+* :ref:`guide_cn-minibatch-link-classification-sampler`
+
+本章余下的小节介绍了更多的高级主题，面向那些想要开发新的采样算法、
+想要实现与小批次训练兼容的图神经网络模块、以及想要了解如何在小批次数据上进行评估和推理模型的用户。
+
+* :ref:`guide_cn-minibatch-customizing-neighborhood-sampler`
+* :ref:`guide_cn-minibatch-custom-gnn-module`
+* :ref:`guide_cn-minibatch-inference`
+
+
+.. toctree::
+    :maxdepth: 1
+    :hidden:
+    :glob:
+
+    minibatch-node
+    minibatch-edge
+    minibatch-link
+    minibatch-custom-sampler
+    minibatch-nn
+    minibatch-inference
--- a/docs/source/guide_cn/nn-forward.rst
+++ b/docs/source/guide_cn/nn-forward.rst
@@ -32,7 +32,7 @@ DGL NN模块额外增加了1个参数 :class:`dgl.DGLGraph`。``forward()`` 函
 这可能会导致模型性能不佳。但是，在 :class:`~dgl.nn.pytorch.conv.SAGEConv` 模块中，被聚合的特征将会与节点的初始特征拼接起来，
 ``forward()`` 函数的输出不会全为0。在这种情况下，无需进行此类检验。

-DGL NN模块可在不同类型的图输入中重复使用，包括：同构图、异构图（:ref:`guide_cn-graph-heterogeneous`）和子图区块（:ref:`guide-minibatch`）。
+DGL NN模块可在不同类型的图输入中重复使用，包括：同构图、异构图（:ref:`guide_cn-graph-heterogeneous`）和子图块（:ref:`guide_cn-minibatch`）。

 SAGEConv的数学公式如下：


--- a/docs/source/guide_cn/nn-heterograph.rst
+++ b/docs/source/guide_cn/nn-heterograph.rst
@@ -77,7 +77,7 @@ HeteroGraphConv的实现逻辑

 上述代码中的for循环为处理异构图计算的主要逻辑。首先我们遍历图中所有的关系(通过调用 ``canonical_etypes``)。
 通过关系名，我们可以使用g[ ``stype, etype, dtype`` ]的语法将只包含该关系的子图( ``rel_graph`` )抽取出来。
-对于二部图，输入特征将被组织为元组 ``(src_inputs[stype], dst_inputs[dtype])``。
+对于二分图，输入特征将被组织为元组 ``(src_inputs[stype], dst_inputs[dtype])``。
 接着调用用户预先注册在该关系上的NN模块，并将结果保存在outputs字典中。

 .. code::

--- a/docs/source/guide_cn/training.rst
+++ b/docs/source/guide_cn/training.rst
@@ -11,7 +11,7 @@
 本章通过使用 :ref:`guide_cn-message-passing` 中介绍的消息传递方法和 :ref:`guide_cn-nn` 中介绍的图神经网络模块，
 讲解了如何对小规模的图数据进行节点分类、边分类、链接预测和整图分类的图神经网络的训练。

-本章假设用户的图以及所有的节点和边特征都能存进GPU。对于无法全部载入的情况，请参考用户指南的 :ref:`guide-minibatch`。
+本章假设用户的图以及所有的节点和边特征都能存进GPU。对于无法全部载入的情况，请参考用户指南的 :ref:`guide_cn-minibatch`。

 后续章节的内容均假设用户已经准备好了图和节点/边的特征数据。如果用户希望使用DGL提供的数据集或其他兼容
 ``DGLDataset`` 的数据(如 :ref:`guide_cn-data-pipeline` 所述)，