"git@developer.sourcefind.cn:OpenDAS/torchaudio.git" did not exist on "afb6626c440e92423549818401afa54f4addfb39"
Unverified Commit 50f1f4ae authored by Tianjun Xiao's avatar Tianjun Xiao Committed by GitHub
Browse files

Remove details about message passing optimization (#2425)

* remove details about message passing optimization

* address comments
parent 8d081138
......@@ -6,41 +6,12 @@
:ref:`(中文版) <guide_cn-message-passing-efficient>`
DGL optimizes memory consumption and computing speed for message
passing. The optimization includes:
- Merge multiple kernels in a single one: This is achieved by using
:meth:`~dgl.DGLGraph.update_all` to call multiple built-in functions
at once. (Speed optimization)
- Parallelism on nodes and edges: DGL abstracts edge-wise computation
:meth:`~dgl.DGLGraph.apply_edges` as a generalized sampled dense-dense
matrix multiplication (**gSDDMM**) operation and parallelizes the computing
across edges. Likewise, DGL abstracts node-wise computation
:meth:`~dgl.DGLGraph.update_all` as a generalized sparse-dense matrix
multiplication (**gSPMM**) operation and parallelizes the computing across
nodes. (Speed optimization)
- Avoid unnecessary memory copy from nodes to edges: To generate a
message that requires the feature from source and destination node,
one option is to copy the source and destination node feature to
that edge. For some graphs, the number of edges is much larger than
the number of nodes. This copy can be costly. DGL's built-in message
functions avoid this memory copy by sampling out the node feature using
entry index. (Memory and speed optimization)
- Avoid materializing feature vectors on edges: the complete message
passing process includes message generation, message aggregation and
node update. In :meth:`~dgl.DGLGraph.update_all` call, message function
and reduce function are merged into one kernel if those functions are
built-in. There is no message materialization on edges. (Memory
optimization)
According to the above, a common practise to leverage those
passing. A common practise to leverage those
optimizations is to construct one's own message passing functionality as
a combination of :meth:`~dgl.DGLGraph.update_all` calls with built-in
functions as parameters.
For some cases like
Besides that, considering that the number of edges is much larger than the number of nodes for some graphs, avoiding unnecessary memory copy from nodes to edges is beneficial. For some cases like
:class:`~dgl.nn.pytorch.conv.GATConv`,
where it is necessary to save message on the edges, one needs to call
:meth:`~dgl.DGLGraph.apply_edges` with built-in functions. Sometimes the
......
......@@ -5,25 +5,9 @@
:ref:`(English Version) <guide-message-passing-efficient>`
DGL优化了消息传递的内存消耗和计算速度,这包括:
DGL优化了消息传递的内存消耗和计算速度。利用这些优化的一个常见实践是通过基于内置函数的 :meth:`~dgl.DGLGraph.update_all` 来开发消息传递功能。
- 将多个内核合并到一个内核中:这是通过使用 :meth:`~dgl.DGLGraph.update_all` 一次调用多个内置函数来实现的。(速度优化)
- 节点和边上的并行计算:DGL抽象了逐边计算,将 :meth:`~dgl.DGLGraph.apply_edges` 作为一种广义抽样稠密-稠密矩阵乘法
**(gSDDMM)** 运算,并实现了跨边并行计算。同样,DGL将逐节点计算 :meth:`~dgl.DGLGraph.update_all` 抽象为广义稀疏-稠密矩阵乘法(gSPMM)运算,
并实现了跨节点并行计算。(速度优化)
- 避免不必要的从点到边的内存拷贝:想要生成带有源节点和目标节点特征的消息,一个选项是将源节点和目标节点的特征拷贝到边上。
对于某些图,边的数量远远大于节点的数量。这个拷贝的代价会很大。DGL内置的消息函数通过使用条目索引对节点特征进行采集来避免这种内存拷贝。
(内存和速度优化)
- 避免具体化边上的特征向量:完整的消息传递过程包括消息生成、消息聚合和节点更新。
在调用 :meth:`~dgl.DGLGraph.update_all` 时,如果消息函数和聚合函数是内置的,则它们会被合并到一个内核中,
从而避免存储消息对象。(内存优化)
根据以上所述,利用这些优化的一个常见实践是通过基于内置函数的 :meth:`~dgl.DGLGraph.update_all` 来开发消息传递功能。
对于某些情况,比如 :class:`~dgl.nn.pytorch.conv.GATConv`,计算必须在边上保存消息,
除此之外,考虑到某些图边的数量远远大于节点的数量,DGL建议避免不必要的从点到边的内存拷贝。对于某些情况,比如 :class:`~dgl.nn.pytorch.conv.GATConv`,计算必须在边上保存消息,
那么用户就需要调用基于内置函数的 :meth:`~dgl.DGLGraph.apply_edges`。有时边上的消息可能是高维的,这会非常消耗内存。
DGL建议用户尽量减少边的特征维数。
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment