README.md 3.3 KB
Newer Older
1
2
3
Graph Convolutional Networks (GCN)
============

Mufei Li's avatar
Mufei Li committed
4
5
6
- Paper link: [https://arxiv.org/abs/1609.02907](https://arxiv.org/abs/1609.02907)
- Author's code repo: [https://github.com/tkipf/gcn](https://github.com/tkipf/gcn). Note that the original code is 
implemented with Tensorflow for the paper. 
7

Minjie Wang's avatar
Minjie Wang committed
8
The folder contains two different implementations using DGL.
9

Minjie Wang's avatar
Minjie Wang committed
10
Batched GCN (gcn.py)
11
12
13
14
15
-----------
Defining the model on only one node and edge makes it hard to fully utilize GPUs. As a result, we allow users to define model on a *batch of* nodes and edges.

* The message function `gcn_msg` computes the message for a batch of edges. Here, the `src` argument is the batched representation of the source endpoints of the edges. The function simply returns the source node representations.
  ```python
Da Zheng's avatar
Da Zheng committed
16
17
18
19
20
  def gcn_msg(edge):
    # edge.src contains the node data on the source node.
    # It's a dictionary. edge.src['h'] is a tensor of shape (B, D).
    # B is the number of edges being batched.
    return {'m': edge.src['h']}
21
  ```
Mufei Li's avatar
Mufei Li committed
22
23
* The reduce function `gcn_reduce` also accumulates messages for a batch of nodes. We batch the messages on the second dimension for the `msgs` argument, 
which for example can correspond to the neighbors of the nodes:
24
  ```python
Da Zheng's avatar
Da Zheng committed
25
26
27
  def gcn_reduce(node):
    # node.mailbox contains messages from `gcn_msg`.
    # node.mailbox['m'] is a tensor of shape (B, deg, D). B is the number of nodes in the batch;
28
29
30
    #  deg is the number of messages; D is the message tensor dimension. DGL gaurantees
    #  that all the nodes in a batch have the same in-degrees (through "degree-bucketing").
    # Reduce on the second dimension is equal to sum up all the in-coming messages.
Da Zheng's avatar
Da Zheng committed
31
32
33
34
35
36
37
38
39
40
41
42
    return {'accum': mx.nd.sum(node.mailbox['m'], 1)}
  ```
* The update function `NodeUpdateModule` computes the new new node representation `h` using non-linear transformation on the reduced messages.
  ```python
  class NodeUpdateModule(gluon.Block):
    def __init__(self, out_feats, activation=None):
      super(NodeUpdateModule, self).__init__()
      self.linear = gluon.nn.Dense(out_feats, activation=activation)

    def forward(self, node):
      accum = self.linear(node.data['accum'])
      return {'h': mx.nd.concat(node.data['h'], accum, dim=1)}
43
44
  ```

Da Zheng's avatar
Da Zheng committed
45
After defining the functions on each node/edge, the message passing is triggered by calling `update_all` on the DGLGraph object (in GCN module).
46
```python
Mufei Li's avatar
Mufei Li committed
47
self.g.update_all(gcn_msg, gcn_reduce, layer)`
48
49
50
51
52
53
```

Batched GCN with spMV optimization (gcn_spmv.py)
-----------
Batched computation is much more efficient than naive vertex-centric approach, but is still not ideal. For example, the batched message function needs to look up source node data and save it on edges. Such kind of lookups is very common and incurs extra memory copy operations. In fact, the message and reduce phase of GCN model can be fused into one sparse-matrix-vector multiplication (spMV). Therefore, DGL provides many built-in message/reduce functions so we can figure out the chance of optimization. In gcn_spmv.py, user only needs to write update module and trigger the message passing as follows:
```python
Mufei Li's avatar
Mufei Li committed
54
self.g.update_all(fn.copy_src(src='h', out='m'), fn.sum(msg='m', out='h'), layer)
55
```
Mufei Li's avatar
Mufei Li committed
56
Here, `'fn.copy_src'` and `'fn.sum'` are the builtin message and reduce functions that perform the same operations as `gcn_msg` and `gcn_reduce` in gcn.py.