"doc/vscode:/vscode.git/clone" did not exist on "41ce92bb2287f99d66cd614b4c838af056a289ce"
Unverified Commit 8a07ab77 authored by Minjie Wang's avatar Minjie Wang Committed by GitHub
Browse files

[Doc] Tutorials re-organization (#2683)

* reorg

* change titles

* rm some stale API doc; minor fix

* fix docs

* add warning

* rm new-tutorial run in ci

* lint
parent 0fc64952
......@@ -107,7 +107,7 @@ DGL为分布式张量提供了类似于单机普通张量的接口,以访问
.. code:: python
tensor = dgl.distributed.DistTensor((g.number_of_nodes(), 10), th.float32, name=test)
tensor = dgl.distributed.DistTensor((g.number_of_nodes(), 10), th.float32, name='test')
**Note**: :class:`~dgl.distributed.DistTensor` 的创建是一个同步操作。所有训练器都必须调用创建,
并且只有当所有训练器都调用它时,此创建过程才能成功。
......
......@@ -53,7 +53,7 @@ JSON文件包含所有划分的配置。如果该API没有为节点和边分配
.. code:: python
dgl.distributed.partition_graph(g, graph_name, 4, /tmp/test, balance_ntypes=g.ndata[train_mask])
dgl.distributed.partition_graph(g, 'graph_name', 4, '/tmp/test', balance_ntypes=g.ndata['train_mask'])
除了平衡节点的类型之外, :func:`dgl.distributed.partition_graph` 还允许通过指定
``balance_edges`` 来平衡每个类型节点在子图中的入度。这平衡了不同类型节点的连边数量。
......
......@@ -27,9 +27,9 @@ DGL通过其核心数据结构 :class:`~dgl.DGLGraph` 提供了一个以图为
:hidden:
:glob:
graph_cn-basic
graph_cn-graphs-nodes-edges
graph_cn-feature
graph_cn-external
graph_cn-heterogeneous
graph_cn-gpu
graph-basic
graph-graphs-nodes-edges
graph-feature
graph-external
graph-heterogeneous
graph-gpu
用户指南
==========
(持续更新中)
.. toctree::
:maxdepth: 2
:titlesonly:
......
.. _guide_cn-message-passing:
第2章:消息传递范式
================
===========================
:ref:`(English Version) <guide-message-passing>`
消息传递是实现GNN的一种通用框架和编程范式。它从聚合与更新的角度归纳总结了多种GNN模型的实现。
消息传递范式
----------
----------------------
假设节点 :math:`v` 上的的特征为 :math:`x_v\in\mathbb{R}^{d_1}`,边 :math:`({u}, {v})` 上的特征为 :math:`w_{e}\in\mathbb{R}^{d_2}`。
**消息传递范式** 定义了以下逐节点和边上的计算:
......@@ -21,7 +21,7 @@
**聚合函数** :math:`\rho` 会聚合节点接受到的消息。 **更新函数** :math:`\psi` 会结合聚合后的消息和节点本身的特征来更新节点的特征。
本章路线图
--------
--------------------
本章首先介绍了DGL的消息传递API。然后讲解了如何高效地在点和边上使用这些API。本章的最后一节解释了如何在异构图上实现消息传递。
......
......@@ -3,75 +3,8 @@
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Overview of DGL
===============
Deep Graph Library (DGL) is a Python package built for easy implementation of
graph neural network model family, on top of existing DL frameworks (e.g.
PyTorch, MXNet, Gluon etc.).
DGL reduces the implementation of graph neural networks into declaring a set
of *functions* (or *modules* in PyTorch terminology). In addition, DGL
provides:
* Versatile controls over message passing, ranging from low-level operations
such as sending along selected edges and receiving on specific nodes, to
high-level control such as graph-wide feature updates.
* Transparent speed optimization with automatic batching of computations and
sparse matrix multiplication.
* Seamless integration with existing deep learning frameworks.
* Easy and friendly interfaces for node/edge feature access and graph
structure manipulation.
* Good scalability to graphs with tens of millions of vertices.
To begin with, we have prototyped 10 models across various domains:
semi-supervised learning on graphs (with potentially billions of nodes/edges),
generative models on graphs, (previously) difficult-to-parallelize tree-based
models like TreeLSTM, etc. We also implement some conventional models in DGL
from a new graphical perspective yielding simplicity.
Getting Started
---------------
* :doc:`Installation<install/index>`.
* :doc:`Quickstart tutorial<tutorials/basics/1_first>` for absolute beginners.
* :doc:`User guide<guide/index>`.
* :doc:`用户指南(User guide)中文版<guide_cn/index>`.
* :doc:`API reference manual<api/python/index>`.
* :doc:`End-to-end model tutorials<tutorials/models/index>` for learning DGL by popular models on graphs.
..
Follow the :doc:`instructions<install/index>` to install DGL.
:doc:`<new-tutorial/1_introduction>` is the most common place to get started with.
It offers a broad experience of using DGL for deep learning on graph data.
API reference document lists more endetailed specifications of each API and GNN modules,
a useful manual for in-depth developers.
You can learn other basic concepts of DGL through the dedicated tutorials.
* Learn constructing, saving and loading graphs with node and edge features :doc:`here<new-tutorial/2_dglgraph>`.
* Learn performing computation on graph using message passing :doc:`here<new-tutorial/3_message_passing>`.
* Learn link prediction with DGL :doc:`here<new-tutorial/4_link_predict>`.
* Learn graph classification with DGL :doc:`here<new-tutorial/5_graph_classification>`.
* Learn creating your own dataset for DGL :doc:`here<new-tutorial/6_load_data>`.
* Learn working with heterogeneous graph data :doc:`here<tutorials/basics/5_hetero>`.
End-to-end model tutorials are other good starting points for learning DGL and popular
models on graphs. The model tutorials are categorized based on the way they utilize DGL APIs.
* :ref:`Graph Neural Network and its variant <tutorials1-index>`: Learn how to use DGL to train
popular **GNN models** on one input graph.
* :ref:`Dealing with many small graphs <tutorials2-index>`: Learn how to train models for
many graph samples such as sentence parse trees.
* :ref:`Generative models <tutorials3-index>`: Learn how to deal with **dynamically-changing graphs**.
* :ref:`Old (new) wines in new bottle <tutorials4-index>`: Learn how to combine DGL with tensor-based
DGL framework in a flexible way. Explore new perspective on traditional models by graphs.
* :ref:`Training on giant graphs <tutorials5-index>`: Learn how to train graph neural networks
on giant graphs.
Each tutorial is accompanied with a runnable python script and jupyter notebook that
can be downloaded. If you would like the tutorials improved, please raise a github issue.
Welcome to Deep Graph Library Tutorials and Documentation
=========================================================
.. toctree::
:maxdepth: 1
......@@ -80,33 +13,19 @@ Getting Started
:glob:
install/index
install/backend
tutorials/blitz/index
.. toctree::
:maxdepth: 2
:caption: Tutorials
:hidden:
:glob:
new-tutorial/blitz/index
new-tutorial/large/index
.. toctree::
:maxdepth: 3
:caption: Model Examples
:hidden:
:glob:
tutorials/models/index
.. toctree::
:maxdepth: 2
:caption: User Guide
:caption: Advanced Materials
:hidden:
:titlesonly:
:glob:
guide/index
guide_cn/index
tutorials/large/index
tutorials/models/index
.. toctree::
:maxdepth: 2
......@@ -121,6 +40,7 @@ Getting Started
api/python/dgl.distributed
api/python/dgl.function
api/python/nn
api/python/nn.functional
api/python/dgl.ops
api/python/dgl.optim
api/python/dgl.sampling
......@@ -145,34 +65,39 @@ Getting Started
env_var
resources
Relationship of DGL to other frameworks
---------------------------------------
DGL is designed to be compatible and agnostic to the existing tensor
frameworks. It provides a backend adapter interface that allows easy porting
to other tensor-based, autograd-enabled frameworks.
Deep Graph Library (DGL) is a Python package built for easy implementation of
graph neural network model family, on top of existing DL frameworks (currently
supporting PyTorch, MXNet and TensorFlow). It offers a versatile control of message passing,
speed optimization via auto-batching and highly tuned sparse matrix kernels,
and multi-GPU/CPU training to scale to graphs of hundreds of millions of
nodes and edges.
Free software
Getting Started
---------------
For absolute beginners, start with the :doc:`Blitz Introduction to DGL <tutorials/blitz/index>`.
It covers the basic concepts of common graph machine learning tasks and a step-by-step
on building Graph Neural Networks (GNNs) to solve them.
For acquainted users who wish to learn more advanced usage,
* `Learn DGL by examples <https://github.com/dmlc/dgl/tree/master/examples>`_.
* Read the :doc:`User Guide<guide/index>` (:doc:`中文版链接<guide_cn/index>`), which explains the concepts
and usage of DGL in much more details.
* Go through the tutorials for :doc:`Stochastic Training of GNNs <tutorials/large/index>`,
which covers the basic steps for training GNNs on large graphs in mini-batches.
* :doc:`Study classical papers <tutorials/models/index>` on graph machine learning alongside DGL.
* Search for the usage of a specific API in the :doc:`API reference manual <api/python/index>`,
which organizes all DGL APIs by their namespace.
Contribution
-------------
DGL is free software; you can redistribute it and/or modify it under the terms
of the Apache License 2.0. We welcome contributions.
Join us on `GitHub <https://github.com/dmlc/dgl>`_ and check out our
:doc:`contribution guidelines <contribute>`.
History
-------
Prototype of DGL started in early Spring, 2018, at NYU Shanghai by Prof. `Zheng
Zhang <https://shanghai.nyu.edu/academics/faculty/directory/zheng-zhang>`_ and
Quan Gan. Serious development began when `Minjie
<https://jermainewang.github.io/>`_, `Lingfan <https://cs.nyu.edu/~lingfan/>`_
and Prof. `Jinyang Li <http://www.news.cs.nyu.edu/~jinyang/>`_ from NYU's
system group joined, flanked by a team of student volunteers at NYU Shanghai,
Fudan and other universities (Yu, Zihao, Murphy, Allen, Qipeng, Qi, Hao), as
well as early adopters at the CILVR lab (Jake Zhao). Development accelerated
when AWS MXNet Science team joined force, with Da Zheng, Alex Smola, Haibin
Lin, Chao Ma and a number of others. For full credit, see `here
<https://www.dgl.ai/ack>`_.
Index
-----
* :ref:`genindex`
.. _backends:
Working with different backends
===============================
DGL supports PyTorch, MXNet and Tensorflow backends.
DGL will choose the backend on the following options (high priority to low priority)
- `DGLBACKEND` environment
- You can use `DGLBACKEND=[BACKEND] python gcn.py ...` to specify the backend
- Or `export DGLBACKEND=[BACKEND]` to set the global environment variable
- `config.json` file under "~/.dgl"
- You can use `python -m dgl.backend.set_default_backend [BACKEND]` to set the default backend
Currently BACKEND can be chosen from mxnet, pytorch, tensorflow.
PyTorch backend
---------------
Export ``DGLBACKEND`` as ``pytorch`` to specify PyTorch backend. The required PyTorch
version is 1.5.0 or later. See `pytorch.org <https://pytorch.org>`_ for installation instructions.
MXNet backend
-------------
Export ``DGLBACKEND`` as ``mxnet`` to specify MXNet backend. The required MXNet version is
1.5 or later. See `mxnet.apache.org <https://mxnet.apache.org/get_started>`_ for installation
instructions.
MXNet uses uint32 as the default data type for integer tensors, which only supports graph of
size smaller than 2^32. To enable large graph training, *build* MXNet with ``USE_INT64_TENSOR_SIZE=1``
flag. See `this FAQ <https://mxnet.apache.org/api/faq/large_tensor_support>`_ for more information.
MXNet 1.5 and later has an option to enable Numpy shape mode for ``NDArray`` objects, some DGL models
need this mode to be enabled to run correctly. However, this mode may not compatible with pretrained
model parameters with this mode disabled, e.g. pretrained models from GluonCV and GluonNLP.
By setting ``DGL_MXNET_SET_NP_SHAPE``, users can switch this mode on or off.
Tensorflow backend
------------------
Export ``DGLBACKEND`` as ``tensorflow`` to specify Tensorflow backend. The required Tensorflow
version is 2.2.0 or later. See `tensorflow.org <https://www.tensorflow.org/install>`_ for installation
instructions. In addition, DGL will set ``TF_FORCE_GPU_ALLOW_GROWTH`` to ``true`` to prevent Tensorflow take over the whole GPU memory:
.. code:: bash
pip install "tensorflow>=2.2.0" # when using tensorflow cpu version
Install DGL
===========
This topic explains how to install DGL. We recommend installing DGL by using ``conda`` or ``pip``.
Install and Setup
=================
System requirements
-------------------
......@@ -22,7 +20,8 @@ CPU build, then the CPU build is overwritten.
Install from Conda or Pip
-------------------------
Check out the `Get Started page <https://www.dgl.ai/pages/start.html>`_.
We recommend installing DGL by ``conda`` or ``pip``.
Check out the instructions on the `Get Started page <https://www.dgl.ai/pages/start.html>`_.
.. _install-from-source:
......@@ -63,15 +62,14 @@ configuration as you wish. For example, change ``USE_CUDA`` to ``ON`` will
enable a CUDA build. You could also pass ``-DKEY=VALUE`` to the cmake command
for the same purpose.
- CPU-only build
.. code:: bash
* CPU-only build::
mkdir build
cd build
cmake ..
make -j4
- CUDA build
.. code:: bash
* CUDA build::
mkdir build
cd build
......@@ -125,8 +123,7 @@ You can build DGL with MSBuild. With `MS Build Tools <https://go.microsoft.com/
and `CMake on Windows <https://cmake.org/download/>`_ installed, run the following
in VS2019 x64 Native tools command prompt.
- CPU only build
.. code::
* CPU only build::
MD build
CD build
......@@ -134,8 +131,8 @@ in VS2019 x64 Native tools command prompt.
msbuild dgl.sln /m
CD ..\python
python setup.py install
- CUDA build
.. code::
* CUDA build::
MD build
CD build
......@@ -144,9 +141,61 @@ in VS2019 x64 Native tools command prompt.
CD ..\python
python setup.py install
Optional Flags
``````````````
Compilation Flags
`````````````````
See `config.cmake <https://github.com/dmlc/dgl/blob/master/cmake/config.cmake>`_.
.. _backends:
Working with different backends
-------------------------------
DGL supports PyTorch, MXNet and Tensorflow backends.
DGL will choose the backend on the following options (high priority to low priority)
* Use the ``DGLBACKEND`` environment variable:
- You can use ``DGLBACKEND=[BACKEND] python gcn.py ...`` to specify the backend
- Or ``export DGLBACKEND=[BACKEND]`` to set the global environment variable
* Modify the ``config.json`` file under "~/.dgl":
- You can use ``python -m dgl.backend.set_default_backend [BACKEND]`` to set the default backend
Currently BACKEND can be chosen from mxnet, pytorch, tensorflow.
PyTorch backend
```````````````
Export ``DGLBACKEND`` as ``pytorch`` to specify PyTorch backend. The required PyTorch
version is 1.5.0 or later. See `pytorch.org <https://pytorch.org>`_ for installation instructions.
MXNet backend
`````````````
Export ``DGLBACKEND`` as ``mxnet`` to specify MXNet backend. The required MXNet version is
1.5 or later. See `mxnet.apache.org <https://mxnet.apache.org/get_started>`_ for installation
instructions.
MXNet uses uint32 as the default data type for integer tensors, which only supports graph of
size smaller than 2^32. To enable large graph training, *build* MXNet with ``USE_INT64_TENSOR_SIZE=1``
flag. See `this FAQ <https://mxnet.apache.org/api/faq/large_tensor_support>`_ for more information.
MXNet 1.5 and later has an option to enable Numpy shape mode for ``NDArray`` objects, some DGL models
need this mode to be enabled to run correctly. However, this mode may not compatible with pretrained
model parameters with this mode disabled, e.g. pretrained models from GluonCV and GluonNLP.
By setting ``DGL_MXNET_SET_NP_SHAPE``, users can switch this mode on or off.
Tensorflow backend
``````````````````
Export ``DGLBACKEND`` as ``tensorflow`` to specify Tensorflow backend. The required Tensorflow
version is 2.2.0 or later. See `tensorflow.org <https://www.tensorflow.org/install>`_ for installation
instructions. In addition, DGL will set ``TF_FORCE_GPU_ALLOW_GROWTH`` to ``true`` to prevent Tensorflow take over the whole GPU memory:
.. code:: bash
pip install "tensorflow>=2.2.0" # when using tensorflow cpu version
- If you are using PyTorch, you can add ``-DBUILD_TORCH=ON`` flag in CMake
to build PyTorch plugins for further performance optimization. This applies for Linux,
Windows, and Mac.
......@@ -9,8 +9,6 @@ __all__ = ['edge_softmax']
def edge_softmax(graph, logits, eids=ALL, norm_by='dst'):
r"""Compute softmax over weights of incoming edges for every node.
Description
-----------
For a node :math:`i`, edge softmax is an operation that computes
.. math::
......@@ -28,6 +26,9 @@ def edge_softmax(graph, logits, eids=ALL, norm_by='dst'):
An example of using edge softmax is in
`Graph Attention Network <https://arxiv.org/pdf/1710.10903.pdf>`__ where
the attention weights are computed with this operation.
Other non-GNN examples using this are
`Transformer <https://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf>`__,
`Capsule <https://arxiv.org/pdf/1710.09829.pdf>`__, etc.
Parameters
----------
......
......@@ -13,6 +13,7 @@ def gsddmm(g, op, lhs_data, rhs_data, lhs_target='u', rhs_target='v'):
It computes edge features by :attr:`op` lhs features and rhs features.
.. math::
x_{e} = \phi(x_{lhs}, x_{rhs}), \forall (u,e,v)\in \mathcal{G}
where :math:`x_{e}` is the returned feature on edges and :math:`x_u`,
......@@ -33,9 +34,9 @@ def gsddmm(g, op, lhs_data, rhs_data, lhs_target='u', rhs_target='v'):
rhs_data : tensor or None
The right operand, could be None if it's not required by op.
lhs_target: str
Choice of `u`(source), `e`(edge) or `v`(destination) for left operand.
Choice of ``u``(source), ``e``(edge) or ``v``(destination) for left operand.
rhs_target: str
Choice of `u`(source), `e`(edge) or `v`(destination) for right operand.
Choice of ``u``(source), ``e``(edge) or ``v``(destination) for right operand.
Returns
-------
......
......@@ -4,7 +4,6 @@
. /opt/conda/etc/profile.d/conda.sh
conda activate pytorch-ci
TUTORIAL_ROOT="./tutorials"
NEW_TUTORIAL_ROOT="./new-tutorial"
function fail {
echo FAIL: $@
......@@ -29,11 +28,3 @@ do
done
popd > /dev/null
pushd ${NEW_TUTORIAL_ROOT} > /dev/null
for f in $(find . -name "*.py" ! -name "*_mx.py")
do
echo "Running tutorial ${f} ..."
python3 $f || fail "run ${f}"
done
popd > /dev/null
"""
.. currentmodule:: dgl
DGL at a Glance
=========================
**Author**: `Minjie Wang <https://jermainewang.github.io/>`_, Quan Gan, `Jake
Zhao <https://cs.nyu.edu/~jakezhao/>`_, Zheng Zhang
DGL is a Python package dedicated to deep learning on graphs, built atop
existing tensor DL frameworks (e.g. Pytorch, MXNet) and simplifying the
implementation of graph-based neural networks.
The goal of this tutorial:
- Understand how DGL enables computation on graph from a high level.
- Train a simple graph neural network in DGL to classify nodes in a graph.
At the end of this tutorial, we hope you get a brief feeling of how DGL works.
*This tutorial assumes basic familiarity with pytorch.*
"""
###############################################################################
# Tutorial problem description
# ----------------------------
#
# The tutorial is based on the "Zachary's karate club" problem. The karate club
# is a social network that includes 34 members and documents pairwise links
# between members who interact outside the club. The club later divides into
# two communities led by the instructor (node 0) and the club president (node
# 33). The network is visualized as follows with the color indicating the
# community:
#
# .. image:: https://data.dgl.ai/tutorial/img/karate-club.png
# :align: center
#
# The task is to predict which side (0 or 33) each member tends to join given
# the social network itself.
###############################################################################
# Step 1: Creating a graph in DGL
# -------------------------------
# Create the graph for Zachary's karate club as follows:
import dgl
import numpy as np
def build_karate_club_graph():
# All 78 edges are stored in two numpy arrays. One for source endpoints
# while the other for destination endpoints.
src = np.array([1, 2, 2, 3, 3, 3, 4, 5, 6, 6, 6, 7, 7, 7, 7, 8, 8, 9, 10, 10,
10, 11, 12, 12, 13, 13, 13, 13, 16, 16, 17, 17, 19, 19, 21, 21,
25, 25, 27, 27, 27, 28, 29, 29, 30, 30, 31, 31, 31, 31, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 33, 33, 33, 33, 33, 33, 33,
33, 33, 33, 33, 33, 33, 33, 33, 33, 33])
dst = np.array([0, 0, 1, 0, 1, 2, 0, 0, 0, 4, 5, 0, 1, 2, 3, 0, 2, 2, 0, 4,
5, 0, 0, 3, 0, 1, 2, 3, 5, 6, 0, 1, 0, 1, 0, 1, 23, 24, 2, 23,
24, 2, 23, 26, 1, 8, 0, 24, 25, 28, 2, 8, 14, 15, 18, 20, 22, 23,
29, 30, 31, 8, 9, 13, 14, 15, 18, 19, 20, 22, 23, 26, 27, 28, 29, 30,
31, 32])
# Edges are directional in DGL; Make them bi-directional.
u = np.concatenate([src, dst])
v = np.concatenate([dst, src])
# Construct a DGLGraph
return dgl.graph((u, v))
###############################################################################
# Print out the number of nodes and edges in our newly constructed graph:
G = build_karate_club_graph()
print('We have %d nodes.' % G.number_of_nodes())
print('We have %d edges.' % G.number_of_edges())
###############################################################################
# Visualize the graph by converting it to a `networkx
# <https://networkx.github.io/documentation/stable/>`_ graph:
import networkx as nx
# Since the actual graph is undirected, we convert it for visualization
# purpose.
nx_G = G.to_networkx().to_undirected()
# Kamada-Kawaii layout usually looks pretty for arbitrary graphs
pos = nx.kamada_kawai_layout(nx_G)
nx.draw(nx_G, pos, with_labels=True, node_color=[[.7, .7, .7]])
###############################################################################
# Step 2: Assign features to nodes or edges
# --------------------------------------------
# Graph neural networks associate features with nodes and edges for training.
# For our classification example, since there is no input feature, we assign each node
# with a learnable embedding vector.
# In DGL, you can add features for all nodes at once, using a feature tensor that
# batches node features along the first dimension. The code below adds the learnable
# embeddings for all nodes:
import torch
import torch.nn as nn
import torch.nn.functional as F
embed = nn.Embedding(34, 5) # 34 nodes with embedding dim equal to 5
G.ndata['feat'] = embed.weight
###############################################################################
# Print out the node features to verify:
# print out node 2's input feature
print(G.ndata['feat'][2])
# print out node 10 and 11's input features
print(G.ndata['feat'][[10, 11]])
###############################################################################
# Step 3: Define a Graph Convolutional Network (GCN)
# --------------------------------------------------
# To perform node classification, use the Graph Convolutional Network
# (GCN) developed by `Kipf and Welling <https://arxiv.org/abs/1609.02907>`_. Here
# is the simplest definition of a GCN framework. We recommend that you
# read the original paper for more details.
#
# - At layer :math:`l`, each node :math:`v_i^l` carries a feature vector :math:`h_i^l`.
# - Each layer of the GCN tries to aggregate the features from :math:`u_i^{l}` where
# :math:`u_i`'s are neighborhood nodes to :math:`v` into the next layer representation at
# :math:`v_i^{l+1}`. This is followed by an affine transformation with some
# non-linearity.
#
# The above definition of GCN fits into a **message-passing** paradigm: Each
# node will update its own feature with information sent from neighboring
# nodes. A graphical demonstration is displayed below.
#
# .. image:: https://data.dgl.ai/tutorial/1_first/mailbox.png
# :alt: mailbox
# :align: center
#
# In DGL, we provide implementations of popular Graph Neural Network layers under
# the `dgl.<backend>.nn` subpackage. The :class:`~dgl.nn.pytorch.GraphConv` module
# implements one Graph Convolutional layer.
from dgl.nn.pytorch import GraphConv
###############################################################################
# Define a deeper GCN model that contains two GCN layers:
class GCN(nn.Module):
def __init__(self, in_feats, hidden_size, num_classes):
super(GCN, self).__init__()
self.conv1 = GraphConv(in_feats, hidden_size)
self.conv2 = GraphConv(hidden_size, num_classes)
def forward(self, g, inputs):
h = self.conv1(g, inputs)
h = torch.relu(h)
h = self.conv2(g, h)
return h
# The first layer transforms input features of size of 5 to a hidden size of 5.
# The second layer transforms the hidden layer and produces output features of
# size 2, corresponding to the two groups of the karate club.
net = GCN(5, 5, 2)
###############################################################################
# Step 4: Data preparation and initialization
# -------------------------------------------
#
# We use learnable embeddings to initialize the node features. Since this is a
# semi-supervised setting, only the instructor (node 0) and the club president
# (node 33) are assigned labels. The implementation is available as follow.
inputs = embed.weight
labeled_nodes = torch.tensor([0, 33]) # only the instructor and the president nodes are labeled
labels = torch.tensor([0, 1]) # their labels are different
###############################################################################
# Step 5: Train then visualize
# ----------------------------
# The training loop is exactly the same as other PyTorch models.
# We (1) create an optimizer, (2) feed the inputs to the model,
# (3) calculate the loss and (4) use autograd to optimize the model.
import itertools
optimizer = torch.optim.Adam(itertools.chain(net.parameters(), embed.parameters()), lr=0.01)
all_logits = []
for epoch in range(50):
logits = net(G, inputs)
# we save the logits for visualization later
all_logits.append(logits.detach())
logp = F.log_softmax(logits, 1)
# we only compute loss for labeled nodes
loss = F.nll_loss(logp[labeled_nodes], labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print('Epoch %d | Loss: %.4f' % (epoch, loss.item()))
###############################################################################
# This is a rather toy example, so it does not even have a validation or test
# set. Instead, Since the model produces an output feature of size 2 for each node, we can
# visualize by plotting the output feature in a 2D space.
# The following code animates the training process from initial guess
# (where the nodes are not classified correctly at all) to the end
# (where the nodes are linearly separable).
import matplotlib.animation as animation
import matplotlib.pyplot as plt
def draw(i):
cls1color = '#00FFFF'
cls2color = '#FF00FF'
pos = {}
colors = []
for v in range(34):
pos[v] = all_logits[i][v].numpy()
cls = pos[v].argmax()
colors.append(cls1color if cls else cls2color)
ax.cla()
ax.axis('off')
ax.set_title('Epoch: %d' % i)
nx.draw_networkx(nx_G.to_undirected(), pos, node_color=colors,
with_labels=True, node_size=300, ax=ax)
fig = plt.figure(dpi=150)
fig.clf()
ax = fig.subplots()
draw(0) # draw the prediction of the first epoch
plt.close()
###############################################################################
# .. image:: https://data.dgl.ai/tutorial/1_first/karate0.png
# :height: 300px
# :width: 400px
# :align: center
###############################################################################
# The following animation shows how the model correctly predicts the community
# after a series of training epochs.
ani = animation.FuncAnimation(fig, draw, frames=len(all_logits), interval=200)
###############################################################################
# .. image:: https://data.dgl.ai/tutorial/1_first/karate.gif
# :height: 300px
# :width: 400px
# :align: center
###############################################################################
# Next steps
# ----------
#
# In the :doc:`next tutorial <2_basics>`, we will go through some more basics
# of DGL, such as reading and writing node/edge features.
"""
.. currentmodule:: dgl
DGLGraph and Node/edge Features
===============================
**Author**: `Minjie Wang <https://jermainewang.github.io/>`_, Quan Gan, Yu Gai,
Zheng Zhang
In this tutorial, you learn how to create a graph and how to read and write node and edge representations.
"""
###############################################################################
# Creating a graph
# ----------------
# The design of :class:`DGLGraph` was influenced by other graph libraries. You
# can create a graph from networkx and convert it into a :class:`DGLGraph` and
# vice versa.
import networkx as nx
import dgl
g_nx = nx.petersen_graph()
g_dgl = dgl.DGLGraph(g_nx)
import matplotlib.pyplot as plt
plt.subplot(121)
nx.draw(g_nx, with_labels=True)
plt.subplot(122)
nx.draw(g_dgl.to_networkx(), with_labels=True)
plt.show()
###############################################################################
# There are many ways to construct a :class:`DGLGraph`. Below are the allowed
# data types ordered by our recommendataion.
#
# * A pair of arrays ``(u, v)`` storing the source and destination nodes respectively.
# They can be numpy arrays or tensor objects from the backend framework.
# * ``scipy`` sparse matrix representing the adjacency matrix of the graph to be
# constructed.
# * ``networkx`` graph object.
# * A list of edges in the form of integer pairs.
#
# The examples below construct the same star graph via different methods.
#
# :class:`DGLGraph` nodes are a consecutive range of integers between 0 and
# :func:`number_of_nodes() <DGLGraph.number_of_nodes>`.
# :class:`DGLGraph` edges are in order of their additions. Note that
# edges are accessed in much the same way as nodes, with one extra feature:
# *edge broadcasting*.
import torch as th
import numpy as np
import scipy.sparse as spp
# Create a star graph from a pair of arrays (using ``numpy.array`` works too).
u = th.tensor([0, 0, 0, 0, 0])
v = th.tensor([1, 2, 3, 4, 5])
star1 = dgl.DGLGraph((u, v))
# Create the same graph from a scipy sparse matrix (using ``scipy.sparse.csr_matrix`` works too).
adj = spp.coo_matrix((np.ones(len(u)), (u.numpy(), v.numpy())))
star3 = dgl.DGLGraph(adj)
###############################################################################
# You can also create a graph by progressively adding more nodes and edges.
# Although it is not as efficient as the above constructors, it is suitable
# for applications where the graph cannot be constructed in one shot.
g = dgl.DGLGraph()
g.add_nodes(10)
# A couple edges one-by-one
for i in range(1, 4):
g.add_edge(i, 0)
# A few more with a paired list
src = list(range(5, 8)); dst = [0]*3
g.add_edges(src, dst)
# finish with a pair of tensors
src = th.tensor([8, 9]); dst = th.tensor([0, 0])
g.add_edges(src, dst)
# Edge broadcasting will do star graph in one go!
g = dgl.DGLGraph()
g.add_nodes(10)
src = th.tensor(list(range(1, 10)));
g.add_edges(src, 0)
# Visualize the graph.
nx.draw(g.to_networkx(), with_labels=True)
plt.show()
###############################################################################
# Assigning a feature
# -------------------
# You can also assign features to nodes and edges of a :class:`DGLGraph`. The
# features are represented as dictionary of names (strings) and tensors,
# called **fields**.
#
# The following code snippet assigns each node a vector (len=3).
#
# .. note::
#
# DGL aims to be framework-agnostic, and currently it supports PyTorch and
# MXNet tensors. The following examples use PyTorch only.
import dgl
import torch as th
x = th.randn(10, 3)
g.ndata['x'] = x
###############################################################################
# :func:`ndata <DGLGraph.ndata>` is a syntax sugar to access the feature
# data of all nodes. To get the features of some particular nodes, slice out
# the corresponding rows.
g.ndata['x'][0] = th.zeros(1, 3)
g.ndata['x'][[0, 1, 2]] = th.zeros(3, 3)
g.ndata['x'][th.tensor([0, 1, 2])] = th.randn((3, 3))
###############################################################################
# Assigning edge features is similar to that of node features,
# except that you can also do it by specifying endpoints of the edges.
g.edata['w'] = th.randn(9, 2)
# Access edge set with IDs in integer, list, or integer tensor
g.edata['w'][1] = th.randn(1, 2)
g.edata['w'][[0, 1, 2]] = th.zeros(3, 2)
g.edata['w'][th.tensor([0, 1, 2])] = th.zeros(3, 2)
# You can get the edge ids by giving endpoints, which are useful for accessing the features.
g.edata['w'][g.edge_id(1, 0)] = th.ones(1, 2) # edge 1 -> 0
g.edata['w'][g.edge_ids([1, 2, 3], [0, 0, 0])] = th.ones(3, 2) # edges [1, 2, 3] -> 0
# Use edge broadcasting whenever applicable.
g.edata['w'][g.edge_ids([1, 2, 3], [0, 0, 0])] = th.ones(3, 2) # edges [1, 2, 3] -> 0
###############################################################################
# After assignments, each node or edge field will be associated with a scheme
# containing the shape and data type (dtype) of its field value.
print(g.node_attr_schemes())
g.ndata['x'] = th.zeros((10, 4))
print(g.node_attr_schemes())
###############################################################################
# You can also remove node or edge states from the graph. This is particularly
# useful to save memory during inference.
g.ndata.pop('x')
g.edata.pop('w')
###############################################################################
# Working with multigraphs
# ~~~~~~~~~~~~~~~~~~~~~~~~
# Many graph applications need parallel edges,
# which class:DGLGraph supports by default.
g_multi = dgl.DGLGraph()
g_multi.add_nodes(10)
g_multi.ndata['x'] = th.randn(10, 2)
g_multi.add_edges(list(range(1, 10)), 0)
g_multi.add_edge(1, 0) # two edges on 1->0
g_multi.edata['w'] = th.randn(10, 2)
g_multi.edges[1].data['w'] = th.zeros(1, 2)
print(g_multi.edges())
###############################################################################
# An edge in multigraph cannot be uniquely identified by using its incident nodes
# :math:`u` and :math:`v`; query their edge IDs use ``edge_id`` interface.
_, _, eid_10 = g_multi.edge_id(1, 0, return_uv=True)
g_multi.edges[eid_10].data['w'] = th.ones(len(eid_10), 2)
print(g_multi.edata['w'])
###############################################################################
# .. note::
#
# * Updating a feature of different schemes raises the risk of error on individual nodes (or
# node subset).
###############################################################################
# Next steps
# ----------
# In the :doc:`next tutorial <3_pagerank>` you learn the
# DGL message passing interface by implementing PageRank.
"""
.. currentmodule:: dgl
Message Passing Tutorial
========================
**Author**: `Minjie Wang <https://jermainewang.github.io/>`_, Quan Gan, Yu Gai,
Zheng Zhang
In this tutorial, you learn how to use different levels of the message
passing API with PageRank on a small graph. In DGL, the message passing and
feature transformations are **user-defined functions** (UDFs).
"""
###############################################################################
# The PageRank algorithm
# ----------------------
# In each iteration of PageRank, every node (web page) first scatters its
# PageRank value uniformly to its downstream nodes. The new PageRank value of
# each node is computed by aggregating the received PageRank values from its
# neighbors, which is then adjusted by the damping factor:
#
# .. math::
#
# PV(u) = \frac{1-d}{N} + d \times \sum_{v \in \mathcal{N}(u)}
# \frac{PV(v)}{D(v)}
#
# where :math:`N` is the number of nodes in the graph; :math:`D(v)` is the
# out-degree of a node :math:`v`; and :math:`\mathcal{N}(u)` is the neighbor
# nodes.
###############################################################################
# A naive implementation
# ----------------------
# Create a graph with 100 nodes by using ``networkx`` and then convert it to a
# :class:`DGLGraph`.
import networkx as nx
import matplotlib.pyplot as plt
import torch
import dgl
N = 100 # number of nodes
DAMP = 0.85 # damping factor
K = 10 # number of iterations
g = nx.nx.erdos_renyi_graph(N, 0.1)
g = dgl.DGLGraph(g)
nx.draw(g.to_networkx(), node_size=50, node_color=[[.5, .5, .5,]])
plt.show()
###############################################################################
# According to the algorithm, PageRank consists of two phases in a typical
# scatter-gather pattern. Initialize the PageRank value of each node
# to :math:`\frac{1}{N}` and then store each node's out-degree as a node feature.
g.ndata['pv'] = torch.ones(N) / N
g.ndata['deg'] = g.out_degrees(g.nodes()).float()
###############################################################################
# Define the message function, which divides every node's PageRank
# value by its out-degree and passes the result as message to its neighbors.
def pagerank_message_func(edges):
return {'pv' : edges.src['pv'] / edges.src['deg']}
###############################################################################
# In DGL, the message functions are expressed as **Edge UDFs**. Edge UDFs
# take in a single argument ``edges``. It has three members ``src``, ``dst``,
# and ``data`` for accessing source node features, destination node features,
# and edge features. Here, the function computes messages only
# from source node features.
#
# Define the reduce function, which removes and aggregates the
# messages from its ``mailbox``, and computes its new PageRank value.
def pagerank_reduce_func(nodes):
msgs = torch.sum(nodes.mailbox['pv'], dim=1)
pv = (1 - DAMP) / N + DAMP * msgs
return {'pv' : pv}
###############################################################################
# The reduce functions are **Node UDFs**. Node UDFs have a single argument
# ``nodes``, which has two members ``data`` and ``mailbox``. ``data``
# contains the node features and ``mailbox`` contains all incoming message
# features, stacked along the second dimension (hence the ``dim=1`` argument).
#
# The message UDF works on a batch of edges, whereas the reduce UDF works on
# a batch of edges but outputs a batch of nodes. Their relationships are as
# follows:
#
# .. image:: https://i.imgur.com/kIMiuFb.png
#
###############################################################################
# The algorithm is straightforward. Here is the code for one
# PageRank iteration.
def pagerank_naive(g):
# Phase #1: send out messages along all edges.
for u, v in zip(*g.edges()):
g.send((u, v), pagerank_message_func)
# Phase #2: receive messages to compute new PageRank values.
for v in g.nodes():
g.recv(v, pagerank_reduce_func)
###############################################################################
# Batching semantics for a large graph
# ------------------------------------
# The above code does not scale to a large graph because it iterates over all
# the nodes. DGL solves this by allowing you to compute on a *batch* of nodes or
# edges. For example, the following codes trigger message and reduce functions
# on multiple nodes and edges at one time.
def pagerank_batch(g):
g.send(g.edges(), pagerank_message_func)
g.recv(g.nodes(), pagerank_reduce_func)
###############################################################################
# You are still using the same reduce function ``pagerank_reduce_func``,
# where ``nodes.mailbox['pv']`` is a *single* tensor, stacking the incoming
# messages along the second dimension.
#
# You might wonder if this is even possible to perform reduce on all
# nodes in parallel, since each node may have different number of incoming
# messages and you cannot really "stack" tensors of different lengths together.
# In general, DGL solves the problem by grouping the nodes by the number of
# incoming messages, and calling the reduce function for each group.
###############################################################################
# Use higher-level APIs for efficiency
# ---------------------------------------
# DGL provides many routines that combine basic ``send`` and ``recv`` in
# various ways. These routines are called **level-2 APIs**. For example, the next code example
# shows how to further simplify the PageRank example with such an API.
def pagerank_level2(g):
g.update_all()
###############################################################################
# In addition to ``update_all``, you can use ``pull``, ``push``, and ``send_and_recv``
# in this level-2 category. For more information, see :doc:`API reference <../../api/python/graph>`.
###############################################################################
# Use DGL ``builtin`` functions for efficiency
# ------------------------------------------------
# Some of the message and reduce functions are used frequently. For this reason, DGL also
# provides ``builtin`` functions. For example, two ``builtin`` functions can be
# used in the PageRank example.
#
# * :func:`dgl.function.copy_src(src, out) <function.copy_src>` - This
# code example is an edge UDF that computes the
# output using the source node feature data. To use this, specify the name of
# the source feature data (``src``) and the output name (``out``).
#
# * :func:`dgl.function.sum(msg, out) <function.sum>` - This code example is a node UDF
# that sums the messages in
# the node's mailbox. To use this, specify the message name (``msg``) and the
# output name (``out``).
#
# The following PageRank example shows such functions.
import dgl.function as fn
def pagerank_builtin(g):
g.ndata['pv'] = g.ndata['pv'] / g.ndata['deg']
g.update_all(message_func=fn.copy_src(src='pv', out='m'),
reduce_func=fn.sum(msg='m',out='m_sum'))
g.ndata['pv'] = (1 - DAMP) / N + DAMP * g.ndata['m_sum']
###############################################################################
# In the previous example code, you directly provide the UDFs to the :func:`update_all <DGLGraph.update_all>`
# as its arguments.
# This will override the previously registered UDFs.
#
# In addition to cleaner code, using ``builtin`` functions also gives DGL the
# opportunity to fuse operations together. This results in faster execution. For
# example, DGL will fuse the ``copy_src`` message function and ``sum`` reduce
# function into one sparse matrix-vector (spMV) multiplication.
#
# `The following section <spmv_>`_ describes why spMV can speed up the scatter-gather
# phase in PageRank. For more details about the ``builtin`` functions in DGL,
# see :doc:`API reference <../../api/python/function>`.
#
# You can also download and run the different code examples to see the differences.
for k in range(K):
# Uncomment the corresponding line to select different version.
# pagerank_naive(g)
# pagerank_batch(g)
# pagerank_level2(g)
pagerank_builtin(g)
print(g.ndata['pv'])
###############################################################################
# .. _spmv:
#
# Using spMV for PageRank
# -----------------------
# Using ``builtin`` functions allows DGL to understand the semantics of UDFs.
# This allows you to create an efficient implementation. For example, in the case
# of PageRank, one common method to accelerate it is by using its linear algebra
# form.
#
# .. math::
#
# \mathbf{R}^{k} = \frac{1-d}{N} \mathbf{1} + d \mathbf{A}*\mathbf{R}^{k-1}
#
# Here, :math:`\mathbf{R}^k` is the vector of the PageRank values of all nodes
# at iteration :math:`k`; :math:`\mathbf{A}` is the sparse adjacency matrix
# of the graph.
# Computing this equation is quite efficient because there is an efficient
# GPU kernel for the sparse matrix-vector multiplication (spMV). DGL
# detects whether such optimization is available through the ``builtin``
# functions. If a certain combination of ``builtin`` can be mapped to an spMV
# kernel (e.g., the PageRank example), DGL uses it automatically. We recommend
# using ``builtin`` functions whenever possible.
###############################################################################
# Next steps
# ----------
#
# * Learn how to use DGL (:doc:`builtin functions<../../features/builtin>`) to write
# more efficient message passing.
# * To see model tutorials, see the :doc:`overview page<../models/index>`.
# * To learn about Graph Neural Networks, see :doc:`GCN tutorial<../models/1_gnn/1_gcn>`.
# * To see how DGL batches multiple graphs, see :doc:`TreeLSTM tutorial<../models/2_small_graph/3_tree-lstm>`.
# * Play with some graph generative models by following tutorial for :doc:`Deep Generative Model of Graphs<../models/3_generative_model/5_dgmg>`.
# * To learn how traditional models are interpreted in a view of graph, see
# the tutorials on :doc:`CapsuleNet<../models/4_old_wines/2_capsule>` and
# :doc:`Transformer<../models/4_old_wines/7_transformer>`.
"""
.. currentmodule:: dgl
Graph Classification Tutorial
=============================
**Author**: `Mufei Li <https://github.com/mufeili>`_,
`Minjie Wang <https://jermainewang.github.io/>`_,
`Zheng Zhang <https://shanghai.nyu.edu/academics/faculty/directory/zheng-zhang>`_.
In this tutorial, you learn how to use DGL to batch multiple graphs of variable size and shape. The
tutorial also demonstrates training a graph neural network for a simple graph classification task.
Graph classification is an important problem
with applications across many fields, such as bioinformatics, chemoinformatics, social
network analysis, urban computing, and cybersecurity. Applying graph neural
networks to this problem has been a popular approach recently. This can be seen in the following reserach references:
`Ying et al., 2018 <https://arxiv.org/abs/1806.08804>`_,
`Cangea et al., 2018 <https://arxiv.org/abs/1811.01287>`_,
`Knyazev et al., 2018 <https://arxiv.org/abs/1811.09595>`_,
`Bianchi et al., 2019 <https://arxiv.org/abs/1901.01343>`_,
`Liao et al., 2019 <https://arxiv.org/abs/1901.01484>`_,
`Gao et al., 2019 <https://openreview.net/forum?id=HJePRoAct7>`_).
"""
###############################################################################
# Simple graph classification task
# --------------------------------
# In this tutorial, you learn how to perform batched graph classification
# with DGL. The example task objective is to classify eight types of topologies shown here.
#
# .. image:: https://data.dgl.ai/tutorial/batch/dataset_overview.png
# :align: center
#
# Implement a synthetic dataset :class:`data.MiniGCDataset` in DGL. The dataset has eight
# different types of graphs and each class has the same number of graph samples.
import dgl
import torch
from dgl.data import MiniGCDataset
import matplotlib.pyplot as plt
import networkx as nx
# A dataset with 80 samples, each graph is
# of size [10, 20]
dataset = MiniGCDataset(80, 10, 20)
graph, label = dataset[0]
fig, ax = plt.subplots()
nx.draw(graph.to_networkx(), ax=ax)
ax.set_title('Class: {:d}'.format(label))
plt.show()
###############################################################################
# Form a graph mini-batch
# -----------------------
# To train neural networks efficiently, a common practice is to batch
# multiple samples together to form a mini-batch. Batching fixed-shaped tensor
# inputs is common. For example, batching two images of size 28 x 28
# gives a tensor of shape 2 x 28 x 28. By contrast, batching graph inputs
# has two challenges:
#
# * Graphs are sparse.
# * Graphs can have various length. For example, number of nodes and edges.
#
# To address this, DGL provides a :func:`dgl.batch` API. It leverages the idea that
# a batch of graphs can be viewed as a large graph that has many disjointed
# connected components. Below is a visualization that gives the general idea.
#
# .. image:: https://data.dgl.ai/tutorial/batch/batch.png
# :width: 400pt
# :align: center
#
# The return type of :func:`dgl.batch` is still a graph. In the same way,
# a batch of tensors is still a tensor. This means that any code that works
# for one graph immediately works for a batch of graphs. More importantly,
# because DGL processes messages on all nodes and edges in parallel, this greatly
# improves efficiency.
#
# Graph classifier
# ----------------
# Graph classification proceeds as follows.
#
# .. image:: https://data.dgl.ai/tutorial/batch/graph_classifier.png
#
# From a batch of graphs, perform message passing and graph convolution
# for nodes to communicate with others. After message passing, compute a
# tensor for graph representation from node (and edge) attributes. This step might
# be called readout or aggregation. Finally, the graph
# representations are fed into a classifier :math:`g` to predict the graph labels.
#
# Graph convolution layer can be found in the ``dgl.nn.<backend>`` submodule.
from dgl.nn.pytorch import GraphConv
###############################################################################
# Readout and classification
# --------------------------
# For this demonstration, consider initial node features to be their degrees.
# After two rounds of graph convolution, perform a graph readout by averaging
# over all node features for each graph in the batch.
#
# .. math::
#
# h_g=\frac{1}{|\mathcal{V}|}\sum_{v\in\mathcal{V}}h_{v}
#
# In DGL, :func:`dgl.mean_nodes` handles this task for a batch of
# graphs with variable size. You then feed the graph representations into a
# classifier with one linear layer to obtain pre-softmax logits.
import torch.nn as nn
import torch.nn.functional as F
class Classifier(nn.Module):
def __init__(self, in_dim, hidden_dim, n_classes):
super(Classifier, self).__init__()
self.conv1 = GraphConv(in_dim, hidden_dim)
self.conv2 = GraphConv(hidden_dim, hidden_dim)
self.classify = nn.Linear(hidden_dim, n_classes)
def forward(self, g):
# Use node degree as the initial node feature. For undirected graphs, the in-degree
# is the same as the out_degree.
h = g.in_degrees().view(-1, 1).float()
# Perform graph convolution and activation function.
h = F.relu(self.conv1(g, h))
h = F.relu(self.conv2(g, h))
g.ndata['h'] = h
# Calculate graph representation by averaging all the node representations.
hg = dgl.mean_nodes(g, 'h')
return self.classify(hg)
###############################################################################
# Setup and training
# ------------------
# Create a synthetic dataset of :math:`400` graphs with :math:`10` ~
# :math:`20` nodes. :math:`320` graphs constitute a training set and
# :math:`80` graphs constitute a test set.
import torch.optim as optim
from dgl.dataloading import GraphDataLoader
# Create training and test sets.
trainset = MiniGCDataset(320, 10, 20)
testset = MiniGCDataset(80, 10, 20)
# Use DGL's GraphDataLoader. It by default handles the
# graph batching operation for every mini-batch.
data_loader = GraphDataLoader(trainset, batch_size=32, shuffle=True)
# Create model
model = Classifier(1, 256, trainset.num_classes)
loss_func = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
model.train()
epoch_losses = []
for epoch in range(80):
epoch_loss = 0
for iter, (bg, label) in enumerate(data_loader):
prediction = model(bg)
loss = loss_func(prediction, label)
optimizer.zero_grad()
loss.backward()
optimizer.step()
epoch_loss += loss.detach().item()
epoch_loss /= (iter + 1)
print('Epoch {}, loss {:.4f}'.format(epoch, epoch_loss))
epoch_losses.append(epoch_loss)
###############################################################################
# The learning curve of a run is presented below.
plt.title('cross entropy averaged over minibatches')
plt.plot(epoch_losses)
plt.show()
###############################################################################
# The trained model is evaluated on the test set created. To deploy
# the tutorial, restrict the running time to get a higher
# accuracy (:math:`80` % ~ :math:`90` %) than the ones printed below.
model.eval()
# Convert a list of tuples to two lists
test_X, test_Y = map(list, zip(*testset))
test_bg = dgl.batch(test_X)
test_Y = torch.tensor(test_Y).float().view(-1, 1)
probs_Y = torch.softmax(model(test_bg), 1)
sampled_Y = torch.multinomial(probs_Y, 1)
argmax_Y = torch.max(probs_Y, 1)[1].view(-1, 1)
print('Accuracy of sampled predictions on the test set: {:.4f}%'.format(
(test_Y == sampled_Y.float()).sum().item() / len(test_Y) * 100))
print('Accuracy of argmax predictions on the test set: {:4f}%'.format(
(test_Y == argmax_Y.float()).sum().item() / len(test_Y) * 100))
###############################################################################
# The animation here plots the probability that a trained model predicts the correct graph type.
#
# .. image:: https://data.dgl.ai/tutorial/batch/test_eval4.gif
#
# To understand the node and graph representations that a trained model learned,
# we use `t-SNE, <https://lvdmaaten.github.io/tsne/>`_ for dimensionality reduction
# and visualization.
#
# .. image:: https://data.dgl.ai/tutorial/batch/tsne_node2.png
# :align: center
#
# .. image:: https://data.dgl.ai/tutorial/batch/tsne_graph2.png
# :align: center
#
# The two small figures on the top separately visualize node representations after one and two
# layers of graph convolution. The figure on the bottom visualizes
# the pre-softmax logits for graphs as graph representations.
#
# While the visualization does suggest some clustering effects of the node features,
# you would not expect a perfect result. Node degrees are deterministic for
# these node features. The graph features are improved when separated.
#
# What's next?
# ------------
# Graph classification with graph neural networks is still a new field.
# It's waiting for people to bring more exciting discoveries. The work requires
# mapping different graphs to different embeddings, while preserving
# their structural similarity in the embedding space. To learn more about it, see
# `How Powerful Are Graph Neural Networks? <https://arxiv.org/abs/1810.00826>`_ a research paper
# published for the International Conference on Learning Representations 2019.
#
# For more examples about batched graph processing, see the following:
#
# * Tutorials for `Tree LSTM <https://docs.dgl.ai/tutorials/models/2_small_graph/3_tree-lstm.html>`_ and `Deep Generative Models of Graphs <https://docs.dgl.ai/tutorials/models/3_generative_model/5_dgmg.html>`_
# * An example implementation of `Junction Tree VAE <https://github.com/dmlc/dgl/tree/master/examples/pytorch/jtnn>`_
"""
.. currentmodule:: dgl
Working with Heterogeneous Graphs
=================================
**Author**: Quan Gan, `Minjie Wang <https://jermainewang.github.io/>`_, Mufei Li,
George Karypis, Zheng Zhang
In this tutorial, you learn about:
* Examples of heterogenous graph data and typical applications.
* Creating and manipulating a heterogenous graph in DGL.
* Implementing `Relational-GCN <https://arxiv.org/abs/1703.06103>`_, a popular GNN model,
for heterogenous graph input.
* Training a model to solve a node classification task.
Heterogeneous graphs, or *heterographs* for short, are graphs that contain
different types of nodes and edges. The different types of nodes and edges tend
to have different types of attributes that are designed to capture the
characteristics of each node and edge type. Within the context of
graph neural networks, depending on their complexity, certain node and edge types
might need to be modeled with representations that have a different number of dimensions.
DGL supports graph neural network computations on such heterogeneous graphs, by
using the heterograph class and its associated API.
"""
###############################################################################
# Examples of heterographs
# -----------------------
# Many graph datasets represent relationships among various types of entities.
# This section provides an overview for several graph use-cases that show such relationships
# and can have their data represented as heterographs.
#
# Citation graph
# ~~~~~~~~~~~~~~~
# The Association for Computing Machinery publishes an `ACM dataset <https://aminer.org/citation>`_ that contains two
# million papers, their authors, publication venues, and the other papers
# that were cited. This information can be represented as a heterogeneous graph.
#
# The following diagram shows several entities in the ACM dataset and the relationships among them
# (taken from `Shi et al., 2015 <https://arxiv.org/pdf/1511.04854.pdf>`_).
#
# .. figure:: https://data.dgl.ai/tutorial/hetero/acm-example.png#
#
# This graph has three types of entities that correspond to papers, authors, and publication venues.
# It also contains three types of edges that connect the following:
#
# * Authors with papers corresponding to *written-by* relationships
#
# * Papers with publication venues corresponding to *published-in* relationships
#
# * Papers with other papers corresponding to *cited-by* relationships
#
#
# Recommender systems
# ~~~~~~~~~~~~~~~~~~~~
# The datasets used in recommender systems often contain
# interactions between users and items. For example, the data could include the
# ratings that users have provided to movies. Such interactions can be modeled
# as heterographs.
#
# The nodes in these heterographs will have two types, *users* and *movies*. The edges
# will correspond to the user-movie interactions. Furthermore, if an interaction is
# marked with a rating, then each rating value could correspond to a different edge type.
# The following diagram shows an example of user-item interactions as a heterograph.
#
# .. figure:: https://data.dgl.ai/tutorial/hetero/recsys-example.png
#
#
# Knowledge graph
# ~~~~~~~~~~~~~~~~
# Knowledge graphs are inherently heterogenous. For example, in
# Wikidata, Barack Obama (item Q76) is an instance of a human, which could be viewed as
# the entity class, whose spouse (item P26) is Michelle Obama (item Q13133) and
# occupation (item P106) is politician (item Q82955). The relationships are shown in the following.
# diagram.
#
# .. figure:: https://data.dgl.ai/tutorial/hetero/kg-example.png
#
###############################################################################
# Creating a heterograph in DGL
# -----------------------------
# You can create a heterograph in DGL using the :func:`dgl.heterograph` API.
# The argument to :func:`dgl.heterograph` is a dictionary. The keys are tuples
# in the form of ``(srctype, edgetype, dsttype)`` specifying the relation name
# and the two entity types it connects. Such tuples are called *canonical edge types*
# The values are data to initialize the graph structures, that is, which
# nodes the edges actually connect.
#
# For instance, the following code creates the user-item interactions heterograph shown earlier.
# Each value of the dictionary is a pair of source and destination arrays.
# Nodes are integer IDs starting from zero. Nodes IDs of different types have
# separate countings.
import dgl
import numpy as np
ratings = dgl.heterograph(
{('user', '+1', 'movie') : (np.array([0, 0, 1]), np.array([0, 1, 0])),
('user', '-1', 'movie') : (np.array([2]), np.array([1]))})
###############################################################################
# Manipulating heterograph
# ------------------------
# You can create a more realistic heterograph using the ACM dataset. To do this, first
# download the dataset as follows:
import scipy.io
import urllib.request
data_url = 'https://data.dgl.ai/dataset/ACM.mat'
data_file_path = '/tmp/ACM.mat'
urllib.request.urlretrieve(data_url, data_file_path)
data = scipy.io.loadmat(data_file_path)
print(list(data.keys()))
###############################################################################
# The dataset stores node information by their types: ``P`` for paper, ``A``
# for author, ``C`` for conference, ``L`` for subject code, and so on. The relationships
# are stored as SciPy sparse matrix under key ``XvsY``, where ``X`` and ``Y``
# could be any of the node type code.
#
# The following code prints out some statistics about the paper-author relationships.
print(type(data['PvsA']))
print('#Papers:', data['PvsA'].shape[0])
print('#Authors:', data['PvsA'].shape[1])
print('#Links:', data['PvsA'].nnz)
###############################################################################
# Converting this SciPy matrix to a heterograph in DGL is straightforward.
pa_g = dgl.heterograph({('paper', 'written-by', 'author') : data['PvsA'].nonzero()})
###############################################################################
# You can easily print out the type names and other structural information.
print('Node types:', pa_g.ntypes)
print('Edge types:', pa_g.etypes)
print('Canonical edge types:', pa_g.canonical_etypes)
# Nodes and edges are assigned integer IDs starting from zero and each type has its own counting.
# To distinguish the nodes and edges of different types, specify the type name as the argument.
print(pa_g.number_of_nodes('paper'))
# Canonical edge type name can be shortened to only one edge type name if it is
# uniquely distinguishable.
print(pa_g.number_of_edges(('paper', 'written-by', 'author')))
print(pa_g.number_of_edges('written-by'))
print(pa_g.successors(1, etype='written-by')) # get the authors that write paper #1
# Type name argument could be omitted whenever the behavior is unambiguous.
print(pa_g.number_of_edges()) # Only one edge type, the edge type argument could be omitted
###############################################################################
# A homogeneous graph is just a special case of a heterograph with only one type
# of node and edge.
# Paper-citing-paper graph is a homogeneous graph
pp_g = dgl.heterograph({('paper', 'citing', 'paper') : data['PvsP'].nonzero()})
# equivalent (shorter) API for creating homogeneous graph
pp_g = dgl.from_scipy(data['PvsP'])
# All the ntype and etype arguments could be omitted because the behavior is unambiguous.
print(pp_g.number_of_nodes())
print(pp_g.number_of_edges())
print(pp_g.successors(3))
###############################################################################
# Create a subset of the ACM graph using the paper-author, paper-paper,
# and paper-subject relationships. Meanwhile, also add the reverse
# relationship to prepare for the later sections.
G = dgl.heterograph({
('paper', 'written-by', 'author') : data['PvsA'].nonzero(),
('author', 'writing', 'paper') : data['PvsA'].transpose().nonzero(),
('paper', 'citing', 'paper') : data['PvsP'].nonzero(),
('paper', 'cited', 'paper') : data['PvsP'].transpose().nonzero(),
('paper', 'is-about', 'subject') : data['PvsL'].nonzero(),
('subject', 'has', 'paper') : data['PvsL'].transpose().nonzero(),
})
print(G)
###############################################################################
# **Metagraph** (or network schema) is a useful summary of a heterograph.
# Serving as a template for a heterograph, it tells how many types of objects
# exist in the network and where the possible links exist.
#
# DGL provides easy access to the metagraph, which could be visualized using
# external tools.
# Draw the metagraph using graphviz.
import pygraphviz as pgv
def plot_graph(nxg):
ag = pgv.AGraph(strict=False, directed=True)
for u, v, k in nxg.edges(keys=True):
ag.add_edge(u, v, label=k)
ag.layout('dot')
ag.draw('graph.png')
plot_graph(G.metagraph())
###############################################################################
# Learning tasks associated with heterographs
# -------------------------------------------
# Some of the typical learning tasks that involve heterographs include:
#
# * *Node classification and regression* to predict the class of each node or
# estimate a value associated with it.
#
# * *Link prediction* to predict if there is an edge of a certain
# type between a pair of nodes, or predict which other nodes a particular
# node is connected with (and optionally the edge types of such connections).
#
# * *Graph classification/regression* to assign an entire
# heterograph into one of the target classes or to estimate a numerical
# value associated with it.
#
# In this tutorial, we designed a simple example for the first task.
#
# A semi-supervised node classification example
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Our goal is to predict the publishing conference of a paper using the ACM
# academic graph we just created. To further simplify the task, we only focus
# on papers published in three conferences: *KDD*, *ICML*, and *VLDB*. All
# the other papers are not labeled, making it a semi-supervised setting.
#
# The following code extracts those papers from the raw dataset and prepares
# the training, validation, testing split.
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
pvc = data['PvsC'].tocsr()
# find all papers published in KDD, ICML, VLDB
c_selected = [0, 11, 13] # KDD, ICML, VLDB
p_selected = pvc[:, c_selected].tocoo()
# generate labels
labels = pvc.indices
labels[labels == 11] = 1
labels[labels == 13] = 2
labels = torch.tensor(labels).long()
# generate train/val/test split
pid = p_selected.row
shuffle = np.random.permutation(pid)
train_idx = torch.tensor(shuffle[0:800]).long()
val_idx = torch.tensor(shuffle[800:900]).long()
test_idx = torch.tensor(shuffle[900:]).long()
###############################################################################
# Relational-GCN on heterograph
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# We use `Relational-GCN <https://arxiv.org/abs/1703.06103>`_ to learn the
# representation of nodes in the graph. Its message-passing equation is as
# follows:
#
# .. math::
#
# h_i^{(l+1)} = \sigma\left(\sum_{r\in \mathcal{R}}
# \sum_{j\in\mathcal{N}_r(i)}W_r^{(l)}h_j^{(l)}\right)
#
# Breaking down the equation, you see that there are two parts in the
# computation.
#
# (i) Message computation and aggregation within each relation :math:`r`
#
# (ii) Reduction that merges the results from multiple relationships
#
# Following this intuition, perform message passing on a heterograph in
# two steps.
#
# (i) Per-edge-type message passing
#
# (ii) Type wise reduction
import dgl.function as fn
class HeteroRGCNLayer(nn.Module):
def __init__(self, in_size, out_size, etypes):
super(HeteroRGCNLayer, self).__init__()
# W_r for each relation
self.weight = nn.ModuleDict({
name : nn.Linear(in_size, out_size) for name in etypes
})
def forward(self, G, feat_dict):
# The input is a dictionary of node features for each type
funcs = {}
for srctype, etype, dsttype in G.canonical_etypes:
# Compute W_r * h
Wh = self.weight[etype](feat_dict[srctype])
# Save it in graph for message passing
G.nodes[srctype].data['Wh_%s' % etype] = Wh
# Specify per-relation message passing functions: (message_func, reduce_func).
# Note that the results are saved to the same destination feature 'h', which
# hints the type wise reducer for aggregation.
funcs[etype] = (fn.copy_u('Wh_%s' % etype, 'm'), fn.mean('m', 'h'))
# Trigger message passing of multiple types.
# The first argument is the message passing functions for each relation.
# The second one is the type wise reducer, could be "sum", "max",
# "min", "mean", "stack"
G.multi_update_all(funcs, 'sum')
# return the updated node feature dictionary
return {ntype : G.nodes[ntype].data['h'] for ntype in G.ntypes}
###############################################################################
# Create a simple GNN by stacking two ``HeteroRGCNLayer``. Since the
# nodes do not have input features, make their embeddings trainable.
class HeteroRGCN(nn.Module):
def __init__(self, G, in_size, hidden_size, out_size):
super(HeteroRGCN, self).__init__()
# Use trainable node embeddings as featureless inputs.
embed_dict = {ntype : nn.Parameter(torch.Tensor(G.number_of_nodes(ntype), in_size))
for ntype in G.ntypes}
for key, embed in embed_dict.items():
nn.init.xavier_uniform_(embed)
self.embed = nn.ParameterDict(embed_dict)
# create layers
self.layer1 = HeteroRGCNLayer(in_size, hidden_size, G.etypes)
self.layer2 = HeteroRGCNLayer(hidden_size, out_size, G.etypes)
def forward(self, G):
h_dict = self.layer1(G, self.embed)
h_dict = {k : F.leaky_relu(h) for k, h in h_dict.items()}
h_dict = self.layer2(G, h_dict)
# get paper logits
return h_dict['paper']
###############################################################################
# Train and evaluate
# ~~~~~~~~~~~~~~~~~~
# Train and evaluate this network.
# Create the model. The output has three logits for three classes.
model = HeteroRGCN(G, 10, 10, 3)
opt = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=5e-4)
best_val_acc = 0
best_test_acc = 0
for epoch in range(100):
logits = model(G)
# The loss is computed only for labeled nodes.
loss = F.cross_entropy(logits[train_idx], labels[train_idx])
pred = logits.argmax(1)
train_acc = (pred[train_idx] == labels[train_idx]).float().mean()
val_acc = (pred[val_idx] == labels[val_idx]).float().mean()
test_acc = (pred[test_idx] == labels[test_idx]).float().mean()
if best_val_acc < val_acc:
best_val_acc = val_acc
best_test_acc = test_acc
opt.zero_grad()
loss.backward()
opt.step()
if epoch % 5 == 0:
print('Loss %.4f, Train Acc %.4f, Val Acc %.4f (Best %.4f), Test Acc %.4f (Best %.4f)' % (
loss.item(),
train_acc.item(),
val_acc.item(),
best_val_acc.item(),
test_acc.item(),
best_test_acc.item(),
))
###############################################################################
# What's next?
# ------------
# * Check out our full implementation in PyTorch
# `here <https://github.com/dmlc/dgl/tree/master/examples/pytorch/rgcn-hetero>`_.
#
# * We also provide the following model examples:
#
# * `Graph Convolutional Matrix Completion <https://arxiv.org/abs/1706.02263>_`,
# which we implement in MXNet
# `here <https://github.com/dmlc/dgl/tree/v0.4.0/examples/mxnet/gcmc>`_.
#
# * `Heterogeneous Graph Attention Network <https://arxiv.org/abs/1903.07293>`_
# requires transforming a heterograph into a homogeneous graph according to
# a given metapath (i.e. a path template consisting of edge types). We
# provide :func:`dgl.transform.metapath_reachable_graph` to do this. See full
# implementation
# `here <https://github.com/dmlc/dgl/tree/master/examples/pytorch/han>`_.
#
# * `Metapath2vec <https://dl.acm.org/citation.cfm?id=3098036>`_ requires
# generating random walk paths according to a given metapath. Please
# refer to the full metapath2vec implementation
# `here <https://github.com/dmlc/dgl/tree/master/examples/pytorch/metapath2vec>`_.
#
# * :doc:`Full heterograph API reference <../../api/python/heterograph>`.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment