[Tutorial] New tutorials for small graphs (#2482)

* new tutorials for small graphs * address changes and use GraphDataLoader * fix and add data * fix load_data * style fixes * Update 5_graph_classification.py

[Tutorial] New tutorials for small graphs (#2482)
* new tutorials for small graphs * address changes and use GraphDataLoader * fix and add data * fix load_data * style fixes * Update 5_graph_classification.py
e0189397 · Quan (Andy) Gan · GitHub · 16169f3a · e0189397 · e0189397
Unverified Commit e0189397 authored Jan 10, 2021 by Quan (Andy) Gan Committed by GitHub Jan 10, 2021
9 changed files
--- a/docs/source/api/python/dgl.DGLGraph.rst
+++ b/docs/source/api/python/dgl.DGLGraph.rst
@@ -35,6 +35,8 @@ when the graph is heterogeneous.
    DGLGraph.metagraph
    DGLGraph.to_canonical_etype
+.. _apigraph-querying-graph-structure:
 Querying graph structure
 ------------------------

--- a/docs/source/conf.py
+++ b/docs/source/conf.py
@@ -196,8 +196,9 @@ intersphinx_mapping = {
 from sphinx_gallery.sorting import FileNameSortKey
 examples_dirs = ['../../tutorials/basics',
-                 '../../tutorials/models']  # path to find sources
+                 '../../tutorials/models',
-gallery_dirs = ['tutorials/basics', 'tutorials/models']  # path to generate docs
+                 '../../new-tutorial']  # path to find sources
+gallery_dirs = ['tutorials/basics', 'tutorials/models', 'new-tutorial']  # path to generate docs
 reference_url = {
    'dgl' : None,
    'numpy': 'http://docs.scipy.org/doc/numpy/',

--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -42,7 +42,7 @@ Getting Started
 ..
  Follow the :doc:`instructions<install/index>` to install DGL.
-  :doc:`DGL at a glance<tutorials/basics/1_first>` is the most common place to get started with.
+  :doc:`<new-tutorial/1_introduction>` is the most common place to get started with.
  It offers a broad experience of using DGL for deep learning on graph data.
  API reference document lists more endetailed specifications of each API and GNN modules,
@@ -50,9 +50,11 @@ Getting Started
  You can learn other basic concepts of DGL through the dedicated tutorials.
-  * Learn constructing graphs and set/get node and edge features :doc:`here<tutorials/basics/2_basics>`.
+  * Learn constructing, saving and loading graphs with node and edge features :doc:`here<new-tutorial/2_dglgraph>`.
-  * Learn performing computation on graph using message passing :doc:`here<tutorials/basics/3_pagerank>`.
+  * Learn performing computation on graph using message passing :doc:`here<new-tutorial/3_message_passing>`.
-  * Learn processing multiple graph samples in a batch :doc:`here<tutorials/basics/4_batch>`.
+  * Learn link prediction with DGL :doc:`here<new-tutorial/4_link_predict>`.
+  * Learn graph classification with DGL :doc:`here<new-tutorial/5_graph_classification>`.
+  * Learn creating your own dataset for DGL :doc:`here<new-tutorial/6_load_data>`.
  * Learn working with heterogeneous graph data :doc:`here<tutorials/basics/5_hetero>`.
  End-to-end model tutorials are other good starting points for learning DGL and popular
@@ -79,7 +81,7 @@ Getting Started
   install/index
   install/backend
-   tutorials/basics/1_first
+   new-tutorial/1_introduction
 .. toctree::
   :maxdepth: 2

--- a/new-tutorial/1_introduction.py
+++ b/new-tutorial/1_introduction.py
+"""
+A Blitz Introduction to DGL - Node Classification
+=================================================
+GNNs are powerful tools for many machine learning tasks on graphs. In
+this introductory tutorial, you will learn the basic workflow of using
+GNNs for node classification, i.e. predicting the category of a node in
+a graph.
+By completing this tutorial, you will be able to
+-  Load a DGL-provided dataset.
+-  Build a GNN model with DGL-provided neural network modules.
+-  Train and evaluate a GNN model for node classification on either CPU
+   or GPU.
+This tutorial assumes that you have experience in building neural
+networks with PyTorch.
+(Time estimate: 13 minutes)
+"""
+import dgl
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+######################################################################
+# Overview of Node Classification with GNN
+# ----------------------------------------
+#
+# One of the most popular and widely adopted tasks on graph data is node
+# classification, where a model needs to predict the ground truth category
+# of each node. Before graph neural networks, many proposed methods are
+# using either connectivity alone (such as DeepWalk or node2vec), or simple
+# combinations of connectivity and the node's own features.  GNNs, by
+# contrast, offers an opportunity to obtain node representations by
+# combining the connectivity and features of a *local neighborhood*.
+#
+# `Kipf et
+# al., <https://arxiv.org/abs/1609.02907>`__ is an example that formulates
+# the node classification problem as a semi-supervised node classification
+# task. With the help of only a small portion of labeled nodes, a graph
+# neural network (GNN) can accurately predict the node category of the
+# others.
+# 
+# This tutorial will show how to build such a GNN for semi-supervised node
+# classification with only a small number of labels on the Cora
+# dataset,
+# a citation network with papers as nodes and citations as edges. The task
+# is to predict the category of a given paper. Each paper node contains a
+# word count vector as its features, normalized so that they sum up to one,
+# as described in Section 5.2 of
+# `the paper <https://arxiv.org/abs/1609.02907>`__.
+# 
+# Loading Cora Dataset
+# --------------------
+# 
+import dgl.data
+dataset = dgl.data.CoraGraphDataset()
+print('Number of categories:', dataset.num_classes)
+######################################################################
+# A DGL Dataset object may contain one or multiple graphs. The Cora
+# dataset used in this tutorial only consists of one single graph.
+# 
+g = dataset[0]
+######################################################################
+# A DGL graph can store node features and edge features in two
+# dictionary-like attributes called ``ndata`` and ``edata``.
+# In the DGL Cora dataset, the graph contains the following node features:
+# 
+# - ``train_mask``: A boolean tensor indicating whether the node is in the
+#   training set.
+#
+# - ``val_mask``: A boolean tensor indicating whether the node is in the
+#   validation set.
+#
+# - ``test_mask``: A boolean tensor indicating whether the node is in the
+#   test set.
+#
+# - ``label``: The ground truth node category.
+#
+# -  ``feat``: The node features.
+# 
+print('Node features')
+print(g.ndata)
+print('Edge features')
+print(g.edata)
+######################################################################
+# Defining a Graph Convolutional Network (GCN)
+# --------------------------------------------
+# 
+# This tutorial will build a two-layer `Graph Convolutional Network
+# (GCN) <http://tkipf.github.io/graph-convolutional-networks/>`__. Each
+# layer computes new node representations by aggregating neighbor
+# information.
+# 
+# To build a multi-layer GCN you can simply stack ``dgl.nn.GraphConv``
+# modules, which inherit ``torch.nn.Module``.
+# 
+from dgl.nn import GraphConv
+class GCN(nn.Module):
+    def __init__(self, in_feats, h_feats, num_classes):
+        super(GCN, self).__init__()
+        self.conv1 = GraphConv(in_feats, h_feats)
+        self.conv2 = GraphConv(h_feats, num_classes)
+    def forward(self, g, in_feat):
+        h = self.conv1(g, in_feat)
+        h = F.relu(h)
+        h = self.conv2(g, h)
+        return h
+# Create the model with given dimensions
+model = GCN(g.ndata['feat'].shape[1], 16, dataset.num_classes)
+######################################################################
+# DGL provides implementation of many popular neighbor aggregation
+# modules. You can easily invoke them with one line of code.
+# 
+######################################################################
+# Training the GCN
+# ----------------
+# 
+# Training this GCN is similar to training other PyTorch neural networks.
+# 
+def train(g, model):
+    optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
+    best_val_acc = 0
+    best_test_acc = 0
+    features = g.ndata['feat']
+    labels = g.ndata['label']
+    train_mask = g.ndata['train_mask']
+    val_mask = g.ndata['val_mask']
+    test_mask = g.ndata['test_mask']
+    for e in range(100):
+        # Forward
+        logits = model(g, features)
+        # Compute prediction
+        pred = logits.argmax(1)
+        # Compute loss
+        # Note that you should only compute the losses of the nodes in the training set.
+        loss = F.cross_entropy(logits[train_mask], labels[train_mask])
+        # Compute accuracy on training/validation/test
+        train_acc = (pred[train_mask] == labels[train_mask]).float().mean()
+        val_acc = (pred[val_mask] == labels[val_mask]).float().mean()
+        test_acc = (pred[test_mask] == labels[test_mask]).float().mean()
+        # Save the best validation accuracy and the corresponding test accuracy.
+        if best_val_acc < val_acc:
+            best_val_acc = val_acc
+            best_test_acc = test_acc
+        # Backward
+        optimizer.zero_grad()
+        loss.backward()
+        optimizer.step()
+        if e % 5 == 0:
+            print('In epoch {}, loss: {:.3f}, val acc: {:.3f} (best {:.3f}), test acc: {:.3f} (best {:.3f})'.format(
+                e, loss, val_acc, best_val_acc, test_acc, best_test_acc))
+model = GCN(g.ndata['feat'].shape[1], 16, dataset.num_classes)
+train(g, model)
+######################################################################
+# Training on GPU
+# ---------------
+# 
+# Training on GPU requires to put both the model and the graph onto GPU
+# with the ``to`` method, similar to what you will do in PyTorch.
+# 
+g = g.to('cuda')
+model = GCN(g.ndata['feat'].shape[1], 16, dataset.num_classes).to('cuda')
+train(g, model)
+######################################################################
+# What’s next?
+# ------------
+# 
+# -  :doc:`How does DGL represent a graph <2_dglgraph>`?
+# -  :doc:`Write your own GNN module <3_message_passing>`.
+# -  :doc:`Link prediction (predicting existence of edges) on full
+#    graph <4_link_predict>`.
+# -  :doc:`Graph classification <5_graph_classification>`.
+# -  :doc:`Make your own dataset <6_load_data>`.
+# -  :ref:`The list of supported graph convolution
+#    modules <apinn-pytorch>`.
+# -  :ref:`The list of datasets provided by DGL <apidata>`.
+# 
--- a/new-tutorial/2_dglgraph.py
+++ b/new-tutorial/2_dglgraph.py
+"""
+How Does DGL Represent A Graph?
+===============================
+By the end of this tutorial you will be able to:
+-  Construct a graph in DGL from scratch.
+-  Assign node and edge features to a graph.
+-  Query properties of a DGL graph such as node degrees and
+   connectivity.
+-  Transform a DGL graph into another graph.
+-  Load and save DGL graphs.
+(Time estimate: 16 minutes)
+"""
+######################################################################
+# DGL Graph Construction
+# ----------------------
+# 
+# DGL represents a directed graph as a ``DGLGraph`` object. You can
+# construct a graph by specifying the number of nodes in the graph as well
+# as the list of source and destination nodes.  Nodes in the graph have
+# consecutive IDs starting from 0.
+# 
+# For instance, the following code constructs a directed star graph with 5
+# leaves. The center node's ID is 0. The edges go from the
+# center node to the leaves.
+# 
+import dgl
+import numpy as np
+import torch
+g = dgl.graph(([0, 0, 0, 0, 0], [1, 2, 3, 4, 5]), num_nodes=6)
+# Equivalently, PyTorch LongTensors also work.
+g = dgl.graph((torch.LongTensor([0, 0, 0, 0, 0]), torch.LongTensor([1, 2, 3, 4, 5])), num_nodes=6)
+# You can omit the number of nodes argument if you can tell the number of nodes from the edge list alone.
+g = dgl.graph(([0, 0, 0, 0, 0], [1, 2, 3, 4, 5]))
+######################################################################
+# Edges in the graph have consecutive IDs starting from 0, and are
+# in the same order as the list of source and destination nodes during
+# creation.
+# 
+# Print the source and destination nodes of every edge.
+print(g.edges())
+######################################################################
+# .. note::
+# 
+#    ``DGLGraph``'s are always directed to best fit the computation
+#    pattern of graph neural networks, where the messages sent
+#    from one node to the other are often different between both
+#    directions. If you want to handle undirected graphs, you may consider
+#    treating it as a bidirectional graph. See `Graph
+#    Transformations`_ for an example of making
+#    a bidirectional graph.
+# 
+######################################################################
+# Assigning Node and Edge Features to Graph
+# -----------------------------------------
+# 
+# Many graph data contain attributes on nodes and edges.
+# Although the types of node and edge attributes can be arbitrary in real
+# world, ``DGLGraph`` only accepts attributes stored in tensors (with
+# numerical contents). Consequently, an attribute of all the nodes or
+# edges must have the same shape. In the context of deep learning, those
+# attributes are often called *features*.
+# 
+# You can assign and retrieve node and edge features via ``ndata`` and
+# ``edata`` interface.
+# 
+# Assign a 3-dimensional node feature vector for each node.
+g.ndata['x'] = torch.randn(6, 3)
+# Assign a 4-dimensional edge feature vector for each edge.
+g.edata['a'] = torch.randn(5, 4)
+# Assign a 5x4 node feature matrix for each node.  Node and edge features in DGL can be multi-dimensional.
+g.ndata['y'] = torch.randn(6, 5, 4)
+print(g.edata['a'])
+######################################################################
+# .. note::
+# 
+#    The vast development of deep learning has provided us many
+#    ways to encode various types of attributes into numerical features.
+#    Here are some general suggestions:
+# 
+#    -  For categorical attributes (e.g. gender, occupation), consider
+#       converting them to integers or one-hot encoding.
+#    -  For variable length string contents (e.g. news article, quote),
+#       consider applying a language model.
+#    -  For images, consider applying a vision model such as CNNs.
+# 
+#    You can find plenty of materials on how to encode such attributes
+#    into a tensor in the `PyTorch Deep Learning
+#    Tutorials <https://pytorch.org/tutorials/>`__.
+# 
+######################################################################
+# Querying Graph Structures
+# -------------------------
+# 
+# ``DGLGraph`` object provides various methods to query a graph structure.
+# 
+print(g.num_nodes())
+print(g.num_edges())
+# Out degrees of the center node
+print(g.out_degrees(0))
+# In degrees of the center node - note that the graph is directed so the in degree should be 0.
+print(g.in_degrees(0))
+######################################################################
+# Graph Transformations
+# ---------------------
+# 
+######################################################################
+# DGL provides many APIs to transform a graph to another such as
+# extracting a subgraph:
+# 
+# Induce a subgraph from node 0, node 1 and node 3 from the original graph.
+sg1 = g.subgraph([0, 1, 3])
+# Induce a subgraph from edge 0, edge 1 and edge 3 from the original graph.
+sg2 = g.edge_subgraph([0, 1, 3])
+######################################################################
+# You can obtain the node/edge mapping from the subgraph to the original
+# graph by looking into the node feature ``dgl.NID`` or edge feature
+# ``dgl.EID`` in the new graph.
+# 
+# The original IDs of each node in sg1
+print(sg1.ndata[dgl.NID])
+# The original IDs of each edge in sg1
+print(sg1.edata[dgl.EID])
+# The original IDs of each node in sg2
+print(sg2.ndata[dgl.NID])
+# The original IDs of each edge in sg2
+print(sg2.edata[dgl.EID])
+######################################################################
+# ``subgraph`` and ``edge_subgraph`` also copies the original features
+# to the subgraph:
+#
+# The original node feature of each node in sg1
+print(sg1.ndata['x'])
+# The original edge feature of each node in sg1
+print(sg1.edata['a'])
+# The original node feature of each node in sg2
+print(sg2.ndata['x'])
+# The original edge feature of each node in sg2
+print(sg2.edata['a'])
+######################################################################
+# Another common transformation is to add a reverse edge for each edge in
+# the original graph with ``dgl.add_reverse_edges``.
+# 
+# .. note::
+# 
+#    If you have an undirected graph, it is better to convert it
+#    into a bidirectional graph first via adding reverse edges.
+# 
+newg = dgl.add_reverse_edges(g)
+newg.edges()
+######################################################################
+# Loading and Saving Graphs
+# -------------------------
+# 
+# You can save a graph or a list of graphs via ``dgl.save_graphs`` and
+# load them back with ``dgl.load_graphs``.
+# 
+# Save graphs
+dgl.save_graphs('graph.dgl', g)
+dgl.save_graphs('graphs.dgl', [g, sg1, sg2])
+# Load graphs
+(g,), _ = dgl.load_graphs('graph.dgl')
+print(g)
+(g, sg1, sg2), _ = dgl.load_graphs('graphs.dgl')
+print(g)
+print(sg1)
+print(sg2)
+######################################################################
+# What’s next?
+# ------------
+# 
+# -  See
+#    :ref:`here <apigraph-querying-graph-structure>`
+#    for a list of graph structure query APIs.
+# -  See
+#    :ref:`here <api-subgraph-extraction>`
+#    for a list of subgraph extraction routines.
+# -  See
+#    :ref:`here <api-transform>`
+#    for a list of graph transformation routines.
+# -  API reference of :func:`dgl.save_graphs`
+#    and
+#    :func:`dgl.load_graphs`
+# 
--- a/new-tutorial/3_message_passing.py
+++ b/new-tutorial/3_message_passing.py
+"""
+Write your own GNN module
+=========================
+Sometimes, your model goes beyond simply stacking existing GNN modules.
+For example, you would like to invent a new way of aggregating neighbor
+information by considering node importance or edge weights.
+By the end of this tutorial you will be able to
+-  Understand DGL’s message passing APIs.
+-  Implement GraphSAGE convolution module by your own.
+This tutorial assumes that you already know :doc:`the basics of training a
+GNN for node classification <1_introduction>`.
+(Time estimate: 10 minutes)
+"""
+import dgl
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+######################################################################
+# Message passing and GNNs
+# ------------------------
+# 
+# DGL follows the *message passing paradigm* inspired by the Message
+# Passing Neural Network proposed by `Gilmer et
+# al. <https://arxiv.org/abs/1704.01212>`__ Essentially, they found many
+# GNN models can fit into the following framework:
+# 
+# .. math::
+# 
+# 
+#    m_{u\to v}^{(l)} = M^{(l)}\left(h_v^{(l-1)}, h_u^{(l-1)}, e_{u\to v}^{(l-1)}\right)
+# 
+# .. math::
+# 
+# 
+#    m_{v}^{(l)} = \sum_{u\in\mathcal{N}(v)}m_{u\to v}^{(l)}
+# 
+# .. math::
+# 
+# 
+#    h_v^{(l)} = U^{(l)}\left(h_v^{(l-1)}, m_v^{(l)}\right)
+# 
+# where DGL calls :math:`M^{(l)}` the *message function*, :math:`\sum` the
+# *reduce function* and :math:`U^{(l)}` the *update function*. Note that
+# :math:`\sum` here can represent any function and is not necessarily a
+# summation.
+# 
+######################################################################
+# For example, the `GraphSAGE convolution (Hamilton et al.,
+# 2017) <https://cs.stanford.edu/people/jure/pubs/graphsage-nips17.pdf>`__
+# takes the following mathematical form:
+# 
+# .. math::
+# 
+# 
+#    h_{\mathcal{N}(v)}^k\leftarrow \text{Average}\{h_u^{k-1},\forall u\in\mathcal{N}(v)\}
+# 
+# .. math::
+# 
+# 
+#    h_v^k\leftarrow \text{ReLU}\left(W^k\cdot \text{CONCAT}(h_v^{k-1}, h_{\mathcal{N}(v)}^k) \right)
+# 
+# You can see that message passing is directional: the message sent from
+# one node :math:`u` to other node :math:`v` is not necessarily the same
+# as the other message sent from node :math:`v` to node :math:`u` in the
+# opposite direction.
+# 
+# Although DGL has builtin support of GraphSAGE via
+# :class:```dgl.nn.SAGEConv`` <dgl.nn.pytorch.SAGEConv>`,
+# here is how you can implement GraphSAGE convolution in DGL by your own.
+# 
+import dgl.function as fn
+class SAGEConv(nn.Module):
+    """Graph convolution module used by the GraphSAGE model.
+    Parameters
+    ----------
+    in_feat : int
+        Input feature size.
+    out_feat : int
+        Output feature size.
+    """
+    def __init__(self, in_feat, out_feat):
+        super(SAGEConv, self).__init__()
+        # A linear submodule for projecting the input and neighbor feature to the output.
+        self.linear = nn.Linear(in_feat * 2, out_feat)
+    def forward(self, g, h):
+        """Forward computation
+        Parameters
+        ----------
+        g : Graph
+            The input graph.
+        h : Tensor
+            The input node feature.
+        """
+        with g.local_scope():
+            g.ndata['h'] = h
+            # update_all is a message passing API.
+            g.update_all(message_func=fn.copy_u('h', 'm'), reduce_func=fn.mean('m', 'h_N'))
+            h_N = g.ndata['h_N']
+            h_total = torch.cat([h, h_N], dim=1)
+            return self.linear(h_total)
+######################################################################
+# The central piece in this code is the
+# :func:```g.update_all`` <dgl.DGLGraph.update_all>`
+# function, which gathers and averages the neighbor features. There are
+# three concepts here:
+#
+# * Message function ``fn.copy_u('h', 'm')`` that
+#   copies the node feature under name ``'h'`` as *messages* sent to
+#   neighbors.
+#
+# * Reduce function ``fn.mean('m', 'h_N')`` that averages
+#   all the received messages under name ``'m'`` and saves the result as a
+#   new node feature ``'h_N'``.
+#
+# * ``update_all`` tells DGL to trigger the
+#   message and reduce functions for all the nodes and edges.
+# 
+######################################################################
+# Afterwards, you can stack your own GraphSAGE convolution layers to form
+# a multi-layer GraphSAGE network.
+#
+class Model(nn.Module):
+    def __init__(self, in_feats, h_feats, num_classes):
+        super(Model, self).__init__()
+        self.conv1 = SAGEConv(in_feats, h_feats)
+        self.conv2 = SAGEConv(h_feats, num_classes)
+    def forward(self, g, in_feat):
+        h = self.conv1(g, in_feat)
+        h = F.relu(h)
+        h = self.conv2(g, h)
+        return h
+######################################################################
+# Training loop
+# ~~~~~~~~~~~~~
+# The following code for data loading and training loop is directly copied
+# from the introduction tutorial.
+# 
+import dgl.data
+dataset = dgl.data.CoraGraphDataset()
+g = dataset[0]
+def train(g, model):
+    optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
+    all_logits = []
+    best_val_acc = 0
+    best_test_acc = 0
+    features = g.ndata['feat']
+    labels = g.ndata['label']
+    train_mask = g.ndata['train_mask']
+    val_mask = g.ndata['val_mask']
+    test_mask = g.ndata['test_mask']
+    for e in range(200):
+        # Forward
+        logits = model(g, features)
+        # Compute prediction
+        pred = logits.argmax(1)
+        # Compute loss
+        # Note that we should only compute the losses of the nodes in the training set,
+        # i.e. with train_mask 1.
+        loss = F.cross_entropy(logits[train_mask], labels[train_mask])
+        # Compute accuracy on training/validation/test
+        train_acc = (pred[train_mask] == labels[train_mask]).float().mean()
+        val_acc = (pred[val_mask] == labels[val_mask]).float().mean()
+        test_acc = (pred[test_mask] == labels[test_mask]).float().mean()
+        # Save the best validation accuracy and the corresponding test accuracy.
+        if best_val_acc < val_acc:
+            best_val_acc = val_acc
+            best_test_acc = test_acc
+        # Backward
+        optimizer.zero_grad()
+        loss.backward()
+        optimizer.step()
+        all_logits.append(logits.detach())
+        if e % 5 == 0:
+            print('In epoch {}, loss: {:.3f}, val acc: {:.3f} (best {:.3f}), test acc: {:.3f} (best {:.3f})'.format(
+                e, loss, val_acc, best_val_acc, test_acc, best_test_acc))
+model = Model(g.ndata['feat'].shape[1], 16, dataset.num_classes)
+train(g, model)
+######################################################################
+# More customization
+# ------------------
+# 
+# In DGL, we provide many built-in message and reduce functions under the
+# ``dgl.function`` package. You can find more details in :ref:`the API
+# doc <apifunction>`.
+# 
+######################################################################
+# These APIs allow one to quickly implement new graph convolution modules.
+# For example, the following implements a new ``SAGEConv`` that aggregates
+# neighbor representations using a weighted average. Note that ``edata``
+# member can hold edge features which can also take part in message
+# passing.
+# 
+class WeightedSAGEConv(nn.Module):
+    """Graph convolution module used by the GraphSAGE model with edge weights.
+    Parameters
+    ----------
+    in_feat : int
+        Input feature size.
+    out_feat : int
+        Output feature size.
+    """
+    def __init__(self, in_feat, out_feat):
+        super(WeightedSAGEConv, self).__init__()
+        # A linear submodule for projecting the input and neighbor feature to the output.
+        self.linear = nn.Linear(in_feat * 2, out_feat)
+    def forward(self, g, h, w):
+        """Forward computation
+        Parameters
+        ----------
+        g : Graph
+            The input graph.
+        h : Tensor
+            The input node feature.
+        w : Tensor
+            The edge weight.
+        """
+        with g.local_scope():
+            g.ndata['h'] = h
+            g.edata['w'] = w
+            g.update_all(message_func=fn.u_mul_e('h', 'w', 'm'), reduce_func=fn.mean('m', 'h_N'))
+            h_N = g.ndata['h_N']
+            h_total = torch.cat([h, h_N], dim=1)
+            return self.linear(h_total)
+######################################################################
+# Because the graph in this dataset does not have edge weights, we
+# manually assign all edge weights to one in the ``forward()`` function of
+# the model. You can replace it with your own edge weights.
+# 
+class Model(nn.Module):
+    def __init__(self, in_feats, h_feats, num_classes):
+        super(Model, self).__init__()
+        self.conv1 = WeightedSAGEConv(in_feats, h_feats)
+        self.conv2 = WeightedSAGEConv(h_feats, num_classes)
+    def forward(self, g, in_feat):
+        h = self.conv1(g, in_feat, torch.ones(g.num_edges()).to(g.device))
+        h = F.relu(h)
+        h = self.conv2(g, h, torch.ones(g.num_edges()).to(g.device))
+        return h
+model = Model(g.ndata['feat'].shape[1], 16, dataset.num_classes)
+train(g, model)
+######################################################################
+# Even more customization by user-defined function
+# ------------------------------------------------
+# 
+# DGL allows user-defined message and reduce function for the maximal
+# expressiveness. Here is a user-defined message function that is
+# equivalent to ``fn.u_mul_e('h', 'w', 'm')``.
+# 
+def u_mul_e_udf(edges):
+    return {'m' : edges.src['h'] * edges.data['w']}
+######################################################################
+# ``edges`` has three members: ``src``, ``data`` and ``dst``, representing
+# the source node feature, edge feature, and destination node feature for
+# all edges.
+# 
+######################################################################
+# You can also write your own reduce function. For example, the following
+# is equivalent to the builtin ``fn.sum('m', 'h')`` function that sums up
+# the incoming messages:
+# 
+def sum_udf(nodes):
+    return {'h': nodes.mailbox['m'].sum(1)}
+######################################################################
+# In short, DGL will group the nodes by their in-degrees, and for each
+# group DGL stacks the incoming messages along the second dimension. You 
+# can then perform a reduction along the second dimension to aggregate
+# messages.
+# 
+# For more details on customizing message and reduce function with
+# user-defined function, please refer to the :ref:`API
+# reference <apiudf>`.
+# 
+######################################################################
+# Best practice of writing custom GNN modules
+# -------------------------------------------
+# 
+# DGL recommends the following practice ranked by preference:
+# 
+# -  Use ``dgl.nn`` modules.
+# -  Use ``dgl.nn.functional`` functions which contain lower-level complex
+#    operations such as computing a softmax for each node over incoming
+#    edges.
+# -  Use ``update_all`` with builtin message and reduce functions.
+# -  Use user-defined message or reduce functions.
+# 
+######################################################################
+# What’s next?
+# ------------
+# 
+# -  :ref:`Writing Efficient Message Passing
+#    Code <guide-message-passing-efficient>`.
+# 
--- a/new-tutorial/4_link_predict.py
+++ b/new-tutorial/4_link_predict.py
+"""
+Link Prediction using Graph Neural Networks
+===========================================
+In the :doc:`introduction <1_introduction>`, you have already learned the
+basic workflow of using GNNs for node classification, i.e. predicting
+the category of a node in a graph. This tutorial will teach you how to
+train a GNN for link prediction, i.e. predicting the existence of an
+edge between two arbitrary nodes in a graph.
+By the end of this tutorial you will be able to
+-  Build a GNN-based link prediction model.
+-  Train and evaluate the model on a small DGL-provided dataset.
+(Time estimate: 20 minutes)
+"""
+import dgl
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import itertools
+import numpy as np
+import scipy.sparse as sp
+######################################################################
+# Overview of Link Prediction with GNN
+# ------------------------------------
+# 
+# Many applications such as social recommendation, item recommendation,
+# knowledge graph completion, etc., can be formulated as link prediction,
+# which predicts whether an edge exists between two particular nodes. This
+# tutorial shows an example of predicting whether a citation relationship,
+# either citing or being cited, between two papers exists in a citation
+# network.
+# 
+# This tutorial follows a relatively simple practice from
+# `SEAL <https://papers.nips.cc/paper/2018/file/53f0d7c537d99b3824f0f99d62ea2428-Paper.pdf>`__.
+# It formulates the link prediction problem as a binary classification
+# problem as follows:
+# 
+# -  Treat the edges in the graph as *positive examples*.
+# -  Sample a number of non-existent edges (i.e. node pairs with no edges
+#    between them) as *negative* examples.
+# -  Divide the positive examples and negative examples into a training
+#    set and a test set.
+# -  Evaluate the model with any binary classification metric such as Area
+#    Under Curve (AUC).
+# 
+# In some domains such as large-scale recommender systems or information
+# retrieval, you may favor metrics that emphasize good performance of
+# top-K predictions. In these cases you may want to consider other metrics
+# such as mean average precision, and use other negative sampling methods,
+# which are beyond the scope of this tutorial.
+# 
+# Loading graph and features
+# --------------------------
+# 
+# Following the :doc:`introduction <1_introduction>`, we first load the
+# Cora dataset.
+# 
+import dgl.data
+dataset = dgl.data.CoraGraphDataset()
+g = dataset[0]
+######################################################################
+# Preparing training and testing sets
+# -----------------------------------
+# 
+# This tutorial randomly picks 10% of the edges for positive examples in
+# the test set, and leave the rest for the training set. It then samples
+# the same number of edges for negative examples in both sets.
+# 
+# Split edge set for training and testing
+u, v = g.edges()
+eids = np.arange(g.number_of_edges())
+eids = np.random.permutation(eids)
+test_size = int(len(eids) * 0.1)
+train_size = g.number_of_edges() - test_size
+test_pos_u, test_pos_v = u[eids[:test_size]], v[eids[:test_size]]
+train_pos_u, train_pos_v = u[eids[test_size:]], v[eids[test_size:]]
+# Find all negative edges and split them for training and testing
+adj = sp.coo_matrix((np.ones(len(u)), (u.numpy(), v.numpy())))
+adj_neg = 1 - adj.todense() - np.eye(g.number_of_nodes())
+neg_u, neg_v = np.where(adj_neg != 0)
+neg_eids = np.random.choice(len(neg_u), g.number_of_edges() // 2)
+test_neg_u, test_neg_v = neg_u[neg_eids[:test_size]], neg_v[neg_eids[:test_size]]
+train_neg_u, train_neg_v = neg_u[neg_eids[test_size:]], neg_v[neg_eids[test_size:]]
+# Create training set.
+train_u = torch.cat([torch.as_tensor(train_pos_u), torch.as_tensor(train_neg_u)])
+train_v = torch.cat([torch.as_tensor(train_pos_v), torch.as_tensor(train_neg_v)])
+train_label = torch.cat([torch.zeros(len(train_pos_u)), torch.ones(len(train_neg_u))])
+# Create testing set.
+test_u = torch.cat([torch.as_tensor(test_pos_u), torch.as_tensor(test_neg_u)])
+test_v = torch.cat([torch.as_tensor(test_pos_v), torch.as_tensor(test_neg_v)])
+test_label = torch.cat([torch.zeros(len(test_pos_u)), torch.ones(len(test_neg_u))])
+######################################################################
+# When training, you will need to remove the edges in the test set from
+# the original graph. You can do this via ``dgl.remove_edges``.
+#
+# .. note::
+#
+#    ``dgl.remove_edges`` works by creating a subgraph from the original
+#    graph, resulting in a copy and therefore could be slow for large
+#    graphs.  If so, you could save the training and test graph to
+#    disk, as you would do for preprocessing.
+# 
+train_g = dgl.remove_edges(g, eids[:test_size])
+######################################################################
+# Defining a GraphSAGE model
+# --------------------------
+# 
+# This tutorial builds a model consisting of two
+# `GraphSAGE <https://arxiv.org/abs/1706.02216>`__ layers, each computes
+# new node representations by averaging neighbor information. DGL provides
+# ``dgl.nn.SAGEConv`` that conveniently creates a GraphSAGE layer.
+# 
+from dgl.nn import SAGEConv
+# ----------- 2. create model -------------- #
+# build a two-layer GraphSAGE model
+class GraphSAGE(nn.Module):
+    def __init__(self, in_feats, h_feats):
+        super(GraphSAGE, self).__init__()
+        self.conv1 = SAGEConv(in_feats, h_feats, 'mean')
+        self.conv2 = SAGEConv(h_feats, h_feats, 'mean')
+    def forward(self, g, in_feat):
+        h = self.conv1(g, in_feat)
+        h = F.relu(h)
+        h = self.conv2(g, h)
+        return h
+model = GraphSAGE(train_g.ndata['feat'].shape[1], 16)
+######################################################################
+# The model then predicts the probability of existence of an edge by
+# computing a dot product between the representations of both incident
+# nodes.
+# 
+# .. math::
+# 
+# 
+#    \hat{y}_{u\sim v} = \sigma(h_u^T h_v)
+# 
+# The loss function is simply binary cross entropy loss.
+# 
+# .. math::
+# 
+# 
+#    \mathcal{L} = -\sum_{u\sim v\in \mathcal{D}}\left( y_{u\sim v}\log(\hat{y}_{u\sim v}) + (1-y_{u\sim v})\log(1-\hat{y}_{u\sim v})) \right)
+# 
+# .. note::
+# 
+#    This tutorial does not include evaluation on a validation
+#    set. In practice you should save and evaluate the best model based on
+#    performance on the validation set.
+# 
+# ----------- 3. set up loss and optimizer -------------- #
+# in this case, loss will in training loop
+optimizer = torch.optim.Adam(itertools.chain(model.parameters()), lr=0.01)
+# ----------- 4. training -------------------------------- #
+for e in range(100):
+    # forward
+    logits = model(train_g, train_g.ndata['feat'])
+    pred = torch.sigmoid((logits[train_u] * logits[train_v]).sum(dim=1))
+    # compute loss
+    loss = F.binary_cross_entropy(pred, train_label)
+    # backward
+    optimizer.zero_grad()
+    loss.backward()
+    optimizer.step()
+    if e % 5 == 0:
+        print('In epoch {}, loss: {}'.format(e, loss))
+# ----------- 5. check results ------------------------ #
+from sklearn.metrics import roc_auc_score
+with torch.no_grad():
+    pred = torch.sigmoid((logits[test_u] * logits[test_v]).sum(dim=1))
+    pred = pred.numpy()
+    label = test_label.numpy()
+    print('AUC', roc_auc_score(label, pred))
--- a/new-tutorial/5_graph_classification.py
+++ b/new-tutorial/5_graph_classification.py
+"""
+Training a GNN for Graph Classification
+=======================================
+By the end of this tutorial, you will be able to
+-  Load a DGL-provided graph classification dataset.
+-  Understand what *readout* function does.
+-  Understand how to create and use a minibatch of graphs.
+-  Build a GNN-based graph classification model.
+-  Train and evaluate the model on a DGL-provided dataset.
+(Time estimate: 18 minutes)
+"""
+import dgl
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+######################################################################
+# Overview of Graph Classification with GNN
+# -----------------------------------------
+# 
+# Graph classification or regression requires a model to predict certain
+# graph-level properties of a single graph given its node and edge
+# features.  Molecular property prediction is one particular application.
+# 
+# This tutorial shows how to train a graph classification model for a
+# small dataset from the paper `How Powerful Are Graph Neural
+# Networks <https://arxiv.org/abs/1810.00826>`__.
+# 
+# Loading Data
+# ------------
+# 
+import dgl.data
+# Generate a synthetic dataset with 10000 graphs, ranging from 10 to 500 nodes.
+dataset = dgl.data.GINDataset('PROTEINS', self_loop=True)
+######################################################################
+# The dataset is a set of graphs, each with node features and a single
+# label. One can see the node feature dimensionality and the number of
+# possible graph categories of ``GINDataset`` objects in ``dim_nfeats``
+# and ``gclasses`` attributes.
+# 
+print('Node feature dimensionality:', dataset.dim_nfeats)
+print('Number of graph categories:', dataset.gclasses)
+######################################################################
+# Defining Data Loader
+# --------------------
+# 
+# A graph classification dataset usually contains two types of elements: a
+# set of graphs, and their graph-level labels. Similar to an image
+# classification task, when the dataset is large enough, we need to train
+# with mini-batches. When you train a model for image classification or
+# language modeling, you will use a ``DataLoader`` to iterate over the
+# dataset. In DGL, you can use the ``GraphDataLoader``.
+# 
+# You can also use various dataset samplers provided in
+# ```torch.utils.data.sampler`` <https://pytorch.org/docs/stable/data.html#data-loading-order-and-sampler>`__.
+# For example, this tutorial creates a training ``GraphDataLoader`` and
+# test ``GraphDataLoader``, using ``SubsetRandomSampler`` to tell PyTorch
+# to sample from only a subset of the dataset.
+# 
+from dgl.dataloading import GraphDataLoader
+from torch.utils.data.sampler import SubsetRandomSampler
+num_examples = len(dataset)
+num_train = int(num_examples * 0.8)
+train_sampler = SubsetRandomSampler(torch.arange(num_train))
+test_sampler = SubsetRandomSampler(torch.arange(num_train, num_examples))
+train_dataloader = GraphDataLoader(
+    dataset, sampler=train_sampler, batch_size=5, drop_last=False)
+test_dataloader = GraphDataLoader(
+    dataset, sampler=test_sampler, batch_size=5, drop_last=False)
+######################################################################
+# You can try to iterate over the created ``GraphDataLoader`` and see what it
+# gives:
+# 
+it = iter(train_dataloader)
+batch = next(it)
+print(batch)
+######################################################################
+# As each element in ``dataset`` has a graph and a label, the
+# ``GraphDataLoader`` will return two objects for each iteration. The
+# first element is the batched graph, and the second element is simply a
+# label vector representing the category of each graph in the mini-batch.
+# Next, we’ll talked about the batched graph.
+# 
+# A Batched Graph in DGL
+# ----------------------
+# 
+# In each mini-batch, the sampled graphs are combined into a single bigger
+# batched graph via ``dgl.batch``. The single bigger batched graph merges
+# all original graphs as separately connected components, with the node
+# and edge features concatenated. This bigger graph is also a ``DGLGraph``
+# instance (so you can
+# still treat it as a normal ``DGLGraph`` object as in
+# `here <2_dglgraph.ipynb>`__). It however contains the information
+# necessary for recovering the original graphs, such as the number of
+# nodes and edges of each graph element.
+# 
+batched_graph, labels = batch
+print('Number of nodes for each graph element in the batch:', batched_graph.batch_num_nodes())
+print('Number of edges for each graph element in the batch:', batched_graph.batch_num_edges())
+# Recover the original graph elements from the minibatch
+graphs = dgl.unbatch(batched_graph)
+print('The original graphs in the minibatch:')
+print(graphs)
+######################################################################
+# Define Model
+# ------------
+# 
+# This tutorial will build a two-layer `Graph Convolutional Network
+# (GCN) <http://tkipf.github.io/graph-convolutional-networks/>`__. Each of
+# its layer computes new node representations by aggregating neighbor
+# information. If you have gone through the
+# :doc:`introduction <1_introduction>`, you will notice two
+# differences:
+# 
+# -  Since the task is to predict a single category for the *entire graph*
+#    instead of for every node, you will need to aggregate the
+#    representations of all the nodes and potentially the edges to form a
+#    graph-level representation. Such process is more commonly referred as
+#    a *readout*. A simple choice is to average the node features of a
+#    graph with ``dgl.mean_nodes()``.
+#
+# -  The input graph to the model will be a batched graph yielded by the
+#    ``GraphDataLoader``. The readout functions provided by DGL can handle
+#    batched graphs so that they will return one representation for each
+#    minibatch element.
+# 
+from dgl.nn import GraphConv
+class GCN(nn.Module):
+    def __init__(self, in_feats, h_feats, num_classes):
+        super(GCN, self).__init__()
+        self.conv1 = GraphConv(in_feats, h_feats)
+        self.conv2 = GraphConv(h_feats, num_classes)
+    def forward(self, g, in_feat):
+        h = self.conv1(g, in_feat)
+        h = F.relu(h)
+        h = self.conv2(g, h)
+        g.ndata['h'] = h
+        return dgl.mean_nodes(g, 'h')
+######################################################################
+# Training Loop
+# -------------
+# 
+# The training loop iterates over the training set with the
+# ``GraphDataLoader`` object and computes the gradients, just like
+# image classification or language modeling.
+# 
+# Create the model with given dimensions
+model = GCN(dataset.dim_nfeats, 16, dataset.gclasses)
+optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
+for epoch in range(20):
+    for batched_graph, labels in train_dataloader:
+        pred = model(batched_graph, batched_graph.ndata['attr'].float())
+        loss = F.cross_entropy(pred, labels)
+        optimizer.zero_grad()
+        loss.backward()
+        optimizer.step()
+num_correct = 0
+num_tests = 0
+for batched_graph, labels in test_dataloader:
+    pred = model(batched_graph, batched_graph.ndata['attr'].float())
+    num_correct += (pred.argmax(1) == labels).sum().item()
+    num_tests += len(labels)
+print('Test accuracy:', num_correct / num_tests)
+######################################################################
+# What’s next
+# -----------
+# 
+# -  See `GIN
+#    example <https://github.com/dmlc/dgl/tree/master/examples/pytorch/gin>`__
+#    for an end-to-end graph classification model.
+# 
--- a/new-tutorial/6_load_data.py
+++ b/new-tutorial/6_load_data.py
+"""
+Make Your Own Dataset
+=====================
+This tutorial assumes that you already know :doc:`the basics of training a
+GNN for node classification <1_introduction>` and :doc:`how to
+create, load, and store a DGL graph <2_dglgraph>`.
+By the end of this tutorial, you will be able to
+-  Create your own graph dataset for node classification, link
+   prediction, or graph classification.
+(Time estimate: 15 minutes)
+"""
+######################################################################
+# ``DGLDataset`` Object Overview
+# ------------------------------
+# 
+# Your custom graph dataset should inherit the ``dgl.data.DGLDataset``
+# class and implement the following methods:
+# 
+# -  ``__getitem__(self, i)``: retrieve the ``i``-th example of the
+#    dataset. An example often contains a single DGL graph, and
+#    occasionally its label.
+# -  ``__len__(self)``: the number of examples in the dataset.
+# -  ``process(self)``: load and process raw data from disk.
+# 
+######################################################################
+# Creating a Dataset for Node Classification or Link Prediction from CSV
+# ----------------------------------------------------------------------
+# 
+# A node classification dataset often consists of a single graph, as well
+# as its node and edge features.
+# 
+# This tutorial takes a small dataset based on `Zachary’s Karate Club
+# network <https://en.wikipedia.org/wiki/Zachary%27s_karate_club>`__. It
+# contains
+#
+# * A ``members.csv`` file containing the attributes of all
+#   members, as well as their attributes.
+#
+# * An ``interactions.csv`` file
+#   containing the pair-wise interactions between two club members.
+#
+import urllib.request
+import pandas as pd
+urllib.request.urlretrieve(
+    'https://data.dgl.ai/tutorial/dataset/members.csv', './members.csv')
+urllib.request.urlretrieve(
+    'https://data.dgl.ai/tutorial/dataset/interactions.csv', './interactions.csv')
+members = pd.read_csv('./members.csv')
+members.head()
+interactions = pd.read_csv('./interactions.csv')
+interactions.head()
+######################################################################
+# This tutorial treats the members as nodes and interactions as edges. It
+# takes age as a numeric feature of the nodes, affiliated club as the label
+# of the nodes, and edge weight as a numeric feature of the edges.
+# 
+# .. note::
+# 
+#    The original Zachary’s Karate Club network does not have
+#    member ages. The ages in this tutorial are generated synthetically
+#    for demonstrating how to add node features into the graph for dataset
+#    creation.
+# 
+# .. note::
+# 
+#    In practice, taking age directly as a numeric feature may
+#    not work well in machine learning; strategies like binning or
+#    normalizing the feature would work better. This tutorial directly
+#    takes the values as-is for simplicity.
+# 
+import dgl
+from dgl.data import DGLDataset
+import torch
+import os
+class KarateClubDataset(DGLDataset):
+    def __init__(self):
+        super().__init__(name='karate_club')
+    def process(self):
+        nodes_data = pd.read_csv('./members.csv')
+        edges_data = pd.read_csv('./interactions.csv')
+        node_features = torch.from_numpy(nodes_data['Age'].to_numpy())
+        node_labels = torch.from_numpy(nodes_data['Club'].astype('category').cat.codes.to_numpy())
+        edge_features = torch.from_numpy(edges_data['Weight'].to_numpy())
+        edges_src = torch.from_numpy(edges_data['Src'].to_numpy())
+        edges_dst = torch.from_numpy(edges_data['Dst'].to_numpy())
+        self.graph = dgl.graph((edges_src, edges_dst), num_nodes=nodes_data.shape[0])
+        self.graph.ndata['feat'] = node_features
+        self.graph.ndata['label'] = node_labels
+        self.graph.edata['weight'] = edge_features
+        # If your dataset is a node classification dataset, you will need to assign
+        # masks indicating whether a node belongs to training, validation, and test set.
+        n_nodes = nodes_data.shape[0]
+        n_train = int(n_nodes * 0.6)
+        n_val = int(n_nodes * 0.2)
+        train_mask = torch.zeros(n_nodes, dtype=torch.bool)
+        val_mask = torch.zeros(n_nodes, dtype=torch.bool)
+        test_mask = torch.zeros(n_nodes, dtype=torch.bool)
+        train_mask[:n_train] = True
+        val_mask[n_train:n_train + n_val] = True
+        test_mask[n_train + n_val:] = True
+        self.graph.ndata['train_mask'] = train_mask
+        self.graph.ndata['val_mask'] = val_mask
+        self.graph.ndata['test_mask'] = test_mask
+    def __getitem__(self, i):
+        return self.graph
+    def __len__(self):
+        return 1
+dataset = KarateClubDataset()
+graph = dataset[0]
+print(graph)
+######################################################################
+# Since a link prediction dataset only involves a single graph, preparing
+# a link prediction dataset will have the same experience as preparing a
+# node classification dataset.
+# 
+######################################################################
+# Creating a Dataset for Graph Classification from CSV
+# ----------------------------------------------------
+# 
+# Creating a graph classification dataset involves implementing
+# ``__getitem__`` to return both the graph and its graph-level label.
+# 
+# This tutorial demonstrates how to create a graph classification dataset
+# with the following synthetic CSV data:
+# 
+# -  ``graph_edges.csv``: containing three columns:
+# 
+#    -  ``graph_id``: the ID of the graph.
+#    -  ``src``: the source node of an edge of the given graph.
+#    -  ``dst``: the destination node of an edge of the given graph.
+# 
+# -  ``graph_properties.csv``: containing three columns:
+# 
+#    -  ``graph_id``: the ID of the graph.
+#    -  ``label``: the label of the graph.
+#    -  ``num_nodes``: the number of nodes in the graph.
+# 
+urllib.request.urlretrieve(
+    'https://data.dgl.ai/tutorial/dataset/graph_edges.csv', './graph_edges.csv')
+urllib.request.urlretrieve(
+    'https://data.dgl.ai/tutorial/dataset/graph_properties.csv', './graph_properties.csv')
+edges = pd.read_csv('./graph_edges.csv')
+properties = pd.read_csv('./graph_properties.csv')
+edges.head()
+properties.head()
+class SyntheticDataset(DGLDataset):
+    def __init__(self):
+        super().__init__(name='synthetic')
+    def process(self):
+        edges = pd.read_csv('./graph_edges.csv')
+        properties = pd.read_csv('./graph_properties.csv')
+        self.graphs = []
+        self.labels = []
+        # Create a graph for each graph ID from the edges table.
+        # First process the properties table into two dictionaries with graph IDs as keys.
+        # The label and number of nodes are values.
+        label_dict = {}
+        num_nodes_dict = {}
+        for _, row in properties.iterrows():
+            label_dict[row['graph_id']] = row['label']
+            num_nodes_dict[row['graph_id']] = row['num_nodes']
+        # For the edges, first group the table by graph IDs.
+        edges_group = edges.groupby('graph_id')
+        # For each graph ID...
+        for graph_id in edges_group.groups:
+            # Find the edges as well as the number of nodes and its label.
+            edges_of_id = edges_group.get_group(graph_id)
+            src = edges_of_id['src'].to_numpy()
+            dst = edges_of_id['dst'].to_numpy()
+            num_nodes = num_nodes_dict[graph_id]
+            label = label_dict[graph_id]
+            # Create a graph and add it to the list of graphs and labels.
+            g = dgl.graph((src, dst), num_nodes=num_nodes)
+            self.graphs.append(g)
+            self.labels.append(label)
+        # Convert the label list to tensor for saving.
+        self.labels = torch.LongTensor(self.labels)
+    def __getitem__(self, i):
+        return self.graphs[i], self.labels[i]
+    def __len__(self):
+        return len(self.graphs)
+dataset = SyntheticDataset()
+graph, label = dataset[0]
+print(graph, label)