"...pipelines/controlnet/pipeline_flax_controlnet.py" did not exist on "b562b6611fb53dae9bcffcaaf44d944539ae22de"
Unverified Commit 8801154b authored by VoVAllen's avatar VoVAllen Committed by GitHub
Browse files

Merge pull request #1 from jermainewang/cpp

Cpp
parents b46abb09 b2c1c4fa
# CI docker CPU env
# Adapted from github.com/dmlc/tvm/docker/Dockerfile.ci_cpu
FROM ubuntu:16.04
RUN apt-get update --fix-missing
COPY install/ubuntu_install_core.sh /install/ubuntu_install_core.sh
RUN bash /install/ubuntu_install_core.sh
COPY install/ubuntu_install_python.sh /install/ubuntu_install_python.sh
RUN bash /install/ubuntu_install_python.sh
COPY install/ubuntu_install_python_package.sh /install/ubuntu_install_python_package.sh
RUN bash /install/ubuntu_install_python_package.sh
# CI docker GPU env
FROM nvidia/cuda:9.0-cudnn7-devel
# Base scripts
RUN apt-get update --fix-missing
COPY install/ubuntu_install_core.sh /install/ubuntu_install_core.sh
RUN bash /install/ubuntu_install_core.sh
COPY install/ubuntu_install_python.sh /install/ubuntu_install_python.sh
RUN bash /install/ubuntu_install_python.sh
COPY install/ubuntu_install_python_package.sh /install/ubuntu_install_python_package.sh
RUN bash /install/ubuntu_install_python_package.sh
# Environment variables
ENV PATH=/usr/local/nvidia/bin:${PATH}
ENV PATH=/usr/local/cuda/bin:${PATH}
ENV CPLUS_INCLUDE_PATH=/usr/local/cuda/include:${CPLUS_INCLUDE_PATH}
ENV C_INCLUDE_PATH=/usr/local/cuda/include:${C_INCLUDE_PATH}
ENV LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64:${LIBRARY_PATH}
ENV LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/nvidia/lib64:${LD_LIBRARY_PATH}
## Build docker image for CI
### CPU image
docker build -t dgl-cpu -f Dockerfile.ci_cpu .
### GPU image
docker build -t dgl-gpu -f Dockerfile.ci_gpu .
# install libraries for building c++ core on ubuntu
apt update && apt install -y --no-install-recommends --force-yes \
apt-utils git build-essential make cmake wget unzip sudo libz-dev libxml2-dev
# install python and pip, don't modify this, modify install_python_package.sh
# apt-get update && apt-get install -y python-dev python-pip
# python 3.6
apt-get update && yes | apt-get install software-properties-common
add-apt-repository ppa:jonathonf/python-3.6
apt-get update && apt-get install -y python3.6 python3.6-dev
rm -f /usr/bin/python3 && ln -s /usr/bin/python3.6 /usr/bin/python3
# Install pip
cd /tmp && wget https://bootstrap.pypa.io/get-pip.py
# python2 get-pip.py
python3.6 get-pip.py
# install libraries for python package on ubuntu
# pip2 install pylint numpy cython scipy nltk requests[security]
pip3 install pylint numpy cython scipy nltk requests[security]
# install DL Framework
# pip2 install torch torchvision
pip3 install torch torchvision
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
SOURCEDIR = source
BUILDDIR = build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
\ No newline at end of file
dgl.BatchedDGLGraph
-------------------
.. autoclass:: dgl.BatchedDGLGraph
:members:
:show-inheritance:
.. autofunction:: dgl.batch
.. autofunction:: dgl.unbatch
dgl.DGLGraph
------------
.. automodule:: dgl.graph
.. autoclass:: dgl.DGLGraph
:members:
:inherited-members:
Python APIs
===========
.. toctree::
:maxdepth: 2
graph
batch
# -*- coding: utf-8 -*-
#
# Configuration file for the Sphinx documentation builder.
#
# This file does only contain a selection of the most common options. For a
# full list see the documentation:
# http://www.sphinx-doc.org/en/master/config
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import sys
sys.path.insert(0, os.path.abspath('../../python'))
# -- Project information -----------------------------------------------------
project = 'DGL'
copyright = '2018, DGL Team'
author = 'DGL Team'
# The short X.Y version
version = '0.0.1'
# The full version, including alpha/beta/rc tags
release = '0.0.1'
# -- General configuration ---------------------------------------------------
# If your documentation needs a minimal Sphinx version, state it here.
#
# needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.autosummary',
'sphinx.ext.coverage',
'sphinx.ext.mathjax',
'sphinx.ext.napoleon',
'sphinx.ext.viewcode',
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
source_suffix = ['.rst', '.md']
# The master toctree document.
master_doc = 'index'
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = []
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = None
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'sphinx_rtd_theme'
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#
# html_theme_options = {}
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# Custom sidebar templates, must be a dictionary that maps document names
# to template names.
#
# The default sidebars (for documents that don't match any pattern) are
# defined by theme itself. Builtin themes are using these templates by
# default: ``['localtoc.html', 'relations.html', 'sourcelink.html',
# 'searchbox.html']``.
#
# html_sidebars = {}
# -- Options for HTMLHelp output ---------------------------------------------
# Output file base name for HTML help builder.
htmlhelp_basename = 'dgldoc'
# -- Options for LaTeX output ------------------------------------------------
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#
# 'papersize': 'letterpaper',
# The font size ('10pt', '11pt' or '12pt').
#
# 'pointsize': '10pt',
# Additional stuff for the LaTeX preamble.
#
# 'preamble': '',
# Latex figure (float) alignment
#
# 'figure_align': 'htbp',
}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
(master_doc, 'dgl.tex', 'dgl Documentation',
'DGL Team', 'manual'),
]
# -- Options for manual page output ------------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
(master_doc, 'dgl', 'dgl Documentation',
[author], 1)
]
# -- Options for Texinfo output ----------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
(master_doc, 'dgl', 'dgl Documentation',
author, 'dgl', 'One line description of project.',
'Miscellaneous'),
]
# -- Options for Epub output -------------------------------------------------
# Bibliographic Dublin Core info.
epub_title = project
# The unique identifier of the text. This can be a ISBN number
# or the project homepage.
#
# epub_identifier = ''
# A unique identification for the text.
#
# epub_uid = ''
# A list of files that should not be packed into the epub file.
epub_exclude_files = ['search.html']
# -- Extension configuration -------------------------------------------------
.. DGL documentation master file, created by
sphinx-quickstart on Fri Oct 5 14:18:01 2018.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to DGL's documentation!
===============================
.. toctree::
:maxdepth: 2
:caption: Contents:
Get Started
-----------
.. toctree::
:maxdepth: 2
install/index
tutorials/index
API Reference
-------------
.. toctree::
:maxdepth: 2
api/python/index
Index
-----
* :ref:`genindex`
Install DGL
============
At this stage, we recommend installing DGL from source. To quickly try out DGL and its demo/tutorials, checkout `Install from docker`_.
Get source codes
----------------
First, download the source files from github. Note you need to use the ``--recursive`` option to
also clone the submodules.
.. code:: bash
git clone --recursive https://github.com/jermainewang/dgl.git
You can also clone the repository first and type following commands:
.. code:: bash
git submodule init
git submodule update
Build shared library
--------------------
Before building the library, please make sure the following dependencies are installed
(use ubuntu as an example):
.. code:: bash
sudo apt-get update
sudo apt-get install -y python
We use cmake (minimal version 2.8) to build the library.
.. code:: bash
mkdir build
cd build
cmake ..
make -j4
Build python binding
--------------------
DGL's python binding depends on following packages (tested version):
* numpy (>= 1.14.0)
* scipy (>= 1.1.0)
* networkx (>= 2.1)
To install them, use following command:
.. code:: bash
pip install --user numpy scipy networkx
There are several ways to setup DGL's python binding. We recommend developers at the current stage
use environment variables to find python packages.
.. code:: bash
export DGL_HOME=/path/to/dgl
export PYTHONPATH=$DGL_HOME$/python:${PYTHONPATH}
export DGL_LIBRARY_PATH=$DGL_HOME$/build
The ``DGL_LIBRARY_PATH`` variable is used for our python package to locate the shared library
built above. Use following command to test whether the installation is successful or not.
.. code:: bash
python -c 'import dgl'
Install from docker
-------------------
TBD
Tutorials
=========
TBD: Get started on DGL
......@@ -9,33 +9,35 @@ GAT with batch processing
import argparse
import numpy as np
import time
import torch
import torch.nn as nn
import torch.nn.functional as F
import mxnet as mx
from mxnet import gluon
import dgl
from dgl import DGLGraph
from dgl.data import register_data_args, load_data
def elu(data):
return mx.nd.LeakyReLU(data, act_type='elu')
def gat_message(src, edge):
return {'ft' : src['ft'], 'a2' : src['a2']}
class GATReduce(nn.Module):
class GATReduce(gluon.Block):
def __init__(self, attn_drop):
super(GATReduce, self).__init__()
self.attn_drop = attn_drop
def forward(self, node, msgs):
a1 = torch.unsqueeze(node['a1'], 1) # shape (B, 1, 1)
a1 = mx.nd.expand_dims(node['a1'], 1) # shape (B, 1, 1)
a2 = msgs['a2'] # shape (B, deg, 1)
ft = msgs['ft'] # shape (B, deg, D)
# attention
a = a1 + a2 # shape (B, deg, 1)
e = F.softmax(F.leaky_relu(a), dim=1)
e = mx.nd.softmax(mx.nd.LeakyReLU(a))
if self.attn_drop != 0.0:
e = F.dropout(e, self.attn_drop)
return {'accum' : torch.sum(e * ft, dim=1)} # shape (B, D)
e = mx.nd.Dropout(e, self.attn_drop)
return {'accum' : mx.nd.sum(e * ft, axis=1)} # shape (B, D)
class GATFinalize(nn.Module):
class GATFinalize(gluon.Block):
def __init__(self, headid, indim, hiddendim, activation, residual):
super(GATFinalize, self).__init__()
self.headid = headid
......@@ -44,7 +46,7 @@ class GATFinalize(nn.Module):
self.residual_fc = None
if residual:
if indim != hiddendim:
self.residual_fc = nn.Linear(indim, hiddendim)
self.residual_fc = gluon.nn.Dense(hiddendim)
def forward(self, node):
ret = node['accum']
......@@ -55,24 +57,24 @@ class GATFinalize(nn.Module):
ret = node['h'] + ret
return {'head%d' % self.headid : self.activation(ret)}
class GATPrepare(nn.Module):
class GATPrepare(gluon.Block):
def __init__(self, indim, hiddendim, drop):
super(GATPrepare, self).__init__()
self.fc = nn.Linear(indim, hiddendim)
self.fc = gluon.nn.Dense(hiddendim)
self.drop = drop
self.attn_l = nn.Linear(hiddendim, 1)
self.attn_r = nn.Linear(hiddendim, 1)
self.attn_l = gluon.nn.Dense(1)
self.attn_r = gluon.nn.Dense(1)
def forward(self, feats):
h = feats
if self.drop != 0.0:
h = F.dropout(h, self.drop)
h = mx.nd.Dropout(h, self.drop)
ft = self.fc(h)
a1 = self.attn_l(ft)
a2 = self.attn_r(ft)
return {'h' : h, 'ft' : ft, 'a1' : a1, 'a2' : a2}
class GAT(nn.Module):
class GAT(gluon.Block):
def __init__(self,
g,
num_layers,
......@@ -88,27 +90,27 @@ class GAT(nn.Module):
self.g = g
self.num_layers = num_layers
self.num_heads = num_heads
self.prp = nn.ModuleList()
self.red = nn.ModuleList()
self.fnl = nn.ModuleList()
self.prp = gluon.nn.Sequential()
self.red = gluon.nn.Sequential()
self.fnl = gluon.nn.Sequential()
# input projection (no residual)
for hid in range(num_heads):
self.prp.append(GATPrepare(in_dim, num_hidden, in_drop))
self.red.append(GATReduce(attn_drop))
self.fnl.append(GATFinalize(hid, in_dim, num_hidden, activation, False))
self.prp.add(GATPrepare(in_dim, num_hidden, in_drop))
self.red.add(GATReduce(attn_drop))
self.fnl.add(GATFinalize(hid, in_dim, num_hidden, activation, False))
# hidden layers
for l in range(num_layers - 1):
for hid in range(num_heads):
# due to multi-head, the in_dim = num_hidden * num_heads
self.prp.append(GATPrepare(num_hidden * num_heads, num_hidden, in_drop))
self.red.append(GATReduce(attn_drop))
self.fnl.append(GATFinalize(hid, num_hidden * num_heads,
num_hidden, activation, residual))
self.prp.add(GATPrepare(num_hidden * num_heads, num_hidden, in_drop))
self.red.add(GATReduce(attn_drop))
self.fnl.add(GATFinalize(hid, num_hidden * num_heads,
num_hidden, activation, residual))
# output projection
self.prp.append(GATPrepare(num_hidden * num_heads, num_classes, in_drop))
self.red.append(GATReduce(attn_drop))
self.fnl.append(GATFinalize(0, num_hidden * num_heads,
num_classes, activation, residual))
self.prp.add(GATPrepare(num_hidden * num_heads, num_classes, in_drop))
self.red.add(GATReduce(attn_drop))
self.fnl.add(GATFinalize(0, num_hidden * num_heads,
num_classes, activation, residual))
# sanity check
assert len(self.prp) == self.num_layers * self.num_heads + 1
assert len(self.red) == self.num_layers * self.num_heads + 1
......@@ -122,23 +124,23 @@ class GAT(nn.Module):
# prepare
self.g.set_n_repr(self.prp[i](last))
# message passing
self.g.update_all(gat_message, self.red[i], self.fnl[i], batchable=True)
self.g.update_all(gat_message, self.red[i], self.fnl[i])
# merge all the heads
last = torch.cat(
[self.g.pop_n_repr('head%d' % hid) for hid in range(self.num_heads)],
last = mx.nd.concat(
*[self.g.pop_n_repr('head%d' % hid) for hid in range(self.num_heads)],
dim=1)
# output projection
self.g.set_n_repr(self.prp[-1](last))
self.g.update_all(gat_message, self.red[-1], self.fnl[-1], batchable=True)
self.g.update_all(gat_message, self.red[-1], self.fnl[-1])
return self.g.pop_n_repr('head0')
def main(args):
# load and preprocess dataset
data = load_data(args)
features = torch.FloatTensor(data.features)
labels = torch.LongTensor(data.labels)
mask = torch.ByteTensor(data.train_mask)
features = mx.nd.array(data.features)
labels = mx.nd.array(data.labels)
mask = mx.nd.array(data.train_mask)
in_feats = features.shape[1]
n_classes = data.num_labels
n_edges = data.graph.number_of_edges()
......@@ -162,16 +164,17 @@ def main(args):
args.num_hidden,
n_classes,
args.num_heads,
F.elu,
elu,
args.in_drop,
args.attn_drop,
args.residual)
if cuda:
model.cuda()
model.initialize()
# use optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=args.lr)
trainer = gluon.Trainer(model.collect_params(), 'adam', {'learning_rate': args.lr})
# initialize graph
dur = []
......@@ -179,19 +182,18 @@ def main(args):
if epoch >= 3:
t0 = time.time()
# forward
logits = model(features)
logp = F.log_softmax(logits, 1)
loss = F.nll_loss(logp, labels)
with mx.autograd.record():
logits = model(features)
loss = mx.nd.softmax_cross_entropy(logits, labels)
optimizer.zero_grad()
#optimizer.zero_grad()
loss.backward()
optimizer.step()
trainer.step(features.shape[0])
if epoch >= 3:
dur.append(time.time() - t0)
print("Epoch {:05d} | Loss {:.4f} | Time(s) {:.4f} | ETputs(KTEPS) {:.2f}".format(
epoch, loss.item(), np.mean(dur), n_edges / np.mean(dur) / 1000))
print("Epoch {:05d} | Loss {:.4f} | Time(s) {:.4f} | ETputs(KTEPS) {:.2f}".format(
epoch, loss.asnumpy()[0], np.mean(dur), n_edges / np.mean(dur) / 1000))
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='GAT')
......
Graph Convolutional Networks (GCN)
============
Paper link: [https://arxiv.org/abs/1609.02907](https://arxiv.org/abs/1609.02907)
Author's code repo: [https://github.com/tkipf/gcn](https://github.com/tkipf/gcn)
The folder contains three different implementations using DGL.
Naive GCN (gcn.py)
-------
The model is defined in the finest granularity (aka on *one* edge and *one* node).
* The message function `gcn_msg` computes the message for one edge. It simply returns the `h` representation of the source node.
```python
def gcn_msg(src, edge):
# src['h'] is a tensor of shape (D,). D is the feature length.
return src['h']
```
* The reduce function `gcn_reduce` accumulates the incoming messages for one node. The `msgs` argument is a list of all the messages. In GCN, the incoming messages are summed up.
```python
def gcn_reduce(node, msgs):
# msgs is a list of in-coming messages.
return sum(msgs)
```
* The update function `NodeUpdateModule` computes the new new node representation `h` using non-linear transformation on the reduced messages.
```python
class NodeUpdateModule(nn.Module):
def __init__(self, in_feats, out_feats, activation=None):
super(NodeUpdateModule, self).__init__()
self.linear = nn.Linear(in_feats, out_feats)
self.activation = activation
def forward(self, node, accum):
# accum is a tensor of shape (D,).
h = self.linear(accum)
if self.activation:
h = self.activation(h)
return {'h' : h}
```
After defining the functions on each node/edge, the message passing is triggered by calling `update_all` on the DGLGraph object (in GCN module).
Batched GCN (gcn_batch.py)
-----------
Defining the model on only one node and edge makes it hard to fully utilize GPUs. As a result, we allow users to define model on a *batch of* nodes and edges.
* The message function `gcn_msg` computes the message for a batch of edges. Here, the `src` argument is the batched representation of the source endpoints of the edges. The function simply returns the source node representations.
```python
def gcn_msg(src, edge):
# src is a tensor of shape (B, D). B is the number of edges being batched.
return src
```
* The reduce function `gcn_reduce` also accumulates messages for a batch of nodes. We batch the messages on the second dimension fo the `msgs` argument:
```python
def gcn_reduce(node, msgs):
# The msgs is a tensor of shape (B, deg, D). B is the number of nodes in the batch;
# deg is the number of messages; D is the message tensor dimension. DGL gaurantees
# that all the nodes in a batch have the same in-degrees (through "degree-bucketing").
# Reduce on the second dimension is equal to sum up all the in-coming messages.
return torch.sum(msgs, 1)
```
* The update module is similar. The first dimension of each tensor is the batch dimension. Since PyTorch operation is usually aware of the batch dimension, the code is the same as the naive GCN.
Triggering message passing is also similar. User needs to set `batchable=True` to indicate that the functions all support batching.
```python
self.g.update_all(gcn_msg, gcn_reduce, layer, batchable=True)`
```
Batched GCN with spMV optimization (gcn_spmv.py)
-----------
Batched computation is much more efficient than naive vertex-centric approach, but is still not ideal. For example, the batched message function needs to look up source node data and save it on edges. Such kind of lookups is very common and incurs extra memory copy operations. In fact, the message and reduce phase of GCN model can be fused into one sparse-matrix-vector multiplication (spMV). Therefore, DGL provides many built-in message/reduce functions so we can figure out the chance of optimization. In gcn_spmv.py, user only needs to write update module and trigger the message passing as follows:
```python
self.g.update_all('from_src', 'sum', layer, batchable=True)
```
Here, `'from_src'` and `'sum'` are the builtin message and reduce function.
......@@ -8,9 +8,8 @@ GCN with batch processing
import argparse
import numpy as np
import time
import torch
import torch.nn as nn
import torch.nn.functional as F
import mxnet as mx
from mxnet import gluon
import dgl
from dgl import DGLGraph
from dgl.data import register_data_args, load_data
......@@ -19,21 +18,17 @@ def gcn_msg(src, edge):
return src
def gcn_reduce(node, msgs):
return torch.sum(msgs, 1)
return mx.nd.sum(msgs, 1)
class NodeApplyModule(nn.Module):
def __init__(self, in_feats, out_feats, activation=None):
super(NodeApplyModule, self).__init__()
self.linear = nn.Linear(in_feats, out_feats)
self.activation = activation
class NodeUpdateModule(gluon.Block):
def __init__(self, out_feats, activation=None):
super(NodeUpdateModule, self).__init__()
self.linear = gluon.nn.Dense(out_feats, activation=activation)
def forward(self, node):
h = self.linear(node)
if self.activation:
h = self.activation(h)
return h
return self.linear(node)
class GCN(nn.Module):
class GCN(gluon.Block):
def __init__(self,
g,
in_feats,
......@@ -46,12 +41,13 @@ class GCN(nn.Module):
self.g = g
self.dropout = dropout
# input layer
self.layers = nn.ModuleList([NodeApplyModule(in_feats, n_hidden, activation)])
self.layers = gluon.nn.Sequential()
self.layers.add(NodeUpdateModule(n_hidden, activation))
# hidden layers
for i in range(n_layers - 1):
self.layers.append(NodeApplyModule(n_hidden, n_hidden, activation))
self.layers.add(NodeUpdateModule(n_hidden, activation))
# output layer
self.layers.append(NodeApplyModule(n_hidden, n_classes))
self.layers.add(NodeUpdateModule(n_classes))
def forward(self, features):
self.g.set_n_repr(features)
......@@ -60,28 +56,29 @@ class GCN(nn.Module):
if self.dropout:
val = F.dropout(self.g.get_n_repr(), p=self.dropout)
self.g.set_n_repr(val)
self.g.update_all(gcn_msg, gcn_reduce, layer, batchable=True)
self.g.update_all(gcn_msg, gcn_reduce, layer)
return self.g.pop_n_repr()
def main(args):
# load and preprocess dataset
data = load_data(args)
features = torch.FloatTensor(data.features)
labels = torch.LongTensor(data.labels)
mask = torch.ByteTensor(data.train_mask)
features = mx.nd.array(data.features)
labels = mx.nd.array(data.labels)
mask = mx.nd.array(data.train_mask)
in_feats = features.shape[1]
n_classes = data.num_labels
n_edges = data.graph.number_of_edges()
if args.gpu < 0:
if args.gpu <= 0:
cuda = False
ctx = mx.cpu(0)
else:
cuda = True
torch.cuda.set_device(args.gpu)
features = features.cuda()
labels = labels.cuda()
mask = mask.cuda()
features = features.as_in_context(mx.gpu(0))
labels = labels.as_in_context(mx.gpu(0))
mask = mask.as_in_context(mx.gpu(0))
ctx = mx.gpu(0)
# create GCN model
g = DGLGraph(data.graph)
......@@ -90,14 +87,12 @@ def main(args):
args.n_hidden,
n_classes,
args.n_layers,
F.relu,
'relu',
args.dropout)
if cuda:
model.cuda()
model.initialize(ctx=ctx)
# use optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=args.lr)
trainer = gluon.Trainer(model.collect_params(), 'adam', {'learning_rate': args.lr})
# initialize graph
dur = []
......@@ -105,19 +100,18 @@ def main(args):
if epoch >= 3:
t0 = time.time()
# forward
logits = model(features)
logp = F.log_softmax(logits, 1)
loss = F.nll_loss(logp[mask], labels[mask])
with mx.autograd.record():
logits = model(features)
loss = mx.nd.softmax_cross_entropy(logits, labels)
optimizer.zero_grad()
#optimizer.zero_grad()
loss.backward()
optimizer.step()
trainer.step(features.shape[0])
if epoch >= 3:
dur.append(time.time() - t0)
print("Epoch {:05d} | Loss {:.4f} | Time(s) {:.4f} | ETputs(KTEPS) {:.2f}".format(
epoch, loss.item(), np.mean(dur), n_edges / np.mean(dur) / 1000))
print("Epoch {:05d} | Loss {:.4f} | Time(s) {:.4f} | ETputs(KTEPS) {:.2f}".format(
epoch, loss.asnumpy()[0], np.mean(dur), n_edges / np.mean(dur) / 1000))
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='GCN')
......@@ -135,6 +129,5 @@ if __name__ == '__main__':
parser.add_argument("--n-layers", type=int, default=1,
help="number of hidden gcn layers")
args = parser.parse_args()
print(args)
main(args)
......@@ -2,6 +2,8 @@
Graph Attention Networks
Paper: https://arxiv.org/abs/1710.10903
Code: https://github.com/PetarV-/GAT
GAT with batch processing
"""
import argparse
......@@ -10,6 +12,7 @@ import time
import torch
import torch.nn as nn
import torch.nn.functional as F
import dgl
from dgl import DGLGraph
from dgl.data import register_data_args, load_data
......@@ -22,15 +25,15 @@ class GATReduce(nn.Module):
self.attn_drop = attn_drop
def forward(self, node, msgs):
a1 = torch.unsqueeze(node['a1'], 0) # shape (1, 1)
a2 = torch.cat([torch.unsqueeze(m['a2'], 0) for m in msgs], dim=0) # shape (deg, 1)
ft = torch.cat([torch.unsqueeze(m['ft'], 0) for m in msgs], dim=0) # shape (deg, D)
a1 = torch.unsqueeze(node['a1'], 1) # shape (B, 1, 1)
a2 = msgs['a2'] # shape (B, deg, 1)
ft = msgs['ft'] # shape (B, deg, D)
# attention
a = a1 + a2 # shape (deg, 1)
e = F.softmax(F.leaky_relu(a), dim=0)
a = a1 + a2 # shape (B, deg, 1)
e = F.softmax(F.leaky_relu(a), dim=1)
if self.attn_drop != 0.0:
e = F.dropout(e, self.attn_drop)
return {'accum' : torch.sum(e * ft, dim=0)} # shape (D,)
return {'accum' : torch.sum(e * ft, dim=1)} # shape (B, D)
class GATFinalize(nn.Module):
def __init__(self, headid, indim, hiddendim, activation, residual):
......@@ -71,7 +74,7 @@ class GATPrepare(nn.Module):
class GAT(nn.Module):
def __init__(self,
nx_graph,
g,
num_layers,
in_dim,
num_hidden,
......@@ -82,8 +85,8 @@ class GAT(nn.Module):
attn_drop,
residual):
super(GAT, self).__init__()
self.g = DGLGraph(nx_graph)
self.num_layers = num_layers # one extra output projection
self.g = g
self.num_layers = num_layers
self.num_heads = num_heads
self.prp = nn.ModuleList()
self.red = nn.ModuleList()
......@@ -104,48 +107,39 @@ class GAT(nn.Module):
# output projection
self.prp.append(GATPrepare(num_hidden * num_heads, num_classes, in_drop))
self.red.append(GATReduce(attn_drop))
self.fnl.append(GATFinalize(0, num_hidden * num_heads, num_classes, activation, residual))
self.fnl.append(GATFinalize(0, num_hidden * num_heads,
num_classes, activation, residual))
# sanity check
assert len(self.prp) == self.num_layers * self.num_heads + 1
assert len(self.red) == self.num_layers * self.num_heads + 1
assert len(self.fnl) == self.num_layers * self.num_heads + 1
def forward(self, features, train_nodes):
def forward(self, features):
last = features
for l in range(self.num_layers):
for hid in range(self.num_heads):
i = l * self.num_heads + hid
# prepare
for n, h in last.items():
self.g.nodes[n].update(self.prp[i](h))
self.g.set_n_repr(self.prp[i](last))
# message passing
self.g.update_all(gat_message, self.red[i], self.fnl[i])
# merge all the heads
last = {}
for n in self.g.nodes():
last[n] = torch.cat(
[self.g.nodes[n]['head%d' % hid] for hid in range(self.num_heads)])
last = torch.cat(
[self.g.pop_n_repr('head%d' % hid) for hid in range(self.num_heads)],
dim=1)
# output projection
for n, h in last.items():
self.g.nodes[n].update(self.prp[-1](h))
self.g.set_n_repr(self.prp[-1](last))
self.g.update_all(gat_message, self.red[-1], self.fnl[-1])
return torch.cat([torch.unsqueeze(self.g.nodes[n]['head0'], 0) for n in train_nodes])
return self.g.pop_n_repr('head0')
def main(args):
# load and preprocess dataset
data = load_data(args)
# features of each samples
features = {}
labels = []
train_nodes = []
for n in data.graph.nodes():
features[n] = torch.FloatTensor(data.features[n, :])
if data.train_mask[n] == 1:
train_nodes.append(n)
labels.append(data.labels[n])
labels = torch.LongTensor(labels)
in_feats = data.features.shape[1]
features = torch.FloatTensor(data.features)
labels = torch.LongTensor(data.labels)
mask = torch.ByteTensor(data.train_mask)
in_feats = features.shape[1]
n_classes = data.num_labels
n_edges = data.graph.number_of_edges()
......@@ -154,11 +148,15 @@ def main(args):
else:
cuda = True
torch.cuda.set_device(args.gpu)
features = {k : v.cuda() for k, v in features.items()}
features = features.cuda()
labels = labels.cuda()
mask = mask.cuda()
# create GCN model
g = DGLGraph(data.graph)
# create model
model = GAT(data.graph,
model = GAT(g,
args.num_layers,
in_feats,
args.num_hidden,
......@@ -181,7 +179,7 @@ def main(args):
if epoch >= 3:
t0 = time.time()
# forward
logits = model(features, train_nodes)
logits = model(features)
logp = F.log_softmax(logits, 1)
loss = F.nll_loss(logp, labels)
......@@ -202,7 +200,7 @@ if __name__ == '__main__':
help="Which GPU to use. Set -1 to use CPU.")
parser.add_argument("--epochs", type=int, default=20,
help="number of training epochs")
parser.add_argument("--num-heads", type=int, default=8,
parser.add_argument("--num-heads", type=int, default=3,
help="number of attentional heads to use")
parser.add_argument("--num-layers", type=int, default=1,
help="number of hidden layers")
......
......@@ -4,43 +4,9 @@ Graph Convolutional Networks (GCN)
Paper link: [https://arxiv.org/abs/1609.02907](https://arxiv.org/abs/1609.02907)
Author's code repo: [https://github.com/tkipf/gcn](https://github.com/tkipf/gcn)
The folder contains three different implementations using DGL.
The folder contains two different implementations using DGL.
Naive GCN (gcn.py)
-------
The model is defined in the finest granularity (aka on *one* edge and *one* node).
* The message function `gcn_msg` computes the message for one edge. It simply returns the `h` representation of the source node.
```python
def gcn_msg(src, edge):
# src['h'] is a tensor of shape (D,). D is the feature length.
return src['h']
```
* The reduce function `gcn_reduce` accumulates the incoming messages for one node. The `msgs` argument is a list of all the messages. In GCN, the incoming messages are summed up.
```python
def gcn_reduce(node, msgs):
# msgs is a list of in-coming messages.
return sum(msgs)
```
* The update function `NodeUpdateModule` computes the new new node representation `h` using non-linear transformation on the reduced messages.
```python
class NodeUpdateModule(nn.Module):
def __init__(self, in_feats, out_feats, activation=None):
super(NodeUpdateModule, self).__init__()
self.linear = nn.Linear(in_feats, out_feats)
self.activation = activation
def forward(self, node, accum):
# accum is a tensor of shape (D,).
h = self.linear(accum)
if self.activation:
h = self.activation(h)
return {'h' : h}
```
After defining the functions on each node/edge, the message passing is triggered by calling `update_all` on the DGLGraph object (in GCN module).
Batched GCN (gcn_batch.py)
Batched GCN (gcn.py)
-----------
Defining the model on only one node and edge makes it hard to fully utilize GPUs. As a result, we allow users to define model on a *batch of* nodes and edges.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment