Unverified Commit ec17136e authored by Mufei Li's avatar Mufei Li Committed by GitHub
Browse files

[Deprecation] Examples (#4751)



* Update from master (#4584)

* [Example][Refactor] Refactor graphsage multigpu and full-graph example (#4430)

* Add refactors for multi-gpu and full-graph example

* Fix format

* Update

* Update

* Update

* [Cleanup] Remove async_transferer (#4505)

* Remove async_transferer

* remove test

* Remove AsyncTransferer
Co-authored-by: default avatarXin Yao <xiny@nvidia.com>
Co-authored-by: default avatarXin Yao <yaox12@outlook.com>

* [Cleanup] Remove duplicate entries of CUB submodule   (issue# 4395) (#4499)

* remove third_part/cub

* remove from third_party
Co-authored-by: default avatarIsrat Nisa <nisisrat@amazon.com>
Co-authored-by: default avatarXin Yao <xiny@nvidia.com>

* [Bug] Enable turn on/off libxsmm at runtime (#4455)

* enable turn on/off libxsmm at runtime by adding a global config and related API
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal>

* [Feature] Unify the cuda stream used in core library (#4480)

* Use an internal cuda stream for CopyDataFromTo

* small fix white space

* Fix to compile

* Make stream optional in copydata for compile

* fix lint issue

* Update cub functions to use internal stream

* Lint check

* Update CopyTo/CopyFrom/CopyFromTo to use internal stream

* Address comments

* Fix backward CUDA stream

* Avoid overloading CopyFromTo()

* Minor comment update

* Overload copydatafromto in cuda device api
Co-authored-by: default avatarxiny <xiny@nvidia.com>

* [Feature] Added exclude_self and output_batch to knn graph construction (Issues #4323 #4316) (#4389)

* * Added "exclude_self" and "output_batch" options to knn_graph and segmented_knn_graph
* Updated out-of-date comments on remove_edges and remove_self_loop, since they now preserve batch information

* * Changed defaults on new knn_graph and segmented_knn_graph function parameters, for compatibility; pytorch/test_geometry.py was failing

* * Added test to ensure dgl.remove_self_loop function correctly updates batch information

* * Added new knn_graph and segmented_knn_graph parameters to dgl.nn.KNNGraph and dgl.nn.SegmentedKNNGraph

* * Formatting

* * Oops, I missed the one in segmented_knn_graph when I fixed the similar thing in knn_graph

* * Fixed edge case handling when invalid k specified, since it still needs to be handled consistently for tests to pass
* Fixed context of batch info, since it must match the context of the input position data for remove_self_loop to succeed

* * Fixed batch info resulting from knn_graph when output_batch is true, for case of 3D input tensor, representing multiple segments

* * Added testing of new exclude_self and output_batch parameters on knn_graph and segmented_knn_graph, and their wrappers, KNNGraph and SegmentedKNNGraph, into the test_knn_cuda test

* * Added doc comments for new parameters

* * Added correct handling for uncommon case of k or more coincident points when excluding self edges in knn_graph and segmented_knn_graph
* Added test cases for more than k coincident points

* * Updated doc comments for output_batch parameters for clarity

* * Linter formatting fixes

* * Extracted out common function for test_knn_cpu and test_knn_cuda, to add the new test cases to test_knn_cpu

* * Rewording in doc comments

* * Removed output_batch parameter from knn_graph and segmented_knn_graph, in favour of always setting the batch information, except in knn_graph if x is a 2D tensor
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>

* [CI] only known devs are authorized to trigger CI (#4518)

* [CI] only known devs are authorized to trigger CI

* fix if author is null

* add comments

* [Readability] Auto fix setup.py and update-version.py (#4446)

* Auto fix update-version

* Auto fix setup.py

* Auto fix update-version

* Auto fix setup.py

* [Doc] Change random.py to random_partition.py in guide on distributed partition pipeline (#4438)

* Update distributed-preprocessing.rst

* Update
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>

* fix unpinning when tensoradaptor is not available (#4450)

* [Doc] fix print issue in tutorial (#4459)

* [Example][Refactor] Refactor RGCN example (#4327)

* Refactor full graph entity classification

* Refactor rgcn with sampling

* README update

* Update

* Results update

* Respect default setting of self_loop=false in entity.py

* Update

* Update README

* Update for multi-gpu

* Update

* [doc] fix invalid link in user guide (#4468)

* [Example] directional_GSN for ogbg-molpcba (#4405)

* version-1

* version-2

* version-3

* update examples/README

* Update .gitignore

* update performance in README, delete scripts

* 1st approving review

* 2nd approving review
Co-authored-by: default avatarMufei Li <mufeili1996@gmail.com>

* Clarify the message name, which is 'm'. (#4462)
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
Co-authored-by: default avatarRhett Ying <85214957+Rhett-Ying@users.noreply.github.com>

* [Refactor] Auto fix view.py. (#4461)
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>

* [Example] SEAL for OGBL (#4291)

* [Example] SEAL for OGBL

* update index

* update

* fix readme typo

* add seal sampler

* modify set ops

* prefetch

* efficiency test

* update

* optimize

* fix ScatterAdd dtype issue

* update sampler style

* update
Co-authored-by: default avatarQuan Gan <coin2028@hotmail.com>

* [CI] use https instead of http (#4488)

* [BugFix] fix crash due to incorrect dtype in dgl.to_block() (#4487)

* [BugFix] fix crash due to incorrect dtype in dgl.to_block()

* fix test failure in TF

* [Feature] Make TensorAdapter Stream Aware (#4472)

* Allocate tensors in DGL's current stream

* make tensoradaptor stream-aware

* replace TAemtpy with cpu allocator

* fix typo

* try fix cpu allocation

* clean header

* redirect AllocDataSpace as well

* resolve comments

* [Build][Doc] Specify the sphinx version (#4465)
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>

* reformat

* reformat

* Auto fix update-version

* Auto fix setup.py

* reformat

* reformat
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
Co-authored-by: default avatarRhett Ying <85214957+Rhett-Ying@users.noreply.github.com>
Co-authored-by: default avatarMufei Li <mufeili1996@gmail.com>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>
Co-authored-by: default avatarXin Yao <xiny@nvidia.com>
Co-authored-by: default avatarChang Liu <chang.liu@utexas.edu>
Co-authored-by: default avatarZhiteng Li <55398076+ZHITENGLI@users.noreply.github.com>
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: default avatarrudongyu <ru_dongyu@outlook.com>
Co-authored-by: default avatarQuan Gan <coin2028@hotmail.com>

* Move mock version of dgl_sparse library to DGL main repo (#4524)

* init

* Add api doc for sparse library

* support op btwn matrices with differnt sparsity

* Fixed docstring

* addresses comments

* lint check

* change keyword format to fmt
Co-authored-by: default avatarIsrat Nisa <nisisrat@amazon.com>

* [DistPart] expose timeout config for process group (#4532)

* [DistPart] expose timeout config for process group

* refine code

* Update tools/distpartitioning/data_proc_pipeline.py
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>

* [Feature] Import PyTorch's CUDA stream management (#4503)

* add set_stream

* add .record_stream for NDArray and HeteroGraph

* refactor dgl stream Python APIs

* test record_stream

* add unit test for record stream

* use pytorch's stream

* fix lint

* fix cpu build

* address comments

* address comments

* add record stream tests for dgl.graph

* record frames and update dataloder

* add docstring

* update frame

* add backend check for record_stream

* remove CUDAThreadEntry::stream

* record stream for newly created formats

* fix bug

* fix cpp test

* fix None c_void_p to c_handle

* [examples]educe memory consumption (#4558)

* [examples]educe memory consumption

* reffine help message

* refine

* [Feature][REVIEW] Enable DGL cugaph nightly CI  (#4525)

* Added cugraph nightly scripts

* Removed nvcr.io//nvidia/pytorch:22.04-py3 reference
Co-authored-by: default avatarRhett Ying <85214957+Rhett-Ying@users.noreply.github.com>

* Revert "[Feature][REVIEW] Enable DGL cugaph nightly CI  (#4525)" (#4563)

This reverts commit ec171c64

.

* [Misc] Add flake8 lint workflow. (#4566)

* Add pyproject.toml for autopep8.

* Add pyproject.toml for autopep8.

* Add flake8 annotation in workflow.

* remove

* add

* clean up
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* [Misc] Try use official pylint workflow. (#4568)

* polish update_version

* update pylint workflow.

* add

* revert.
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* [CI] refine stage logic (#4565)

* [CI] refine stage logic

* refine

* refine

* remove (#4570)
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* Add Pylint workflow for flake8. (#4571)

* remove

* Add pylint.
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* [Misc] Update the python version in Pylint workflow for flake8. (#4572)

* remove

* Add pylint.

* Change the python version for pylint.
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* Update pylint. (#4574)
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* [Misc] Use another workflow. (#4575)

* Update pylint.

* Use another workflow.
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* Update pylint. (#4576)
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* Update pylint.yml

* Update pylint.yml

* Delete pylint.yml

* [Misc]Add pyproject.toml for autopep8 & black. (#4543)

* Add pyproject.toml for autopep8.

* Add pyproject.toml for autopep8.
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* [Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454)

* rename `DLContext` to `DGLContext`

* rename `kDLGPU` to `kDLCUDA`

* replace DLTensor with DGLArray

* fix linting

* Unify DGLType and DLDataType to DGLDataType

* Fix FFI

* rename DLDeviceType to DGLDeviceType

* decouple dlpack from the core library

* fix bug

* fix lint

* fix merge

* fix build

* address comments

* rename dl_converter to dlpack_convert

* remove redundant comments
Co-authored-by: default avatarChang Liu <chang.liu@utexas.edu>
Co-authored-by: default avatarnv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
Co-authored-by: default avatarXin Yao <xiny@nvidia.com>
Co-authored-by: default avatarXin Yao <yaox12@outlook.com>
Co-authored-by: default avatarIsrat Nisa <neesha295@gmail.com>
Co-authored-by: default avatarIsrat Nisa <nisisrat@amazon.com>
Co-authored-by: default avatarpeizhou001 <110809584+peizhou001@users.noreply.github.com>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal>
Co-authored-by: default avatarndickson-nvidia <99772994+ndickson-nvidia@users.noreply.github.com>
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: default avatarRhett Ying <85214957+Rhett-Ying@users.noreply.github.com>
Co-authored-by: default avatarHongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>
Co-authored-by: default avatarZhiteng Li <55398076+ZHITENGLI@users.noreply.github.com>
Co-authored-by: default avatarrudongyu <ru_dongyu@outlook.com>
Co-authored-by: default avatarQuan Gan <coin2028@hotmail.com>
Co-authored-by: default avatarVibhu Jawa <vibhujawa@gmail.com>

* [Deprecation] Dataset Attributes (#4546)

* Update

* CI

* CI

* Update
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>

* [Example] Bug Fix (#4665)

* Update

* CI

* CI

* Update

* Update
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>

* Update

* Update (#4724)
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>
Co-authored-by: default avatarChang Liu <chang.liu@utexas.edu>
Co-authored-by: default avatarnv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
Co-authored-by: default avatarXin Yao <xiny@nvidia.com>
Co-authored-by: default avatarXin Yao <yaox12@outlook.com>
Co-authored-by: default avatarIsrat Nisa <neesha295@gmail.com>
Co-authored-by: default avatarIsrat Nisa <nisisrat@amazon.com>
Co-authored-by: default avatarpeizhou001 <110809584+peizhou001@users.noreply.github.com>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal>
Co-authored-by: default avatarndickson-nvidia <99772994+ndickson-nvidia@users.noreply.github.com>
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: default avatarRhett Ying <85214957+Rhett-Ying@users.noreply.github.com>
Co-authored-by: default avatarHongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>
Co-authored-by: default avatarZhiteng Li <55398076+ZHITENGLI@users.noreply.github.com>
Co-authored-by: default avatarrudongyu <ru_dongyu@outlook.com>
Co-authored-by: default avatarQuan Gan <coin2028@hotmail.com>
Co-authored-by: default avatarVibhu Jawa <vibhujawa@gmail.com>
parent 1990e797
# Graph Convolutional Matrix Completion
Paper link: [https://arxiv.org/abs/1706.02263](https://arxiv.org/abs/1706.02263)
Author's code: [https://github.com/riannevdberg/gc-mc](https://github.com/riannevdberg/gc-mc)
The implementation does not handle side-channel features and mini-epoching and thus achieves
slightly worse performance when using node features.
Credit: Jiani Zhang ([@jennyzhang0215](https://github.com/jennyzhang0215))
## Dependencies
* MXNet 1.5.0+
* pandas
* gluonnlp
## Data
Supported datasets: ml-100k, ml-1m, ml-10m
## How to run
ml-100k, no feature
```bash
DGLBACKEND=mxnet python train.py --data_name=ml-100k --use_one_hot_fea --gcn_agg_accum=stack
```
Results: RMSE=0.9077 (0.910 reported)
Speed: 0.0246s/epoch (vanilla implementation: 0.1008s/epoch)
ml-100k, with feature
```bash
DGLBACKEND=mxnet python train.py --data_name=ml-100k --gcn_agg_accum=stack
```
Results: RMSE=0.9495 (0.905 reported)
ml-1m, no feature
```bash
DGLBACKEND=mxnet python train.py --data_name=ml-1m --gcn_agg_accum=sum --use_one_hot_fea
```
Results: RMSE=0.8377 (0.832 reported)
Speed: 0.0695s/epoch (vanilla implementation: 1.538s/epoch)
ml-10m, no feature
```bash
DGLBACKEND=mxnet python train.py --data_name=ml-10m --gcn_agg_accum=stack --gcn_dropout=0.3 \
--train_lr=0.001 --train_min_lr=0.0001 --train_max_iter=15000 \
--use_one_hot_fea --gen_r_num_basis_func=4
```
Results: RMSE=0.7875 (0.777 reported)
Speed: 0.6480s/epoch (vanilla implementation: OOM)
Testbed: EC2 p3.2xlarge
"""MovieLens dataset"""
import numpy as np
import os
import re
import pandas as pd
import scipy.sparse as sp
import gluonnlp as nlp
import mxnet as mx
import dgl
from dgl.data.utils import download, extract_archive, get_download_dir
_urls = {
'ml-100k' : 'http://files.grouplens.org/datasets/movielens/ml-100k.zip',
'ml-1m' : 'http://files.grouplens.org/datasets/movielens/ml-1m.zip',
'ml-10m' : 'http://files.grouplens.org/datasets/movielens/ml-10m.zip',
}
READ_DATASET_PATH = get_download_dir()
GENRES_ML_100K =\
['unknown', 'Action', 'Adventure', 'Animation',
'Children', 'Comedy', 'Crime', 'Documentary', 'Drama', 'Fantasy',
'Film-Noir', 'Horror', 'Musical', 'Mystery', 'Romance', 'Sci-Fi',
'Thriller', 'War', 'Western']
GENRES_ML_1M = GENRES_ML_100K[1:]
GENRES_ML_10M = GENRES_ML_100K + ['IMAX']
class MovieLens(object):
"""MovieLens dataset used by GCMC model
TODO(minjie): make this dataset more general
The dataset stores MovieLens ratings in two types of graphs. The encoder graph
contains rating value information in the form of edge types. The decoder graph
stores plain user-movie pairs in the form of a bipartite graph with no rating
information. All graphs have two types of nodes: "user" and "movie".
The training, validation and test set can be summarized as follows:
training_enc_graph : training user-movie pairs + rating info
training_dec_graph : training user-movie pairs
valid_enc_graph : training user-movie pairs + rating info
valid_dec_graph : validation user-movie pairs
test_enc_graph : training user-movie pairs + validation user-movie pairs + rating info
test_dec_graph : test user-movie pairs
Attributes
----------
train_enc_graph : dgl.DGLHeteroGraph
Encoder graph for training.
train_dec_graph : dgl.DGLHeteroGraph
Decoder graph for training.
train_labels : mx.nd.NDArray
The categorical label of each user-movie pair
train_truths : mx.nd.NDArray
The actual rating values of each user-movie pair
valid_enc_graph : dgl.DGLHeteroGraph
Encoder graph for validation.
valid_dec_graph : dgl.DGLHeteroGraph
Decoder graph for validation.
valid_labels : mx.nd.NDArray
The categorical label of each user-movie pair
valid_truths : mx.nd.NDArray
The actual rating values of each user-movie pair
test_enc_graph : dgl.DGLHeteroGraph
Encoder graph for test.
test_dec_graph : dgl.DGLHeteroGraph
Decoder graph for test.
test_labels : mx.nd.NDArray
The categorical label of each user-movie pair
test_truths : mx.nd.NDArray
The actual rating values of each user-movie pair
user_feature : mx.nd.NDArray
User feature tensor. If None, representing an identity matrix.
movie_feature : mx.nd.NDArray
Movie feature tensor. If None, representing an identity matrix.
possible_rating_values : np.ndarray
Available rating values in the dataset
Parameters
----------
name : str
Dataset name. Could be "ml-100k", "ml-1m", "ml-10m"
ctx : mx.context.Context
Device context
use_one_hot_fea : bool, optional
If true, the ``user_feature`` attribute is None, representing an one-hot identity
matrix. (Default: False)
symm : bool, optional
If true, the use symmetric normalize constant. Otherwise, use left normalize
constant. (Default: True)
test_ratio : float, optional
Ratio of test data
valid_ratio : float, optional
Ratio of validation data
"""
def __init__(self, name, ctx, use_one_hot_fea=False, symm=True,
test_ratio=0.1, valid_ratio=0.1):
self._name = name
self._ctx = ctx
self._symm = symm
self._test_ratio = test_ratio
self._valid_ratio = valid_ratio
# download and extract
download_dir = get_download_dir()
zip_file_path = '{}/{}.zip'.format(download_dir, name)
download(_urls[name], path=zip_file_path)
extract_archive(zip_file_path, '{}/{}'.format(download_dir, name))
if name == 'ml-10m':
root_folder = 'ml-10M100K'
else:
root_folder = name
self._dir = os.path.join(download_dir, name, root_folder)
print("Starting processing {} ...".format(self._name))
self._load_raw_user_info()
self._load_raw_movie_info()
print('......')
if self._name == 'ml-100k':
self.all_train_rating_info = self._load_raw_rates(os.path.join(self._dir, 'u1.base'), '\t')
self.test_rating_info = self._load_raw_rates(os.path.join(self._dir, 'u1.test'), '\t')
self.all_rating_info = pd.concat([self.all_train_rating_info, self.test_rating_info])
elif self._name == 'ml-1m' or self._name == 'ml-10m':
self.all_rating_info = self._load_raw_rates(os.path.join(self._dir, 'ratings.dat'), '::')
num_test = int(np.ceil(self.all_rating_info.shape[0] * self._test_ratio))
shuffled_idx = np.random.permutation(self.all_rating_info.shape[0])
self.test_rating_info = self.all_rating_info.iloc[shuffled_idx[: num_test]]
self.all_train_rating_info = self.all_rating_info.iloc[shuffled_idx[num_test: ]]
else:
raise NotImplementedError
print('......')
num_valid = int(np.ceil(self.all_train_rating_info.shape[0] * self._valid_ratio))
shuffled_idx = np.random.permutation(self.all_train_rating_info.shape[0])
self.valid_rating_info = self.all_train_rating_info.iloc[shuffled_idx[: num_valid]]
self.train_rating_info = self.all_train_rating_info.iloc[shuffled_idx[num_valid: ]]
self.possible_rating_values = np.unique(self.train_rating_info["rating"].values)
print("All rating pairs : {}".format(self.all_rating_info.shape[0]))
print("\tAll train rating pairs : {}".format(self.all_train_rating_info.shape[0]))
print("\t\tTrain rating pairs : {}".format(self.train_rating_info.shape[0]))
print("\t\tValid rating pairs : {}".format(self.valid_rating_info.shape[0]))
print("\tTest rating pairs : {}".format(self.test_rating_info.shape[0]))
self.user_info = self._drop_unseen_nodes(orign_info=self.user_info,
cmp_col_name="id",
reserved_ids_set=set(self.all_rating_info["user_id"].values),
label="user")
self.movie_info = self._drop_unseen_nodes(orign_info=self.movie_info,
cmp_col_name="id",
reserved_ids_set=set(self.all_rating_info["movie_id"].values),
label="movie")
# Map user/movie to the global id
self.global_user_id_map = {ele: i for i, ele in enumerate(self.user_info['id'])}
self.global_movie_id_map = {ele: i for i, ele in enumerate(self.movie_info['id'])}
print('Total user number = {}, movie number = {}'.format(len(self.global_user_id_map),
len(self.global_movie_id_map)))
self._num_user = len(self.global_user_id_map)
self._num_movie = len(self.global_movie_id_map)
### Generate features
if use_one_hot_fea:
self.user_feature = None
self.movie_feature = None
else:
self.user_feature = mx.nd.array(self._process_user_fea(), ctx=ctx, dtype=np.float32)
self.movie_feature = mx.nd.array(self._process_movie_fea(), ctx=ctx, dtype=np.float32)
if self.user_feature is None:
self.user_feature_shape = (self.num_user, self.num_user)
self.movie_feature_shape = (self.num_movie, self.num_movie)
else:
self.user_feature_shape = self.user_feature.shape
self.movie_feature_shape = self.movie_feature.shape
info_line = "Feature dim: "
info_line += "\nuser: {}".format(self.user_feature_shape)
info_line += "\nmovie: {}".format(self.movie_feature_shape)
print(info_line)
all_train_rating_pairs, all_train_rating_values = self._generate_pair_value(self.all_train_rating_info)
train_rating_pairs, train_rating_values = self._generate_pair_value(self.train_rating_info)
valid_rating_pairs, valid_rating_values = self._generate_pair_value(self.valid_rating_info)
test_rating_pairs, test_rating_values = self._generate_pair_value(self.test_rating_info)
def _make_labels(ratings):
labels = mx.nd.array(np.searchsorted(self.possible_rating_values, ratings),
ctx=ctx, dtype=np.int32)
return labels
self.train_enc_graph = self._generate_enc_graph(train_rating_pairs, train_rating_values, add_support=True)
self.train_dec_graph = self._generate_dec_graph(train_rating_pairs)
self.train_labels = _make_labels(train_rating_values)
self.train_truths = mx.nd.array(train_rating_values, ctx=ctx, dtype=np.float32)
self.valid_enc_graph = self.train_enc_graph
self.valid_dec_graph = self._generate_dec_graph(valid_rating_pairs)
self.valid_labels = _make_labels(valid_rating_values)
self.valid_truths = mx.nd.array(valid_rating_values, ctx=ctx, dtype=np.float32)
self.test_enc_graph = self._generate_enc_graph(all_train_rating_pairs, all_train_rating_values, add_support=True)
self.test_dec_graph = self._generate_dec_graph(test_rating_pairs)
self.test_labels = _make_labels(test_rating_values)
self.test_truths = mx.nd.array(test_rating_values, ctx=ctx, dtype=np.float32)
def _npairs(graph):
rst = 0
for r in self.possible_rating_values:
rst += graph.number_of_edges(str(r))
return rst
print("Train enc graph: \t#user:{}\t#movie:{}\t#pairs:{}".format(
self.train_enc_graph.number_of_nodes('user'), self.train_enc_graph.number_of_nodes('movie'),
_npairs(self.train_enc_graph)))
print("Train dec graph: \t#user:{}\t#movie:{}\t#pairs:{}".format(
self.train_dec_graph.number_of_nodes('user'), self.train_dec_graph.number_of_nodes('movie'),
self.train_dec_graph.number_of_edges()))
print("Valid enc graph: \t#user:{}\t#movie:{}\t#pairs:{}".format(
self.valid_enc_graph.number_of_nodes('user'), self.valid_enc_graph.number_of_nodes('movie'),
_npairs(self.valid_enc_graph)))
print("Valid dec graph: \t#user:{}\t#movie:{}\t#pairs:{}".format(
self.valid_dec_graph.number_of_nodes('user'), self.valid_dec_graph.number_of_nodes('movie'),
self.valid_dec_graph.number_of_edges()))
print("Test enc graph: \t#user:{}\t#movie:{}\t#pairs:{}".format(
self.test_enc_graph.number_of_nodes('user'), self.test_enc_graph.number_of_nodes('movie'),
_npairs(self.test_enc_graph)))
print("Test dec graph: \t#user:{}\t#movie:{}\t#pairs:{}".format(
self.test_dec_graph.number_of_nodes('user'), self.test_dec_graph.number_of_nodes('movie'),
self.test_dec_graph.number_of_edges()))
def _generate_pair_value(self, rating_info):
rating_pairs = (np.array([self.global_user_id_map[ele] for ele in rating_info["user_id"]],
dtype=np.int64),
np.array([self.global_movie_id_map[ele] for ele in rating_info["movie_id"]],
dtype=np.int64))
rating_values = rating_info["rating"].values.astype(np.float32)
return rating_pairs, rating_values
def _generate_enc_graph(self, rating_pairs, rating_values, add_support=False):
user_movie_R = np.zeros((self._num_user, self._num_movie), dtype=np.float32)
user_movie_R[rating_pairs] = rating_values
data_dict = dict()
num_nodes_dict = {'user': self._num_user, 'movie': self._num_movie}
rating_row, rating_col = rating_pairs
for rating in self.possible_rating_values:
ridx = np.where(rating_values == rating)
rrow = rating_row[ridx]
rcol = rating_col[ridx]
data_dict.update({
('user', str(rating), 'movie'): (rrow, rcol),
('movie', 'rev-%s' % str(rating), 'user'): (rcol, rrow)
})
graph = dgl.heterograph(data_dict, num_nodes_dict=num_nodes_dict)
# sanity check
assert len(rating_pairs[0]) == sum([graph.number_of_edges(et) for et in graph.etypes]) // 2
if add_support:
def _calc_norm(x):
x = x.asnumpy().astype('float32')
x[x == 0.] = np.inf
x = mx.nd.array(1. / np.sqrt(x))
return x.expand_dims(1)
user_ci = []
user_cj = []
movie_ci = []
movie_cj = []
for r in self.possible_rating_values:
r = str(r)
user_ci.append(graph['rev-%s' % r].in_degrees())
movie_ci.append(graph[r].in_degrees())
if self._symm:
user_cj.append(graph[r].out_degrees())
movie_cj.append(graph['rev-%s' % r].out_degrees())
else:
user_cj.append(mx.nd.zeros((self.num_user,)))
movie_cj.append(mx.nd.zeros((self.num_movie,)))
user_ci = _calc_norm(mx.nd.add_n(*user_ci))
movie_ci = _calc_norm(mx.nd.add_n(*movie_ci))
if self._symm:
user_cj = _calc_norm(mx.nd.add_n(*user_cj))
movie_cj = _calc_norm(mx.nd.add_n(*movie_cj))
else:
user_cj = mx.nd.ones((self.num_user,))
movie_cj = mx.nd.ones((self.num_movie,))
graph.nodes['user'].data.update({'ci' : user_ci, 'cj' : user_cj})
graph.nodes['movie'].data.update({'ci' : movie_ci, 'cj' : movie_cj})
return graph
def _generate_dec_graph(self, rating_pairs):
ones = np.ones_like(rating_pairs[0])
user_movie_ratings_coo = sp.coo_matrix(
(ones, rating_pairs),
shape=(self.num_user, self.num_movie), dtype=np.float32)
g = dgl.bipartite_from_scipy(user_movie_ratings_coo, utype='_U', etype='_E', vtype='_V')
return dgl.heterograph({('user', 'rate', 'movie'): g.edges()},
num_nodes_dict={'user': self.num_user, 'movie': self.num_movie})
@property
def num_links(self):
return self.possible_rating_values.size
@property
def num_user(self):
return self._num_user
@property
def num_movie(self):
return self._num_movie
def _drop_unseen_nodes(self, orign_info, cmp_col_name, reserved_ids_set, label):
# print(" -----------------")
# print("{}: {}(reserved) v.s. {}(from info)".format(label, len(reserved_ids_set),
# len(set(orign_info[cmp_col_name].values))))
if reserved_ids_set != set(orign_info[cmp_col_name].values):
pd_rating_ids = pd.DataFrame(list(reserved_ids_set), columns=["id_graph"])
# print("\torign_info: ({}, {})".format(orign_info.shape[0], orign_info.shape[1]))
data_info = orign_info.merge(pd_rating_ids, left_on=cmp_col_name, right_on='id_graph', how='outer')
data_info = data_info.dropna(subset=[cmp_col_name, 'id_graph'])
data_info = data_info.drop(columns=["id_graph"])
data_info = data_info.reset_index(drop=True)
# print("\tAfter dropping, data shape: ({}, {})".format(data_info.shape[0], data_info.shape[1]))
return data_info
else:
orign_info = orign_info.reset_index(drop=True)
return orign_info
def _load_raw_rates(self, file_path, sep):
"""In MovieLens, the rates have the following format
ml-100k
user id \t movie id \t rating \t timestamp
ml-1m/10m
UserID::MovieID::Rating::Timestamp
timestamp is unix timestamp and can be converted by pd.to_datetime(X, unit='s')
Parameters
----------
file_path : str
Returns
-------
rating_info : pd.DataFrame
"""
rating_info = pd.read_csv(
file_path, sep=sep, header=None,
names=['user_id', 'movie_id', 'rating', 'timestamp'],
dtype={'user_id': np.int32, 'movie_id' : np.int32,
'ratings': np.float32, 'timestamp': np.int64}, engine='python')
return rating_info
def _load_raw_user_info(self):
"""In MovieLens, the user attributes file have the following formats:
ml-100k:
user id | age | gender | occupation | zip code
ml-1m:
UserID::Gender::Age::Occupation::Zip-code
For ml-10m, there is no user information. We read the user id from the rating file.
Parameters
----------
name : str
Returns
-------
user_info : pd.DataFrame
"""
if self._name == 'ml-100k':
self.user_info = pd.read_csv(os.path.join(self._dir, 'u.user'), sep='|', header=None,
names=['id', 'age', 'gender', 'occupation', 'zip_code'], engine='python')
elif self._name == 'ml-1m':
self.user_info = pd.read_csv(os.path.join(self._dir, 'users.dat'), sep='::', header=None,
names=['id', 'gender', 'age', 'occupation', 'zip_code'], engine='python')
elif self._name == 'ml-10m':
rating_info = pd.read_csv(
os.path.join(self._dir, 'ratings.dat'), sep='::', header=None,
names=['user_id', 'movie_id', 'rating', 'timestamp'],
dtype={'user_id': np.int32, 'movie_id': np.int32, 'ratings': np.float32,
'timestamp': np.int64}, engine='python')
self.user_info = pd.DataFrame(np.unique(rating_info['user_id'].values.astype(np.int32)),
columns=['id'])
else:
raise NotImplementedError
def _process_user_fea(self):
"""
Parameters
----------
user_info : pd.DataFrame
name : str
For ml-100k and ml-1m, the column name is ['id', 'gender', 'age', 'occupation', 'zip_code'].
We take the age, gender, and the one-hot encoding of the occupation as the user features.
For ml-10m, there is no user feature and we set the feature to be a single zero.
Returns
-------
user_features : np.ndarray
"""
if self._name == 'ml-100k' or self._name == 'ml-1m':
ages = self.user_info['age'].values.astype(np.float32)
gender = (self.user_info['gender'] == 'F').values.astype(np.float32)
all_occupations = set(self.user_info['occupation'])
occupation_map = {ele: i for i, ele in enumerate(all_occupations)}
occupation_one_hot = np.zeros(shape=(self.user_info.shape[0], len(all_occupations)),
dtype=np.float32)
occupation_one_hot[np.arange(self.user_info.shape[0]),
np.array([occupation_map[ele] for ele in self.user_info['occupation']])] = 1
user_features = np.concatenate([ages.reshape((self.user_info.shape[0], 1)) / 50.0,
gender.reshape((self.user_info.shape[0], 1)),
occupation_one_hot], axis=1)
elif self._name == 'ml-10m':
user_features = np.zeros(shape=(self.user_info.shape[0], 1), dtype=np.float32)
else:
raise NotImplementedError
return user_features
def _load_raw_movie_info(self):
"""In MovieLens, the movie attributes may have the following formats:
In ml_100k:
movie id | movie title | release date | video release date | IMDb URL | [genres]
In ml_1m, ml_10m:
MovieID::Title (Release Year)::Genres
Also, Genres are separated by |, e.g., Adventure|Animation|Children|Comedy|Fantasy
Parameters
----------
name : str
Returns
-------
movie_info : pd.DataFrame
For ml-100k, the column name is ['id', 'title', 'release_date', 'video_release_date', 'url'] + [GENRES (19)]]
For ml-1m and ml-10m, the column name is ['id', 'title'] + [GENRES (18/20)]]
"""
if self._name == 'ml-100k':
GENRES = GENRES_ML_100K
elif self._name == 'ml-1m':
GENRES = GENRES_ML_1M
elif self._name == 'ml-10m':
GENRES = GENRES_ML_10M
else:
raise NotImplementedError
if self._name == 'ml-100k':
file_path = os.path.join(self._dir, 'u.item')
self.movie_info = pd.read_csv(file_path, sep='|', header=None,
names=['id', 'title', 'release_date', 'video_release_date', 'url'] + GENRES,
engine='python')
elif self._name == 'ml-1m' or self._name == 'ml-10m':
file_path = os.path.join(self._dir, 'movies.dat')
movie_info = pd.read_csv(file_path, sep='::', header=None,
names=['id', 'title', 'genres'], engine='python')
genre_map = {ele: i for i, ele in enumerate(GENRES)}
genre_map['Children\'s'] = genre_map['Children']
genre_map['Childrens'] = genre_map['Children']
movie_genres = np.zeros(shape=(movie_info.shape[0], len(GENRES)), dtype=np.float32)
for i, genres in enumerate(movie_info['genres']):
for ele in genres.split('|'):
if ele in genre_map:
movie_genres[i, genre_map[ele]] = 1.0
else:
print('genres not found, filled with unknown: {}'.format(genres))
movie_genres[i, genre_map['unknown']] = 1.0
for idx, genre_name in enumerate(GENRES):
assert idx == genre_map[genre_name]
movie_info[genre_name] = movie_genres[:, idx]
self.movie_info = movie_info.drop(columns=["genres"])
else:
raise NotImplementedError
def _process_movie_fea(self):
"""
Parameters
----------
movie_info : pd.DataFrame
name : str
Returns
-------
movie_features : np.ndarray
Generate movie features by concatenating embedding and the year
"""
if self._name == 'ml-100k':
GENRES = GENRES_ML_100K
elif self._name == 'ml-1m':
GENRES = GENRES_ML_1M
elif self._name == 'ml-10m':
GENRES = GENRES_ML_10M
else:
raise NotImplementedError
word_embedding = nlp.embedding.GloVe('glove.840B.300d')
tokenizer = nlp.data.transforms.SpacyTokenizer()
title_embedding = np.zeros(shape=(self.movie_info.shape[0], 300), dtype=np.float32)
release_years = np.zeros(shape=(self.movie_info.shape[0], 1), dtype=np.float32)
p = re.compile(r'(.+)\s*\((\d+)\)')
for i, title in enumerate(self.movie_info['title']):
match_res = p.match(title)
if match_res is None:
print('{} cannot be matched, index={}, name={}'.format(title, i, self._name))
title_context, year = title, 1950
else:
title_context, year = match_res.groups()
# We use average of glove
title_embedding[i, :] = word_embedding[tokenizer(title_context)].asnumpy().mean(axis=0)
release_years[i] = float(year)
movie_features = np.concatenate((title_embedding,
(release_years - 1950.0) / 100.0,
self.movie_info[GENRES]),
axis=1)
return movie_features
if __name__ == '__main__':
MovieLens("ml-100k", ctx=mx.cpu(), symm=True)
"""NN modules"""
import math
import numpy as np
import mxnet as mx
import mxnet.ndarray as F
from mxnet.gluon import nn, Block
import dgl.function as fn
from utils import get_activation
class GCMCLayer(Block):
r"""GCMC layer
.. math::
z_j^{(l+1)} = \sigma_{agg}\left[\mathrm{agg}\left(
\sum_{j\in\mathcal{N}_1}\frac{1}{c_{ij}}W_1h_j, \ldots,
\sum_{j\in\mathcal{N}_R}\frac{1}{c_{ij}}W_Rh_j
\right)\right]
After that, apply an extra output projection:
.. math::
h_j^{(l+1)} = \sigma_{out}W_oz_j^{(l+1)}
The equation is applied to both user nodes and movie nodes and the parameters
are not shared unless ``share_user_item_param`` is true.
Parameters
----------
rating_vals : list of int or float
Possible rating values.
user_in_units : int
Size of user input feature
movie_in_units : int
Size of movie input feature
msg_units : int
Size of message :math:`W_rh_j`
out_units : int
Size of of final output user and movie features
dropout_rate : float, optional
Dropout rate (Default: 0.0)
agg : str, optional
Function to aggregate messages of different ratings.
Could be any of the supported cross type reducers:
"sum", "max", "min", "mean", "stack".
(Default: "stack")
agg_act : callable, str, optional
Activation function :math:`sigma_{agg}`. (Default: None)
out_act : callable, str, optional
Activation function :math:`sigma_{agg}`. (Default: None)
share_user_item_param : bool, optional
If true, user node and movie node share the same set of parameters.
Require ``user_in_units`` and ``move_in_units`` to be the same.
(Default: False)
"""
def __init__(self,
rating_vals,
user_in_units,
movie_in_units,
msg_units,
out_units,
dropout_rate=0.0,
agg='stack', # or 'sum'
agg_act=None,
out_act=None,
share_user_item_param=False):
super(GCMCLayer, self).__init__()
self.rating_vals = rating_vals
self.agg = agg
self.share_user_item_param = share_user_item_param
if agg == 'stack':
# divide the original msg unit size by number of ratings to keep
# the dimensionality
assert msg_units % len(rating_vals) == 0
msg_units = msg_units // len(rating_vals)
with self.name_scope():
self.dropout = nn.Dropout(dropout_rate)
self.W_r = {}
for rating in rating_vals:
rating = str(rating)
if share_user_item_param and user_in_units == movie_in_units:
self.W_r[rating] = self.params.get(
'W_r_%s' % rating, shape=(user_in_units, msg_units),
dtype=np.float32, allow_deferred_init=True)
self.W_r['rev-%s' % rating] = self.W_r[rating]
else:
self.W_r[rating] = self.params.get(
'W_r_%s' % rating, shape=(user_in_units, msg_units),
dtype=np.float32, allow_deferred_init=True)
self.W_r['rev-%s' % rating] = self.params.get(
'revW_r_%s' % rating, shape=(movie_in_units, msg_units),
dtype=np.float32, allow_deferred_init=True)
self.ufc = nn.Dense(out_units)
if share_user_item_param:
self.ifc = self.ufc
else:
self.ifc = nn.Dense(out_units)
self.agg_act = get_activation(agg_act)
self.out_act = get_activation(out_act)
def forward(self, graph, ufeat=None, ifeat=None):
"""Forward function
Normalizer constant :math:`c_{ij}` is stored as two node data "ci"
and "cj".
Parameters
----------
graph : DGLHeteroGraph
User-movie rating graph. It should contain two node types: "user"
and "movie" and many edge types each for one rating value.
ufeat : mx.nd.NDArray, optional
User features. If None, using an identity matrix.
ifeat : mx.nd.NDArray, optional
Movie features. If None, using an identity matrix.
Returns
-------
new_ufeat : mx.nd.NDArray
New user features
new_ifeat : mx.nd.NDArray
New movie features
"""
num_u = graph.number_of_nodes('user')
num_i = graph.number_of_nodes('movie')
funcs = {}
for i, rating in enumerate(self.rating_vals):
rating = str(rating)
# W_r * x
x_u = dot_or_identity(ufeat, self.W_r[rating].data())
x_i = dot_or_identity(ifeat, self.W_r['rev-%s' % rating].data())
# left norm and dropout
x_u = x_u * self.dropout(graph.nodes['user'].data['cj'])
x_i = x_i * self.dropout(graph.nodes['movie'].data['cj'])
graph.nodes['user'].data['h%d' % i] = x_u
graph.nodes['movie'].data['h%d' % i] = x_i
funcs[rating] = (fn.copy_u('h%d' % i, 'm'), fn.sum('m', 'h'))
funcs['rev-%s' % rating] = (fn.copy_u('h%d' % i, 'm'), fn.sum('m', 'h'))
# message passing
graph.multi_update_all(funcs, self.agg)
ufeat = graph.nodes['user'].data.pop('h').reshape((num_u, -1))
ifeat = graph.nodes['movie'].data.pop('h').reshape((num_i, -1))
# right norm
ufeat = ufeat * graph.nodes['user'].data['ci']
ifeat = ifeat * graph.nodes['movie'].data['ci']
# fc and non-linear
ufeat = self.agg_act(ufeat)
ifeat = self.agg_act(ifeat)
ufeat = self.dropout(ufeat)
ifeat = self.dropout(ifeat)
ufeat = self.ufc(ufeat)
ifeat = self.ifc(ifeat)
return self.out_act(ufeat), self.out_act(ifeat)
class BiDecoder(Block):
r"""Bilinear decoder.
.. math::
p(M_{ij}=r) = \text{softmax}(u_i^TQ_rv_j)
The trainable parameter :math:`Q_r` is further decomposed to a linear
combination of basis weight matrices :math:`P_s`:
.. math::
Q_r = \sum_{s=1}^{b} a_{rs}P_s
Parameters
----------
rating_vals : list of int or float
Possible rating values.
in_units : int
Size of input user and movie features
num_basis_functions : int, optional
Number of basis. (Default: 2)
dropout_rate : float, optional
Dropout raite (Default: 0.0)
"""
def __init__(self,
rating_vals,
in_units,
num_basis_functions=2,
dropout_rate=0.0):
super(BiDecoder, self).__init__()
self.rating_vals = rating_vals
self._num_basis_functions = num_basis_functions
self.dropout = nn.Dropout(dropout_rate)
self.Ps = []
with self.name_scope():
for i in range(num_basis_functions):
self.Ps.append(self.params.get(
'Ps_%d' % i, shape=(in_units, in_units),
#init=mx.initializer.Orthogonal(scale=1.1, rand_type='normal'),
init=mx.initializer.Xavier(magnitude=math.sqrt(2.0)),
allow_deferred_init=True))
self.rate_out = nn.Dense(units=len(rating_vals), flatten=False, use_bias=False)
def forward(self, graph, ufeat, ifeat):
"""Forward function.
Parameters
----------
graph : DGLHeteroGraph
"Flattened" user-movie graph with only one edge type.
ufeat : mx.nd.NDArray
User embeddings. Shape: (|V_u|, D)
ifeat : mx.nd.NDArray
Movie embeddings. Shape: (|V_m|, D)
Returns
-------
mx.nd.NDArray
Predicting scores for each user-movie edge.
"""
graph = graph.local_var()
ufeat = self.dropout(ufeat)
ifeat = self.dropout(ifeat)
graph.nodes['movie'].data['h'] = ifeat
basis_out = []
for i in range(self._num_basis_functions):
graph.nodes['user'].data['h'] = F.dot(ufeat, self.Ps[i].data())
graph.apply_edges(fn.u_dot_v('h', 'h', 'sr'))
basis_out.append(graph.edata['sr'])
out = F.concat(*basis_out, dim=1)
out = self.rate_out(out)
return out
def dot_or_identity(A, B):
# if A is None, treat as identity matrix
if A is None:
return B
else:
return mx.nd.dot(A, B)
"""Training script"""
import os, time
import argparse
import logging
import random
import string
import numpy as np
import mxnet as mx
from mxnet import gluon
from data import MovieLens
from model import GCMCLayer, BiDecoder
from utils import get_activation, parse_ctx, gluon_net_info, gluon_total_param_num, \
params_clip_global_norm, MetricLogger
from mxnet.gluon import Block
class Net(Block):
def __init__(self, args, **kwargs):
super(Net, self).__init__(**kwargs)
self._act = get_activation(args.model_activation)
with self.name_scope():
self.encoder = GCMCLayer(args.rating_vals,
args.src_in_units,
args.dst_in_units,
args.gcn_agg_units,
args.gcn_out_units,
args.gcn_dropout,
args.gcn_agg_accum,
agg_act=self._act,
share_user_item_param=args.share_param)
self.decoder = BiDecoder(args.rating_vals,
in_units=args.gcn_out_units,
num_basis_functions=args.gen_r_num_basis_func)
def forward(self, enc_graph, dec_graph, ufeat, ifeat):
user_out, movie_out = self.encoder(
enc_graph,
ufeat,
ifeat)
pred_ratings = self.decoder(dec_graph, user_out, movie_out)
return pred_ratings
def evaluate(args, net, dataset, segment='valid'):
possible_rating_values = dataset.possible_rating_values
nd_possible_rating_values = mx.nd.array(possible_rating_values, ctx=args.ctx, dtype=np.float32)
if segment == "valid":
rating_values = dataset.valid_truths
enc_graph = dataset.valid_enc_graph
dec_graph = dataset.valid_dec_graph
elif segment == "test":
rating_values = dataset.test_truths
enc_graph = dataset.test_enc_graph
dec_graph = dataset.test_dec_graph
else:
raise NotImplementedError
# Evaluate RMSE
with mx.autograd.predict_mode():
pred_ratings = net(enc_graph, dec_graph,
dataset.user_feature, dataset.movie_feature)
real_pred_ratings = (mx.nd.softmax(pred_ratings, axis=1) *
nd_possible_rating_values.reshape((1, -1))).sum(axis=1)
rmse = mx.nd.square(real_pred_ratings - rating_values).mean().asscalar()
rmse = np.sqrt(rmse)
return rmse
def train(args):
print(args)
dataset = MovieLens(args.data_name, args.ctx, use_one_hot_fea=args.use_one_hot_fea, symm=args.gcn_agg_norm_symm,
test_ratio=args.data_test_ratio, valid_ratio=args.data_valid_ratio)
print("Loading data finished ...\n")
args.src_in_units = dataset.user_feature_shape[1]
args.dst_in_units = dataset.movie_feature_shape[1]
args.rating_vals = dataset.possible_rating_values
### build the net
net = Net(args=args)
net.initialize(init=mx.init.Xavier(factor_type='in'), ctx=args.ctx)
net.hybridize()
nd_possible_rating_values = mx.nd.array(dataset.possible_rating_values, ctx=args.ctx, dtype=np.float32)
rating_loss_net = gluon.loss.SoftmaxCELoss()
rating_loss_net.hybridize()
trainer = gluon.Trainer(net.collect_params(), args.train_optimizer, {'learning_rate': args.train_lr})
print("Loading network finished ...\n")
### perpare training data
train_gt_labels = dataset.train_labels
train_gt_ratings = dataset.train_truths
### prepare the logger
train_loss_logger = MetricLogger(['iter', 'loss', 'rmse'], ['%d', '%.4f', '%.4f'],
os.path.join(args.save_dir, 'train_loss%d.csv' % args.save_id))
valid_loss_logger = MetricLogger(['iter', 'rmse'], ['%d', '%.4f'],
os.path.join(args.save_dir, 'valid_loss%d.csv' % args.save_id))
test_loss_logger = MetricLogger(['iter', 'rmse'], ['%d', '%.4f'],
os.path.join(args.save_dir, 'test_loss%d.csv' % args.save_id))
### declare the loss information
best_valid_rmse = np.inf
no_better_valid = 0
best_iter = -1
avg_gnorm = 0
count_rmse = 0
count_num = 0
count_loss = 0
dataset.train_enc_graph = dataset.train_enc_graph.to(args.ctx)
dataset.train_dec_graph = dataset.train_dec_graph.to(args.ctx)
dataset.valid_enc_graph = dataset.train_enc_graph
dataset.valid_dec_graph = dataset.valid_dec_graph.to(args.ctx)
dataset.test_enc_graph = dataset.test_enc_graph.to(args.ctx)
dataset.test_dec_graph = dataset.test_dec_graph.to(args.ctx)
print("Start training ...")
dur = []
for iter_idx in range(1, args.train_max_iter):
if iter_idx > 3:
t0 = time.time()
with mx.autograd.record():
pred_ratings = net(dataset.train_enc_graph, dataset.train_dec_graph,
dataset.user_feature, dataset.movie_feature)
loss = rating_loss_net(pred_ratings, train_gt_labels).mean()
loss.backward()
count_loss += loss.asscalar()
gnorm = params_clip_global_norm(net.collect_params(), args.train_grad_clip, args.ctx)
avg_gnorm += gnorm
trainer.step(1.0)
if iter_idx > 3:
dur.append(time.time() - t0)
if iter_idx == 1:
print("Total #Param of net: %d" % (gluon_total_param_num(net)))
print(gluon_net_info(net, save_path=os.path.join(args.save_dir, 'net%d.txt' % args.save_id)))
real_pred_ratings = (mx.nd.softmax(pred_ratings, axis=1) *
nd_possible_rating_values.reshape((1, -1))).sum(axis=1)
rmse = mx.nd.square(real_pred_ratings - train_gt_ratings).sum()
count_rmse += rmse.asscalar()
count_num += pred_ratings.shape[0]
if iter_idx % args.train_log_interval == 0:
train_loss_logger.log(iter=iter_idx,
loss=count_loss/(iter_idx+1), rmse=count_rmse/count_num)
logging_str = "Iter={}, gnorm={:.3f}, loss={:.4f}, rmse={:.4f}, time={:.4f}".format(
iter_idx, avg_gnorm/args.train_log_interval,
count_loss/iter_idx, count_rmse/count_num,
np.average(dur))
avg_gnorm = 0
count_rmse = 0
count_num = 0
if iter_idx % args.train_valid_interval == 0:
valid_rmse = evaluate(args=args, net=net, dataset=dataset, segment='valid')
valid_loss_logger.log(iter = iter_idx, rmse = valid_rmse)
logging_str += ',\tVal RMSE={:.4f}'.format(valid_rmse)
if valid_rmse < best_valid_rmse:
best_valid_rmse = valid_rmse
no_better_valid = 0
best_iter = iter_idx
#net.save_parameters(filename=os.path.join(args.save_dir, 'best_valid_net{}.params'.format(args.save_id)))
test_rmse = evaluate(args=args, net=net, dataset=dataset, segment='test')
best_test_rmse = test_rmse
test_loss_logger.log(iter=iter_idx, rmse=test_rmse)
logging_str += ', Test RMSE={:.4f}'.format(test_rmse)
else:
no_better_valid += 1
if no_better_valid > args.train_early_stopping_patience\
and trainer.learning_rate <= args.train_min_lr:
logging.info("Early stopping threshold reached. Stop training.")
break
if no_better_valid > args.train_decay_patience:
new_lr = max(trainer.learning_rate * args.train_lr_decay_factor, args.train_min_lr)
if new_lr < trainer.learning_rate:
logging.info("\tChange the LR to %g" % new_lr)
trainer.set_learning_rate(new_lr)
no_better_valid = 0
if iter_idx % args.train_log_interval == 0:
print(logging_str)
print('Best Iter Idx={}, Best Valid RMSE={:.4f}, Best Test RMSE={:.4f}'.format(
best_iter, best_valid_rmse, best_test_rmse))
train_loss_logger.close()
valid_loss_logger.close()
test_loss_logger.close()
def config():
parser = argparse.ArgumentParser(description='Run the baseline method.')
parser.add_argument('--seed', default=123, type=int)
parser.add_argument('--ctx', dest='ctx', default='gpu0', type=str,
help='Running Context. E.g `--ctx gpu` or `--ctx gpu0,gpu1` or `--ctx cpu`')
parser.add_argument('--save_dir', type=str, help='The saving directory')
parser.add_argument('--save_id', type=int, help='The saving log id')
parser.add_argument('--silent', action='store_true')
parser.add_argument('--data_name', default='ml-1m', type=str,
help='The dataset name: ml-100k, ml-1m, ml-10m')
parser.add_argument('--data_test_ratio', type=float, default=0.1) ## for ml-100k the test ration is 0.2
parser.add_argument('--data_valid_ratio', type=float, default=0.1)
parser.add_argument('--use_one_hot_fea', action='store_true', default=False)
#parser.add_argument('--model_remove_rating', type=bool, default=False)
parser.add_argument('--model_activation', type=str, default="leaky")
parser.add_argument('--gcn_dropout', type=float, default=0.7)
parser.add_argument('--gcn_agg_norm_symm', type=bool, default=True)
parser.add_argument('--gcn_agg_units', type=int, default=500)
parser.add_argument('--gcn_agg_accum', type=str, default="sum")
parser.add_argument('--gcn_out_units', type=int, default=75)
parser.add_argument('--gen_r_num_basis_func', type=int, default=2)
# parser.add_argument('--train_rating_batch_size', type=int, default=10000)
parser.add_argument('--train_max_iter', type=int, default=2000)
parser.add_argument('--train_log_interval', type=int, default=1)
parser.add_argument('--train_valid_interval', type=int, default=1)
parser.add_argument('--train_optimizer', type=str, default="adam")
parser.add_argument('--train_grad_clip', type=float, default=1.0)
parser.add_argument('--train_lr', type=float, default=0.01)
parser.add_argument('--train_min_lr', type=float, default=0.001)
parser.add_argument('--train_lr_decay_factor', type=float, default=0.5)
parser.add_argument('--train_decay_patience', type=int, default=50)
parser.add_argument('--train_early_stopping_patience', type=int, default=100)
parser.add_argument('--share_param', default=False, action='store_true')
args = parser.parse_args()
args.ctx = parse_ctx(args.ctx)[0]
### configure save_fir to save all the info
if args.save_dir is None:
args.save_dir = args.data_name+"_" + ''.join(random.choices(string.ascii_uppercase + string.digits, k=2))
if args.save_id is None:
args.save_id = np.random.randint(20)
args.save_dir = os.path.join("log", args.save_dir)
if not os.path.isdir(args.save_dir):
os.makedirs(args.save_dir)
return args
if __name__ == '__main__':
args = config()
np.random.seed(args.seed)
mx.random.seed(args.seed, args.ctx)
train(args)
import ast
import os
import csv
import inspect
import logging
import re
import mxnet.ndarray as nd
from mxnet import gluon
from mxnet.gluon import nn
import mxnet as mx
import numpy as np
from collections import OrderedDict
class MetricLogger(object):
def __init__(self, attr_names, parse_formats, save_path):
self._attr_format_dict = OrderedDict(zip(attr_names, parse_formats))
self._file = open(save_path, 'w')
self._csv = csv.writer(self._file)
self._csv.writerow(attr_names)
self._file.flush()
def log(self, **kwargs):
self._csv.writerow([parse_format % kwargs[attr_name]
for attr_name, parse_format in self._attr_format_dict.items()])
self._file.flush()
def close(self):
self._file.close()
def parse_ctx(ctx_args):
ctx = re.findall('([a-z]+)(\d*)', ctx_args)
ctx = [(device, int(num)) if len(num) > 0 else (device, 0) for device, num in ctx]
ctx = [mx.Context(*ele) for ele in ctx]
return ctx
def gluon_total_param_num(net):
return sum([np.prod(v.shape) for v in net.collect_params().values()])
def gluon_net_info(net, save_path=None):
info_str = 'Total Param Number: {}\n'.format(gluon_total_param_num(net)) +\
'Params:\n'
for k, v in net.collect_params().items():
info_str += '\t{}: {}, {}\n'.format(k, v.shape, np.prod(v.shape))
info_str += str(net)
if save_path is not None:
with open(save_path, 'w') as f:
f.write(info_str)
return info_str
def params_clip_global_norm(param_dict, clip, ctx):
grads = [p.grad(ctx) for p in param_dict.values()]
gnorm = gluon.utils.clip_global_norm(grads, clip)
return gnorm
def get_activation(act):
"""Get the activation based on the act string
Parameters
----------
act: str or HybridBlock
Returns
-------
ret: HybridBlock
"""
if act is None:
return lambda x: x
if isinstance(act, str):
if act == 'leaky':
return nn.LeakyReLU(0.1)
elif act in ['relu', 'sigmoid', 'tanh', 'softrelu', 'softsign']:
return nn.Activation(act)
else:
raise NotImplementedError
else:
return act
# Stochastic Training for Graph Convolutional Networks
* Paper: [Control Variate](https://arxiv.org/abs/1710.10568)
* Paper: [Skip Connection](https://arxiv.org/abs/1809.05343)
* Author's code: [https://github.com/thu-ml/stochastic_gcn](https://github.com/thu-ml/stochastic_gcn)
### Dependencies
- MXNet nightly build
```bash
pip install mxnet --pre
```
### Neighbor Sampling & Skip Connection
cora: test accuracy ~83% with `--num-neighbors 2`, ~84% by training on the full graph
```
DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset cora --self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000
```
citeseer: test accuracy ~69% with `--num-neighbors 2`, ~70% by training on the full graph
```
DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset citeseer --self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000
```
pubmed: test accuracy ~78% with `--num-neighbors 3`, ~77% by training on the full graph
```
DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset pubmed --self-loop --num-neighbors 3 --batch-size 1000 --test-batch-size 5000
```
reddit: test accuracy ~91% with `--num-neighbors 3` and `--batch-size 1000`, ~93% by training on the full graph
```
DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset reddit-self-loop --num-neighbors 3 --batch-size 1000 --test-batch-size 5000 --n-hidden 64
```
### Control Variate & Skip Connection
cora: test accuracy ~84% with `--num-neighbors 1`, ~84% by training on the full graph
```
DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset cora --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000
```
citeseer: test accuracy ~69% with `--num-neighbors 1`, ~70% by training on the full graph
```
DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset citeseer --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000
```
pubmed: test accuracy ~79% with `--num-neighbors 1`, ~77% by training on the full graph
```
DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset pubmed --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000
```
reddit: test accuracy ~93% with `--num-neighbors 1` and `--batch-size 1000`, ~93% by training on the full graph
```
DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset reddit-self-loop --num-neighbors 1 --batch-size 10000 --test-batch-size 5000 --n-hidden 64
```
### Control Variate & GraphSAGE-mean
Following [Control Variate](https://arxiv.org/abs/1710.10568), we use the mean pooling architecture GraphSAGE-mean, two linear layers and layer normalization per graph convolution layer.
reddit: test accuracy 96.1% with `--num-neighbors 1` and `--batch-size 1000`, ~96.2% in [Control Variate](https://arxiv.org/abs/1710.10568) with `--num-neighbors 2` and `--batch-size 1000`
```
DGLBACKEND=mxnet python3 train.py --model graphsage_cv --batch-size 1000 --test-batch-size 5000 --n-epochs 50 --dataset reddit --num-neighbors 1 --n-hidden 128 --dropout 0.2 --weight-decay 0
```
### Run multi-processing training
When training a GNN model with multiple processes, there are two steps.
Step 1: run a graph store server separately that loads the reddit dataset with four workers.
```
python3 examples/mxnet/sampling/run_store_server.py --dataset reddit --num-workers 4
```
Step 2: run four workers to train GraphSage on the reddit dataset.
```
python3 ../incubator-mxnet/tools/launch.py -n 4 -s 1 --launcher local python3 multi_process_train.py --model graphsage_cv --batch-size 2500 --test-batch-size 5000 --n-epochs 1 --graph-name reddit --num-neighbors 1 --n-hidden 128 --dropout 0.2 --weight-decay 0
```
# Stochastic Training for Graph Convolutional Networks Using Distributed Sampler
* Paper: [Control Variate](https://arxiv.org/abs/1710.10568)
* Paper: [Skip Connection](https://arxiv.org/abs/1809.05343)
* Author's code: [https://github.com/thu-ml/stochastic_gcn](https://github.com/thu-ml/stochastic_gcn)
### Dependencies
- MXNet nightly build
```bash
pip install mxnet --pre
```
### Usage Guide
Assume that the user has already launched two instances (`instance_0` & `instance_1`) on AWS EC2, and also these two instances have the correct authority to access each other by TCP/IP protocol. Now we can treat `instance_0` as `Trainer` and `instance_1` as `Sampler`. Then, the user can start the trainer process and sampler process on these two instances separately. We have already provided a set of scripts to start the trainer and sampler process and users just need to change the `--ip` to their own IP address.
Once we start the trainer process, users will see the following logging output:
```
[04:48:20] .../socket_communicator.cc:68: Bind to 127.0.0.1:2049
[04:48:20] .../socket_communicator.cc:74: Listen on 127.0.0.1:2049, wait sender connect ...
```
After that user can start the sampler process. For the sampler instance_0, users can change the `--num-sampler` option to set the number of the sampler. The `sampler.py` script will start `--num-sampler` processes concurrently to maximalize the system utilization. Users can also launch many samplers in parallel across a set of machines. For example, if we have `10` sampler instance and for each instance, we set the `--num-sampler` to `2`, we need to set the `--num-sampler` of the trainer instance to `20`.
### Neighbor Sampling & Skip Connection
#### cora
Test accuracy ~83% with `--num-neighbors 2`, ~84% by training on the full graph
Trainer side:
```
DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset cora --self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000 --ip 127.0.0.1:50051 --num-sampler 1
```
Sampler side:
```
OMP_NUM_THREADS=1 DGLBACKEND=mxnet python3 sampler.py --model gcn_ns --dataset cora --self-loop --num-neighbors 2 --batch-size 1000 --ip 127.0.0.1:50051 --num-sampler 1
```
#### citeseer
Test accuracy ~69% with `--num-neighbors 2`, ~70% by training on the full graph
Trainer side:
```
DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset citeseer --self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000 --ip 127.0.0.1:50051 --num-sampler 1
```
Sampler side:
```
OMP_NUM_THREADS=1 DGLBACKEND=mxnet python3 sampler.py --model gcn_ns --dataset citeseer --self-loop --num-neighbors 2 --batch-size 1000 --ip 127.0.0.1:50051 --num-sampler 1
```
#### pubmed
Test accuracy ~78% with `--num-neighbors 3`, ~77% by training on the full graph
Trainer side:
```
DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset pubmed --self-loop --num-neighbors 3 --batch-size 1000 --test-batch-size 5000 --ip 127.0.0.1:50051 --num-sampler 1
```
Sampler side:
```
OMP_NUM_THREADS=1 DGLBACKEND=mxnet python3 sampler.py --model gcn_ns --dataset pubmed --self-loop --num-neighbors 3 --batch-size 1000 --ip 127.0.0.1:50051 --num-sampler 1
```
#### reddit
Test accuracy ~91% with `--num-neighbors 2` and `--batch-size 1000`, ~93% by training on the full graph
Trainer side:
```
DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset reddit-self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000 --n-hidden 64 --ip 127.0.0.1:2049 --num-sampler 1
```
Sampler side:
```
OMP_NUM_THREADS=1 DGLBACKEND=mxnet python3 sampler.py --model gcn_ns --dataset reddit-self-loop --num-neighbors 2 --batch-size 1000 --ip 127.0.0.1:2049 --num-sampler 1
```
### Control Variate & Skip Connection
#### cora
Test accuracy ~84% with `--num-neighbors 1`, ~84% by training on the full graph
Trainer side:
```
DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset cora --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
```
Sampler side:
```
OMP_NUM_THREADS=1 DGLBACKEND=mxnet python3 sampler.py --model gcn_cv --dataset cora --self-loop --num-neighbors 1 --batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
```
#### citeseer
Test accuracy ~69% with `--num-neighbors 1`, ~70% by training on the full graph
Trainer Side:
```
DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset citeseer --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
```
Sampler Side:
```
OMP_NUM_THREADS=1 DGLBACKEND=mxnet python3 sampler.py --model gcn_cv --dataset citeseer --self-loop --num-neighbors 1 --batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
```
#### pubmed
Trainer Side:
```
DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset pubmed --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
```
Sampler Side:
```
OMP_NUM_THREADS=1 DGLBACKEND=mxnet python3 sampler.py --model gcn_cv --dataset pubmed --self-loop --num-neighbors 1 --batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
```
#### reddit
Test accuracy ~93% with `--num-neighbors 1` and `--batch-size 1000`, ~93% by training on the full graph
Trainer Side:
```
DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset reddit-self-loop --num-neighbors 1 --batch-size 10000 --test-batch-size 5000 --n-hidden 64 --ip 127.0.0.1:50051 --num-sampler 1
```
Sampler Side:
```
OMP_NUM_THREADS=1 DGLBACKEND=mxnet python3 sampler.py --model gcn_cv --dataset reddit-self-loop --num-neighbors 1 --batch-size 10000 --ip 127.0.0.1:50051 --num-sampler 1
```
### Control Variate & GraphSAGE-mean
Following [Control Variate](https://arxiv.org/abs/1710.10568), we use the mean pooling architecture GraphSAGE-mean, two linear layers and layer normalization per graph convolution layer.
#### reddit
Test accuracy 96.1% with `--num-neighbors 1` and `--batch-size 1000`, ~96.2% in [Control Variate](https://arxiv.org/abs/1710.10568) with `--num-neighbors 2` and `--batch-size 1000`
Trainer side:
```
DGLBACKEND=mxnet python3 train.py --model graphsage_cv --batch-size 1000 --test-batch-size 5000 --n-epochs 50 --dataset reddit --num-neighbors 1 --n-hidden 128 --dropout 0.2 --weight-decay 0 --ip 127.0.0.1:50051 --num-sampler 1
```
Sampler side:
```
OMP_NUM_THREADS=1 DGLBACKEND=mxnet python3 sampler.py --model graphsage_cv --batch-size 1000 --dataset reddit --num-neighbors 1 --ip 127.0.0.1:50051 --num-sampler 1
```
import os, sys
import argparse, time, math
import numpy as np
import mxnet as mx
from mxnet import gluon
import dgl
import dgl.function as fn
from dgl import DGLGraph
from dgl.data import register_data_args, load_data
parentdir=os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
sys.path.insert(0, parentdir)
from gcn_cv_sc import NodeUpdate, GCNSampling, GCNInfer
def gcn_cv_train(g, ctx, args, n_classes, train_nid, test_nid, n_test_samples, distributed):
n0_feats = g.nodes[0].data['features']
num_nodes = g.number_of_nodes()
in_feats = n0_feats.shape[1]
g_ctx = n0_feats.context
norm = mx.nd.expand_dims(1./g.in_degrees().astype('float32'), 1)
g.set_n_repr({'norm': norm.as_in_context(g_ctx)})
degs = g.in_degrees().astype('float32').asnumpy()
degs[degs > args.num_neighbors] = args.num_neighbors
g.set_n_repr({'subg_norm': mx.nd.expand_dims(mx.nd.array(1./degs, ctx=g_ctx), 1)})
n_layers = args.n_layers
g.update_all(fn.copy_src(src='features', out='m'),
fn.sum(msg='m', out='preprocess'),
lambda node : {'preprocess': node.data['preprocess'] * node.data['norm']})
for i in range(n_layers - 1):
g.init_ndata('h_{}'.format(i), (num_nodes, args.n_hidden), 'float32')
g.init_ndata('agg_h_{}'.format(i), (num_nodes, args.n_hidden), 'float32')
g.init_ndata('h_{}'.format(n_layers-1), (num_nodes, 2*args.n_hidden), 'float32')
g.init_ndata('agg_h_{}'.format(n_layers-1), (num_nodes, 2*args.n_hidden), 'float32')
model = GCNSampling(in_feats,
args.n_hidden,
n_classes,
n_layers,
mx.nd.relu,
args.dropout,
prefix='GCN')
model.initialize(ctx=ctx)
loss_fcn = gluon.loss.SoftmaxCELoss()
infer_model = GCNInfer(in_feats,
args.n_hidden,
n_classes,
n_layers,
mx.nd.relu,
prefix='GCN')
infer_model.initialize(ctx=ctx)
# use optimizer
print(model.collect_params())
kv_type = 'dist_sync' if distributed else 'local'
trainer = gluon.Trainer(model.collect_params(), 'adam',
{'learning_rate': args.lr, 'wd': args.weight_decay},
kvstore=mx.kv.create(kv_type))
# Create sampler receiver
sampler = dgl.contrib.sampling.SamplerReceiver(graph=g, addr=args.ip, num_sender=args.num_sampler)
# initialize graph
dur = []
adj = g.adjacency_matrix(transpose=False).as_in_context(g_ctx)
for epoch in range(args.n_epochs):
start = time.time()
if distributed:
msg_head = "Worker {:d}, epoch {:d}".format(g.worker_id, epoch)
else:
msg_head = "epoch {:d}".format(epoch)
for nf in sampler:
for i in range(n_layers):
agg_history_str = 'agg_h_{}'.format(i)
dests = nf.layer_parent_nid(i+1).as_in_context(g_ctx)
# TODO we could use DGLGraph.pull to implement this, but the current
# implementation of pull is very slow. Let's manually do it for now.
agg = mx.nd.dot(mx.nd.take(adj, dests), g.nodes[:].data['h_{}'.format(i)])
g.set_n_repr({agg_history_str: agg}, dests)
node_embed_names = [['preprocess', 'h_0']]
for i in range(1, n_layers):
node_embed_names.append(['h_{}'.format(i), 'agg_h_{}'.format(i-1), 'subg_norm', 'norm'])
node_embed_names.append(['agg_h_{}'.format(n_layers-1), 'subg_norm', 'norm'])
nf.copy_from_parent(node_embed_names=node_embed_names, ctx=ctx)
# forward
with mx.autograd.record():
pred = model(nf)
batch_nids = nf.layer_parent_nid(-1)
batch_labels = g.nodes[batch_nids].data['labels'].as_in_context(ctx)
loss = loss_fcn(pred, batch_labels)
loss = loss.sum() / len(batch_nids)
loss.backward()
trainer.step(batch_size=1)
node_embed_names = [['h_{}'.format(i)] for i in range(n_layers)]
node_embed_names.append([])
nf.copy_to_parent(node_embed_names=node_embed_names)
mx.nd.waitall()
print(msg_head + ': training takes ' + str(time.time() - start))
infer_params = infer_model.collect_params()
for key in infer_params:
idx = trainer._param2idx[key]
trainer._kvstore.pull(idx, out=infer_params[key].data())
num_acc = 0.
num_tests = 0
if not distributed or g.worker_id == 0:
for nf in dgl.contrib.sampling.NeighborSampler(g, args.test_batch_size,
g.number_of_nodes(),
neighbor_type='in',
num_hops=n_layers,
seed_nodes=test_nid):
node_embed_names = [['preprocess']]
for i in range(n_layers):
node_embed_names.append(['norm'])
nf.copy_from_parent(node_embed_names=node_embed_names, ctx=ctx)
pred = infer_model(nf)
batch_nids = nf.layer_parent_nid(-1)
batch_labels = g.nodes[batch_nids].data['labels'].as_in_context(ctx)
num_acc += (pred.argmax(axis=1) == batch_labels).sum().asscalar()
num_tests += nf.layer_size(-1)
if distributed:
g._sync_barrier()
print("Test Accuracy {:.4f}". format(num_acc/num_tests))
break
elif distributed:
g._sync_barrier()
import os, sys
import argparse, time, math
import numpy as np
import mxnet as mx
from mxnet import gluon
from functools import partial
import dgl
import dgl.function as fn
from dgl import DGLGraph
from dgl.data import register_data_args, load_data
parentdir=os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
sys.path.insert(0, parentdir)
from gcn_ns_sc import NodeUpdate, GCNSampling, GCNInfer
def gcn_ns_train(g, ctx, args, n_classes, train_nid, test_nid, n_test_samples):
n0_feats = g.nodes[0].data['features']
in_feats = n0_feats.shape[1]
g_ctx = n0_feats.context
degs = g.in_degrees().astype('float32').as_in_context(g_ctx)
norm = mx.nd.expand_dims(1./degs, 1)
g.set_n_repr({'norm': norm})
model = GCNSampling(in_feats,
args.n_hidden,
n_classes,
args.n_layers,
mx.nd.relu,
args.dropout,
prefix='GCN')
model.initialize(ctx=ctx)
loss_fcn = gluon.loss.SoftmaxCELoss()
infer_model = GCNInfer(in_feats,
args.n_hidden,
n_classes,
args.n_layers,
mx.nd.relu,
prefix='GCN')
infer_model.initialize(ctx=ctx)
# use optimizer
print(model.collect_params())
trainer = gluon.Trainer(model.collect_params(), 'adam',
{'learning_rate': args.lr, 'wd': args.weight_decay},
kvstore=mx.kv.create('local'))
# Create sampler receiver
sampler = dgl.contrib.sampling.SamplerReceiver(graph=g, addr=args.ip, num_sender=args.num_sampler)
# initialize graph
dur = []
for epoch in range(args.n_epochs):
for nf in sampler:
nf.copy_from_parent(ctx=ctx)
# forward
with mx.autograd.record():
pred = model(nf)
batch_nids = nf.layer_parent_nid(-1)
batch_labels = g.nodes[batch_nids].data['labels'].as_in_context(ctx)
loss = loss_fcn(pred, batch_labels)
loss = loss.sum() / len(batch_nids)
loss.backward()
trainer.step(batch_size=1)
infer_params = infer_model.collect_params()
for key in infer_params:
idx = trainer._param2idx[key]
trainer._kvstore.pull(idx, out=infer_params[key].data())
num_acc = 0.
num_tests = 0
for nf in dgl.contrib.sampling.NeighborSampler(g, args.test_batch_size,
g.number_of_nodes(),
neighbor_type='in',
num_hops=args.n_layers+1,
seed_nodes=test_nid):
nf.copy_from_parent(ctx=ctx)
pred = infer_model(nf)
batch_nids = nf.layer_parent_nid(-1)
batch_labels = g.nodes[batch_nids].data['labels'].as_in_context(ctx)
num_acc += (pred.argmax(axis=1) == batch_labels).sum().asscalar()
num_tests += nf.layer_size(-1)
break
print("Test Accuracy {:.4f}". format(num_acc/num_tests))
import os, sys
import argparse, time, math
import numpy as np
import mxnet as mx
from mxnet import gluon
import argparse, time, math
import dgl
import dgl.function as fn
from dgl import DGLGraph
from dgl.data import register_data_args, load_data
parentdir=os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
sys.path.insert(0, parentdir)
from graphsage_cv import GraphSAGELayer, NodeUpdate
from graphsage_cv import GraphSAGETrain, GraphSAGEInfer
def graphsage_cv_train(g, ctx, args, n_classes, train_nid, test_nid, n_test_samples, distributed):
n0_feats = g.nodes[0].data['features']
num_nodes = g.number_of_nodes()
in_feats = n0_feats.shape[1]
g_ctx = n0_feats.context
norm = mx.nd.expand_dims(1./g.in_degrees().astype('float32'), 1)
g.set_n_repr({'norm': norm.as_in_context(g_ctx)})
degs = g.in_degrees().astype('float32').asnumpy()
degs[degs > args.num_neighbors] = args.num_neighbors
g.set_n_repr({'subg_norm': mx.nd.expand_dims(mx.nd.array(1./degs, ctx=g_ctx), 1)})
n_layers = args.n_layers
g.update_all(fn.copy_src(src='features', out='m'),
fn.sum(msg='m', out='preprocess'),
lambda node : {'preprocess': node.data['preprocess'] * node.data['norm']})
for i in range(n_layers):
g.init_ndata('h_{}'.format(i), (num_nodes, args.n_hidden), 'float32')
g.init_ndata('agg_h_{}'.format(i), (num_nodes, args.n_hidden), 'float32')
model = GraphSAGETrain(in_feats,
args.n_hidden,
n_classes,
n_layers,
args.dropout,
prefix='GraphSAGE')
model.initialize(ctx=ctx)
loss_fcn = gluon.loss.SoftmaxCELoss()
infer_model = GraphSAGEInfer(in_feats,
args.n_hidden,
n_classes,
n_layers,
prefix='GraphSAGE')
infer_model.initialize(ctx=ctx)
# use optimizer
print(model.collect_params())
kv_type = 'dist_sync' if distributed else 'local'
trainer = gluon.Trainer(model.collect_params(), 'adam',
{'learning_rate': args.lr, 'wd': args.weight_decay},
kvstore=mx.kv.create(kv_type))
# Create sampler receiver
sampler = dgl.contrib.sampling.SamplerReceiver(graph=g, addr=args.ip, num_sender=args.num_sampler)
# initialize graph
dur = []
adj = g.adjacency_matrix(transpose=False).as_in_context(g_ctx)
for epoch in range(args.n_epochs):
start = time.time()
if distributed:
msg_head = "Worker {:d}, epoch {:d}".format(g.worker_id, epoch)
else:
msg_head = "epoch {:d}".format(epoch)
for nf in sampler:
for i in range(n_layers):
agg_history_str = 'agg_h_{}'.format(i)
dests = nf.layer_parent_nid(i+1).as_in_context(g_ctx)
# TODO we could use DGLGraph.pull to implement this, but the current
# implementation of pull is very slow. Let's manually do it for now.
agg = mx.nd.dot(mx.nd.take(adj, dests), g.nodes[:].data['h_{}'.format(i)])
g.set_n_repr({agg_history_str: agg}, dests)
node_embed_names = [['preprocess', 'features', 'h_0']]
for i in range(1, n_layers):
node_embed_names.append(['h_{}'.format(i), 'agg_h_{}'.format(i-1), 'subg_norm', 'norm'])
node_embed_names.append(['agg_h_{}'.format(n_layers-1), 'subg_norm', 'norm'])
nf.copy_from_parent(node_embed_names=node_embed_names, ctx=ctx)
# forward
with mx.autograd.record():
pred = model(nf)
batch_nids = nf.layer_parent_nid(-1)
batch_labels = g.nodes[batch_nids].data['labels'].as_in_context(ctx)
loss = loss_fcn(pred, batch_labels)
if distributed:
loss = loss.sum() / (len(batch_nids) * g.num_workers)
else:
loss = loss.sum() / (len(batch_nids))
loss.backward()
trainer.step(batch_size=1)
node_embed_names = [['h_{}'.format(i)] for i in range(n_layers)]
node_embed_names.append([])
nf.copy_to_parent(node_embed_names=node_embed_names)
mx.nd.waitall()
print(msg_head + ': training takes ' + str(time.time() - start))
infer_params = infer_model.collect_params()
for key in infer_params:
idx = trainer._param2idx[key]
trainer._kvstore.pull(idx, out=infer_params[key].data())
num_acc = 0.
num_tests = 0
if not distributed or g.worker_id == 0:
for nf in dgl.contrib.sampling.NeighborSampler(g, args.test_batch_size,
g.number_of_nodes(),
neighbor_type='in',
num_hops=n_layers,
seed_nodes=test_nid,
add_self_loop=True):
node_embed_names = [['preprocess', 'features']]
for i in range(n_layers):
node_embed_names.append(['norm', 'subg_norm'])
nf.copy_from_parent(node_embed_names=node_embed_names, ctx=ctx)
pred = infer_model(nf)
batch_nids = nf.layer_parent_nid(-1)
batch_labels = g.nodes[batch_nids].data['labels'].as_in_context(ctx)
num_acc += (pred.argmax(axis=1) == batch_labels).sum().asscalar()
num_tests += nf.layer_size(-1)
if distributed:
g._sync_barrier()
print(msg_head + ": Test Accuracy {:.4f}". format(num_acc/num_tests))
break
elif distributed:
g._sync_barrier()
import argparse, time, math
import numpy as np
import mxnet as mx
from mxnet import gluon
from functools import partial
import dgl
import dgl.function as fn
from dgl import DGLGraph
from dgl.data import register_data_args, load_data
from dgl.contrib.sampling import SamplerPool
import time
class MySamplerPool(SamplerPool):
def worker(self, args):
"""User-defined worker function
"""
is_shuffle = True
self_loop = False;
number_hops = 1
if args.model == "gcn_ns":
number_hops = args.n_layers + 1
elif args.model == "gcn_cv":
number_hops = args.n_layers
elif args.model == "graphsage_cv":
num_hops = args.n_layers
self_loop = True
else:
print("unknown model. Please choose from gcn_ns, gcn_cv, graphsage_cv")
# Start sender
namebook = { 0:args.ip }
sender = dgl.contrib.sampling.SamplerSender(namebook)
# load and preprocess dataset
data = load_data(args)
ctx = mx.cpu()
if args.self_loop and not args.dataset.startswith('reddit'):
data.graph.add_edges_from([(i,i) for i in range(len(data.graph))])
train_nid = mx.nd.array(np.nonzero(data.train_mask)[0]).astype(np.int64).as_in_context(ctx)
test_nid = mx.nd.array(np.nonzero(data.test_mask)[0]).astype(np.int64).as_in_context(ctx)
# create GCN model
g = DGLGraph(data.graph, readonly=True)
while True:
idx = 0
for nf in dgl.contrib.sampling.NeighborSampler(g, args.batch_size,
args.num_neighbors,
neighbor_type='in',
shuffle=is_shuffle,
num_workers=32,
num_hops=number_hops,
add_self_loop=self_loop,
seed_nodes=train_nid):
print("send train nodeflow: %d" %(idx))
sender.send(nf, 0)
idx += 1
sender.signal(0)
def main(args):
pool = MySamplerPool()
pool.start(args.num_sampler, args)
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='GCN')
register_data_args(parser)
parser.add_argument("--model", type=str,
help="select a model. Valid models: gcn_ns, gcn_cv, graphsage_cv")
parser.add_argument("--batch-size", type=int, default=1000,
help="batch size")
parser.add_argument("--num-neighbors", type=int, default=3,
help="number of neighbors to be sampled")
parser.add_argument("--self-loop", action='store_true',
help="graph self-loop (default=False)")
parser.add_argument("--n-layers", type=int, default=1,
help="number of hidden gcn layers")
parser.add_argument("--ip", type=str, default='127.0.0.1:50051',
help="IP address")
parser.add_argument("--num-sampler", type=int, default=1,
help="number of sampler")
args = parser.parse_args()
print(args)
main(args)
\ No newline at end of file
import argparse, time, math
import numpy as np
import mxnet as mx
from mxnet import gluon
from functools import partial
import dgl
import dgl.function as fn
from dgl import DGLGraph
from dgl.data import register_data_args, load_data
from dis_gcn_ns_sc import gcn_ns_train
from dis_gcn_cv_sc import gcn_cv_train
from dis_graphsage_cv import graphsage_cv_train
def main(args):
# load and preprocess dataset
data = load_data(args)
if args.gpu >= 0:
ctx = mx.gpu(args.gpu)
else:
ctx = mx.cpu()
if args.self_loop and not args.dataset.startswith('reddit'):
data.graph.add_edges_from([(i,i) for i in range(len(data.graph))])
train_nid = mx.nd.array(np.nonzero(data.train_mask)[0]).astype(np.int64)
test_nid = mx.nd.array(np.nonzero(data.test_mask)[0]).astype(np.int64)
features = mx.nd.array(data.features)
labels = mx.nd.array(data.labels)
train_mask = mx.nd.array(data.train_mask)
val_mask = mx.nd.array(data.val_mask)
test_mask = mx.nd.array(data.test_mask)
in_feats = features.shape[1]
n_classes = data.num_labels
n_edges = data.graph.number_of_edges()
n_train_samples = train_mask.sum().asscalar()
n_val_samples = val_mask.sum().asscalar()
n_test_samples = test_mask.sum().asscalar()
print("""----Data statistics------'
#Edges %d
#Classes %d
#Train samples %d
#Val samples %d
#Test samples %d""" %
(n_edges, n_classes,
n_train_samples,
n_val_samples,
n_test_samples))
# create GCN model
g = DGLGraph(data.graph, readonly=True)
g.ndata['features'] = features
g.ndata['labels'] = labels
if args.model == "gcn_ns":
gcn_ns_train(g, ctx, args, n_classes, train_nid, test_nid, n_test_samples)
elif args.model == "gcn_cv":
gcn_cv_train(g, ctx, args, n_classes, train_nid, test_nid, n_test_samples, False)
elif args.model == "graphsage_cv":
graphsage_cv_train(g, ctx, args, n_classes, train_nid, test_nid, n_test_samples, False)
else:
print("unknown model. Please choose from gcn_ns, gcn_cv, graphsage_cv")
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='GCN')
register_data_args(parser)
parser.add_argument("--model", type=str,
help="select a model. Valid models: gcn_ns, gcn_cv, graphsage_cv")
parser.add_argument("--dropout", type=float, default=0.5,
help="dropout probability")
parser.add_argument("--gpu", type=int, default=-1,
help="gpu")
parser.add_argument("--lr", type=float, default=3e-2,
help="learning rate")
parser.add_argument("--n-epochs", type=int, default=200,
help="number of training epochs")
parser.add_argument("--batch-size", type=int, default=1000,
help="batch size")
parser.add_argument("--test-batch-size", type=int, default=1000,
help="test batch size")
parser.add_argument("--num-neighbors", type=int, default=3,
help="number of neighbors to be sampled")
parser.add_argument("--n-hidden", type=int, default=16,
help="number of hidden gcn units")
parser.add_argument("--n-layers", type=int, default=1,
help="number of hidden gcn layers")
parser.add_argument("--self-loop", action='store_true',
help="graph self-loop (default=False)")
parser.add_argument("--weight-decay", type=float, default=5e-4,
help="Weight for L2 loss")
parser.add_argument("--nworkers", type=int, default=1,
help="number of workers")
parser.add_argument("--ip", type=str, default='127.0.0.1:50051',
help="IP address")
parser.add_argument("--num-sampler", type=int, default=1,
help="number of sampler")
args = parser.parse_args()
print(args)
main(args)
import argparse, time, math
import numpy as np
import mxnet as mx
from mxnet import gluon
import dgl
import dgl.function as fn
from dgl import DGLGraph
from dgl.data import register_data_args, load_data
class NodeUpdate(gluon.Block):
def __init__(self, layer_id, in_feats, out_feats, dropout, activation=None, test=False, concat=False):
super(NodeUpdate, self).__init__()
self.layer_id = layer_id
self.dropout = dropout
self.test = test
self.concat = concat
with self.name_scope():
self.dense = gluon.nn.Dense(out_feats, in_units=in_feats)
self.activation = activation
def forward(self, node):
h = node.data['h']
norm = node.data['norm']
if self.test:
h = h * norm
else:
agg_history_str = 'agg_h_{}'.format(self.layer_id-1)
agg_history = node.data[agg_history_str]
subg_norm = node.data['subg_norm']
# control variate
h = h * subg_norm + agg_history * norm
if self.dropout:
h = mx.nd.Dropout(h, p=self.dropout)
h = self.dense(h)
if self.concat:
h = mx.nd.concat(h, self.activation(h))
elif self.activation:
h = self.activation(h)
return {'activation': h}
class GCNSampling(gluon.Block):
def __init__(self,
in_feats,
n_hidden,
n_classes,
n_layers,
activation,
dropout,
**kwargs):
super(GCNSampling, self).__init__(**kwargs)
self.dropout = dropout
self.n_layers = n_layers
with self.name_scope():
self.layers = gluon.nn.Sequential()
# input layer
self.dense = gluon.nn.Dense(n_hidden, in_units=in_feats)
self.activation = activation
# hidden layers
for i in range(1, n_layers):
skip_start = (i == self.n_layers-1)
self.layers.add(NodeUpdate(i, n_hidden, n_hidden, dropout, activation, concat=skip_start))
# output layer
self.layers.add(NodeUpdate(n_layers, 2*n_hidden, n_classes, dropout))
def forward(self, nf):
h = nf.layers[0].data['preprocess']
if self.dropout:
h = mx.nd.Dropout(h, p=self.dropout)
h = self.dense(h)
skip_start = (0 == self.n_layers-1)
if skip_start:
h = mx.nd.concat(h, self.activation(h))
else:
h = self.activation(h)
for i, layer in enumerate(self.layers):
new_history = h.copy().detach()
history_str = 'h_{}'.format(i)
history = nf.layers[i].data[history_str]
h = h - history
nf.layers[i].data['h'] = h
nf.block_compute(i,
fn.copy_src(src='h', out='m'),
fn.sum(msg='m', out='h'),
layer)
h = nf.layers[i+1].data.pop('activation')
# update history
if i < nf.num_layers-1:
nf.layers[i].data[history_str] = new_history
return h
class GCNInfer(gluon.Block):
def __init__(self,
in_feats,
n_hidden,
n_classes,
n_layers,
activation,
**kwargs):
super(GCNInfer, self).__init__(**kwargs)
self.n_layers = n_layers
with self.name_scope():
self.layers = gluon.nn.Sequential()
# input layer
self.dense = gluon.nn.Dense(n_hidden, in_units=in_feats)
self.activation = activation
# hidden layers
for i in range(1, n_layers):
skip_start = (i == self.n_layers-1)
self.layers.add(NodeUpdate(i, n_hidden, n_hidden, 0, activation, True, concat=skip_start))
# output layer
self.layers.add(NodeUpdate(n_layers, 2*n_hidden, n_classes, 0, None, True))
def forward(self, nf):
h = nf.layers[0].data['preprocess']
h = self.dense(h)
skip_start = (0 == self.n_layers-1)
if skip_start:
h = mx.nd.concat(h, self.activation(h))
else:
h = self.activation(h)
for i, layer in enumerate(self.layers):
nf.layers[i].data['h'] = h
nf.block_compute(i,
fn.copy_src(src='h', out='m'),
fn.sum(msg='m', out='h'),
layer)
h = nf.layers[i+1].data.pop('activation')
return h
def gcn_cv_train(g, ctx, args, n_classes, train_nid, test_nid, n_test_samples, distributed):
n0_feats = g.nodes[0].data['features']
num_nodes = g.number_of_nodes()
in_feats = n0_feats.shape[1]
g_ctx = n0_feats.context
norm = mx.nd.expand_dims(1./g.in_degrees().astype('float32'), 1)
g.set_n_repr({'norm': norm.as_in_context(g_ctx)})
degs = g.in_degrees().astype('float32').asnumpy()
degs[degs > args.num_neighbors] = args.num_neighbors
g.set_n_repr({'subg_norm': mx.nd.expand_dims(mx.nd.array(1./degs, ctx=g_ctx), 1)})
n_layers = args.n_layers
g.update_all(fn.copy_src(src='features', out='m'),
fn.sum(msg='m', out='preprocess'),
lambda node : {'preprocess': node.data['preprocess'] * node.data['norm']})
for i in range(n_layers - 1):
g.init_ndata('h_{}'.format(i), (num_nodes, args.n_hidden), 'float32')
g.init_ndata('agg_h_{}'.format(i), (num_nodes, args.n_hidden), 'float32')
g.init_ndata('h_{}'.format(n_layers-1), (num_nodes, 2*args.n_hidden), 'float32')
g.init_ndata('agg_h_{}'.format(n_layers-1), (num_nodes, 2*args.n_hidden), 'float32')
model = GCNSampling(in_feats,
args.n_hidden,
n_classes,
n_layers,
mx.nd.relu,
args.dropout,
prefix='GCN')
model.initialize(ctx=ctx)
loss_fcn = gluon.loss.SoftmaxCELoss()
infer_model = GCNInfer(in_feats,
args.n_hidden,
n_classes,
n_layers,
mx.nd.relu,
prefix='GCN')
infer_model.initialize(ctx=ctx)
# use optimizer
print(model.collect_params())
kv_type = 'dist_sync' if distributed else 'local'
trainer = gluon.Trainer(model.collect_params(), 'adam',
{'learning_rate': args.lr, 'wd': args.weight_decay},
kvstore=mx.kv.create(kv_type))
# initialize graph
dur = []
adj = g.adjacency_matrix(transpose=False).as_in_context(g_ctx)
for epoch in range(args.n_epochs):
start = time.time()
if distributed:
msg_head = "Worker {:d}, epoch {:d}".format(g.worker_id, epoch)
else:
msg_head = "epoch {:d}".format(epoch)
for nf in dgl.contrib.sampling.NeighborSampler(g, args.batch_size,
args.num_neighbors,
neighbor_type='in',
num_workers=32,
shuffle=True,
num_hops=n_layers,
seed_nodes=train_nid):
for i in range(n_layers):
agg_history_str = 'agg_h_{}'.format(i)
dests = nf.layer_parent_nid(i+1).as_in_context(g_ctx)
# TODO we could use DGLGraph.pull to implement this, but the current
# implementation of pull is very slow. Let's manually do it for now.
agg = mx.nd.dot(mx.nd.take(adj, dests), g.nodes[:].data['h_{}'.format(i)])
g.set_n_repr({agg_history_str: agg}, dests)
node_embed_names = [['preprocess', 'h_0']]
for i in range(1, n_layers):
node_embed_names.append(['h_{}'.format(i), 'agg_h_{}'.format(i-1), 'subg_norm', 'norm'])
node_embed_names.append(['agg_h_{}'.format(n_layers-1), 'subg_norm', 'norm'])
nf.copy_from_parent(node_embed_names=node_embed_names, ctx=ctx)
# forward
with mx.autograd.record():
pred = model(nf)
batch_nids = nf.layer_parent_nid(-1)
batch_labels = g.nodes[batch_nids].data['labels'].as_in_context(ctx)
loss = loss_fcn(pred, batch_labels)
loss = loss.sum() / len(batch_nids)
loss.backward()
trainer.step(batch_size=1)
node_embed_names = [['h_{}'.format(i)] for i in range(n_layers)]
node_embed_names.append([])
nf.copy_to_parent(node_embed_names=node_embed_names)
mx.nd.waitall()
print(msg_head + ': training takes ' + str(time.time() - start))
infer_params = infer_model.collect_params()
for key in infer_params:
idx = trainer._param2idx[key]
trainer._kvstore.pull(idx, out=infer_params[key].data())
num_acc = 0.
num_tests = 0
if not distributed or g.worker_id == 0:
for nf in dgl.contrib.sampling.NeighborSampler(g, args.test_batch_size,
g.number_of_nodes(),
neighbor_type='in',
num_hops=n_layers,
seed_nodes=test_nid):
node_embed_names = [['preprocess']]
for i in range(n_layers):
node_embed_names.append(['norm'])
nf.copy_from_parent(node_embed_names=node_embed_names, ctx=ctx)
pred = infer_model(nf)
batch_nids = nf.layer_parent_nid(-1)
batch_labels = g.nodes[batch_nids].data['labels'].as_in_context(ctx)
num_acc += (pred.argmax(axis=1) == batch_labels).sum().asscalar()
num_tests += nf.layer_size(-1)
if distributed:
g._sync_barrier()
print("Test Accuracy {:.4f}". format(num_acc/num_tests))
break
elif distributed:
g._sync_barrier()
import argparse, time, math
import numpy as np
import mxnet as mx
from mxnet import gluon
from functools import partial
import dgl
import dgl.function as fn
from dgl import DGLGraph
from dgl.data import register_data_args, load_data
class NodeUpdate(gluon.Block):
def __init__(self, in_feats, out_feats, activation=None, concat=False):
super(NodeUpdate, self).__init__()
self.dense = gluon.nn.Dense(out_feats, in_units=in_feats)
self.activation = activation
self.concat = concat
def forward(self, node):
h = node.data['h']
h = h * node.data['norm']
h = self.dense(h)
# skip connection
if self.concat:
h = mx.nd.concat(h, self.activation(h))
elif self.activation:
h = self.activation(h)
return {'activation': h}
class GCNSampling(gluon.Block):
def __init__(self,
in_feats,
n_hidden,
n_classes,
n_layers,
activation,
dropout,
**kwargs):
super(GCNSampling, self).__init__(**kwargs)
self.dropout = dropout
self.n_layers = n_layers
with self.name_scope():
self.layers = gluon.nn.Sequential()
# input layer
skip_start = (0 == n_layers-1)
self.layers.add(NodeUpdate(in_feats, n_hidden, activation, concat=skip_start))
# hidden layers
for i in range(1, n_layers):
skip_start = (i == n_layers-1)
self.layers.add(NodeUpdate(n_hidden, n_hidden, activation, concat=skip_start))
# output layer
self.layers.add(NodeUpdate(2*n_hidden, n_classes))
def forward(self, nf):
nf.layers[0].data['activation'] = nf.layers[0].data['features']
for i, layer in enumerate(self.layers):
h = nf.layers[i].data.pop('activation')
if self.dropout:
h = mx.nd.Dropout(h, p=self.dropout)
nf.layers[i].data['h'] = h
degs = nf.layer_in_degree(i + 1).astype('float32').as_in_context(h.context)
nf.layers[i + 1].data['norm'] = mx.nd.expand_dims(1./degs, 1)
nf.block_compute(i,
fn.copy_src(src='h', out='m'),
fn.sum(msg='m', out='h'),
layer)
h = nf.layers[-1].data.pop('activation')
return h
class GCNInfer(gluon.Block):
def __init__(self,
in_feats,
n_hidden,
n_classes,
n_layers,
activation,
**kwargs):
super(GCNInfer, self).__init__(**kwargs)
self.n_layers = n_layers
with self.name_scope():
self.layers = gluon.nn.Sequential()
# input layer
skip_start = (0 == n_layers-1)
self.layers.add(NodeUpdate(in_feats, n_hidden, activation, concat=skip_start))
# hidden layers
for i in range(1, n_layers):
skip_start = (i == n_layers-1)
self.layers.add(NodeUpdate(n_hidden, n_hidden, activation, concat=skip_start))
# output layer
self.layers.add(NodeUpdate(2*n_hidden, n_classes))
def forward(self, nf):
nf.layers[0].data['activation'] = nf.layers[0].data['features']
for i, layer in enumerate(self.layers):
h = nf.layers[i].data.pop('activation')
nf.layers[i].data['h'] = h
nf.block_compute(i,
fn.copy_src(src='h', out='m'),
fn.sum(msg='m', out='h'),
layer)
return nf.layers[-1].data.pop('activation')
def gcn_ns_train(g, ctx, args, n_classes, train_nid, test_nid, n_test_samples):
n0_feats = g.nodes[0].data['features']
in_feats = n0_feats.shape[1]
g_ctx = n0_feats.context
degs = g.in_degrees().astype('float32').as_in_context(g_ctx)
norm = mx.nd.expand_dims(1./degs, 1)
g.set_n_repr({'norm': norm})
model = GCNSampling(in_feats,
args.n_hidden,
n_classes,
args.n_layers,
mx.nd.relu,
args.dropout,
prefix='GCN')
model.initialize(ctx=ctx)
loss_fcn = gluon.loss.SoftmaxCELoss()
infer_model = GCNInfer(in_feats,
args.n_hidden,
n_classes,
args.n_layers,
mx.nd.relu,
prefix='GCN')
infer_model.initialize(ctx=ctx)
# use optimizer
print(model.collect_params())
trainer = gluon.Trainer(model.collect_params(), 'adam',
{'learning_rate': args.lr, 'wd': args.weight_decay},
kvstore=mx.kv.create('local'))
# initialize graph
dur = []
for epoch in range(args.n_epochs):
for nf in dgl.contrib.sampling.NeighborSampler(g, args.batch_size,
args.num_neighbors,
neighbor_type='in',
shuffle=True,
num_workers=32,
num_hops=args.n_layers+1,
seed_nodes=train_nid):
nf.copy_from_parent(ctx=ctx)
# forward
with mx.autograd.record():
pred = model(nf)
batch_nids = nf.layer_parent_nid(-1)
batch_labels = g.nodes[batch_nids].data['labels'].as_in_context(ctx)
loss = loss_fcn(pred, batch_labels)
loss = loss.sum() / len(batch_nids)
loss.backward()
trainer.step(batch_size=1)
infer_params = infer_model.collect_params()
for key in infer_params:
idx = trainer._param2idx[key]
trainer._kvstore.pull(idx, out=infer_params[key].data())
num_acc = 0.
num_tests = 0
for nf in dgl.contrib.sampling.NeighborSampler(g, args.test_batch_size,
g.number_of_nodes(),
neighbor_type='in',
num_hops=args.n_layers+1,
seed_nodes=test_nid):
nf.copy_from_parent(ctx=ctx)
pred = infer_model(nf)
batch_nids = nf.layer_parent_nid(-1)
batch_labels = g.nodes[batch_nids].data['labels'].as_in_context(ctx)
num_acc += (pred.argmax(axis=1) == batch_labels).sum().asscalar()
num_tests += nf.layer_size(-1)
break
print("Test Accuracy {:.4f}". format(num_acc/num_tests))
import argparse, time, math
import numpy as np
import mxnet as mx
from mxnet import gluon
import argparse, time, math
import dgl
import dgl.function as fn
from dgl import DGLGraph
from dgl.data import register_data_args, load_data
class GraphSAGELayer(gluon.Block):
def __init__(self,
in_feats,
hidden,
out_feats,
dropout,
last=False,
**kwargs):
super(GraphSAGELayer, self).__init__(**kwargs)
self.last = last
self.dropout = dropout
with self.name_scope():
self.dense1 = gluon.nn.Dense(hidden, in_units=in_feats)
self.layer_norm1 = gluon.nn.LayerNorm(in_channels=hidden)
self.dense2 = gluon.nn.Dense(out_feats, in_units=hidden)
if not self.last:
self.layer_norm2 = gluon.nn.LayerNorm(in_channels=out_feats)
def forward(self, h):
h = self.dense1(h)
h = self.layer_norm1(h)
h = mx.nd.relu(h)
if self.dropout:
h = mx.nd.Dropout(h, p=self.dropout)
h = self.dense2(h)
if not self.last:
h = self.layer_norm2(h)
h = mx.nd.relu(h)
return h
class NodeUpdate(gluon.Block):
def __init__(self, layer_id, in_feats, out_feats, hidden, dropout,
test=False, last=False):
super(NodeUpdate, self).__init__()
self.layer_id = layer_id
self.dropout = dropout
self.test = test
self.last = last
with self.name_scope():
self.layer = GraphSAGELayer(in_feats, hidden, out_feats, dropout, last)
def forward(self, node):
h = node.data['h']
norm = node.data['norm']
# activation from previous layer of myself
self_h = node.data['self_h']
if self.test:
h = (h - self_h) * norm
# graphsage
h = mx.nd.concat(h, self_h)
else:
agg_history_str = 'agg_h_{}'.format(self.layer_id-1)
agg_history = node.data[agg_history_str]
# normalization constant
subg_norm = node.data['subg_norm']
# delta_h (h - history) from previous layer of myself
self_delta_h = node.data['self_delta_h']
# control variate
h = (h - self_delta_h) * subg_norm + agg_history * norm
# graphsage
h = mx.nd.concat(h, self_h)
if self.dropout:
h = mx.nd.Dropout(h, p=self.dropout)
h = self.layer(h)
return {'activation': h}
class GraphSAGETrain(gluon.Block):
def __init__(self,
in_feats,
n_hidden,
n_classes,
n_layers,
dropout,
**kwargs):
super(GraphSAGETrain, self).__init__(**kwargs)
self.dropout = dropout
with self.name_scope():
self.layers = gluon.nn.Sequential()
# input layer
self.input_layer = GraphSAGELayer(2*in_feats, n_hidden, n_hidden, dropout)
# hidden layers
for i in range(1, n_layers):
self.layers.add(NodeUpdate(i, 2*n_hidden, n_hidden, n_hidden, dropout))
# output layer
self.layers.add(NodeUpdate(n_layers, 2*n_hidden, n_classes, n_hidden, dropout, last=True))
def forward(self, nf):
h = nf.layers[0].data['preprocess']
features = nf.layers[0].data['features']
h = mx.nd.concat(h, features)
if self.dropout:
h = mx.nd.Dropout(h, p=self.dropout)
h = self.input_layer(h)
for i, layer in enumerate(self.layers):
parent_nid = dgl.utils.toindex(nf.layer_parent_nid(i+1))
layer_nid = nf.map_from_parent_nid(i, parent_nid,
remap_local=True).as_in_context(h.context)
self_h = h[layer_nid]
# activation from previous layer of myself, used in graphSAGE
nf.layers[i+1].data['self_h'] = self_h
new_history = h.copy().detach()
history_str = 'h_{}'.format(i)
history = nf.layers[i].data[history_str]
# delta_h used in control variate
delta_h = h - history
# delta_h from previous layer of the nodes in (i+1)-th layer, used in control variate
nf.layers[i+1].data['self_delta_h'] = delta_h[layer_nid]
nf.layers[i].data['h'] = delta_h
nf.block_compute(i,
fn.copy_src(src='h', out='m'),
fn.sum(msg='m', out='h'),
layer)
h = nf.layers[i+1].data.pop('activation')
# update history
if i < nf.num_layers-1:
nf.layers[i].data[history_str] = new_history
return h
class GraphSAGEInfer(gluon.Block):
def __init__(self,
in_feats,
n_hidden,
n_classes,
n_layers,
**kwargs):
super(GraphSAGEInfer, self).__init__(**kwargs)
with self.name_scope():
self.layers = gluon.nn.Sequential()
# input layer
self.input_layer = GraphSAGELayer(2*in_feats, n_hidden, n_hidden, 0)
# hidden layers
for i in range(1, n_layers):
self.layers.add(NodeUpdate(i, 2*n_hidden, n_hidden, n_hidden, 0, True))
# output layer
self.layers.add(NodeUpdate(n_layers, 2*n_hidden, n_classes, n_hidden, 0, True, last=True))
def forward(self, nf):
h = nf.layers[0].data['preprocess']
features = nf.layers[0].data['features']
h = mx.nd.concat(h, features)
h = self.input_layer(h)
for i, layer in enumerate(self.layers):
nf.layers[i].data['h'] = h
parent_nid = dgl.utils.toindex(nf.layer_parent_nid(i+1))
layer_nid = nf.map_from_parent_nid(i, parent_nid,
remap_local=True).as_in_context(h.context)
# activation from previous layer of the nodes in (i+1)-th layer, used in graphSAGE
self_h = h[layer_nid]
nf.layers[i+1].data['self_h'] = self_h
nf.block_compute(i,
fn.copy_src(src='h', out='m'),
fn.sum(msg='m', out='h'),
layer)
h = nf.layers[i+1].data.pop('activation')
return h
def graphsage_cv_train(g, ctx, args, n_classes, train_nid, test_nid, n_test_samples, distributed):
n0_feats = g.nodes[0].data['features']
num_nodes = g.number_of_nodes()
in_feats = n0_feats.shape[1]
g_ctx = n0_feats.context
norm = mx.nd.expand_dims(1./g.in_degrees().astype('float32'), 1)
g.set_n_repr({'norm': norm.as_in_context(g_ctx)})
degs = g.in_degrees().astype('float32').asnumpy()
degs[degs > args.num_neighbors] = args.num_neighbors
g.set_n_repr({'subg_norm': mx.nd.expand_dims(mx.nd.array(1./degs, ctx=g_ctx), 1)})
n_layers = args.n_layers
g.update_all(fn.copy_src(src='features', out='m'),
fn.sum(msg='m', out='preprocess'),
lambda node : {'preprocess': node.data['preprocess'] * node.data['norm']})
for i in range(n_layers):
g.init_ndata('h_{}'.format(i), (num_nodes, args.n_hidden), 'float32')
g.init_ndata('agg_h_{}'.format(i), (num_nodes, args.n_hidden), 'float32')
model = GraphSAGETrain(in_feats,
args.n_hidden,
n_classes,
n_layers,
args.dropout,
prefix='GraphSAGE')
model.initialize(ctx=ctx)
loss_fcn = gluon.loss.SoftmaxCELoss()
infer_model = GraphSAGEInfer(in_feats,
args.n_hidden,
n_classes,
n_layers,
prefix='GraphSAGE')
infer_model.initialize(ctx=ctx)
# use optimizer
print(model.collect_params())
kv_type = 'dist_sync' if distributed else 'local'
trainer = gluon.Trainer(model.collect_params(), 'adam',
{'learning_rate': args.lr, 'wd': args.weight_decay},
kvstore=mx.kv.create(kv_type))
# initialize graph
dur = []
adj = g.adjacency_matrix(transpose=False).as_in_context(g_ctx)
for epoch in range(args.n_epochs):
start = time.time()
if distributed:
msg_head = "Worker {:d}, epoch {:d}".format(g.worker_id, epoch)
else:
msg_head = "epoch {:d}".format(epoch)
for nf in dgl.contrib.sampling.NeighborSampler(g, args.batch_size,
args.num_neighbors,
neighbor_type='in',
shuffle=True,
num_workers=32,
num_hops=n_layers,
add_self_loop=True,
seed_nodes=train_nid):
for i in range(n_layers):
agg_history_str = 'agg_h_{}'.format(i)
dests = nf.layer_parent_nid(i+1).as_in_context(g_ctx)
# TODO we could use DGLGraph.pull to implement this, but the current
# implementation of pull is very slow. Let's manually do it for now.
agg = mx.nd.dot(mx.nd.take(adj, dests), g.nodes[:].data['h_{}'.format(i)])
g.set_n_repr({agg_history_str: agg}, dests)
node_embed_names = [['preprocess', 'features', 'h_0']]
for i in range(1, n_layers):
node_embed_names.append(['h_{}'.format(i), 'agg_h_{}'.format(i-1), 'subg_norm', 'norm'])
node_embed_names.append(['agg_h_{}'.format(n_layers-1), 'subg_norm', 'norm'])
nf.copy_from_parent(node_embed_names=node_embed_names, ctx=ctx)
# forward
with mx.autograd.record():
pred = model(nf)
batch_nids = nf.layer_parent_nid(-1)
batch_labels = g.nodes[batch_nids].data['labels'].as_in_context(ctx)
loss = loss_fcn(pred, batch_labels)
if distributed:
loss = loss.sum() / (len(batch_nids) * g.num_workers)
else:
loss = loss.sum() / (len(batch_nids))
loss.backward()
trainer.step(batch_size=1)
node_embed_names = [['h_{}'.format(i)] for i in range(n_layers)]
node_embed_names.append([])
nf.copy_to_parent(node_embed_names=node_embed_names)
mx.nd.waitall()
print(msg_head + ': training takes ' + str(time.time() - start))
infer_params = infer_model.collect_params()
for key in infer_params:
idx = trainer._param2idx[key]
trainer._kvstore.pull(idx, out=infer_params[key].data())
num_acc = 0.
num_tests = 0
if not distributed or g.worker_id == 0:
for nf in dgl.contrib.sampling.NeighborSampler(g, args.test_batch_size,
g.number_of_nodes(),
neighbor_type='in',
num_hops=n_layers,
seed_nodes=test_nid,
add_self_loop=True):
node_embed_names = [['preprocess', 'features']]
for i in range(n_layers):
node_embed_names.append(['norm', 'subg_norm'])
nf.copy_from_parent(node_embed_names=node_embed_names, ctx=ctx)
pred = infer_model(nf)
batch_nids = nf.layer_parent_nid(-1)
batch_labels = g.nodes[batch_nids].data['labels'].as_in_context(ctx)
num_acc += (pred.argmax(axis=1) == batch_labels).sum().asscalar()
num_tests += nf.layer_size(-1)
if distributed:
g._sync_barrier()
print(msg_head + ": Test Accuracy {:.4f}". format(num_acc/num_tests))
break
elif distributed:
g._sync_barrier()
from multiprocessing import Process
import argparse, time, math
import numpy as np
import os
os.environ['OMP_NUM_THREADS'] = '16'
import mxnet as mx
from mxnet import gluon
import dgl
from dgl import DGLGraph
from dgl.data import register_data_args, load_data
from gcn_ns_sc import gcn_ns_train
from gcn_cv_sc import gcn_cv_train
from graphsage_cv import graphsage_cv_train
def main(args):
g = dgl.contrib.graph_store.create_graph_from_store(args.graph_name, "shared_mem")
# We need to set random seed here. Otherwise, all processes have the same mini-batches.
mx.random.seed(g.worker_id)
features = g.nodes[:].data['features']
labels = g.nodes[:].data['labels']
train_mask = g.nodes[:].data['train_mask']
val_mask = g.nodes[:].data['val_mask']
test_mask = g.nodes[:].data['test_mask']
if args.num_gpus > 0:
ctx = mx.gpu(g.worker_id % args.num_gpus)
else:
ctx = mx.cpu()
train_nid = mx.nd.array(np.nonzero(train_mask.asnumpy())[0]).astype(np.int64)
test_nid = mx.nd.array(np.nonzero(test_mask.asnumpy())[0]).astype(np.int64)
n_classes = len(np.unique(labels.asnumpy()))
n_train_samples = train_mask.sum().asscalar()
n_val_samples = val_mask.sum().asscalar()
n_test_samples = test_mask.sum().asscalar()
if args.model == "gcn_ns":
gcn_ns_train(g, ctx, args, n_classes, train_nid, test_nid, n_test_samples)
elif args.model == "gcn_cv":
gcn_cv_train(g, ctx, args, n_classes, train_nid, test_nid, n_test_samples, True)
elif args.model == "graphsage_cv":
graphsage_cv_train(g, ctx, args, n_classes, train_nid, test_nid, n_test_samples, True)
else:
print("unknown model. Please choose from gcn_ns, gcn_cv, graphsage_cv")
print("parent ends")
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='GCN')
register_data_args(parser)
parser.add_argument("--model", type=str,
help="select a model. Valid models: gcn_ns, gcn_cv, graphsage_cv")
parser.add_argument("--graph-name", type=str, default="",
help="graph name")
parser.add_argument("--num-feats", type=int, default=100,
help="the number of features")
parser.add_argument("--dropout", type=float, default=0.5,
help="dropout probability")
parser.add_argument("--num-gpus", type=int, default=0,
help="the number of GPUs to train")
parser.add_argument("--lr", type=float, default=3e-2,
help="learning rate")
parser.add_argument("--n-epochs", type=int, default=200,
help="number of training epochs")
parser.add_argument("--batch-size", type=int, default=1000,
help="batch size")
parser.add_argument("--test-batch-size", type=int, default=1000,
help="test batch size")
parser.add_argument("--num-neighbors", type=int, default=3,
help="number of neighbors to be sampled")
parser.add_argument("--n-hidden", type=int, default=16,
help="number of hidden gcn units")
parser.add_argument("--n-layers", type=int, default=1,
help="number of hidden gcn layers")
parser.add_argument("--self-loop", action='store_true',
help="graph self-loop (default=False)")
parser.add_argument("--weight-decay", type=float, default=5e-4,
help="Weight for L2 loss")
args = parser.parse_args()
print(args)
main(args)
import os
import argparse, time, math
import numpy as np
from scipy import sparse as spsp
import mxnet as mx
import dgl
from dgl import DGLGraph
from dgl.data import register_data_args, load_data
class GraphData:
def __init__(self, csr, num_feats, graph_name):
num_nodes = csr.shape[0]
self.graph = dgl.graph_index.from_csr(csr.indptr, csr.indices, False, 'in')
self.graph = self.graph.copyto_shared_mem(dgl.contrib.graph_store._get_graph_path(graph_name))
self.features = mx.nd.random.normal(shape=(csr.shape[0], num_feats))
self.num_labels = 10
self.labels = mx.nd.floor(mx.nd.random.uniform(low=0, high=self.num_labels,
shape=(csr.shape[0])))
self.train_mask = np.zeros((num_nodes,))
self.train_mask[np.arange(0, int(num_nodes/2), dtype=np.int64)] = 1
self.val_mask = np.zeros((num_nodes,))
self.val_mask[np.arange(int(num_nodes/2), int(num_nodes/4*3), dtype=np.int64)] = 1
self.test_mask = np.zeros((num_nodes,))
self.test_mask[np.arange(int(num_nodes/4*3), int(num_nodes), dtype=np.int64)] = 1
def main(args):
# load and preprocess dataset
if args.graph_file != '':
csr = mx.nd.load(args.graph_file)[0]
n_edges = csr.shape[0]
graph_name = os.path.basename(args.graph_file)
data = GraphData(csr, args.num_feats, graph_name)
csr = None
else:
data = load_data(args)
n_edges = data.graph.number_of_edges()
graph_name = args.dataset
if args.self_loop and not args.dataset.startswith('reddit'):
data.graph.add_edges_from([(i,i) for i in range(len(data.graph))])
mem_ctx = mx.cpu()
features = mx.nd.array(data.features, ctx=mem_ctx)
labels = mx.nd.array(data.labels, ctx=mem_ctx)
train_mask = mx.nd.array(data.train_mask, ctx=mem_ctx)
val_mask = mx.nd.array(data.val_mask, ctx=mem_ctx)
test_mask = mx.nd.array(data.test_mask, ctx=mem_ctx)
n_classes = data.num_labels
n_train_samples = train_mask.sum().asscalar()
n_val_samples = val_mask.sum().asscalar()
n_test_samples = test_mask.sum().asscalar()
print("""----Data statistics------'
#Edges %d
#Classes %d
#Train samples %d
#Val samples %d
#Test samples %d""" %
(n_edges, n_classes,
n_train_samples,
n_val_samples,
n_test_samples))
# create GCN model
print('graph name: ' + graph_name)
g = dgl.contrib.graph_store.create_graph_store_server(data.graph, graph_name, "shared_mem",
args.num_workers, False, edge_dir='in')
g.ndata['features'] = features
g.ndata['labels'] = labels
g.ndata['train_mask'] = train_mask
g.ndata['val_mask'] = val_mask
g.ndata['test_mask'] = test_mask
g.run()
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='GCN')
register_data_args(parser)
parser.add_argument("--graph-file", type=str, default="",
help="graph file")
parser.add_argument("--num-feats", type=int, default=100,
help="the number of features")
parser.add_argument("--self-loop", action='store_true',
help="graph self-loop (default=False)")
parser.add_argument("--num-workers", type=int, default=1,
help="the number of workers")
args = parser.parse_args()
main(args)
import argparse, time, math
import numpy as np
import mxnet as mx
from mxnet import gluon
from functools import partial
import dgl
import dgl.function as fn
from dgl import DGLGraph
from dgl.data import register_data_args, load_data
from gcn_ns_sc import gcn_ns_train
from gcn_cv_sc import gcn_cv_train
from graphsage_cv import graphsage_cv_train
def main(args):
# load and preprocess dataset
data = load_data(args)
if args.gpu >= 0:
ctx = mx.gpu(args.gpu)
else:
ctx = mx.cpu()
if args.self_loop and not args.dataset.startswith('reddit'):
data.graph.add_edges_from([(i,i) for i in range(len(data.graph))])
train_nid = mx.nd.array(np.nonzero(data.train_mask)[0]).astype(np.int64)
test_nid = mx.nd.array(np.nonzero(data.test_mask)[0]).astype(np.int64)
features = mx.nd.array(data.features)
labels = mx.nd.array(data.labels)
train_mask = mx.nd.array(data.train_mask)
val_mask = mx.nd.array(data.val_mask)
test_mask = mx.nd.array(data.test_mask)
in_feats = features.shape[1]
n_classes = data.num_labels
n_edges = data.graph.number_of_edges()
n_train_samples = train_mask.sum().asscalar()
n_val_samples = val_mask.sum().asscalar()
n_test_samples = test_mask.sum().asscalar()
print("""----Data statistics------'
#Edges %d
#Classes %d
#Train samples %d
#Val samples %d
#Test samples %d""" %
(n_edges, n_classes,
n_train_samples,
n_val_samples,
n_test_samples))
# create GCN model
g = dgl.DGLGraph(data.graph, readonly=True)
g.ndata['features'] = features
g.ndata['labels'] = labels
if args.model == "gcn_ns":
gcn_ns_train(g, ctx, args, n_classes, train_nid, test_nid, n_test_samples)
elif args.model == "gcn_cv":
gcn_cv_train(g, ctx, args, n_classes, train_nid, test_nid, n_test_samples, False)
elif args.model == "graphsage_cv":
graphsage_cv_train(g, ctx, args, n_classes, train_nid, test_nid, n_test_samples, False)
else:
print("unknown model. Please choose from gcn_ns, gcn_cv, graphsage_cv")
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='GCN')
register_data_args(parser)
parser.add_argument("--model", type=str,
help="select a model. Valid models: gcn_ns, gcn_cv, graphsage_cv")
parser.add_argument("--dropout", type=float, default=0.5,
help="dropout probability")
parser.add_argument("--gpu", type=int, default=-1,
help="gpu")
parser.add_argument("--lr", type=float, default=3e-2,
help="learning rate")
parser.add_argument("--n-epochs", type=int, default=200,
help="number of training epochs")
parser.add_argument("--batch-size", type=int, default=1000,
help="batch size")
parser.add_argument("--test-batch-size", type=int, default=1000,
help="test batch size")
parser.add_argument("--num-neighbors", type=int, default=3,
help="number of neighbors to be sampled")
parser.add_argument("--n-hidden", type=int, default=16,
help="number of hidden gcn units")
parser.add_argument("--n-layers", type=int, default=1,
help="number of hidden gcn layers")
parser.add_argument("--self-loop", action='store_true',
help="graph self-loop (default=False)")
parser.add_argument("--weight-decay", type=float, default=5e-4,
help="Weight for L2 loss")
parser.add_argument("--nworkers", type=int, default=1,
help="number of workers")
args = parser.parse_args()
print(args)
main(args)
# Adaptive sampling for graph representation learning
This is dgl implementation of [Adaptive Sampling Towards Fast Graph Representation Learning](https://arxiv.org/abs/1809.05343).
The authors' implementation can be found [here](https://github.com/huangwb/AS-GCNN).
## Performance
Test accuracy on cora dataset achieves 0.84 around 250 epochs when sample size is set to 256 for each layer.
## Usage
`python adaptive_sampling.py --batch_size 20 --node_per_layer 40`
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment