Commits · 2190c39d674f76c65db9ee8da7b43d3021f19c29 · OpenDAS / dgl

03 May, 2020 1 commit

[Feature] Distributed graph store (#1383) · 2190c39d

Da Zheng authored May 02, 2020

* initial version from distributed training.

This is copied from multiprocessing training.

* modify for distributed training.

* it's runnable now.

* measure time in neighbor sampling.

* simplify neighbor sampling.

* fix a bug in distributed neighbor sampling.

* allow single-machine training.

* fix a bug.

* fix a bug.

* fix openmp.

* make some improvement.

* fix.

* add prepare in the sampler.

* prepare nodeflow async.

* fix a bug.

* get id.

* simplify the code.

* improve.

* fix partition.py

* fix the example.

* add more features.

* fix the example.

* allow one partition

* use distributed kvstore.

* do g2l map manually.

* fix commandline.

* a temp script to save reddit.

* fix pull_handler.

* add pytorch version.

* estimate the time for copying data.

* delete unused code.

* fix a bug.

* print id.

* fix a bug

* fix a bug

* fix a bug.

* remove redundent code.

* revert modify in sampler.

* fix temp script.

* remove pytorch version.

* fix.

* distributed training with pytorch.

* add distributed graph store.

* fix.

* add metis_partition_assignment.

* fix a few bugs in distributed graph store.

* fix test.

* fix bugs in distributed graph store.

* fix tests.

* remove code of defining DistGraphStore.

* fix partition.

* fix example.

* update run.sh.

* only read necessary node data.

* batching data fetch of multiple NodeFlows.

* simplify gcn.

* remove unnecessary code.

* use the new copy_from_kvstore.

* update training script.

* print time in graphsage.

* make distributed training runnable.

* use val_nid.

* fix train_sampling.

* add distributed training.

* add run.sh

* add more timing.

* fix a bug.

* save graph metadata when partition.

* create ndata and edata in distributed graph store.

* add timing in minibatch training of GraphSage.

* use pytorch distributed.

* add checks.

* fix a bug in global vs. local ids.

* remove fast pull

* fix a compile error.

* update and add new APIs.

* implement more methods in DistGraphStore.

* update more APIs.

* rename it to DistGraph.

* rename to DistTensor

* remove some unnecessary API.

* remove unnecessary files.

* revert changes in sampler.

* Revert "simplify gcn."

This reverts commit 0ed3a34ca714203a5b45240af71555d4227ce452.

* Revert "simplify neighbor sampling."

This reverts commit 551c72d20f05a029360ba97f312c7a7a578aacec.

* Revert "measure time in neighbor sampling."

This reverts commit 63ae80c7b402bb626e24acbbc8fdfe9fffd0bc64.

* Revert "add timing in minibatch training of GraphSage."

This reverts commit e59dc8957a414c7df5c316f51d78bce822bdef5e.

* Revert "fix train_sampling."

This reverts commit ea6aea9a4aabb8ba0ff63070aa51e7ca81536ad9.

* fix lint.

* add comments and small update.

* add more comments.

* add more unit tests and fix bugs.

* check the existence of shared-mem graph index.

* use new partitioned graph storage.

* fix bugs.

* print error in fast pull.

* fix lint

* fix a compile error.

* save absolute path after partitioning.

* small fixes in the example

* Revert "[kvstore] support any data type for init_data() (#1465)"

This reverts commit 87b6997b

.

* fix a bug.

* disable evaluation.

* Revert "Revert "[kvstore] support any data type for init_data() (#1465)""

This reverts commit f5b8039c6326eb73bad8287db3d30d93175e5bee.

* support set and init data.

* support set and init data.

* Revert "Revert "[kvstore] support any data type for init_data() (#1465)""

This reverts commit f5b8039c6326eb73bad8287db3d30d93175e5bee.

* fix bugs.

* fix unit test.

* move to dgl.distributed.

* fix lint.

* fix lint.

* remove local_nids.

* fix lint.

* fix test.

* remove train_dist.

* revert train_sampling.

* rename funcs.

* address comments.

* address comments.

Use NodeDataView/EdgeDataView to keep track of data.

* address comments.

* address comments.

* revert.

* save data with DGL serializer.

* use the right way of getting shape.

* fix lint.

* address comments.

* address comments.

* fix an error in mxnet.

* address comments.

* add edge_map.

* add more test and fix bugs.
Co-authored-by: Zheng <dzzhen@186590dc80ff.ant.amazon.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-6-131.us-east-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-26-167.us-east-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-16-150.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-16-250.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-30-135.us-west-2.compute.internal>

2190c39d

28 Apr, 2020 2 commits

[Model] Shared memory history tensor for multi-gpu training of control variate methods (#1479) · 05b0f1ea
Quan (Andy) Gan authored Apr 28, 2020
```
* shared memory history for multi-gpu training

* reorg
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
```
05b0f1ea

[Model] Unsupervised learning with GraphSAGE (#1440) · dc0432e7

Quan (Andy) Gan authored Apr 28, 2020



* unsupervised graphsage first commit

* fix

* disable remove_edges and still got 0.90 performance

* optimize edgeids with multimap

* change hyperparams

* update README
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

dc0432e7

26 Apr, 2020 1 commit

[Model] GraphSAGE with control variate sampling on new sampler (#1355) · 97bb85d5

Quan (Andy) Gan authored Apr 26, 2020



* control variate first commit

* bug fixes

* split to single and multi GPU

* update readme

* bugfix

* bugfix

* remove push

* bugfix on multi gpu

* update README
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

97bb85d5

18 Mar, 2020 1 commit
- [Bug] Fix dsttype in GraphSAGE minibatch model (#1371) · 0a51dc54
  Quan (Andy) Gan authored Mar 18, 2020
```
* fix for new ntype API for blocks

* adding two new interfaces
```
  0a51dc54
15 Mar, 2020 1 commit
- [Model][Perf] Improve sage sampling performance (#1364) · d876680a
  Minjie Wang authored Mar 15, 2020
```
* improve speed

* fix bugs

* upd reg test
```
  d876680a
10 Mar, 2020 1 commit
- rewrite to use dataloader (#1333) · 20e1bb45
  Quan (Andy) Gan authored Mar 10, 2020
```
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
```
  20e1bb45
07 Mar, 2020 1 commit

[Model][Sampler] GraphSAGE model, bipartite graph conversion & remove edges API (#1297) · a9520f71

Quan (Andy) Gan authored Mar 08, 2020

* remove edge and to bipartite and graphsage with sampling

* fixes

* fixes

* fixes

* reenable multigpu training

* fixes

* compatibility in DGLGraph

* rename to compact_as_bipartite

* bugfix

* lint

* add offline inference

* skip GPU tests

* fix

* addresses comments

* fix

* fix

* fix

* more tests

* more docs and unit tests

* workaround for empty slice on empty data

a9520f71

04 Nov, 2019 1 commit
- hotfix (#971) · fdd0fe65
  Zihao Ye authored Nov 04, 2019
  
  fdd0fe65
03 Nov, 2019 1 commit

[NN] nn modules & examples update (#890) · 9a0511c8

Zihao Ye authored Nov 04, 2019

* upd

* damn it

* fuck

* fuck pylint

* fudge

* remove some comments about MXNet

* upd

* upd

* damn it

* damn it

* fuck

* fuck

* upd

* upd

* pylint bastard

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

9a0511c8

30 Oct, 2019 1 commit

[Bug Fix] Fix package reliability bug of networkx (#949) · 82499e60

xiang song(charlie.song) authored Oct 30, 2019

* upd

* fig edgebatch edges

* add test

* trigger

* Update README.md for pytorch PinSage example.

Add noting that the PinSage model example under
example/pytorch/recommendation only work with Python 3.6+
as its dataset loader depends on stanfordnlp package
which work only with Python 3.6+.

* Provid a frame agnostic API to test nn modules on both CPU and CUDA side.

1. make dgl.nn.xxx frame agnostic
2. make test.backend include dgl.nn modules
3. modify test_edge_softmax of test/mxnet/test_nn.py and
    test/pytorch/test_nn.py work on both CPU and GPU

* Fix style

* Delete unused code

* Make agnostic test only related to tests/backend

1. clear all agnostic related code in dgl.nn
2. make test_graph_conv agnostic to cpu/gpu

* Fix code style

* fix

* doc

* Make all test code under tests.mxnet/pytorch.test_nn.py
work on both CPU and GPU.

* Fix syntex

* Remove rand

* Add TAGCN nn.module and example

* Now tagcn can run on CPU.

* Add unitest for TGConv

* Fix style

* For pubmed dataset, using --lr=0.005 can achieve better acc

* Fix style

* Fix some descriptions

* trigger

* Fix doc

* Add nn.TGConv and example

* Fix bug

* Update data in mxnet.tagcn test acc.

* Fix some comments and code

* delete useless code

* Fix namming

* Fix bug

* Fix bug

* Add test for mxnet TAGCov

* Add test code for mxnet TAGCov

* Update some docs

* Fix some code

* Update docs dgl.nn.mxnet

* Update weight init

* Fix

* reproduce the bug

* Fix concurrency bug reported at #755.
Also make test_shared_mem_store.py more deterministic.

* Update test_shared_mem_store.py

* Update dmlc/core

* networkx >= 2.4 will break our examples

* Update tutorials/requirements

* fix selfloop edges

* upd version

82499e60

29 Oct, 2019 1 commit
- [Cleanup] Change Byte to Bool for training masks (#954) · 98c1448b
  Jacob Stevens authored Oct 29, 2019
```
* Change Byte to Bool for training masks

* Check if module has Bool, otherwise use Byte
```
  98c1448b
27 Aug, 2019 2 commits

[Refactor] Interface of nn modules (#798) · 9314aabd
Zihao Ye authored Aug 27, 2019
```
* refactor

* upd mpnn
```
9314aabd

[NN] Add commonly used GNN models from examples to dgl.nn modules. (#748) · 650f6ee1

Zihao Ye authored Aug 27, 2019

* gat

* upd

* upd sage

* upd

* upd

* upd

* upd

* upd

* add gmmconv

* upd ggnn

* upd

* upd

* upd

* upd

* add citation examples

* add README

* fix cheb

* improve doc

* formula

* upd

* trigger

* lint

* lint

* upd

* add test for transform

* add test

* check

* upd

* improve doc

* shape check

* upd

* densechebconv, currently not correct (?)

* fix cheb

* fix

* upd

* upd sgc-reddit

* upd

* trigger

650f6ee1

23 May, 2019 1 commit
- all demo use python-3 (#555) · f99725ad
  Chao Ma authored May 23, 2019
  
  f99725ad
22 Feb, 2019 1 commit

[Model][Pytorch] GraphSAGE (#403) · 8c750170

hbsun2113 authored Feb 23, 2019

* simple implemention of GraphSAGE

* update the abstract method and `torch.nn` modules are more utilized

8c750170