Commits · 4be4b134247cf79617480e5f4646dfa07bd96a4e · OpenDAS / dgl

31 Jul, 2020 1 commit

[Distributed] add copy_partitions.py (#1866) · 4be4b134

Da Zheng authored Jul 31, 2020



* fix bugs.

* eval on both vaidation and testing.

* add script.

* update.

* update launch.

* make train_dist.py independent.

* update readme.

* update readme.

* update readme.

* update readme.

* generate undirected graph.

* rename conf_file to part_config

* use rsync

* make train_dist independent.
Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-1.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-115.us-west-2.compute.internal>
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

4be4b134

27 Jul, 2020 1 commit
- [Distributed] Small fix on launch script (#1867) · bcb988bd
  Chao Ma authored Jul 27, 2020
```
* update

* update

* update

* update
```
  bcb988bd
16 Jul, 2020 1 commit

[Distributed] Distributed launching script (#1772) · ca9d3216

Chao Ma authored Jul 17, 2020



* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* fix launch script.
Co-authored-by: Da Zheng <zhengda1936@gmail.com>

ca9d3216

15 Jul, 2020 1 commit

[Distributed] add the standalone mode in DistGraph (#1800) · cda0abf7

Da Zheng authored Jul 14, 2020



* add standalone mode

* add comments.

* add tests for sampling.

* fix.

* make the code to run the standalone mode

* fix

* fix

* fix readme.

* fix.

* fix test
Co-authored-by: Chao Ma <mctt90@gmail.com>

cda0abf7

14 Jul, 2020 1 commit

[Distributed] Run distributed graph server inside DGL (#1801) · a1472bcf

Da Zheng authored Jul 14, 2020

* run dist server in dgl.

* fix bugs.

* fix example.

* check environment variables and fix lint.

* fix lint

a1472bcf

28 Jun, 2020 1 commit

[Distributed] Pytorch example of distributed GraphSage. (#1495) · 02d31974

Da Zheng authored Jun 27, 2020



* add train_dist.

* Fix sampling example.

* use distributed sampler.

* fix a bug in DistTensor.

* fix distributed training example.

* add graph partition.

* add command

* disable pytorch parallel.

* shutdown correctly.

* load diff graphs.

* add ip_config.txt.

* record timing for each step.

* use ogb

* add profiler.

* fix a bug.

* add train_dist.

* Fix sampling example.

* use distributed sampler.

* fix a bug in DistTensor.

* fix distributed training example.

* add graph partition.

* add command

* disable pytorch parallel.

* shutdown correctly.

* load diff graphs.

* add ip_config.txt.

* record timing for each step.

* use ogb

* add profiler.

* add Ips of the cluster.

* fix exit.

* support multiple clients.

* balance node types and edges.

* move code.

* remove run.sh

* Revert "support multiple clients."

* fix.

* update train_sampling.

* fix.

* fix

* remove run.sh

* update readme.

* update readme.

* use pytorch distributed.

* ensure all trainers run the same number of steps.

* Update README.md
Co-authored-by: Ubuntu <ubuntu@ip-172-31-16-250.us-west-2.compute.internal>

02d31974