"src/vscode:/vscode.git/clone" did not exist on "2cf4bd0acf479d8d51347b6b524aebd3fdcc8d9f"
- 06 Jun, 2023 1 commit
-
-
Rhett Ying authored
-
- 01 Jun, 2023 1 commit
-
-
Rhett Ying authored
-
- 17 Nov, 2022 1 commit
-
-
Rhett Ying authored
* [Dist][Examples] refactor dist graphsage examples * refine train_dist.py * update train_dist_unsupervised.py * fix debug info * update train_dist_transductive * update unsupervised_transductive * remove distgnn * fix join() in standalone mode * change batch_labels to long() for ogbn-papers100M * free unnecessary mem * lint * fix lint * refine * fix lint * fix incorrect args * refine
-
- 28 Sep, 2022 1 commit
-
-
Hongzhi (Steve), Chen authored
Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
- 15 Sep, 2022 1 commit
-
-
Rhett Ying authored
* [examples]educe memory consumption * reffine help message * refine
-
- 27 Apr, 2022 1 commit
-
-
Rhett Ying authored
* [Feature] enable socket net_type for rpc * fix lint * fix lint * fix build issue on windows * fix test failure on windows * fix test failure * fix cpp unit test failure * net_type blocking max_try_times * fix other comments * fix lint * fix comment * fix lint * fix cpp
-
- 25 Mar, 2022 1 commit
-
-
Quan (Andy) Gan authored
* fix distributed multi-GPU example device * try Join * update version requirement in README * use model.join * fix docs Co-authored-by:Jinjing Zhou <VoVAllen@users.noreply.github.com>
-
- 27 Feb, 2022 1 commit
-
-
Quan (Andy) Gan authored
* huuuuge update * remove * lint * lint * fix * what happened to nccl * update multi-gpu unsupervised graphsage example * replace most of the dgl.mp.process with torch.mp.spawn * update if condition for use_uva case * update user guide * address comments * incorporating suggestions from @jermainewang * oops * fix tutorial to pass CI * oops * fix again Co-authored-by:Xin Yao <xiny@nvidia.com>
-
- 07 Feb, 2022 1 commit
-
-
Jinjing Zhou authored
-
- 24 Dec, 2021 1 commit
-
-
xcwan authored
* Add nccl backend and fix pad_data function cuda bug * Update train_dist.py * Update train_dist.py Co-authored-by:Jinjing Zhou <VoVAllen@users.noreply.github.com>
-
- 20 Dec, 2021 1 commit
-
-
Jinjing Zhou authored
Co-authored-by:Quan (Andy) Gan <coin2028@hotmail.com>
-
- 06 Dec, 2021 1 commit
-
-
Jinjing Zhou authored
* tmp fix * add description
-
- 02 Sep, 2021 1 commit
-
-
xiang song(charlie.song) authored
Co-authored-by:Ubuntu <ubuntu@ip-172-31-2-66.ec2.internal>
-
- 16 Jun, 2021 1 commit
-
-
Da Zheng authored
* add. * fix. * fix. * fix. * fix. * add tests. * support node split and edge split. * support 1 partition. * add tests. * fix. * fix test. * use hierarchical partition. * add check. Co-authored-by:
Zheng <dzzhen@3c22fba32af5.ant.amazon.com> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-22-57.us-west-2.compute.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-71-112.ec2.internal>
-
- 03 May, 2021 1 commit
-
-
xiang song(charlie.song) authored
* Draft for sparse emb * add some notes * Fix * Add sparse optim for dist pytorch * Update test * Fix * upd * upd * Fix * Fix * Fix bug * add transductive exmpale * Fix example * Some fix * Upd * Fix lint * lint * lint * lint * upd * Fix lint * lint * upd * remove dead import * update * lint * update unitest * update example * Add adam optimizer * Add unitest and update data * upd * upd * upd * Fix docstring and fix some bug in example code * Update rgcn readme Co-authored-by:
Ubuntu <ubuntu@ip-172-31-57-25.ec2.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-24-210.ec2.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-2-66.ec2.internal>
-
- 30 Mar, 2021 1 commit
-
-
Da Zheng authored
* remove num_workers. * remove num_workers. * remove num_workers. * remove num-servers. * update error message. * update docstring. * fix docs. * fix tests. * fix test. * fix. * print messages in test. * fix. * fix test. * fix. Co-authored-by:Ubuntu <ubuntu@ip-172-31-9-132.us-west-1.compute.internal>
-
- 22 Mar, 2021 1 commit
-
-
Da Zheng authored
Co-authored-by:xiang song(charlie.song) <classicxsong@gmail.com>
-
- 29 Oct, 2020 1 commit
-
-
maqy1995 authored
-
- 16 Sep, 2020 1 commit
-
-
Chao Ma authored
* update * update * update
-
- 15 Sep, 2020 2 commits
-
-
Chao Ma authored
* update * update
-
Qidong Su authored
Co-authored-by:
Ubuntu <ubuntu@ip-172-31-10-127.us-west-2.compute.internal> Co-authored-by:
Chao Ma <mctt90@gmail.com>
-
- 14 Sep, 2020 1 commit
-
-
Chao Ma authored
-
- 27 Aug, 2020 1 commit
-
-
Chao Ma authored
* check num_workers * update * update * update * update * update * update
-
- 25 Aug, 2020 1 commit
-
-
Chao Ma authored
* fix issues on GPU * update * update * update * update * update * update * update * update * update Co-authored-by:Ma <manchao@38f9d3587685.ant.amazon.com>
-
- 12 Aug, 2020 2 commits
- 11 Aug, 2020 2 commits
-
-
Da Zheng authored
* move server start code to initialize. * fix. * fix lint. * fix examples. * add more checks.
-
Chao Ma authored
* remove server_count from ip_config.txt * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * lint * update * update * update * update * update * update * update * update * update * update * update * update * update * Update dist_context.py * fix lint. * make it work for multiple spaces. * update ip_config.txt. * fix examples. * update * update * update * update * update * update * update * update * update * update * update * update * update * update * udpate * update * update * update * update * update Co-authored-by:
Da Zheng <zhengda1936@gmail.com> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-19-1.us-west-2.compute.internal>
-
- 10 Aug, 2020 1 commit
-
-
Da Zheng authored
* fix tests. * fix. * remove a test. * make code work in the standalone mode. * fix example. * more fix. * make DistDataloader work with num_workers=0 * fix DistDataloader tests. * fix. * fix lint. * fix cleanup. * fix test * remove unnecessary code. * remove tests. * fix. * fix. * fix. * fix example * fix. * fix. * fix launch script. Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-19-1.us-west-2.compute.internal>
-
- 08 Aug, 2020 1 commit
-
-
Da Zheng authored
* fix example. * move feature copy to sampler. Co-authored-by:Ubuntu <ubuntu@ip-172-31-19-1.us-west-2.compute.internal>
-
- 05 Aug, 2020 1 commit
-
-
Jinjing Zhou authored
* 111 * 111 * fix * 111 * fix * 11 * fix * lint * Update __init__.py * lint * fix * lint * fix * fix * fix * fix * fix * try fix * try fix * fix * Revert "fix" This reverts commit a0b954fd4e99b7df92b53db8334dcb583d6e1551. * fixes. * fix. * fix test. * fix exit. * fix. * fix * fix * lint * lint * lint * fix * Update .gitignore * 111 * fix * 111 * 111 * fff * 1111 * 111 * 1325315 * ffff * f??? * fff * 1111 * 111 * fix * 111 * asda * 1111 * 11 * 123 * 啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊 * spawn * 1231231 * up * 111 * fix * fix * Revert "fix" This reverts commit 7373f95312fdcaa36d2fc330bf242339e89c045d. * fix * fix * 1111 * fix * fix tests * start kvclient as early as possible. * lint * fix test * lint * 1111 * fix * fix * 111 * fix * fix * 1 * fix * fix * lint * fix * lint * lint * remove quit * fix * lint * fix * fix several * lint * fix minor * fix * lint Co-authored-by:
Da Zheng <zhengda1936@gmail.com> Co-authored-by:
xiang song(charlie.song) <classicxsong@gmail.com>
-
- 31 Jul, 2020 1 commit
-
-
Da Zheng authored
* fix bugs. * eval on both vaidation and testing. * add script. * update. * update launch. * make train_dist.py independent. * update readme. * update readme. * update readme. * update readme. * generate undirected graph. * rename conf_file to part_config * use rsync * make train_dist independent. Co-authored-by:
Ubuntu <ubuntu@ip-172-31-19-1.us-west-2.compute.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-19-115.us-west-2.compute.internal> Co-authored-by:
xiang song(charlie.song) <classicxsong@gmail.com>
-
- 22 Jul, 2020 1 commit
-
-
Da Zheng authored
* add eval. * extend DistTensor. * fix. * add barrier. * add more print. * add more checks in kvstore. * fix lint. * get all neighbors for eval. * reorganize. * fix. * fix. * fix. * fix test. * add reuse_if_exist. * add test for reuse_if_exist. * fix lint. * fix bugs. * fix. * print errors of tcp socket. * support delete tensors. * fix lint. * fix * fix example Co-authored-by:Ubuntu <ubuntu@ip-172-31-19-1.us-west-2.compute.internal>
-
- 20 Jul, 2020 1 commit
-
-
Chao Ma authored
* exit client * update * update * update * update * update * update * update * update test * update * update * update * update * update * update * update * update * update
-
- 15 Jul, 2020 1 commit
-
-
Da Zheng authored
* add standalone mode * add comments. * add tests for sampling. * fix. * make the code to run the standalone mode * fix * fix * fix readme. * fix. * fix test Co-authored-by:Chao Ma <mctt90@gmail.com>
-
- 14 Jul, 2020 1 commit
-
-
Da Zheng authored
* run dist server in dgl. * fix bugs. * fix example. * check environment variables and fix lint. * fix lint
-
- 02 Jul, 2020 1 commit
-
-
Quan (Andy) Gan authored
* neighbor sampler data loader first commit * more commit * nodedataloader * fix * update RGCN example * update OGB * fixes * fix minibatch RGCN crashing with self loop * reverting gatconv test code * fix * change to new solution that doesn't require tf dataloader * fix * lint * fix * fixes * change doc * fix docstring * docstring fixes * return seeds and input nodes from data loader * fixes * fix test * fix windows build problem * add pytorch wrapper * fixes * add pytorch wrapper * add unit test * add -1 support to sample_neighbors & fix docstrings * docstring fix * lint * add minibatch rgcn evaluations Co-authored-by:
xiang song(charlie.song) <classicxsong@gmail.com> Co-authored-by:
Tong He <hetong007@gmail.com>
-
- 01 Jul, 2020 1 commit
-
-
Da Zheng authored
* fix. * fix tests. * fix * add tests. * fix. * have default rank. * add comment. * fix test. * remove check * simplify code. * add test. * split data evenly. * simplify the distributed training code. * add comments. * add comments.
-
- 28 Jun, 2020 1 commit
-
-
Da Zheng authored
* add train_dist. * Fix sampling example. * use distributed sampler. * fix a bug in DistTensor. * fix distributed training example. * add graph partition. * add command * disable pytorch parallel. * shutdown correctly. * load diff graphs. * add ip_config.txt. * record timing for each step. * use ogb * add profiler. * fix a bug. * add train_dist. * Fix sampling example. * use distributed sampler. * fix a bug in DistTensor. * fix distributed training example. * add graph partition. * add command * disable pytorch parallel. * shutdown correctly. * load diff graphs. * add ip_config.txt. * record timing for each step. * use ogb * add profiler. * add Ips of the cluster. * fix exit. * support multiple clients. * balance node types and edges. * move code. * remove run.sh * Revert "support multiple clients." * fix. * update train_sampling. * fix. * fix * remove run.sh * update readme. * update readme. * use pytorch distributed. * ensure all trainers run the same number of steps. * Update README.md Co-authored-by:Ubuntu <ubuntu@ip-172-31-16-250.us-west-2.compute.internal>
-