- 08 Apr, 2021 1 commit
-
-
Da Zheng authored
Co-authored-by:
Ubuntu <ubuntu@ip-172-31-73-81.ec2.internal> Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com>
-
- 04 Apr, 2021 1 commit
-
-
Da Zheng authored
* set omp thread. * add comment. * fix.
-
- 30 Mar, 2021 1 commit
-
-
Da Zheng authored
* remove num_workers. * remove num_workers. * remove num_workers. * remove num-servers. * update error message. * update docstring. * fix docs. * fix tests. * fix test. * fix. * print messages in test. * fix. * fix test. * fix. Co-authored-by:Ubuntu <ubuntu@ip-172-31-9-132.us-west-1.compute.internal>
-
- 22 Mar, 2021 1 commit
-
-
Da Zheng authored
Co-authored-by:xiang song(charlie.song) <classicxsong@gmail.com>
-
- 25 Feb, 2021 1 commit
-
-
Da Zheng authored
Co-authored-by:Ubuntu <ubuntu@ip-172-31-9-132.us-west-1.compute.internal>
-
- 09 Feb, 2021 1 commit
-
-
Da Zheng authored
* add convert. * fix. * add write_mag. * fix convert_partition.py * write data. * use pyarrow to read. * update write_mag.py * fix convert_partition.py. * load node/edge features when necessary. * reshuffle nodes. * write mag correctly. * fix a bug: inner nodes in a partition might be empty. * fix bugs. * add verify code. * insert reverse edges. * fix a bug. * add get node/edge data. * add instructions. * remove unnecessary argument. * update distributed preprocessing. * fix readme. * fix. * fix. * fix. * fix readme. * fix doc. * fix. * update readme * update doc. * update readme. Co-authored-by:
Ubuntu <ubuntu@ip-172-31-9-132.us-west-1.compute.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-2-202.us-west-1.compute.internal>
-
- 15 Sep, 2020 1 commit
-
-
Chao Ma authored
* update * update
-
- 27 Aug, 2020 1 commit
-
-
Chao Ma authored
* check num_workers * update * update * update * update * update * update
-
- 13 Aug, 2020 1 commit
-
-
Chao Ma authored
* update * update * update * update * update * update * update * update * update * update * update * update * update * update
-
- 12 Aug, 2020 2 commits
- 11 Aug, 2020 2 commits
-
-
Da Zheng authored
* move server start code to initialize. * fix. * fix lint. * fix examples. * add more checks.
-
Chao Ma authored
* remove server_count from ip_config.txt * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * lint * update * update * update * update * update * update * update * update * update * update * update * update * update * Update dist_context.py * fix lint. * make it work for multiple spaces. * update ip_config.txt. * fix examples. * update * update * update * update * update * update * update * update * update * update * update * update * update * update * udpate * update * update * update * update * update Co-authored-by:
Da Zheng <zhengda1936@gmail.com> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-19-1.us-west-2.compute.internal>
-
- 10 Aug, 2020 1 commit
-
-
Da Zheng authored
* fix tests. * fix. * remove a test. * make code work in the standalone mode. * fix example. * more fix. * make DistDataloader work with num_workers=0 * fix DistDataloader tests. * fix. * fix lint. * fix cleanup. * fix test * remove unnecessary code. * remove tests. * fix. * fix. * fix. * fix example * fix. * fix. * fix launch script. Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-19-1.us-west-2.compute.internal>
-
- 09 Aug, 2020 1 commit
-
-
Da Zheng authored
* temp fix omp. * set server threads. * add CAPI to set up OMP threads. * fix. * fix. * update namesapce. * set cpi properly. * allow to config num worker threads. * set #threads. * fix.
-
- 08 Aug, 2020 1 commit
-
-
Da Zheng authored
* update launch script * check the correctness of launch script. * fix.
-
- 31 Jul, 2020 1 commit
-
-
Da Zheng authored
* fix bugs. * eval on both vaidation and testing. * add script. * update. * update launch. * make train_dist.py independent. * update readme. * update readme. * update readme. * update readme. * generate undirected graph. * rename conf_file to part_config * use rsync * make train_dist independent. Co-authored-by:
Ubuntu <ubuntu@ip-172-31-19-1.us-west-2.compute.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-19-115.us-west-2.compute.internal> Co-authored-by:
xiang song(charlie.song) <classicxsong@gmail.com>
-
- 27 Jul, 2020 1 commit
-
-
Chao Ma authored
* update * update * update * update
-
- 17 Jul, 2020 1 commit
-
-
Da Zheng authored
Co-authored-by:Ubuntu <ubuntu@ip-172-31-19-1.us-west-2.compute.internal>
-
- 16 Jul, 2020 1 commit
-
-
Chao Ma authored
* update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * fix launch script. Co-authored-by:Da Zheng <zhengda1936@gmail.com>
-
- 03 May, 2020 1 commit
-
-
Da Zheng authored
* initial version from distributed training. This is copied from multiprocessing training. * modify for distributed training. * it's runnable now. * measure time in neighbor sampling. * simplify neighbor sampling. * fix a bug in distributed neighbor sampling. * allow single-machine training. * fix a bug. * fix a bug. * fix openmp. * make some improvement. * fix. * add prepare in the sampler. * prepare nodeflow async. * fix a bug. * get id. * simplify the code. * improve. * fix partition.py * fix the example. * add more features. * fix the example. * allow one partition * use distributed kvstore. * do g2l map manually. * fix commandline. * a temp script to save reddit. * fix pull_handler. * add pytorch version. * estimate the time for copying data. * delete unused code. * fix a bug. * print id. * fix a bug * fix a bug * fix a bug. * remove redundent code. * revert modify in sampler. * fix temp script. * remove pytorch version. * fix. * distributed training with pytorch. * add distributed graph store. * fix. * add metis_partition_assignment. * fix a few bugs in distributed graph store. * fix test. * fix bugs in distributed graph store. * fix tests. * remove code of defining DistGraphStore. * fix partition. * fix example. * update run.sh. * only read necessary node data. * batching data fetch of multiple NodeFlows. * simplify gcn. * remove unnecessary code. * use the new copy_from_kvstore. * update training script. * print time in graphsage. * make distributed training runnable. * use val_nid. * fix train_sampling. * add distributed training. * add run.sh * add more timing. * fix a bug. * save graph metadata when partition. * create ndata and edata in distributed graph store. * add timing in minibatch training of GraphSage. * use pytorch distributed. * add checks. * fix a bug in global vs. local ids. * remove fast pull * fix a compile error. * update and add new APIs. * implement more methods in DistGraphStore. * update more APIs. * rename it to DistGraph. * rename to DistTensor * remove some unnecessary API. * remove unnecessary files. * revert changes in sampler. * Revert "simplify gcn." This reverts commit 0ed3a34ca714203a5b45240af71555d4227ce452. * Revert "simplify neighbor sampling." This reverts commit 551c72d20f05a029360ba97f312c7a7a578aacec. * Revert "measure time in neighbor sampling." This reverts commit 63ae80c7b402bb626e24acbbc8fdfe9fffd0bc64. * Revert "add timing in minibatch training of GraphSage." This reverts commit e59dc8957a414c7df5c316f51d78bce822bdef5e. * Revert "fix train_sampling." This reverts commit ea6aea9a4aabb8ba0ff63070aa51e7ca81536ad9. * fix lint. * add comments and small update. * add more comments. * add more unit tests and fix bugs. * check the existence of shared-mem graph index. * use new partitioned graph storage. * fix bugs. * print error in fast pull. * fix lint * fix a compile error. * save absolute path after partitioning. * small fixes in the example * Revert "[kvstore] support any data type for init_data() (#1465)" This reverts commit 87b6997b . * fix a bug. * disable evaluation. * Revert "Revert "[kvstore] support any data type for init_data() (#1465)"" This reverts commit f5b8039c6326eb73bad8287db3d30d93175e5bee. * support set and init data. * support set and init data. * Revert "Revert "[kvstore] support any data type for init_data() (#1465)"" This reverts commit f5b8039c6326eb73bad8287db3d30d93175e5bee. * fix bugs. * fix unit test. * move to dgl.distributed. * fix lint. * fix lint. * remove local_nids. * fix lint. * fix test. * remove train_dist. * revert train_sampling. * rename funcs. * address comments. * address comments. Use NodeDataView/EdgeDataView to keep track of data. * address comments. * address comments. * revert. * save data with DGL serializer. * use the right way of getting shape. * fix lint. * address comments. * address comments. * fix an error in mxnet. * address comments. * add edge_map. * add more test and fix bugs. Co-authored-by:
Zheng <dzzhen@186590dc80ff.ant.amazon.com> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-6-131.us-east-2.compute.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-26-167.us-east-2.compute.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-16-150.us-west-2.compute.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-16-250.us-west-2.compute.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-30-135.us-west-2.compute.internal>
-
- 08 Mar, 2020 1 commit
-
-
Da Zheng authored
* add metis. * add test. * construct partition id. * link to METIS github repo. * update metis. * add a tool for partitioning a graph. * update metis. * update. * update. * fix metis. * fix lint * fix indent. * another way of building metis. * disable metis in windows. * test windows * fix. * disable metis for windows properly. * fix for tensorflow. * skip test for gpu. * make graph symmetric * address comments. * more comments. * fix compile * fix a bug. * add test. * change the default #hops of HALO nodes. Co-authored-by:Ubuntu <ubuntu@ip-172-31-26-167.us-east-2.compute.internal>
-