1. 25 Jan, 2021 1 commit
    • Da Zheng's avatar
      [Distributed] Heterogeneous graph support (#2457) · 25ac3344
      Da Zheng authored
      * Distributed heterograph (#3)
      
      * heterogeneous graph partition.
      
      * fix graph partition book for heterograph.
      
      * load heterograph partitions.
      
      * update DistGraphServer to support heterograph.
      
      * make DistGraph runnable for heterograph.
      
      * partition a graph and store parts with homogeneous graph structure.
      
      * update DistGraph server&client to use homogeneous graph.
      
      * shuffle node Ids based on node types.
      
      * load mag in heterograph.
      
      * fix per-node-type mapping.
      
      * balance node types.
      
      * fix for homogeneous graph
      
      * store etype for now.
      
      * fix data name.
      
      * fix a bug in example.
      
      * add profiler in rgcn.
      
      * heterogeneous RGCN.
      
      * map homogeneous node ids to hetero node ids.
      
      * fix graph partition book.
      
      * fix DistGraph.
      
      * shuffle eids.
      
      * verify eids and their mappings when loading a partition.
      
      * Id map from homogneous Ids to per-type Ids.
      
      * verify partitioned results.
      
      * add test for distributed sampler....
      25ac3344
  2. 02 Sep, 2020 1 commit
  3. 25 Aug, 2020 1 commit
  4. 13 Aug, 2020 1 commit
  5. 12 Aug, 2020 2 commits
  6. 11 Aug, 2020 1 commit
    • Chao Ma's avatar
      [Distributed] Remove server_count from ip_config.txt (#1985) · d340ea3a
      Chao Ma authored
      
      
      * remove server_count from ip_config.txt
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * lint
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * Update dist_context.py
      
      * fix lint.
      
      * make it work for multiple spaces.
      
      * update ip_config.txt.
      
      * fix examples.
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * udpate
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      Co-authored-by: default avatarDa Zheng <zhengda1936@gmail.com>
      Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-19-1.us-west-2.compute.internal>
      d340ea3a
  7. 10 Aug, 2020 1 commit
  8. 03 Aug, 2020 1 commit
    • Da Zheng's avatar
      [Distributed] Support multiple servers (#1886) · a4c931a9
      Da Zheng authored
      
      
      * client init graph on the backup servers.
      
      * fix.
      
      * test multi-server.
      
      * fix anonymous dist tensors.
      
      * check #parts.
      
      * fix init_data
      
      * add multi-server multi-client tests.
      
      * update tests in kvstore.
      
      * fix.
      
      * verify the loaded partition.
      
      * fix a bug.
      
      * fix lint.
      
      * fix.
      
      * fix example.
      
      * fix rpc.
      
      * fix pull/push handler for backup kvstore
      
      * fix example readme.
      
      * change ip.
      
      * update docstring.
      Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-19-1.us-west-2.compute.internal>
      a4c931a9
  9. 01 Aug, 2020 1 commit
  10. 31 Jul, 2020 1 commit
  11. 27 Jul, 2020 1 commit
  12. 16 Jul, 2020 1 commit
    • Chao Ma's avatar
      [Distributed] Distributed launching script (#1772) · ca9d3216
      Chao Ma authored
      
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * fix launch script.
      Co-authored-by: default avatarDa Zheng <zhengda1936@gmail.com>
      ca9d3216
  13. 15 Jul, 2020 1 commit
  14. 14 Jul, 2020 1 commit
  15. 28 Jun, 2020 1 commit
    • Da Zheng's avatar
      [Distributed] Pytorch example of distributed GraphSage. (#1495) · 02d31974
      Da Zheng authored
      
      
      * add train_dist.
      
      * Fix sampling example.
      
      * use distributed sampler.
      
      * fix a bug in DistTensor.
      
      * fix distributed training example.
      
      * add graph partition.
      
      * add command
      
      * disable pytorch parallel.
      
      * shutdown correctly.
      
      * load diff graphs.
      
      * add ip_config.txt.
      
      * record timing for each step.
      
      * use ogb
      
      * add profiler.
      
      * fix a bug.
      
      * add train_dist.
      
      * Fix sampling example.
      
      * use distributed sampler.
      
      * fix a bug in DistTensor.
      
      * fix distributed training example.
      
      * add graph partition.
      
      * add command
      
      * disable pytorch parallel.
      
      * shutdown correctly.
      
      * load diff graphs.
      
      * add ip_config.txt.
      
      * record timing for each step.
      
      * use ogb
      
      * add profiler.
      
      * add Ips of the cluster.
      
      * fix exit.
      
      * support multiple clients.
      
      * balance node types and edges.
      
      * move code.
      
      * remove run.sh
      
      * Revert "support multiple clients."
      
      * fix.
      
      * update train_sampling.
      
      * fix.
      
      * fix
      
      * remove run.sh
      
      * update readme.
      
      * update readme.
      
      * use pytorch distributed.
      
      * ensure all trainers run the same number of steps.
      
      * Update README.md
      Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-16-250.us-west-2.compute.internal>
      02d31974