• Da Zheng's avatar
    [Distributed] Pytorch example of distributed GraphSage. (#1495) · 02d31974
    Da Zheng authored
    
    
    * add train_dist.
    
    * Fix sampling example.
    
    * use distributed sampler.
    
    * fix a bug in DistTensor.
    
    * fix distributed training example.
    
    * add graph partition.
    
    * add command
    
    * disable pytorch parallel.
    
    * shutdown correctly.
    
    * load diff graphs.
    
    * add ip_config.txt.
    
    * record timing for each step.
    
    * use ogb
    
    * add profiler.
    
    * fix a bug.
    
    * add train_dist.
    
    * Fix sampling example.
    
    * use distributed sampler.
    
    * fix a bug in DistTensor.
    
    * fix distributed training example.
    
    * add graph partition.
    
    * add command
    
    * disable pytorch parallel.
    
    * shutdown correctly.
    
    * load diff graphs.
    
    * add ip_config.txt.
    
    * record timing for each step.
    
    * use ogb
    
    * add profiler.
    
    * add Ips of the cluster.
    
    * fix exit.
    
    * support multiple clients.
    
    * balance node types and edges.
    
    * move code.
    
    * remove run.sh
    
    * Revert "support multiple clients."
    
    * fix.
    
    * update train_sampling.
    
    * fix.
    
    * fix
    
    * remove run.sh
    
    * update readme.
    
    * update readme.
    
    * use pytorch distributed.
    
    * ensure all trainers run the same number of steps.
    
    * Update README.md
    Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-16-250.us-west-2.compute.internal>
    02d31974
train_sampling.py 9.91 KB