[DEMO] Update demo of distributed sampler (#564)

* update * update * update demo

[DEMO] Update demo of distributed sampler (#564)
* update * update * update demo
9aa5ffca · Chao Ma · GitHub · b3097725 · 9aa5ffca · 9aa5ffca
Unverified Commit 9aa5ffca authored May 26, 2019 by Chao Ma Committed by GitHub May 26, 2019
3 changed files
--- a/examples/mxnet/sampling/README.md
+++ b/examples/mxnet/sampling/README.md
@@ -15,44 +15,44 @@ pip install mxnet --pre
 ### Neighbor Sampling & Skip Connection
 cora: test accuracy ~83% with `--num-neighbors 2`, ~84% by training on the full graph
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/train.py --model gcn_ns --dataset cora --self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000
+DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset cora --self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000
 ```

 citeseer: test accuracy ~69% with `--num-neighbors 2`, ~70% by training on the full graph
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/train.py --model gcn_ns --dataset citeseer --self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000
+DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset citeseer --self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000
 ```

 pubmed: test accuracy ~78% with `--num-neighbors 3`, ~77% by training on the full graph
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/train.py --model gcn_ns --dataset pubmed --self-loop --num-neighbors 3 --batch-size 1000 --test-batch-size 5000
+DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset pubmed --self-loop --num-neighbors 3 --batch-size 1000 --test-batch-size 5000
 ```

 reddit: test accuracy ~91% with `--num-neighbors 3` and `--batch-size 1000`, ~93% by training on the full graph
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/train.py --model gcn_ns --dataset reddit-self-loop --num-neighbors 3 --batch-size 1000 --test-batch-size 5000 --n-hidden 64
+DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset reddit-self-loop --num-neighbors 3 --batch-size 1000 --test-batch-size 5000 --n-hidden 64
 ```


 ### Control Variate & Skip Connection
 cora: test accuracy ~84% with `--num-neighbors 1`, ~84% by training on the full graph
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/train.py --model gcn_cv --dataset cora --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000
+DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset cora --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000
 ```

 citeseer: test accuracy ~69% with `--num-neighbors 1`, ~70% by training on the full graph
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/train.py --model gcn_cv --dataset citeseer --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000
+DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset citeseer --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000
 ```

 pubmed: test accuracy ~79% with `--num-neighbors 1`, ~77% by training on the full graph
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/train.py --model gcn_cv --dataset pubmed --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000
+DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset pubmed --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000
 ```

 reddit: test accuracy ~93% with `--num-neighbors 1` and `--batch-size 1000`, ~93% by training on the full graph
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/train.py --model gcn_cv --dataset reddit-self-loop --num-neighbors 1 --batch-size 10000 --test-batch-size 5000 --n-hidden 64
+DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset reddit-self-loop --num-neighbors 1 --batch-size 10000 --test-batch-size 5000 --n-hidden 64
 ```

 ### Control Variate & GraphSAGE-mean
@@ -61,7 +61,7 @@ Following [Control Variate](https://arxiv.org/abs/1710.10568), we use the mean p

 reddit: test accuracy 96.1% with `--num-neighbors 1` and `--batch-size 1000`, ~96.2% in [Control Variate](https://arxiv.org/abs/1710.10568) with `--num-neighbors 2` and `--batch-size 1000`
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/train.py --model graphsage_cv --batch-size 1000 --test-batch-size 5000 --n-epochs 50 --dataset reddit --num-neighbors 1 --n-hidden 128 --dropout 0.2 --weight-decay 0
+DGLBACKEND=mxnet python3 train.py --model graphsage_cv --batch-size 1000 --test-batch-size 5000 --n-epochs 50 --dataset reddit --num-neighbors 1 --n-hidden 128 --dropout 0.2 --weight-decay 0
 ```

 ### Run multi-processing training
@@ -73,5 +73,5 @@ python3 examples/mxnet/sampling/run_store_server.py --dataset reddit --num-worke

 Run four workers to train GraphSage on the reddit dataset.
 ```
-python3 ../incubator-mxnet/tools/launch.py -n 4 -s 1 --launcher local python3 examples/mxnet/sampling/multi_process_train.py --model graphsage_cv --batch-size 1000 --test-batch-size 5000 --n-epochs 1 --graph-name reddit --num-neighbors 1 --n-hidden 128 --dropout 0.2 --weight-decay 0
+python3 ../incubator-mxnet/tools/launch.py -n 4 -s 1 --launcher local python3 multi_process_train.py --model graphsage_cv --batch-size 1000 --test-batch-size 5000 --n-epochs 1 --graph-name reddit --num-neighbors 1 --n-hidden 128 --dropout 0.2 --weight-decay 0
 ```
--- a/examples/mxnet/sampling/dis_sampling/README.md
+++ b/examples/mxnet/sampling/dis_sampling/README.md
@@ -13,6 +13,10 @@
 pip install mxnet --pre
 ```

+### Usage
+
+To run the following demos, you need to start trainer and sampler process on different machines by changing the `--ip`. You can also change the number of sampler by the `--num-sampler` option.
+
 ### Neighbor Sampling & Skip Connection

 #### cora
@@ -21,12 +25,12 @@ Test accuracy ~83% with `--num-neighbors 2`, ~84% by training on the full graph

 Trainer side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/train.py --model gcn_ns --dataset cora --self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000 --ip 127.0.0.1:50051 --num-sampler 1
+DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset cora --self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 Sampler side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/sampler.py --model gcn_ns --dataset cora --self-loop --num-neighbors 2 --batch-size 1000 --ip 127.0.0.1:50051
+DGLBACKEND=mxnet python3 sampler.py --model gcn_ns --dataset cora --self-loop --num-neighbors 2 --batch-size 1000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 #### citeseer 
@@ -35,12 +39,12 @@ Test accuracy ~69% with `--num-neighbors 2`, ~70% by training on the full graph

 Trainer side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/train.py --model gcn_ns --dataset citeseer --self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000 --ip 127.0.0.1:50051 --num-sampler 1
+DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset citeseer --self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 Sampler side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/sampler.py --model gcn_ns --dataset citeseer --self-loop --num-neighbors 2 --batch-size 1000 --ip 127.0.0.1:50051
+DGLBACKEND=mxnet python3 sampler.py --model gcn_ns --dataset citeseer --self-loop --num-neighbors 2 --batch-size 1000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 #### pubmed
@@ -49,12 +53,12 @@ Test accuracy ~78% with `--num-neighbors 3`, ~77% by training on the full graph

 Trainer side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/train.py --model gcn_ns --dataset pubmed --self-loop --num-neighbors 3 --batch-size 1000 --test-batch-size 5000 --ip 127.0.0.1:50051 --num-sampler 1
+DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset pubmed --self-loop --num-neighbors 3 --batch-size 1000 --test-batch-size 5000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 Sampler side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/sampler.py --model gcn_ns --dataset pubmed --self-loop --num-neighbors 3 --batch-size 1000 --ip 127.0.0.1:50051
+DGLBACKEND=mxnet python3 sampler.py --model gcn_ns --dataset pubmed --self-loop --num-neighbors 3 --batch-size 1000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 #### reddit
@@ -63,12 +67,12 @@ Test accuracy ~91% with `--num-neighbors 2` and `--batch-size 1000`, ~93% by tra

 Trainer side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/train.py --model gcn_ns --dataset reddit-self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000 --n-hidden 64 --ip 127.0.0.1:2049 --num-sampler 1
+DGLBACKEND=mxnet python3 train.py --model gcn_ns --dataset reddit-self-loop --num-neighbors 2 --batch-size 1000 --test-batch-size 5000 --n-hidden 64 --ip 127.0.0.1:2049 --num-sampler 1
 ```

 Sampler side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/sampler.py --model gcn_ns --dataset reddit-self-loop --num-neighbors 2 --batch-size 1000 --ip 127.0.0.1:2049
+DGLBACKEND=mxnet python3 sampler.py --model gcn_ns --dataset reddit-self-loop --num-neighbors 2 --batch-size 1000 --ip 127.0.0.1:2049 --num-sampler 1
 ```

 ### Control Variate & Skip Connection
@@ -79,12 +83,12 @@ Test accuracy ~84% with `--num-neighbors 1`, ~84% by training on the full graph

 Trainer side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/train.py --model gcn_cv --dataset cora --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
+DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset cora --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 Sampler side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/sampler.py --model gcn_cv --dataset cora --self-loop --num-neighbors 1 --batch-size 1000000 --ip 127.0.0.1:50051
+DGLBACKEND=mxnet python3 sampler.py --model gcn_cv --dataset cora --self-loop --num-neighbors 1 --batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 #### citeseer
@@ -93,24 +97,24 @@ Test accuracy ~69% with `--num-neighbors 1`, ~70% by training on the full graph

 Trainer Side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/train.py --model gcn_cv --dataset citeseer --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
+DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset citeseer --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 Sampler Side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/sampler.py --model gcn_cv --dataset citeseer --self-loop --num-neighbors 1 --batch-size 1000000 --ip 127.0.0.1:50051
+DGLBACKEND=mxnet python3 sampler.py --model gcn_cv --dataset citeseer --self-loop --num-neighbors 1 --batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 #### pubmed

 Trainer Side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/train.py --model gcn_cv --dataset pubmed --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
+DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset pubmed --self-loop --num-neighbors 1 --batch-size 1000000 --test-batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 Sampler Side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/sampler.py --model gcn_cv --dataset pubmed --self-loop --num-neighbors 1 --batch-size 1000000 --ip 127.0.0.1:50051
+DGLBACKEND=mxnet python3 sampler.py --model gcn_cv --dataset pubmed --self-loop --num-neighbors 1 --batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 #### reddit
@@ -119,12 +123,12 @@ Test accuracy ~93% with `--num-neighbors 1` and `--batch-size 1000`, ~93% by tra

 Trainer Side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/train.py --model gcn_cv --dataset reddit-self-loop --num-neighbors 1 --batch-size 10000 --test-batch-size 5000 --n-hidden 64 --ip 127.0.0.1:50051 --num-sampler 1
+DGLBACKEND=mxnet python3 train.py --model gcn_cv --dataset reddit-self-loop --num-neighbors 1 --batch-size 10000 --test-batch-size 5000 --n-hidden 64 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 Sampler Side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/sampler.py --model gcn_cv --dataset reddit-self-loop --num-neighbors 1 --batch-size 10000 --ip 127.0.0.1:50051
+DGLBACKEND=mxnet python3 sampler.py --model gcn_cv --dataset reddit-self-loop --num-neighbors 1 --batch-size 10000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 ### Control Variate & GraphSAGE-mean
@@ -137,10 +141,10 @@ Test accuracy 96.1% with `--num-neighbors 1` and `--batch-size 1000`, ~96.2% in

 Trainer side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/train.py --model graphsage_cv --batch-size 1000 --test-batch-size 5000 --n-epochs 50 --dataset reddit --num-neighbors 1 --n-hidden 128 --dropout 0.2 --weight-decay 0 --ip 127.0.0.1:50051 --num-sampler 1
+DGLBACKEND=mxnet python3 train.py --model graphsage_cv --batch-size 1000 --test-batch-size 5000 --n-epochs 50 --dataset reddit --num-neighbors 1 --n-hidden 128 --dropout 0.2 --weight-decay 0 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 Sampler side:
 ```
-DGLBACKEND=mxnet python3 examples/mxnet/sampling/dis_sampling/sampler.py --model graphsage_cv --batch-size 1000 --dataset reddit --num-neighbors 1 --ip 127.0.0.1:50051
+DGLBACKEND=mxnet python3 sampler.py --model graphsage_cv --batch-size 1000 --dataset reddit --num-neighbors 1 --ip 127.0.0.1:50051 --num-sampler 1
 ```
--- a/examples/pytorch/sampling/dis_sampling/README.md
+++ b/examples/pytorch/sampling/dis_sampling/README.md
@@ -13,6 +13,10 @@ Dependencies
 pip install torch requests
 ``

+### Usage
+
+To run the following demos, you need to start trainer and sampler process on different machines by changing the `--ip`. You can also change the number of sampler by the `--num-sampler` option.
+
 ### Neighbor Sampling & Skip Connection

 #### cora
@@ -26,7 +30,7 @@ DGLBACKEND=pytorch python3 gcn_ns_sc_train.py --dataset cora --self-loop --num-n

 Sampler side:
 ```
-DGLBACKEND=pytorch python3 sampler.py --model gcn_ns --dataset cora --self-loop --num-neighbors 2 --batch-size 1000000 --ip 127.0.0.1:50051
+DGLBACKEND=pytorch python3 sampler.py --model gcn_ns --dataset cora --self-loop --num-neighbors 2 --batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 #### citeseer 
@@ -40,7 +44,7 @@ DGLBACKEND=pytorch python3 gcn_ns_sc_train.py --dataset citeseer --self-loop --n

 Sampler side:
 ```
-DGLBACKEND=pytorch python3 sampler.py --model gcn_ns --dataset citeseer --self-loop --num-neighbors 2 --batch-size 1000000 --ip 127.0.0.1:50051
+DGLBACKEND=pytorch python3 sampler.py --model gcn_ns --dataset citeseer --self-loop --num-neighbors 2 --batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 #### pubmed 
@@ -54,7 +58,7 @@ DGLBACKEND=pytorch python3 gcn_ns_sc_train.py --dataset pubmed --self-loop --num

 Sampler side:
 ```
-DGLBACKEND=pytorch python3 sampler.py --model gcn_ns --dataset pubmed --self-loop --num-neighbors 3 --batch-size 1000000 --ip 127.0.0.1:50051
+DGLBACKEND=pytorch python3 sampler.py --model gcn_ns --dataset pubmed --self-loop --num-neighbors 3 --batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 ### Control Variate & Skip Connection
@@ -70,7 +74,7 @@ DGLBACKEND=pytorch python3 gcn_cv_sc_train.py --dataset cora --self-loop --num-n

 Sampler side:
 ```
-DGLBACKEND=pytorch python3 sampler.py --model gcn_cv --dataset cora --self-loop --num-neighbors 1 --batch-size 1000000 --ip 127.0.0.1:50051
+DGLBACKEND=pytorch python3 sampler.py --model gcn_cv --dataset cora --self-loop --num-neighbors 1 --batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 #### citeseer
@@ -84,7 +88,7 @@ DGLBACKEND=pytorch python3 gcn_cv_sc_train.py --dataset citeseer --self-loop --n

 Sampler side:
 ```
-DGLBACKEND=pytorch python3 sampler.py --model gcn_cv --dataset citeseer --self-loop --num-neighbors 1 --batch-size 1000000 --ip 127.0.0.1:50051
+DGLBACKEND=pytorch python3 sampler.py --model gcn_cv --dataset citeseer --self-loop --num-neighbors 1 --batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
 ```

 #### pubmed
@@ -98,6 +102,5 @@ DGLBACKEND=pytorch python3 gcn_cv_sc_train.py --dataset pubmed --self-loop --num

 Sampler side:
 ```
-DGLBACKEND=pytorch python3 sampler.py --model gcn_cv --dataset pubmed --self-loop --num-neighbors 1 --batch-size 1000000 --ip 127.0.0.1:50051
+DGLBACKEND=pytorch python3 sampler.py --model gcn_cv --dataset pubmed --self-loop --num-neighbors 1 --batch-size 1000000 --ip 127.0.0.1:50051 --num-sampler 1
 ```
-