[API Deprecation]Deprecate contrib module (#5114)

56ffb650 · peizhou001 · GitHub · 436de3d1 · 436de3d1 · 436de3d1
Unverified Commit 56ffb650 authored Jan 06, 2023 by peizhou001 Committed by GitHub Jan 06, 2023
20 changed files
--- a/apps/kg/README.md
+++ b/apps/kg/README.md
-# DGL - Knowledge Graph Embedding
-**Note: DGL-KE is moved to [here](https://github.com/awslabs/dgl-ke). DGL-KE in this folder is deprecated.**
-## Introduction
-DGL-KE is a DGL-based package for computing node embeddings and relation embeddings of
-knowledge graphs efficiently. This package is adapted from
-[KnowledgeGraphEmbedding](https://github.com/DeepGraphLearning/KnowledgeGraphEmbedding).
-We enable fast and scalable training of knowledge graph embedding,
-while still keeping the package as extensible as
-[KnowledgeGraphEmbedding](https://github.com/DeepGraphLearning/KnowledgeGraphEmbedding).
-On a single machine,
-it takes only a few minutes for medium-size knowledge graphs, such as FB15k and wn18, and
-takes a couple of hours on Freebase, which has hundreds of millions of edges.
-DGL-KE includes the following knowledge graph embedding models:
- TransE (TransE_l1 with L1 distance and TransE_l2 with L2 distance)
- DistMult
- ComplEx
- RESCAL
- TransR
- RotatE
-It will add other popular models in the future.
-DGL-KE supports multiple training modes:
- CPU training
- GPU training
- Joint CPU & GPU training
- Multiprocessing training on CPUs
-For joint CPU & GPU training, node embeddings are stored on CPU and mini-batches are trained on GPU. This is designed for training KGE models on large knowledge graphs
-For multiprocessing training, each process train mini-batches independently and use shared memory for communication between processes. This is designed to train KGE models on large knowledge graphs with many CPU cores.
-We will support multi-GPU training and distributed training in a near future.
-## Requirements
-The package can run with both Pytorch and MXNet. For Pytorch, it works with Pytorch v1.2 or newer.
-For MXNet, it works with MXNet 1.5 or newer.
-## Built-in Datasets
-DGL-KE provides five built-in knowledge graphs:
-| Dataset | #nodes | #edges | #relations |
-|---------|--------|--------|------------|
-| [FB15k](https://data.dgl.ai/dataset/FB15k.zip) | 14951 | 592213 | 1345 |
-| [FB15k-237](https://data.dgl.ai/dataset/FB15k-237.zip) | 14541 | 310116 | 237 |
-| [wn18](https://data.dgl.ai/dataset/wn18.zip) | 40943 | 151442 | 18 |
-| [wn18rr](https://data.dgl.ai/dataset/wn18rr.zip) | 40943 | 93003 | 11 |
-| [Freebase](https://data.dgl.ai/dataset/Freebase.zip) | 86054151 | 338586276 | 14824 |
-Users can specify one of the datasets with `--dataset` in `train.py` and `eval.py`.
-## Performance
-The 1 GPU speed is measured with 8 CPU cores and one Nvidia V100 GPU. (AWS P3.2xlarge)
-The 8 GPU speed is measured with 64 CPU cores and eight Nvidia V100 GPU. (AWS P3.16xlarge)
-The speed on FB15k 1GPU
-|  Models | TransE_l1 | TransE_l2 | DistMult | ComplEx | RESCAL | TransR | RotatE |
-|---------|-----------|-----------|----------|---------|--------|--------|--------|
-|MAX_STEPS| 48000     | 32000     | 40000    | 100000  | 32000  | 32000  | 20000  |
-|TIME     | 370s      | 270s      | 312s     | 282s    | 2095s  | 1556s  | 1861s  |
-The accuracy on FB15k
-|  Models   |  MR   |  MRR  | HITS@1 | HITS@3 | HITS@10 |
-|-----------|-------|-------|--------|--------|---------|
-| TransE_l1 | 44.18 | 0.675 | 0.551  | 0.774  | 0.861   |
-| TransE_l2 | 46.71 | 0.665 | 0.551  | 0.804  | 0.846   |
-| DistMult  | 61.04 | 0.725 | 0.625  | 0.837  | 0.883   |
-| ComplEx   | 64.59 | 0.785 | 0.718  | 0.835  | 0.889   |
-| RESCAL    | 122.3 | 0.669 | 0.598  | 0.711  | 0.793   |
-| TransR    | 59.86 | 0.676 | 0.591  | 0.735  | 0.814   |
-| RotatE    | 43.66 | 0.728 | 0.632  | 0.801  | 0.874   |
-The speed on FB15k 8GPU
-|  Models | TransE_l1 | TransE_l2 | DistMult | ComplEx | RESCAL | TransR | RotatE |
-|---------|-----------|-----------|----------|---------|--------|--------|--------|
-|MAX_STEPS| 6000      | 4000      | 5000     | 4000    | 4000   | 4000   | 2500   |
-|TIME     | 88.93s    | 62.99s    | 72.74s   | 68.37s  | 245.9s | 203.9s | 126.7s |
-The accuracy on FB15k
-|  Models   |  MR   |  MRR  | HITS@1 | HITS@3 | HITS@10 |
-|-----------|-------|-------|--------|--------|---------|
-| TransE_l1 | 44.25 | 0.672 | 0.547  | 0.774  | 0.860   |
-| TransE_l2 | 46.13 | 0.658 | 0.539  | 0.748  | 0.845   |
-| DistMult  | 61.72 | 0.723 | 0.626  | 0.798  | 0.881   |
-| ComplEx   | 65.84 | 0.754 | 0.676  | 0.813  | 0.880   |
-| RESCAL    | 135.6 | 0.652 | 0.580  | 0.693  | 0.779   |
-| TransR    | 65.27 | 0.676 | 0.591  | 0.736  | 0.811   |
-| RotatE    | 49.59 | 0.683 | 0.581  | 0.759  | 0.848   |
-In comparison, GraphVite uses 4 GPUs and takes 14 minutes. Thus, DGL-KE trains TransE on FB15k 9.5X as fast as GraphVite with 8 GPUs. More performance information on GraphVite can be found [here](https://github.com/DeepGraphLearning/graphvite).
-The speed on wn18 1GPU
-|  Models | TransE_l1 | TransE_l2 | DistMult | ComplEx | RESCAL | TransR | RotatE |
-|---------|-----------|-----------|----------|---------|--------|--------|--------|
-|MAX_STEPS| 32000     | 32000     | 20000    | 20000   | 20000  | 30000  | 24000  |
-|TIME     | 531.5s    | 406.6s    | 284.1s   | 282.3s  | 443.6s | 766.2s | 829.4s |
-The accuracy on wn18
-|  Models   |  MR   |  MRR  | HITS@1 | HITS@3 | HITS@10 |
-|-----------|-------|-------|--------|--------|---------|
-| TransE_l1 | 318.4 | 0.764 | 0.602  | 0.929  | 0.949   |
-| TransE_l2 | 206.2 | 0.561 | 0.306  | 0.800  | 0.944   |
-| DistMult  | 486.0 | 0.818 | 0.711  | 0.921  | 0.948   |
-| ComplEx   | 268.6 | 0.933 | 0.916  | 0.949  | 0.961   |
-| RESCAL    | 536.6 | 0.848 | 0.790  | 0.900  | 0.927   |
-| TransR    | 452.4 | 0.620 | 0.461  | 0.758  | 0.856   |
-| RotatE    | 487.9 | 0.944 | 0.940  | 0.947  | 0.952   |
-The speed on wn18 8GPU
-|  Models | TransE_l1 | TransE_l2 | DistMult | ComplEx | RESCAL | TransR | RotatE |
-|---------|-----------|-----------|----------|---------|--------|--------|--------|
-|MAX_STEPS| 4000      | 4000      | 2500     | 2500    | 2500   | 2500   | 3000   |
-|TIME     | 119.3s    | 81.1s     | 76.0s    | 58.0s   | 594.1s | 1168s  | 139.8s |
-The accuracy on wn18
-|  Models   |  MR   |  MRR  | HITS@1 | HITS@3 | HITS@10 |
-|-----------|-------|-------|--------|--------|---------|
-| TransE_l1 | 360.3 | 0.745 | 0.562  | 0.930  | 0.951   |
-| TransE_l2 | 193.8 | 0.557 | 0.301  | 0.799  | 0.942   |
-| DistMult  | 499.9 | 0.807 | 0.692  | 0.917  | 0.945   |
-| ComplEx   | 476.7 | 0.935 | 0.926  | 0.943  | 0.949   |
-| RESCAL    | 618.8 | 0.848 | 0.791  | 0.897  | 0.927   |
-| TransR    | 513.1 | 0.659 | 0.491  | 0.821  | 0.871   |
-| RotatE    | 466.2 | 0.944 | 0.940  | 0.945  | 0.951   |
-The speed on Freebase (8 GPU)
-|  Models | TransE_l2 | DistMult | ComplEx | TransR | RotatE |
-|---------|-----------|----------|---------|--------|--------|
-|MAX_STEPS| 320000   | 300000   | 360000  | 300000 | 300000 |
-|TIME     | 7908s     | 7425s    | 8946s   | 16816s | 12817s |
-The accuracy on Freebase (it is tested when 1000 negative edges are sampled for each positive edge).
-|  Models   |  MR    |  MRR  | HITS@1 | HITS@3 | HITS@10 |
-|-----------|--------|-------|--------|--------|---------|
-| TransE_l2 | 22.4   | 0.756 | 0.688  | 0.800  | 0.882   |
-| DistMul   | 45.4   | 0.833 | 0.812  | 0.843  | 0.872   |
-| ComplEx   | 48.0   | 0.830 | 0.812  | 0.838  | 0.864   |
-| TransR    | 51.2   | 0.697 | 0.656  | 0.716  | 0.771   |
-| RotatE    | 93.3   | 0.770 | 0.749  | 0.780  | 0.805   |
-The speed on Freebase (48 CPU)
-This measured with 48 CPU cores on an AWS r5dn.24xlarge
-|  Models | TransE_l2 | DistMult | ComplEx |
-|---------|-----------|----------|---------|
-|MAX_STEPS| 50000     | 50000    | 50000   |
-|TIME     | 7002s     | 6340s    | 8133s   |
-The accuracy on Freebase (it is tested when 1000 negative edges are sampled for each positive edge).
-|  Models   |  MR    |  MRR  | HITS@1 | HITS@3 | HITS@10 |
-|-----------|--------|-------|--------|--------|---------|
-| TransE_l2 | 30.8   | 0.814 | 0.764  | 0.848  | 0.902   |
-| DistMul   | 45.1   | 0.834 | 0.815  | 0.843  | 0.871   |
-| ComplEx   | 44.9   | 0.837 | 0.819  | 0.845  | 0.870   |
-The configuration for reproducing the performance results can be found [here](https://github.com/dmlc/dgl/blob/master/apps/kg/config/best_config.sh).
-## Usage
-DGL-KE doesn't require installation. The package contains two scripts `train.py` and `eval.py`.
-* `train.py` trains knowledge graph embeddings and outputs the trained node embeddings
-and relation embeddings.
-* `eval.py` reads the pre-trained node embeddings and relation embeddings and evaluate
-how accurate to predict the tail node when given (head, rel, ?), and predict the head node
-when given (?, rel, tail).
-### Input formats:
-DGL-KE supports two knowledge graph input formats for user defined dataset
- raw_udd_[h|r|t], raw user defined dataset. In this format, user only need to provide triples and let the dataloader generate and manipulate the id mapping. The dataloader will generate two files: entities.tsv for entity id mapping and relations.tsv for relation id mapping. The order of head, relation and tail entities are described in [h|r|t], for example, raw_udd_trh means the triples are stored in the order of tail, relation and head. It should contains three files:
-  - *train* stores the triples in the training set. In format of a triple, e.g., [src_name, rel_name, dst_name] and should follow the order specified in [h|r|t]
-  - *valid* stores the triples in the validation set. In format of a triple, e.g., [src_name, rel_name, dst_name] and should follow the order specified in [h|r|t]
-  - *test* stores the triples in the test set. In format of a triple, e.g., [src_name, rel_name, dst_name] and should follow the order specified in [h|r|t]
-Format 2:
- udd_[h|r|t], user defined dataset. In this format, user should provide the id mapping for entities and relations. The order of head, relation and tail entities are described in [h|r|t], for example, raw_udd_trh means the triples are stored in the order of tail, relation and head. It should contains five files:
-  - *entities* stores the mapping between entity name and entity Id
-  - *relations* stores the mapping between relation name relation Id
-  - *train* stores the triples in the training set. In format of a triple, e.g., [src_id, rel_id, dst_id] and should follow the order specified in [h|r|t]
-  - *valid* stores the triples in the validation set. In format of a triple, e.g., [src_id, rel_id, dst_id] and should follow the order specified in [h|r|t]
-  - *test* stores the triples in the test set. In format of a triple, e.g., [src_id, rel_id, dst_id] and should follow the order specified in [h|r|t]
-### Output formats:
-To save the trained embeddings, users have to provide the path with `--save_emb` when running
-`train.py`. The saved embeddings are stored as numpy ndarrays.
-* The node embedding is saved as `XXX_YYY_entity.npy`.
-* The relation embedding is saved as `XXX_YYY_relation.npy`.
-`XXX` is the dataset name and `YYY` is the model name.
-### Command line parameters
-Here are some examples of using the training script.
-Train KGE models with GPU.
-```bash
-python3 train.py --model DistMult --dataset FB15k --batch_size 1024 --neg_sample_size 256 \
-    --hidden_dim 400 --gamma 143.0 --lr 0.08 --batch_size_eval 16 --valid --test -adv \
-    --gpu 0 --max_step 40000
-```
-Train KGE models with mixed multiple GPUs.
-```bash
-python3 train.py --model DistMult --dataset FB15k --batch_size 1024 --neg_sample_size 256 \
-    --hidden_dim 400 --gamma 143.0 --lr 0.08 --batch_size_eval 16 --valid --test -adv \
-    --max_step 5000 --mix_cpu_gpu --num_proc 8 --gpu 0 1 2 3 4 5 6 7 --async_update \
-    --soft_rel_part --force_sync_interval 1000
-```
-Train embeddings and verify it later.
-```bash
-python3 train.py --model DistMult --dataset FB15k --batch_size 1024 --neg_sample_size 256 \
-    --hidden_dim 400 --gamma 143.0 --lr 0.08 --batch_size_eval 16 --valid --test -adv \
-     --gpu 0 --max_step 40000 --save_emb DistMult_FB15k_emb
-python3 eval.py --model_name DistMult --dataset FB15k --hidden_dim 400 \
-    --gamma 143.0 --batch_size 16 --gpu 0 --model_path DistMult_FB15k_emb/
-```
-Train embeddings with multi-processing. This currently doesn't work in MXNet.
-```bash
-python3 train.py --model TransE_l2 --dataset Freebase --batch_size 1000 \
-    --neg_sample_size 200 --hidden_dim 400 --gamma 10 --lr 0.1 --max_step 50000 \
-    --log_interval 100 --batch_size_eval 1000 --neg_sample_size_eval 1000 --test \
-   -adv --regularization_coef 1e-9 --num_thread 1 --num_proc 48
-```
--- a/apps/kg/config/best_config.sh
+++ b/apps/kg/config/best_config.sh
-#To reproduce reported results on README, you can run the model with the following commands:
-# for FB15k
-# DistMult 1GPU
-DGLBACKEND=pytorch python3 train.py --model DistMult --dataset FB15k --batch_size 1024 \
-    --neg_sample_size 256 --hidden_dim 400 --gamma 143.0 --lr 0.08 --batch_size_eval 16 \
-    --valid --test -adv --gpu 0 --max_step 40000
-# DistMult 8GPU
-DGLBACKEND=pytorch python3 train.py --model DistMult --dataset FB15k --batch_size 1024 \
-    --neg_sample_size 256 --hidden_dim 400 --gamma 143.0 --lr 0.08 --batch_size_eval 16 \
-    --valid --test -adv --max_step 5000 --mix_cpu_gpu --num_proc 8 \
-    --gpu 0 1 2 3 4 5 6 7 --async_update --soft_rel_part --force_sync_interval 1000
-# ComplEx 1GPU
-DGLBACKEND=pytorch python3 train.py --model ComplEx --dataset FB15k --batch_size 1024 \
-    --neg_sample_size 1024 --hidden_dim 400 --gamma 143.0 --lr 0.1 \
-    --regularization_coef 2.00E-06 --batch_size_eval 16 --valid --test -adv --gpu 0 \
-    --max_step 32000
-# ComplEx 8GPU
-DGLBACKEND=pytorch python3 train.py --model ComplEx --dataset FB15k --batch_size 1024 \
-    --neg_sample_size 1024 --hidden_dim 400 --gamma 143.0 --lr 0.1 \
-    --regularization_coef 2.00E-06 --batch_size_eval 16 --valid --test -adv \
-    --max_step 4000 --mix_cpu_gpu --num_proc 8 --gpu 0 1 2 3 4 5 6 7 --async_update \
-    --soft_rel_part --force_sync_interval 1000
-# TransE_l1 1GPU
-DGLBACKEND=pytorch python3 train.py --model TransE_l1 --dataset FB15k --batch_size 1024 \
-    --neg_sample_size 64 --regularization_coef 1e-07 --hidden_dim 400 --gamma 16.0 \
-    --lr 0.01 --batch_size_eval 16 --valid --test -adv --gpu 0 --max_step 48000
-# TransE_l1 8GPU
-DGLBACKEND=pytorch python3 train.py --model TransE_l1 --dataset FB15k --batch_size 1024 \
-    --neg_sample_size 64 --regularization_coef 1e-07 --hidden_dim 400 --gamma 16.0 \
-    --lr 0.01 --batch_size_eval 16 --valid --test -adv --max_step 6000 --mix_cpu_gpu \
-    --num_proc 8 --gpu 0 1 2 3 4 5 6 7 --async_update --soft_rel_part \
-    --force_sync_interval 1000
-# TransE_l2 1GPU
-DGLBACKEND=pytorch python3 train.py --model TransE_l2 --dataset FB15k --batch_size 1024 \
-    --neg_sample_size 256 --regularization_coef=1e-9 --hidden_dim 400 --gamma 19.9 \
-    --lr 0.25 --batch_size_eval 16 --valid --test -adv --gpu 0 --max_step 32000
-# TransE_l2 8GPU
-DGLBACKEND=pytorch python3 train.py --model TransE_l2 --dataset FB15k --batch_size 1024 \
-    --neg_sample_size 256 --regularization_coef=1e-9 --hidden_dim 400 --gamma 19.9 \
-    --lr 0.25 --batch_size_eval 16 --valid --test -adv --gpu 0 --max_step 4000 \
-    --mix_cpu_gpu --num_proc 8 --gpu 0 1 2 3 4 5 6 7 --async_update --soft_rel_part \
-    --force_sync_interval 1000
-# RESCAL 1GPU
-DGLBACKEND=pytorch python3 train.py --model RESCAL --dataset FB15k --batch_size 1024 \
-    --neg_sample_size 256 --hidden_dim 500 --gamma 24.0 --lr 0.03 --batch_size_eval 16 \
-    --gpu 0 --valid --test -adv --max_step 30000
-# RESCAL 8GPU
-DGLBACKEND=pytorch python3 train.py --model RESCAL --dataset FB15k --batch_size 1024 \
-    --neg_sample_size 256 --hidden_dim 500 --gamma 24.0 --lr 0.03 --batch_size_eval 16 \
-    --valid --test -adv --max_step 4000 --mix_cpu_gpu --num_proc 8 \
-    --gpu 0 1 2 3 4 5 6 7 --async_update --soft_rel_part --force_sync_interval 1000
-# TransR 1GPU
-DGLBACKEND=pytorch python3 train.py --model TransR --dataset FB15k --batch_size 1024 \
-    --neg_sample_size 256 --regularization_coef 5e-8 --hidden_dim 200 --gamma 8.0 \
-    --lr 0.015 --batch_size_eval 16 --valid --test -adv --gpu 0 --max_step 32000
-# TransR 8GPU
-DGLBACKEND=pytorch python3 train.py --model TransR --dataset FB15k --batch_size 1024 \
-    --neg_sample_size 256 --regularization_coef 5e-8 --hidden_dim 200 --gamma 8.0 \
-    --lr 0.015 --batch_size_eval 16 --valid --test -adv --max_step 4000 --mix_cpu_gpu \
-    --num_proc 8 --gpu 0 1 2 3 4 5 6 7 --async_update --soft_rel_part \
-    --force_sync_interval 1000
-# RotatE 1GPU
-DGLBACKEND=pytorch python3 train.py --model RotatE --dataset FB15k --batch_size 2048 \
-    --neg_sample_size 256 --regularization_coef 1e-07 --hidden_dim 200 --gamma 12.0 \
-    --lr 0.009 --batch_size_eval 16 --valid --test -adv -de --max_step 20000 \
-    --neg_deg_sample --gpu 0
-# RotatE 8GPU
-DGLBACKEND=pytorch python3 train.py --model RotatE --dataset FB15k --batch_size 1024 \
-    --neg_sample_size 256 --regularization_coef 1e-07 --hidden_dim 200 --gamma 12.0 \
-    --lr 0.009 --batch_size_eval 16 --valid --test -adv -de --max_step 2500 \
-    --neg_deg_sample --mix_cpu_gpu --num_proc 8 --gpu 0 1 2 3 4 5 6 7 --async_update \
-    --soft_rel_part --force_sync_interval 1000
-# for wn18
-# DistMult 1GPU 
-DGLBACKEND=pytorch python3 train.py --model DistMult --dataset wn18 --batch_size 2048 \
-    --neg_sample_size 128 --regularization_coef 1e-06 --hidden_dim 512 --gamma 20.0 \
-    --lr 0.14 --batch_size_eval 16 --valid --test -adv --gpu 0 --max_step 20000
-# DistMult 8GPU 
-DGLBACKEND=pytorch python3 train.py --model DistMult --dataset wn18 --batch_size 2048 \
-    --neg_sample_size 128 --regularization_coef 1e-06 --hidden_dim 512 --gamma 20.0 \
-    --lr 0.14 --batch_size_eval 16 --valid --test -adv --gpu 0 --max_step 2500 \
-    --mix_cpu_gpu --num_proc 8 --gpu 0 1 2 3 4 5 6 7 --async_update \
-    --force_sync_interval 1000
-# ComplEx 1GPU
-DGLBACKEND=pytorch python3 train.py --model ComplEx --dataset wn18 --batch_size 1024 \
-    --neg_sample_size 1024 --regularization_coef 0.00001 --hidden_dim 512 --gamma 200.0 \
-    --lr 0.1 --batch_size_eval 16 --valid --test -adv --gpu 0 --max_step 20000
-# ComplEx 8GPU 
-DGLBACKEND=pytorch python3 train.py --model ComplEx --dataset wn18 --batch_size 1024 \
-    --neg_sample_size 1024 --regularization_coef 0.00001 --hidden_dim 512 --gamma 200.0 \
-    --lr 0.1 --batch_size_eval 16 --valid --test -adv --gpu 0 --max_step 2500 \
-    --mix_cpu_gpu --num_proc 8 --gpu 0 1 2 3 4 5 6 7 --async_update \
-    --force_sync_interval 1000
-# TransE_l1 1GPU
-DGLBACKEND=pytorch python3 train.py --model TransE_l1 --dataset wn18 --batch_size 2048 \
-    --neg_sample_size 128 --regularization_coef 2e-07 --hidden_dim 512 --gamma 12.0 \
-    --lr 0.007 --batch_size_eval 16 --valid --test -adv --gpu 0 --max_step 32000
-# TransE_l1 8GPU
-DGLBACKEND=pytorch python3 train.py --model TransE_l1 --dataset wn18 --batch_size 2048 \
-    --neg_sample_size 128 --regularization_coef 2e-07 --hidden_dim 512 --gamma 12.0 \
-    --lr 0.007 --batch_size_eval 16 --valid --test -adv --gpu 0 --max_step 4000 \
-    --mix_cpu_gpu --num_proc 8 --gpu 0 1 2 3 4 5 6 7 --async_update \
-    --force_sync_interval 1000
-# TransE_l2 1GPU
-DGLBACKEND=pytorch python3 train.py --model TransE_l2 --dataset wn18 --batch_size 1024 \
-    --neg_sample_size 256 --regularization_coef 0.0000001 --hidden_dim 512 --gamma 6.0 \
-    --lr 0.1 --batch_size_eval 16 --valid --test -adv --gpu 0 --max_step 32000
-# TransE_l2 8GPU
-DGLBACKEND=pytorch python3 train.py --model TransE_l2 --dataset wn18 --batch_size 1024 \
-    --neg_sample_size 256 --regularization_coef 0.0000001 --hidden_dim 512 --gamma 6.0 \
-    --lr 0.1 --batch_size_eval 16 --valid --test -adv --gpu 0 --max_step 4000 \
-    --mix_cpu_gpu --num_proc 8 --gpu 0 1 2 3 4 5 6 7 --async_update \
-    --force_sync_interval 1000
-# RESCAL 1GPU
-DGLBACKEND=pytorch python3 train.py --model RESCAL --dataset wn18 --batch_size 1024 \
-    --neg_sample_size 256 --hidden_dim 250 --gamma 24.0 --lr 0.03 --batch_size_eval 16 \
-    --valid --test -adv --gpu 0 --max_step 20000
-# RESCAL 8GPU
-DGLBACKEND=pytorch python3 train.py --model RESCAL --dataset wn18 --batch_size 1024 \
-    --neg_sample_size 256 --hidden_dim 250 --gamma 24.0 --lr 0.03 --batch_size_eval 16 \
-    --valid --test -adv --gpu 0 --max_step 2500  --mix_cpu_gpu --num_proc 8 \
-    --gpu 0 1 2 3 4 5 6 7 --async_update --force_sync_interval 1000 --soft_rel_part
-# TransR 1GPU
-DGLBACKEND=pytorch python3 train.py --model TransR --dataset wn18 --batch_size 1024 \
-    --neg_sample_size 256 --hidden_dim 250 --gamma 16.0 --lr 0.1 --batch_size_eval 16 \
-    --valid --test -adv --gpu 0 --max_step 30000
-# TransR 8GPU
-DGLBACKEND=pytorch python3 train.py --model TransR --dataset wn18 --batch_size 1024 \
-    --neg_sample_size 256 --hidden_dim 250 --gamma 16.0 --lr 0.1 --batch_size_eval 16 \
-    --valid --test -adv --max_step 2500  --mix_cpu_gpu --num_proc 8 \
-    --gpu 0 1 2 3 4 5 6 7 --async_update --force_sync_interval 1000 --soft_rel_part
-# RotatE 1GPU
-DGLBACKEND=pytorch python3 train.py --model RotatE --dataset wn18 --batch_size 2048 \
-    --neg_sample_size 64 --regularization_coef 2e-07 --hidden_dim 256 --gamma 9.0 \
-    --lr 0.0025 -de --batch_size_eval 16 --neg_deg_sample --valid --test -adv --gpu 0 \
-    --max_step 24000 
-# RotatE 8GPU
-DGLBACKEND=pytorch python3 train.py --model RotatE --dataset wn18 --batch_size 2048 \
-    --neg_sample_size 64 --regularization_coef 2e-07 --hidden_dim 256 --gamma 9.0 \
-    --lr 0.0025 -de --batch_size_eval 16 --neg_deg_sample --valid --test -adv \
-    --max_step 3000 --mix_cpu_gpu --num_proc 8 --gpu 0 1 2 3 4 5 6 7 --async_update \
-    --force_sync_interval 1000
-# for Freebase multi-process-cpu
-# TransE_l2
-DGLBACKEND=pytorch python3 train.py --model TransE_l2 --dataset Freebase --batch_size 1000 \
-    --neg_sample_size 200 --hidden_dim 400 --gamma 10 --lr 0.1 --max_step 50000 \
-    --log_interval 100 --batch_size_eval 1000 --neg_sample_size_eval 1000 --test -adv \
-    --regularization_coef 1e-9 --num_thread 1 --num_proc 48
-# DistMult
-DGLBACKEND=pytorch python3 train.py --model DistMult --dataset Freebase --batch_size 1024 \
-    --neg_sample_size 256 --hidden_dim 400 --gamma 143.0 --lr 0.08 --max_step 50000 \
-    --log_interval 100 --batch_size_eval 1000 --neg_sample_size_eval 1000 --test -adv \
-    --num_thread 1 --num_proc 48
-# ComplEx
-DGLBACKEND=pytorch python3 train.py --model ComplEx --dataset Freebase --batch_size 1024 \
-    --neg_sample_size 256 --hidden_dim 400 --gamma 143.0 --lr 0.1 --max_step 50000 \
-    --log_interval 100 --batch_size_eval 1000 --neg_sample_size_eval 1000 --test -adv \
-    --num_thread 1 --num_proc 48
-# Freebase multi-gpu
-# TransE_l2 8GPU
-DGLBACKEND=pytorch python3 train.py --model TransE_l2 --dataset Freebase --batch_size 1000 \
-    --neg_sample_size 200 --hidden_dim 400 --gamma 10 --lr 0.1 --regularization_coef 1e-9 \
-    --batch_size_eval 1000 --valid --test -adv --mix_cpu_gpu --num_proc 8 \
-    --gpu 0 1 2 3 4 5 6 7 --max_step 320000 --neg_sample_size_eval 1000 --eval_interval \
-    100000 --log_interval 10000 --async_update --soft_rel_part --force_sync_interval 10000
-# DistMult 8GPU
-DGLBACKEND=pytorch python3 train.py --model DistMult --dataset Freebase --batch_size 1024 \
-    --neg_sample_size 256 --hidden_dim 400 --gamma 143.0 --lr 0.08 --batch_size_eval 1000 \
-    --valid --test -adv --mix_cpu_gpu --num_proc 8 --gpu 0 1 2 3 4 5 6 7 --max_step 300000 \
-    --neg_sample_size_eval 1000 --eval_interval 100000 --log_interval 10000 --async_update \
-    --soft_rel_part --force_sync_interval 10000
-# ComplEx 8GPU
-DGLBACKEND=pytorch python3 train.py --model ComplEx --dataset Freebase --batch_size 1024 \
-    --neg_sample_size 256 --hidden_dim 400 --gamma 143 --lr 0.1 \
-    --regularization_coef 2.00E-06 --batch_size_eval 1000 --valid --test -adv \
-    --mix_cpu_gpu --num_proc 8 --gpu 0 1 2 3 4 5 6 7 --max_step 360000 \
-    --neg_sample_size_eval 1000 --eval_interval 100000 --log_interval 10000 \
-    --async_update --soft_rel_part --force_sync_interval 10000
-# TransR 8GPU
-DGLBACKEND=pytorch python3 train.py --model TransR --dataset Freebase --batch_size 1024 \
-    --neg_sample_size 256 --regularization_coef 5e-8 --hidden_dim 200 --gamma 8.0 \
-    --lr 0.015 --batch_size_eval 1000 --valid --test -adv --mix_cpu_gpu --num_proc 8 \
-    --gpu 0 1 2 3 4 5 6 7 --max_step 300000 --neg_sample_size_eval 1000 \
-    --eval_interval 100000 --log_interval 10000 --async_update --soft_rel_part \
-    --force_sync_interval 10000
-# RotatE 8GPU
-DGLBACKEND=pytorch python3 train.py --model RotatE --dataset Freebase --batch_size 1024 \
-    --neg_sample_size 256 -de --hidden_dim 200 --gamma 12.0 --lr 0.01 \
-    --regularization_coef 1e-7 --batch_size_eval 1000 --valid --test -adv --mix_cpu_gpu \
-    --num_proc 8 --gpu 0 1 2 3 4 5 6 7 --max_step 300000 --neg_sample_size_eval 1000 \
-    --eval_interval 100000 --log_interval 10000 --async_update --soft_rel_part \
-    --force_sync_interval 10000
--- a/apps/kg/dataloader/KGDataset.py
+++ b/apps/kg/dataloader/KGDataset.py
-# -*- coding: utf-8 -*-
-#
-# setup.py
-#
-# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-import os
-import numpy as np
-def _download_and_extract(url, path, filename):
-    import shutil, zipfile
-    import requests
-    fn = os.path.join(path, filename)
-    while True:
-        try:
-            with zipfile.ZipFile(fn) as zf:
-                zf.extractall(path)
-            print('Unzip finished.')
-            break
-        except Exception:
-            os.makedirs(path, exist_ok=True)
-            f_remote = requests.get(url, stream=True)
-            sz = f_remote.headers.get('content-length')
-            assert f_remote.status_code == 200, 'fail to open {}'.format(url)
-            with open(fn, 'wb') as writer:
-                for chunk in f_remote.iter_content(chunk_size=1024*1024):
-                    writer.write(chunk)
-            print('Download finished. Unzipping the file...')
-def _get_id(dict, key):
-    id = dict.get(key, None)
-    if id is None:
-        id = len(dict)
-        dict[key] = id
-    return id
-def _parse_srd_format(format):
-    if format == "hrt":
-        return [0, 1, 2]
-    if format == "htr":
-        return [0, 2, 1]
-    if format == "rht":
-        return [1, 0, 2]
-    if format == "rth":
-        return [2, 0, 1]
-    if format == "thr":
-        return [1, 2, 0]
-    if format == "trh":
-        return [2, 1, 0]
-def _file_line(path):
-    with open(path) as f:
-        for i, l in enumerate(f):
-            pass
-    return i + 1
-class KGDataset:
-    '''Load a knowledge graph
-    The folder with a knowledge graph has five files:
-    * entities stores the mapping between entity Id and entity name.
-    * relations stores the mapping between relation Id and relation name.
-    * train stores the triples in the training set.
-    * valid stores the triples in the validation set.
-    * test stores the triples in the test set.
-    The mapping between entity (relation) Id and entity (relation) name is stored as 'id\tname'.
-    The triples are stored as 'head_name\trelation_name\ttail_name'.
-    '''
-    def __init__(self, entity_path, relation_path, train_path, 
-                 valid_path=None, test_path=None, format=[0,1,2], skip_first_line=False):
-        self.entity2id, self.n_entities = self.read_entity(entity_path)
-        self.relation2id, self.n_relations = self.read_relation(relation_path)
-        self.train = self.read_triple(train_path, "train", skip_first_line, format)
-        if valid_path is not None:
-            self.valid = self.read_triple(valid_path, "valid", skip_first_line, format)
-        if test_path is not None:
-            self.test = self.read_triple(test_path, "test", skip_first_line, format)
-    def read_entity(self, entity_path):
-        with open(entity_path) as f:
-            entity2id = {}
-            for line in f:
-                eid, entity = line.strip().split('\t')
-                entity2id[entity] = int(eid)
-        return entity2id, len(entity2id)
-    def read_relation(self, relation_path):
-        with open(relation_path) as f:
-            relation2id = {}
-            for line in f:
-                rid, relation = line.strip().split('\t')
-                relation2id[relation] = int(rid)
-        return relation2id, len(relation2id)
-    def read_triple(self, path, mode, skip_first_line=False, format=[0,1,2]):
-        # mode: train/valid/test
-        if path is None:
-            return None
-        print('Reading {} triples....'.format(mode))
-        heads = []
-        tails = []
-        rels = []
-        with open(path) as f:
-            if skip_first_line:
-                _ = f.readline()
-            for line in f:
-                triple = line.strip().split('\t')
-                h, r, t = triple[format[0]], triple[format[1]], triple[format[2]]
-                heads.append(self.entity2id[h])
-                rels.append(self.relation2id[r])
-                tails.append(self.entity2id[t])
-        heads = np.array(heads, dtype=np.int64)
-        tails = np.array(tails, dtype=np.int64)
-        rels = np.array(rels, dtype=np.int64)
-        print('Finished. Read {} {} triples.'.format(len(heads), mode))
-        return (heads, rels, tails)
-class PartitionKGDataset():
-    '''Load a partitioned knowledge graph
-    The folder with a partitioned knowledge graph has four files:
-    * relations stores the mapping between relation Id and relation name.
-    * train stores the triples in the training set.
-    * local_to_global stores the mapping of local id and global id
-    * partition_book stores the machine id of each entity
-    The triples are stored as 'head_id\relation_id\tail_id'.
-    '''
-    def __init__(self, relation_path, train_path, local2global_path, 
-                 read_triple=True, skip_first_line=False):
-        self.n_entities = _file_line(local2global_path)
-        if skip_first_line == False:
-            self.n_relations = _file_line(relation_path)
-        else:
-            self.n_relations = _file_line(relation_path) - 1
-        if read_triple == True:
-            self.train = self.read_triple(train_path, "train")
-    def read_triple(self, path, mode):
-        heads = []
-        tails = []
-        rels = []
-        print('Reading {} triples....'.format(mode))
-        with open(path) as f:
-            for line in f:
-                h, r, t = line.strip().split('\t')
-                heads.append(int(h))
-                rels.append(int(r))
-                tails.append(int(t))
-        heads = np.array(heads, dtype=np.int64)
-        tails = np.array(tails, dtype=np.int64)
-        rels = np.array(rels, dtype=np.int64)
-        print('Finished. Read {} {} triples.'.format(len(heads), mode))
-        return (heads, rels, tails)
-class KGDatasetFB15k(KGDataset):
-    '''Load a knowledge graph FB15k
-    The FB15k dataset has five files:
-    * entities.dict stores the mapping between entity Id and entity name.
-    * relations.dict stores the mapping between relation Id and relation name.
-    * train.txt stores the triples in the training set.
-    * valid.txt stores the triples in the validation set.
-    * test.txt stores the triples in the test set.
-    The mapping between entity (relation) name and entity (relation) Id is stored as 'name\tid'.
-    The triples are stored as 'head_nid\trelation_id\ttail_nid'.
-    '''
-    def __init__(self, path, name='FB15k'):
-        self.name = name
-        url = 'https://data.dgl.ai/dataset/{}.zip'.format(name)
-        if not os.path.exists(os.path.join(path, name)):
-            print('File not found. Downloading from', url)
-            _download_and_extract(url, path, name + '.zip')
-        self.path = os.path.join(path, name)
-        super(KGDatasetFB15k, self).__init__(os.path.join(self.path, 'entities.dict'),
-                                             os.path.join(self.path, 'relations.dict'),
-                                             os.path.join(self.path, 'train.txt'),
-                                             os.path.join(self.path, 'valid.txt'),
-                                             os.path.join(self.path, 'test.txt'))
-class KGDatasetFB15k237(KGDataset):
-    '''Load a knowledge graph FB15k-237
-    The FB15k-237 dataset has five files:
-    * entities.dict stores the mapping between entity Id and entity name.
-    * relations.dict stores the mapping between relation Id and relation name.
-    * train.txt stores the triples in the training set.
-    * valid.txt stores the triples in the validation set.
-    * test.txt stores the triples in the test set.
-    The mapping between entity (relation) name and entity (relation) Id is stored as 'name\tid'.
-    The triples are stored as 'head_nid\trelation_id\ttail_nid'.
-    '''
-    def __init__(self, path, name='FB15k-237'):
-        self.name = name
-        url = 'https://data.dgl.ai/dataset/{}.zip'.format(name)
-        if not os.path.exists(os.path.join(path, name)):
-            print('File not found. Downloading from', url)
-            _download_and_extract(url, path, name + '.zip')
-        self.path = os.path.join(path, name)
-        super(KGDatasetFB15k237, self).__init__(os.path.join(self.path, 'entities.dict'),
-                                                os.path.join(self.path, 'relations.dict'),
-                                                os.path.join(self.path, 'train.txt'),
-                                                os.path.join(self.path, 'valid.txt'),
-                                                os.path.join(self.path, 'test.txt'))
-class KGDatasetWN18(KGDataset):
-    '''Load a knowledge graph wn18
-    The wn18 dataset has five files:
-    * entities.dict stores the mapping between entity Id and entity name.
-    * relations.dict stores the mapping between relation Id and relation name.
-    * train.txt stores the triples in the training set.
-    * valid.txt stores the triples in the validation set.
-    * test.txt stores the triples in the test set.
-    The mapping between entity (relation) name and entity (relation) Id is stored as 'name\tid'.
-    The triples are stored as 'head_nid\trelation_id\ttail_nid'.
-    '''
-    def __init__(self, path, name='wn18'):
-        self.name = name
-        url = 'https://data.dgl.ai/dataset/{}.zip'.format(name)
-        if not os.path.exists(os.path.join(path, name)):
-            print('File not found. Downloading from', url)
-            _download_and_extract(url, path, name + '.zip')
-        self.path = os.path.join(path, name)
-        super(KGDatasetWN18, self).__init__(os.path.join(self.path, 'entities.dict'),
-                                            os.path.join(self.path, 'relations.dict'),
-                                            os.path.join(self.path, 'train.txt'),
-                                            os.path.join(self.path, 'valid.txt'),
-                                            os.path.join(self.path, 'test.txt'))
-class KGDatasetWN18rr(KGDataset):
-    '''Load a knowledge graph wn18rr
-    The wn18rr dataset has five files:
-    * entities.dict stores the mapping between entity Id and entity name.
-    * relations.dict stores the mapping between relation Id and relation name.
-    * train.txt stores the triples in the training set.
-    * valid.txt stores the triples in the validation set.
-    * test.txt stores the triples in the test set.
-    The mapping between entity (relation) name and entity (relation) Id is stored as 'name\tid'.
-    The triples are stored as 'head_nid\trelation_id\ttail_nid'.
-    '''
-    def __init__(self, path, name='wn18rr'):
-        self.name = name
-        url = 'https://data.dgl.ai/dataset/{}.zip'.format(name)
-        if not os.path.exists(os.path.join(path, name)):
-            print('File not found. Downloading from', url)
-            _download_and_extract(url, path, name + '.zip')
-        self.path = os.path.join(path, name)
-        super(KGDatasetWN18rr, self).__init__(os.path.join(self.path, 'entities.dict'),
-                                              os.path.join(self.path, 'relations.dict'),
-                                              os.path.join(self.path, 'train.txt'),
-                                              os.path.join(self.path, 'valid.txt'),
-                                              os.path.join(self.path, 'test.txt'))
-class KGDatasetFreebase(KGDataset):
-    '''Load a knowledge graph Full Freebase
-    The Freebase dataset has five files:
-    * entity2id.txt stores the mapping between entity name and entity Id.
-    * relation2id.txt stores the mapping between relation name relation Id.
-    * train.txt stores the triples in the training set.
-    * valid.txt stores the triples in the validation set.
-    * test.txt stores the triples in the test set.
-    The mapping between entity (relation) name and entity (relation) Id is stored as 'name\tid'.
-    The triples are stored as 'head_nid\trelation_id\ttail_nid'.
-    '''
-    def __init__(self, path, name='Freebase'):
-        self.name = name
-        url = 'https://data.dgl.ai/dataset/{}.zip'.format(name)
-        if not os.path.exists(os.path.join(path, name)):
-            print('File not found. Downloading from', url)
-            _download_and_extract(url, path, '{}.zip'.format(name))
-        self.path = os.path.join(path, name)
-        super(KGDatasetFreebase, self).__init__(os.path.join(self.path, 'entity2id.txt'),
-                                                os.path.join(self.path, 'relation2id.txt'),
-                                                os.path.join(self.path, 'train.txt'),
-                                                os.path.join(self.path, 'valid.txt'),
-                                                os.path.join(self.path, 'test.txt'))
-    def read_entity(self, entity_path):
-        with open(entity_path) as f_ent:
-            n_entities = int(f_ent.readline()[:-1])
-        return None, n_entities
-    def read_relation(self, relation_path):
-        with open(relation_path) as f_rel:
-            n_relations = int(f_rel.readline()[:-1])
-        return None, n_relations
-    def read_triple(self, path, mode, skip_first_line=False, format=None):
-        heads = []
-        tails = []
-        rels = []
-        print('Reading {} triples....'.format(mode))
-        with open(path) as f:
-            if skip_first_line:
-                _ = f.readline()
-            for line in f:
-                h, t, r = line.strip().split('\t')
-                heads.append(int(h))
-                tails.append(int(t))
-                rels.append(int(r))
-        heads = np.array(heads, dtype=np.int64)
-        tails = np.array(tails, dtype=np.int64)
-        rels = np.array(rels, dtype=np.int64)
-        print('Finished. Read {} {} triples.'.format(len(heads), mode))
-        return (heads, rels, tails)
-class KGDatasetUDDRaw(KGDataset):
-    '''Load a knowledge graph user defined dataset
-    The user defined dataset has five files:
-    * entities stores the mapping between entity name and entity Id.
-    * relations stores the mapping between relation name relation Id.
-    * train stores the triples in the training set. In format [src_name, rel_name, dst_name]
-    * valid stores the triples in the validation set. In format [src_name, rel_name, dst_name]
-    * test stores the triples in the test set. In format [src_name, rel_name, dst_name]
-    The mapping between entity (relation) name and entity (relation) Id is stored as 'name\tid'.
-    The triples are stored as 'head_nid\trelation_id\ttail_nid'.
-    '''
-    def __init__(self, path, name, files, format):
-        self.name = name
-        for f in files:
-            assert os.path.exists(os.path.join(path, f)), \
-                'File {} now exist in {}'.format(f, path)
-        assert len(format) == 3
-        format = _parse_srd_format(format)
-        self.load_entity_relation(path, files, format)
-        # Only train set is provided
-        if len(files) == 1:
-            super(KGDatasetUDDRaw, self).__init__("entities.tsv",
-                                                  "relation.tsv",
-                                                  os.path.join(path, files[0]),
-                                                  format=format)
-        # Train, validation and test set are provided
-        if len(files) == 3:
-            super(KGDatasetUDDRaw, self).__init__("entities.tsv",
-                                                  "relation.tsv",
-                                                  os.path.join(path, files[0]),
-                                                  os.path.join(path, files[1]),
-                                                  os.path.join(path, files[2]),
-                                                  format=format)
-    def load_entity_relation(self, path, files, format):
-        entity_map = {}
-        rel_map = {}
-        for fi in files:
-            with open(os.path.join(path, fi)) as f:
-                for line in f:
-                    triple = line.strip().split('\t')
-                    src, rel, dst = triple[format[0]], triple[format[1]], triple[format[2]]
-                    src_id = _get_id(entity_map, src)
-                    dst_id = _get_id(entity_map, dst)
-                    rel_id = _get_id(rel_map, rel)
-        entities = ["{}\t{}\n".format(key, val) for key, val in entity_map.items()]
-        with open(os.path.join(path, "entities.tsv"), "w+") as f:
-            f.writelines(entities)
-        self.entity2id = entity_map
-        self.n_entities = len(entities)
-        relations = ["{}\t{}\n".format(key, val) for key, val in rel_map.items()]
-        with open(os.path.join(path, "relations.tsv"), "w+") as f:
-            f.writelines(relations)
-        self.relation2id = rel_map
-        self.n_relations = len(relations)
-    def read_entity(self, entity_path):
-        return self.entity2id, self.n_entities
-    def read_relation(self, relation_path):
-        return self.relation2id, self.n_relations
-class KGDatasetUDD(KGDataset):
-    '''Load a knowledge graph user defined dataset
-    The user defined dataset has five files:
-    * entities stores the mapping between entity name and entity Id.
-    * relations stores the mapping between relation name relation Id.
-    * train stores the triples in the training set. In format [src_id, rel_id, dst_id]
-    * valid stores the triples in the validation set. In format [src_id, rel_id, dst_id]
-    * test stores the triples in the test set. In format [src_id, rel_id, dst_id]
-    The mapping between entity (relation) name and entity (relation) Id is stored as 'name\tid'.
-    The triples are stored as 'head_nid\trelation_id\ttail_nid'.
-    '''
-    def __init__(self, path, name, files, format):
-        self.name = name
-        for f in files:
-            assert os.path.exists(os.path.join(path, f)), \
-                'File {} now exist in {}'.format(f, path)
-        format = _parse_srd_format(format)
-        if len(files) == 3:
-            super(KGDatasetUDD, self).__init__(os.path.join(path, files[0]),
-                                               os.path.join(path, files[1]),
-                                               os.path.join(path, files[2]),
-                                               None, None,
-                                               format=format)
-        if len(files) == 5:
-            super(KGDatasetUDD, self).__init__(os.path.join(path, files[0]),
-                                               os.path.join(path, files[1]),
-                                               os.path.join(path, files[2]),
-                                               os.path.join(path, files[3]),
-                                               os.path.join(path, files[4]),
-                                               format=format)
-    def read_entity(self, entity_path):
-        n_entities = 0
-        with open(entity_path) as f_ent:
-            for line in f_ent:
-                n_entities += 1
-        return None, n_entities
-    def read_relation(self, relation_path):
-        n_relations = 0
-        with open(relation_path) as f_rel:
-            for line in f_rel:
-                n_relations += 1
-        return None, n_relations
-    def read_triple(self, path, mode, skip_first_line=False, format=[0,1,2]):
-        heads = []
-        tails = []
-        rels = []
-        print('Reading {} triples....'.format(mode))
-        with open(path) as f:
-            if skip_first_line:
-                _ = f.readline()
-            for line in f:
-                triple = line.strip().split('\t')
-                h, r, t = triple[format[0]], triple[format[1]], triple[format[2]]
-                heads.append(int(h))
-                tails.append(int(t))
-                rels.append(int(r))
-        heads = np.array(heads, dtype=np.int64)
-        tails = np.array(tails, dtype=np.int64)
-        rels = np.array(rels, dtype=np.int64)
-        print('Finished. Read {} {} triples.'.format(len(heads), mode))
-        return (heads, rels, tails)
-def get_dataset(data_path, data_name, format_str, files=None):
-    if format_str == 'built_in':
-        if data_name == 'Freebase':
-            dataset = KGDatasetFreebase(data_path)
-        elif data_name == 'FB15k':
-            dataset = KGDatasetFB15k(data_path)
-        elif data_name == 'FB15k-237':
-            dataset = KGDatasetFB15k237(data_path)
-        elif data_name == 'wn18':
-            dataset = KGDatasetWN18(data_path)
-        elif data_name == 'wn18rr':
-            dataset = KGDatasetWN18rr(data_path)
-        else: 
-            assert False, "Unknown dataset {}".format(data_name)
-    elif format_str.startswith('raw_udd'):
-        # user defined dataset
-        format = format_str[8:]
-        dataset = KGDatasetUDDRaw(data_path, data_name, files, format)
-    elif format_str.startswith('udd'):
-        # user defined dataset
-        format = format_str[4:]
-        dataset = KGDatasetUDD(data_path, data_name, files, format)
-    else:
-        assert False, "Unknown format {}".format(format_str)
-    return dataset
-def get_partition_dataset(data_path, data_name, part_id):
-    part_name = os.path.join(data_name, 'partition_'+str(part_id))
-    path = os.path.join(data_path, part_name)
-    if not os.path.exists(path):
-        print('Partition file not found.')
-        exit()
-    train_path = os.path.join(path, 'train.txt')
-    local2global_path = os.path.join(path, 'local_to_global.txt')
-    partition_book_path = os.path.join(path, 'partition_book.txt')
-    if data_name == 'Freebase':
-        relation_path = os.path.join(path, 'relation2id.txt')
-        skip_first_line = True
-    elif data_name in ['FB15k', 'FB15k-237', 'wn18', 'wn18rr']:
-        relation_path = os.path.join(path, 'relations.dict')
-        skip_first_line = False
-    else:
-        relation_path = os.path.join(path, 'relation.tsv')
-        skip_first_line = False
-    dataset = PartitionKGDataset(relation_path, 
-                                 train_path, 
-                                 local2global_path, 
-                                 read_triple=True, 
-                                 skip_first_line=skip_first_line)
-    partition_book = []
-    with open(partition_book_path) as f:
-        for line in f:
-            partition_book.append(int(line))
-    local_to_global = []
-    with open(local2global_path) as f:
-        for line in f:
-            local_to_global.append(int(line))
-    return dataset, partition_book, local_to_global
-def get_server_partition_dataset(data_path, data_name, part_id):
-    part_name = os.path.join(data_name, 'partition_'+str(part_id))
-    path = os.path.join(data_path, part_name)
-    if not os.path.exists(path):
-        print('Partition file not found.')
-        exit()
-    train_path = os.path.join(path, 'train.txt')
-    local2global_path = os.path.join(path, 'local_to_global.txt')    
-    if data_name == 'Freebase':
-        relation_path = os.path.join(path, 'relation2id.txt')
-        skip_first_line = True
-    elif data_name in ['FB15k', 'FB15k-237', 'wn18', 'wn18rr']:
-        relation_path = os.path.join(path, 'relations.dict')
-        skip_first_line = False
-    else:
-        relation_path = os.path.join(path, 'relation.tsv')
-        skip_first_line = False
-    dataset = PartitionKGDataset(relation_path,
-                                 train_path,
-                                 local2global_path,
-                                 read_triple=False,
-                                 skip_first_line=skip_first_line)
-    n_entities = _file_line(os.path.join(path, 'partition_book.txt'))
-    local_to_global = []
-    with open(local2global_path) as f:
-        for line in f:
-            local_to_global.append(int(line))
-    global_to_local = [0] * n_entities
-    for i in range(len(local_to_global)):
-        global_id = local_to_global[i]
-        global_to_local[global_id] = i
-    local_to_global = None
-    return global_to_local, dataset
--- a/apps/kg/dataloader/__init__.py
+++ b/apps/kg/dataloader/__init__.py
-# -*- coding: utf-8 -*-
-#
-# setup.py
-#
-# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-from .KGDataset import *
-from .sampler import *
--- a/apps/kg/dataloader/sampler.py
+++ b/apps/kg/dataloader/sampler.py
-# -*- coding: utf-8 -*-
-#
-# setup.py
-#
-# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-import math
-import numpy as np
-import scipy as sp
-import dgl.backend as F
-import dgl
-import os
-import sys
-import pickle
-import time
-from dgl.base import NID, EID
-def SoftRelationPartition(edges, n, threshold=0.05):
-    """This partitions a list of edges to n partitions according to their
-    relation types. For any relation with number of edges larger than the
-    threshold, its edges will be evenly distributed into all partitions.
-    For any relation with number of edges smaller than the threshold, its
-    edges will be put into one single partition.
-    Algo:
-    For r in relations:
-        if r.size() > threadold
-            Evenly divide edges of r into n parts and put into each relation.
-        else
-            Find partition with fewest edges, and put edges of r into 
-            this partition.
-    Parameters
-    ----------
-    edges : (heads, rels, tails) triple
-        Edge list to partition
-    n : int
-        Number of partitions
-    threshold : float
-        The threshold of whether a relation is LARGE or SMALL
-        Default: 5%
-    Returns
-    -------
-    List of np.array
-        Edges of each partition
-    List of np.array
-        Edge types of each partition
-    bool
-        Whether there exists some relations belongs to multiple partitions
-    """
-    heads, rels, tails = edges
-    print('relation partition {} edges into {} parts'.format(len(heads), n))
-    uniq, cnts = np.unique(rels, return_counts=True)
-    idx = np.flip(np.argsort(cnts))
-    cnts = cnts[idx]
-    uniq = uniq[idx]
-    assert cnts[0] > cnts[-1]
-    edge_cnts = np.zeros(shape=(n,), dtype=np.int64)
-    rel_cnts = np.zeros(shape=(n,), dtype=np.int64)
-    rel_dict = {}
-    rel_parts = []
-    cross_rel_part = []
-    for _ in range(n):
-        rel_parts.append([])
-    large_threshold = int(len(rels) * threshold)
-    capacity_per_partition = int(len(rels) / n)
-    # ensure any relation larger than the partition capacity will be split
-    large_threshold = capacity_per_partition if capacity_per_partition < large_threshold \
-                      else large_threshold
-    num_cross_part = 0
-    for i in range(len(cnts)):
-        cnt = cnts[i]
-        r = uniq[i]
-        r_parts = []
-        if cnt > large_threshold:
-            avg_part_cnt = (cnt // n) + 1
-            num_cross_part += 1
-            for j in range(n):
-                part_cnt = avg_part_cnt if cnt > avg_part_cnt else cnt
-                r_parts.append([j, part_cnt])
-                rel_parts[j].append(r)
-                edge_cnts[j] += part_cnt
-                rel_cnts[j] += 1
-                cnt -= part_cnt
-            cross_rel_part.append(r)
-        else:
-            idx = np.argmin(edge_cnts)
-            r_parts.append([idx, cnt])
-            rel_parts[idx].append(r)
-            edge_cnts[idx] += cnt
-            rel_cnts[idx] += 1
-        rel_dict[r] = r_parts
-    for i, edge_cnt in enumerate(edge_cnts):
-        print('part {} has {} edges and {} relations'.format(i, edge_cnt, rel_cnts[i]))
-    print('{}/{} duplicated relation across partitions'.format(num_cross_part, len(cnts)))
-    parts = []
-    for i in range(n):
-        parts.append([])
-        rel_parts[i] = np.array(rel_parts[i])
-    for i, r in enumerate(rels):
-        r_part = rel_dict[r][0]
-        part_idx = r_part[0]
-        cnt = r_part[1]
-        parts[part_idx].append(i)
-        cnt -= 1
-        if cnt == 0:
-            rel_dict[r].pop(0)
-        else:
-            rel_dict[r][0][1] = cnt
-    for i, part in enumerate(parts):
-        parts[i] = np.array(part, dtype=np.int64)
-    shuffle_idx = np.concatenate(parts)
-    heads[:] = heads[shuffle_idx]
-    rels[:] = rels[shuffle_idx]
-    tails[:] = tails[shuffle_idx]
-    off = 0
-    for i, part in enumerate(parts):
-        parts[i] = np.arange(off, off + len(part))
-        off += len(part)
-    cross_rel_part = np.array(cross_rel_part)
-    return parts, rel_parts, num_cross_part > 0, cross_rel_part
-def BalancedRelationPartition(edges, n):
-    """This partitions a list of edges based on relations to make sure
-    each partition has roughly the same number of edges and relations.
-    Algo:
-    For r in relations:
-      Find partition with fewest edges
-      if r.size() > num_of empty_slot
-         put edges of r into this partition to fill the partition,
-         find next partition with fewest edges to put r in.
-      else
-         put edges of r into this partition.
-    Parameters
-    ----------
-    edges : (heads, rels, tails) triple
-        Edge list to partition
-    n : int
-        number of partitions
-    Returns
-    -------
-    List of np.array
-        Edges of each partition
-    List of np.array
-        Edge types of each partition
-    bool
-        Whether there exists some relations belongs to multiple partitions
-    """
-    heads, rels, tails = edges
-    print('relation partition {} edges into {} parts'.format(len(heads), n))
-    uniq, cnts = np.unique(rels, return_counts=True)
-    idx = np.flip(np.argsort(cnts))
-    cnts = cnts[idx]
-    uniq = uniq[idx]
-    assert cnts[0] > cnts[-1]
-    edge_cnts = np.zeros(shape=(n,), dtype=np.int64)
-    rel_cnts = np.zeros(shape=(n,), dtype=np.int64)
-    rel_dict = {}
-    rel_parts = []
-    for _ in range(n):
-        rel_parts.append([])
-    max_edges = (len(rels) // n) + 1
-    num_cross_part = 0
-    for i in range(len(cnts)):
-        cnt = cnts[i]
-        r = uniq[i]
-        r_parts = []
-        while cnt > 0:
-            idx = np.argmin(edge_cnts)
-            if edge_cnts[idx] + cnt <= max_edges:
-                r_parts.append([idx, cnt])
-                rel_parts[idx].append(r)
-                edge_cnts[idx] += cnt
-                rel_cnts[idx] += 1
-                cnt = 0
-            else:
-                cur_cnt = max_edges - edge_cnts[idx]
-                r_parts.append([idx, cur_cnt])
-                rel_parts[idx].append(r)
-                edge_cnts[idx] += cur_cnt
-                rel_cnts[idx] += 1
-                num_cross_part += 1
-                cnt -= cur_cnt
-        rel_dict[r] = r_parts
-    for i, edge_cnt in enumerate(edge_cnts):
-        print('part {} has {} edges and {} relations'.format(i, edge_cnt, rel_cnts[i]))
-    print('{}/{} duplicated relation across partitions'.format(num_cross_part, len(cnts)))
-    parts = []
-    for i in range(n):
-        parts.append([])
-        rel_parts[i] = np.array(rel_parts[i])
-    for i, r in enumerate(rels):
-        r_part = rel_dict[r][0]
-        part_idx = r_part[0]
-        cnt = r_part[1]
-        parts[part_idx].append(i)
-        cnt -= 1
-        if cnt == 0:
-            rel_dict[r].pop(0)
-        else:
-            rel_dict[r][0][1] = cnt
-    for i, part in enumerate(parts):
-        parts[i] = np.array(part, dtype=np.int64)
-    shuffle_idx = np.concatenate(parts)
-    heads[:] = heads[shuffle_idx]
-    rels[:] = rels[shuffle_idx]
-    tails[:] = tails[shuffle_idx]
-    off = 0
-    for i, part in enumerate(parts):
-        parts[i] = np.arange(off, off + len(part))
-        off += len(part)
-    return parts, rel_parts, num_cross_part > 0
-def RandomPartition(edges, n):
-    """This partitions a list of edges randomly across n partitions
-    Parameters
-    ----------
-    edges : (heads, rels, tails) triple
-        Edge list to partition
-    n : int
-        number of partitions
-    Returns
-    -------
-    List of np.array
-        Edges of each partition
-    """
-    heads, rels, tails = edges
-    print('random partition {} edges into {} parts'.format(len(heads), n))
-    idx = np.random.permutation(len(heads))
-    heads[:] = heads[idx]
-    rels[:] = rels[idx]
-    tails[:] = tails[idx]
-    part_size = int(math.ceil(len(idx) / n))
-    parts = []
-    for i in range(n):
-        start = part_size * i
-        end = min(part_size * (i + 1), len(idx))
-        parts.append(idx[start:end])
-        print('part {} has {} edges'.format(i, len(parts[-1])))
-    return parts
-def ConstructGraph(edges, n_entities, args):
-    """Construct Graph for training
-    Parameters
-    ----------
-    edges : (heads, rels, tails) triple
-        Edge list
-    n_entities : int
-        number of entities
-    args :
-        Global configs.
-    """
-    pickle_name = 'graph_train.pickle'
-    if args.pickle_graph and os.path.exists(os.path.join(args.data_path, args.dataset, pickle_name)):
-        with open(os.path.join(args.data_path, args.dataset, pickle_name), 'rb') as graph_file:
-            g = pickle.load(graph_file)
-            print('Load pickled graph.')
-    else:
-        src, etype_id, dst = edges
-        coo = sp.sparse.coo_matrix((np.ones(len(src)), (src, dst)), shape=[n_entities, n_entities])
-        g = dgl.DGLGraph(coo, readonly=True, multigraph=True, sort_csr=True)
-        g.edata['tid'] = F.tensor(etype_id, F.int64)
-        if args.pickle_graph:
-            with open(os.path.join(args.data_path, args.dataset, pickle_name), 'wb') as graph_file:
-                pickle.dump(g, graph_file)
-    return g
-class TrainDataset(object):
-    """Dataset for training
-    Parameters
-    ----------
-    dataset : KGDataset
-        Original dataset.
-    args :
-        Global configs.
-    ranks:
-        Number of partitions.
-    """
-    def __init__(self, dataset, args, ranks=64):
-        triples = dataset.train
-        num_train = len(triples[0])
-        print('|Train|:', num_train)
-        if ranks > 1 and args.soft_rel_part:
-            self.edge_parts, self.rel_parts, self.cross_part, self.cross_rels = \
-            SoftRelationPartition(triples, ranks)
-        elif ranks > 1 and args.rel_part:
-            self.edge_parts, self.rel_parts, self.cross_part = \
-                BalancedRelationPartition(triples, ranks)
-        elif ranks > 1:
-            self.edge_parts = RandomPartition(triples, ranks)
-            self.cross_part = True
-        else:
-            self.edge_parts = [np.arange(num_train)]
-            self.rel_parts = [np.arange(dataset.n_relations)]
-            self.cross_part = False
-        self.g = ConstructGraph(triples, dataset.n_entities, args)
-    def create_sampler(self, batch_size, neg_sample_size=2, neg_chunk_size=None, mode='head', num_workers=32,
-                       shuffle=True, exclude_positive=False, rank=0):
-        """Create sampler for training
-        Parameters
-        ----------
-        batch_size : int
-            Batch size of each mini batch.
-        neg_sample_size : int
-            How many negative edges sampled for each node.
-        neg_chunk_size : int
-            How many edges in one chunk. We split one batch into chunks.
-        mode : str
-            Sampling mode.
-        number_workers: int
-            Number of workers used in parallel for this sampler
-        shuffle : bool
-            If True, shuffle the seed edges.
-            If False, do not shuffle the seed edges.
-            Default: False
-        exclude_positive : bool
-            If True, exlucde true positive edges in sampled negative edges
-            If False, return all sampled negative edges even there are positive edges
-            Default: False
-        rank : int
-            Which partition to sample.
-        Returns
-        -------
-        dgl.contrib.sampling.EdgeSampler
-            Edge sampler
-        """
-        EdgeSampler = getattr(dgl.contrib.sampling, 'EdgeSampler')
-        assert batch_size % neg_sample_size == 0, 'batch_size should be divisible by B'
-        return EdgeSampler(self.g,
-                           seed_edges=F.tensor(self.edge_parts[rank]),
-                           batch_size=batch_size,
-                           neg_sample_size=int(neg_sample_size/neg_chunk_size),
-                           chunk_size=neg_chunk_size,
-                           negative_mode=mode,
-                           num_workers=num_workers,
-                           shuffle=shuffle,
-                           exclude_positive=exclude_positive,
-                           return_false_neg=False)
-class ChunkNegEdgeSubgraph(dgl.DGLGraph):
-    """Wrapper for negative graph
-        Parameters
-        ----------
-        neg_g : DGLGraph
-            Graph holding negative edges.
-        num_chunks : int
-            Number of chunks in sampled graph.
-        chunk_size : int
-            Info of chunk_size.
-        neg_sample_size : int
-            Info of neg_sample_size.
-        neg_head : bool
-            If True, negative_mode is 'head'
-            If False, negative_mode is 'tail'
-    """
-    def __init__(self, subg, num_chunks, chunk_size,
-                 neg_sample_size, neg_head):
-        super(ChunkNegEdgeSubgraph, self).__init__(graph_data=subg.sgi.graph,
-                                                   readonly=True,
-                                                   parent=subg._parent)
-        self.ndata[NID] = subg.sgi.induced_nodes.tousertensor()
-        self.edata[EID] = subg.sgi.induced_edges.tousertensor()
-        self.subg = subg
-        self.num_chunks = num_chunks
-        self.chunk_size = chunk_size
-        self.neg_sample_size = neg_sample_size
-        self.neg_head = neg_head
-    @property
-    def head_nid(self):
-        return self.subg.head_nid
-    @property
-    def tail_nid(self):
-        return self.subg.tail_nid
-def create_neg_subgraph(pos_g, neg_g, chunk_size, neg_sample_size, is_chunked,
-                        neg_head, num_nodes):
-    """KG models need to know the number of chunks, the chunk size and negative sample size
-    of a negative subgraph to perform the computation more efficiently.
-    This function tries to infer all of these information of the negative subgraph
-    and create a wrapper class that contains all of the information.
-    Parameters
-    ----------
-    pos_g : DGLGraph
-        Graph holding positive edges.
-    neg_g : DGLGraph
-        Graph holding negative edges.
-    chunk_size : int
-        Chunk size of negative subgrap.
-    neg_sample_size : int
-        Negative sample size of negative subgrap.
-    is_chunked : bool
-        If True, the sampled batch is chunked.
-    neg_head : bool
-        If True, negative_mode is 'head'
-        If False, negative_mode is 'tail'
-    num_nodes: int
-        Total number of nodes in the whole graph.
-    Returns
-    -------
-    ChunkNegEdgeSubgraph
-        Negative graph wrapper
-    """
-    assert neg_g.number_of_edges() % pos_g.number_of_edges() == 0
-    # We use all nodes to create negative edges. Regardless of the sampling algorithm,
-    # we can always view the subgraph with one chunk.
-    if (neg_head and len(neg_g.head_nid) == num_nodes) \
-            or (not neg_head and len(neg_g.tail_nid) == num_nodes):
-        num_chunks = 1
-        chunk_size = pos_g.number_of_edges()
-    elif is_chunked:
-        # This is probably for evaluation.
-        if pos_g.number_of_edges() < chunk_size \
-                and neg_g.number_of_edges() % neg_sample_size == 0:
-            num_chunks = 1
-            chunk_size = pos_g.number_of_edges()
-        # This is probably the last batch in the training. Let's ignore it.
-        elif pos_g.number_of_edges() % chunk_size > 0:
-            return None
-        else:
-            num_chunks = int(pos_g.number_of_edges() / chunk_size)
-        assert num_chunks * chunk_size == pos_g.number_of_edges()
-    else:
-        num_chunks = pos_g.number_of_edges()
-        chunk_size = 1
-    return ChunkNegEdgeSubgraph(neg_g, num_chunks, chunk_size,
-                                neg_sample_size, neg_head)
-class EvalSampler(object):
-    """Sampler for validation and testing
-    Parameters
-    ----------
-    g : DGLGraph
-        Graph containing KG graph
-    edges : tensor
-        Seed edges
-    batch_size : int
-        Batch size of each mini batch.
-    neg_sample_size : int
-        How many negative edges sampled for each node.
-    neg_chunk_size : int
-        How many edges in one chunk. We split one batch into chunks.
-    mode : str
-        Sampling mode.
-    number_workers: int
-        Number of workers used in parallel for this sampler
-    filter_false_neg : bool
-        If True, exlucde true positive edges in sampled negative edges
-        If False, return all sampled negative edges even there are positive edges
-        Default: True
-    """
-    def __init__(self, g, edges, batch_size, neg_sample_size, neg_chunk_size, mode, num_workers=32,
-                 filter_false_neg=True):
-        EdgeSampler = getattr(dgl.contrib.sampling, 'EdgeSampler')
-        self.sampler = EdgeSampler(g,
-                                   batch_size=batch_size,
-                                   seed_edges=edges,
-                                   neg_sample_size=neg_sample_size,
-                                   chunk_size=neg_chunk_size,
-                                   negative_mode=mode,
-                                   num_workers=num_workers,
-                                   shuffle=False,
-                                   exclude_positive=False,
-                                   relations=g.edata['tid'],
-                                   return_false_neg=filter_false_neg)
-        self.sampler_iter = iter(self.sampler)
-        self.mode = mode
-        self.neg_head = 'head' in mode
-        self.g = g
-        self.filter_false_neg = filter_false_neg
-        self.neg_chunk_size = neg_chunk_size
-        self.neg_sample_size = neg_sample_size
-    def __iter__(self):
-        return self
-    def __next__(self):
-        """Get next batch
-        Returns
-        -------
-        DGLGraph
-            Sampled positive graph
-        ChunkNegEdgeSubgraph
-            Negative graph wrapper
-        """
-        while True:
-            pos_g, neg_g = next(self.sampler_iter)
-            if self.filter_false_neg:
-                neg_positive = neg_g.edata['false_neg']
-            neg_g = create_neg_subgraph(pos_g, neg_g, 
-                                        self.neg_chunk_size, 
-                                        self.neg_sample_size, 
-                                        'chunk' in self.mode, 
-                                        self.neg_head, 
-                                        self.g.number_of_nodes())
-            if neg_g is not None:
-                break
-        pos_g.ndata['id'] = pos_g.parent_nid
-        neg_g.ndata['id'] = neg_g.parent_nid
-        pos_g.edata['id'] = pos_g._parent.edata['tid'][pos_g.parent_eid]
-        if self.filter_false_neg:
-            neg_g.edata['bias'] = F.astype(-neg_positive, F.float32)
-        return pos_g, neg_g
-    def reset(self):
-        """Reset the sampler
-        """
-        self.sampler_iter = iter(self.sampler)
-        return self
-class EvalDataset(object):
-    """Dataset for validation or testing
-    Parameters
-    ----------
-    dataset : KGDataset
-        Original dataset.
-    args :
-        Global configs.
-    """
-    def __init__(self, dataset, args):
-        pickle_name = 'graph_all.pickle'
-        if args.pickle_graph and os.path.exists(os.path.join(args.data_path, args.dataset, pickle_name)):
-            with open(os.path.join(args.data_path, args.dataset, pickle_name), 'rb') as graph_file:
-                g = pickle.load(graph_file)
-                print('Load pickled graph.')
-        else:
-            src = np.concatenate((dataset.train[0], dataset.valid[0], dataset.test[0]))
-            etype_id = np.concatenate((dataset.train[1], dataset.valid[1], dataset.test[1]))
-            dst = np.concatenate((dataset.train[2], dataset.valid[2], dataset.test[2]))
-            coo = sp.sparse.coo_matrix((np.ones(len(src)), (src, dst)),
-                                       shape=[dataset.n_entities, dataset.n_entities])
-            g = dgl.DGLGraph(coo, readonly=True, multigraph=True, sort_csr=True)
-            g.edata['tid'] = F.tensor(etype_id, F.int64)
-            if args.pickle_graph:
-                with open(os.path.join(args.data_path, args.dataset, pickle_name), 'wb') as graph_file:
-                    pickle.dump(g, graph_file)
-        self.g = g
-        self.num_train = len(dataset.train[0])
-        self.num_valid = len(dataset.valid[0])
-        self.num_test = len(dataset.test[0])
-        if args.eval_percent < 1:
-            self.valid = np.random.randint(0, self.num_valid,
-                    size=(int(self.num_valid * args.eval_percent),)) + self.num_train
-        else:
-            self.valid = np.arange(self.num_train, self.num_train + self.num_valid)
-        print('|valid|:', len(self.valid))
-        if args.eval_percent < 1:
-            self.test = np.random.randint(0, self.num_test,
-                    size=(int(self.num_test * args.eval_percent,)))
-            self.test += self.num_train + self.num_valid
-        else:
-            self.test = np.arange(self.num_train + self.num_valid, self.g.number_of_edges())
-        print('|test|:', len(self.test))
-    def get_edges(self, eval_type):
-        """ Get all edges in this dataset
-        Parameters
-        ----------
-        eval_type : str
-            Sampling type, 'valid' for validation and 'test' for testing
-        Returns
-        -------
-        np.array
-            Edges
-        """
-        if eval_type == 'valid':
-            return self.valid
-        elif eval_type == 'test':
-            return self.test
-        else:
-            raise Exception('get invalid type: ' + eval_type)
-    def create_sampler(self, eval_type, batch_size, neg_sample_size, neg_chunk_size,
-                       filter_false_neg, mode='head', num_workers=32, rank=0, ranks=1):
-        """Create sampler for validation or testing
-        Parameters
-        ----------
-        eval_type : str
-            Sampling type, 'valid' for validation and 'test' for testing
-        batch_size : int
-            Batch size of each mini batch.
-        neg_sample_size : int
-            How many negative edges sampled for each node.
-        neg_chunk_size : int
-            How many edges in one chunk. We split one batch into chunks.
-        filter_false_neg : bool
-            If True, exlucde true positive edges in sampled negative edges
-            If False, return all sampled negative edges even there are positive edges
-        mode : str
-            Sampling mode.
-        number_workers: int
-            Number of workers used in parallel for this sampler
-        rank : int
-            Which partition to sample.
-        ranks : int
-            Total number of partitions.
-        Returns
-        -------
-        dgl.contrib.sampling.EdgeSampler
-            Edge sampler
-        """
-        edges = self.get_edges(eval_type)
-        beg = edges.shape[0] * rank // ranks
-        end = min(edges.shape[0] * (rank + 1) // ranks, edges.shape[0])
-        edges = edges[beg: end]
-        return EvalSampler(self.g, edges, batch_size, neg_sample_size, neg_chunk_size,
-                           mode, num_workers, filter_false_neg)
-class NewBidirectionalOneShotIterator:
-    """Grouped samper iterator
-    Parameters
-    ----------
-    dataloader_head : dgl.contrib.sampling.EdgeSampler
-        EdgeSampler in head mode
-    dataloader_tail : dgl.contrib.sampling.EdgeSampler
-        EdgeSampler in tail mode
-    neg_chunk_size : int
-        How many edges in one chunk. We split one batch into chunks.
-    neg_sample_size : int
-        How many negative edges sampled for each node.
-    is_chunked : bool
-        If True, the sampled batch is chunked.
-    num_nodes : int
-        Total number of nodes in the whole graph.
-    """
-    def __init__(self, dataloader_head, dataloader_tail, neg_chunk_size, neg_sample_size,
-                 is_chunked, num_nodes):
-        self.sampler_head = dataloader_head
-        self.sampler_tail = dataloader_tail
-        self.iterator_head = self.one_shot_iterator(dataloader_head, neg_chunk_size,
-                                                    neg_sample_size, is_chunked,
-                                                    True, num_nodes)
-        self.iterator_tail = self.one_shot_iterator(dataloader_tail, neg_chunk_size,
-                                                    neg_sample_size, is_chunked,
-                                                    False, num_nodes)
-        self.step = 0
-    def __next__(self):
-        self.step += 1
-        if self.step % 2 == 0:
-            pos_g, neg_g = next(self.iterator_head)
-        else:
-            pos_g, neg_g = next(self.iterator_tail)
-        return pos_g, neg_g
-    @staticmethod
-    def one_shot_iterator(dataloader, neg_chunk_size, neg_sample_size, is_chunked,
-                          neg_head, num_nodes):
-        while True:
-            for pos_g, neg_g in dataloader:
-                neg_g = create_neg_subgraph(pos_g, neg_g, neg_chunk_size, neg_sample_size,
-                                            is_chunked, neg_head, num_nodes)
-                if neg_g is None:
-                    continue
-                pos_g.ndata['id'] = pos_g.parent_nid
-                neg_g.ndata['id'] = neg_g.parent_nid
-                pos_g.edata['id'] = pos_g._parent.edata['tid'][pos_g.parent_eid]
-                yield pos_g, neg_g
--- a/apps/kg/distributed/README.md
+++ b/apps/kg/distributed/README.md
-## Training Scripts for distributed training
-1. Partition data
-Partition FB15k:
-```bash
-./partition.sh  ../data FB15k 4
-```
-Partition Freebase:
-```bash
-./partition.sh  ../data Freebase 4
-```
-2. Modify `ip_config.txt` and copy dgl-ke to all the machines
-3. Run
-```bash
-./launch.sh \
-  ~/dgl/apps/kg/distributed \
-  ./fb15k_transe_l2.sh \
-  ubuntu ~/mykey.pem
-```
\ No newline at end of file
--- a/apps/kg/distributed/fb15k_transe_l2.sh
+++ b/apps/kg/distributed/fb15k_transe_l2.sh
-#!/bin/bash
-##################################################################################
-# This script runing distmult model on Freebase dataset in distributed setting.
-# You can change the hyper-parameter in this file but DO NOT run script manually
-##################################################################################
-machine_id=$1
-server_count=$2
-machine_count=$3
-# Delete the temp file
-rm *-shape
-##################################################################################
-# Start kvserver
-##################################################################################
-SERVER_ID_LOW=$((machine_id*server_count))
-SERVER_ID_HIGH=$(((machine_id+1)*server_count))
-while [ $SERVER_ID_LOW -lt $SERVER_ID_HIGH ]
-do
-    MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 DGLBACKEND=pytorch python3 ../kvserver.py --model TransE_l2 --dataset FB15k \
-    --hidden_dim 400 --gamma 19.9 --lr 0.25 --total_client 64 --server_id $SERVER_ID_LOW &
-    let SERVER_ID_LOW+=1
-done
-##################################################################################
-# Start kvclient
-##################################################################################
-MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 DGLBACKEND=pytorch python3 ../kvclient.py --model TransE_l2 --dataset FB15k \
--batch_size 1000 --neg_sample_size 200 --hidden_dim 400 --gamma 19.9 --lr 0.25 --max_step 500 --log_interval 100 --num_thread 1 \
--batch_size_eval 16 --test -adv --regularization_coef 1e-9 --total_machine $machine_count --num_client 16
\ No newline at end of file
--- a/apps/kg/distributed/freebase_complex.sh
+++ b/apps/kg/distributed/freebase_complex.sh
-#!/bin/bash
-##################################################################################
-# This script runing ComplEx model on Freebase dataset in distributed setting.
-# You can change the hyper-parameter in this file but DO NOT run script manually
-##################################################################################
-machine_id=$1
-server_count=$2
-machine_count=$3
-# Delete the temp file
-rm *-shape
-##################################################################################
-# Start kvserver
-##################################################################################
-SERVER_ID_LOW=$((machine_id*server_count))
-SERVER_ID_HIGH=$(((machine_id+1)*server_count))
-while [ $SERVER_ID_LOW -lt $SERVER_ID_HIGH ]
-do
-    MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 DGLBACKEND=pytorch python3 ../kvserver.py --model ComplEx --dataset Freebase \
-    --hidden_dim 400 --gamma 143.0 --lr 0.1 --total_client 160 --server_id $SERVER_ID_LOW &
-    let SERVER_ID_LOW+=1
-done
-##################################################################################
-# Start kvclient
-##################################################################################
-MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 DGLBACKEND=pytorch python3 ../kvclient.py --model ComplEx --dataset Freebase \
--batch_size 1024 --neg_sample_size 256 --hidden_dim 400 --gamma 143.0 --lr 0.1 --max_step 12500 --log_interval 100 \
--batch_size_eval 1000 --neg_sample_size_eval 1000 --test -adv --total_machine $machine_count --num_thread 1 --num_client 40
\ No newline at end of file
--- a/apps/kg/distributed/freebase_distmult.sh
+++ b/apps/kg/distributed/freebase_distmult.sh
-#!/bin/bash
-##################################################################################
-# This script runing distmult model on Freebase dataset in distributed setting.
-# You can change the hyper-parameter in this file but DO NOT run script manually
-##################################################################################
-machine_id=$1
-server_count=$2
-machine_count=$3
-# Delete the temp file
-rm *-shape
-##################################################################################
-# Start kvserver
-##################################################################################
-SERVER_ID_LOW=$((machine_id*server_count))
-SERVER_ID_HIGH=$(((machine_id+1)*server_count))
-while [ $SERVER_ID_LOW -lt $SERVER_ID_HIGH ]
-do
-    MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 DGLBACKEND=pytorch python3 ../kvserver.py --model DistMult --dataset Freebase \
-    --hidden_dim 400 --gamma 143.0 --lr 0.08 --total_client 160 --server_id $SERVER_ID_LOW &
-    let SERVER_ID_LOW+=1
-done
-##################################################################################
-# Start kvclient
-##################################################################################
-MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 DGLBACKEND=pytorch python3 ../kvclient.py --model DistMult --dataset Freebase \
--batch_size 1024 --neg_sample_size 256 --hidden_dim 400 --gamma 143.0 --lr 0.08 --max_step 12500 --log_interval 100 \
--batch_size_eval 1000 --neg_sample_size_eval 1000 --test -adv --total_machine $machine_count --num_thread 1 --num_client 40
\ No newline at end of file
--- a/apps/kg/distributed/freebase_transe_l2.sh
+++ b/apps/kg/distributed/freebase_transe_l2.sh
-#!/bin/bash
-##################################################################################
-# This script runing distmult model on Freebase dataset in distributed setting.
-# You can change the hyper-parameter in this file but DO NOT run script manually
-##################################################################################
-machine_id=$1
-server_count=$2
-machine_count=$3
-# Delete the temp file
-rm *-shape
-##################################################################################
-# Start kvserver
-##################################################################################
-SERVER_ID_LOW=$((machine_id*server_count))
-SERVER_ID_HIGH=$(((machine_id+1)*server_count))
-while [ $SERVER_ID_LOW -lt $SERVER_ID_HIGH ]
-do
-    MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 DGLBACKEND=pytorch python3 ../kvserver.py --model TransE_l2 --dataset Freebase \
-    --hidden_dim 400 --gamma 10 --lr 0.1 --total_client 160 --server_id $SERVER_ID_LOW &
-    let SERVER_ID_LOW+=1
-done
-##################################################################################
-# Start kvclient
-##################################################################################
-MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 DGLBACKEND=pytorch python3 ../kvclient.py --model TransE_l2 --dataset Freebase \
--batch_size 1000 --neg_sample_size 200 --hidden_dim 400 --gamma 10 --lr 0.1 --max_step 12500 --log_interval 100 --num_thread 1 \
--batch_size_eval 1000 --neg_sample_size_eval 1000 --test -adv --regularization_coef 1e-9 --total_machine $machine_count --num_client 40
\ No newline at end of file
--- a/apps/kg/distributed/ip_config.txt
+++ b/apps/kg/distributed/ip_config.txt
-127.0.0.1 30050 8
-127.0.0.1 30050 8
-127.0.0.1 30050 8
-127.0.0.1 30050 8
\ No newline at end of file
--- a/apps/kg/distributed/launch.sh
+++ b/apps/kg/distributed/launch.sh
-#!/bin/bash
-##################################################################################
-# User runs this script to launch distrobited jobs on cluster
-##################################################################################
-script_path=$1
-script_file=$2
-user_name=$3
-ssh_key=$4
-server_count=$(awk 'NR==1 {print $3}' ip_config.txt)
-machine_count=$(awk 'END{print NR}' ip_config.txt)
-# run command on remote machine
-LINE_LOW=2
-LINE_HIGH=$(awk 'END{print NR}' ip_config.txt)
-let LINE_HIGH+=1
-s_id=0
-while [ $LINE_LOW -lt $LINE_HIGH ]
-do
-    ip=$(awk 'NR=='$LINE_LOW' {print $1}' ip_config.txt)
-    let LINE_LOW+=1
-    let s_id+=1
-    if test -z "$ssh_key" 
-    then
-        ssh $user_name@$ip 'cd '$script_path'; '$script_file' '$s_id' '$server_count' '$machine_count'' &
-    else
-        ssh -i $ssh_key $user_name@$ip 'cd '$script_path'; '$script_file' '$s_id' '$server_count' '$machine_count'' &
-    fi
-done
-# run command on local machine
-$script_file 0 $server_count $machine_count
\ No newline at end of file
--- a/apps/kg/distributed/partition.sh
+++ b/apps/kg/distributed/partition.sh
-#!/bin/bash
-##################################################################################
-# User runs this script to partition a graph using METIS
-##################################################################################
-DATA_PATH=$1
-DATA_SET=$2
-K=$3
-# partition graph
-python3 ../partition.py --dataset $DATA_SET -k $K --data_path $DATA_PATH
-# copy related file to partition
-PART_ID=0
-while [ $PART_ID -lt $K ]
-do
-    cp $DATA_PATH/$DATA_SET/relation* $DATA_PATH/$DATA_SET/partition_$PART_ID
-    let PART_ID+=1
-done
\ No newline at end of file
--- a/apps/kg/eval.py
+++ b/apps/kg/eval.py
-# -*- coding: utf-8 -*-
-#
-# setup.py
-#
-# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-from dataloader import EvalDataset, TrainDataset
-from dataloader import get_dataset
-import argparse
-import os
-import logging
-import time
-import pickle
-from utils import get_compatible_batch_size
-backend = os.environ.get('DGLBACKEND', 'pytorch')
-if backend.lower() == 'mxnet':
-    import multiprocessing as mp
-    from train_mxnet import load_model_from_checkpoint
-    from train_mxnet import test
-else:
-    import torch.multiprocessing as mp
-    from train_pytorch import load_model_from_checkpoint
-    from train_pytorch import test, test_mp
-class ArgParser(argparse.ArgumentParser):
-    def __init__(self):
-        super(ArgParser, self).__init__()
-        self.add_argument('--model_name', default='TransE',
-                          choices=['TransE', 'TransE_l1', 'TransE_l2', 'TransR',
-                                   'RESCAL', 'DistMult', 'ComplEx', 'RotatE'],
-                          help='model to use')
-        self.add_argument('--data_path', type=str, default='data',
-                          help='root path of all dataset')
-        self.add_argument('--dataset', type=str, default='FB15k',
-                          help='dataset name, under data_path')
-        self.add_argument('--format', type=str, default='built_in',
-                          help='the format of the dataset, it can be built_in,'\
-                                'raw_udd_{htr} and udd_{htr}')
-        self.add_argument('--data_files', type=str, default=None, nargs='+',
-                          help='a list of data files, e.g. entity relation train valid test')
-        self.add_argument('--model_path', type=str, default='ckpts',
-                          help='the place where models are saved')
-        self.add_argument('--batch_size_eval', type=int, default=8,
-                          help='batch size used for eval and test')
-        self.add_argument('--neg_sample_size_eval', type=int, default=-1,
-                          help='negative sampling size for testing')
-        self.add_argument('--neg_deg_sample_eval', action='store_true',
-                          help='negative sampling proportional to vertex degree for testing')
-        self.add_argument('--hidden_dim', type=int, default=256,
-                          help='hidden dim used by relation and entity')
-        self.add_argument('-g', '--gamma', type=float, default=12.0,
-                          help='margin value')
-        self.add_argument('--eval_percent', type=float, default=1,
-                          help='sample some percentage for evaluation.')
-        self.add_argument('--no_eval_filter', action='store_true',
-                          help='do not filter positive edges among negative edges for evaluation')
-        self.add_argument('--gpu', type=int, default=[-1], nargs='+',
-                          help='a list of active gpu ids, e.g. 0')
-        self.add_argument('--mix_cpu_gpu', action='store_true',
-                          help='mix CPU and GPU training')
-        self.add_argument('-de', '--double_ent', action='store_true',
-                          help='double entitiy dim for complex number')
-        self.add_argument('-dr', '--double_rel', action='store_true',
-                          help='double relation dim for complex number')
-        self.add_argument('--num_proc', type=int, default=1,
-                          help='number of process used')
-        self.add_argument('--num_thread', type=int, default=1,
-                          help='number of thread used')
-    def parse_args(self):
-        args = super().parse_args()
-        return args
-def get_logger(args):
-    if not os.path.exists(args.model_path):
-        raise Exception('No existing model_path: ' + args.model_path)
-    log_file = os.path.join(args.model_path, 'eval.log')
-    logging.basicConfig(
-        format='%(asctime)s %(levelname)-8s %(message)s',
-        level=logging.INFO,
-        datefmt='%Y-%m-%d %H:%M:%S',
-        filename=log_file,
-        filemode='w'
-    )
-    logger = logging.getLogger(__name__)
-    print("Logs are being recorded at: {}".format(log_file))
-    return logger
-def main(args):
-    args.eval_filter = not args.no_eval_filter
-    if args.neg_deg_sample_eval:
-        assert not args.eval_filter, "if negative sampling based on degree, we can't filter positive edges."
-    # load dataset and samplers
-    dataset = get_dataset(args.data_path, args.dataset, args.format, args.data_files)
-    args.pickle_graph = False
-    args.train = False
-    args.valid = False
-    args.test = True
-    args.strict_rel_part = False
-    args.soft_rel_part = False
-    args.async_update = False
-    logger = get_logger(args)
-    # Here we want to use the regualr negative sampler because we need to ensure that
-    # all positive edges are excluded.
-    eval_dataset = EvalDataset(dataset, args)
-    if args.neg_sample_size_eval < 0:
-        args.neg_sample_size_eval = args.neg_sample_size = eval_dataset.g.number_of_nodes()
-    args.batch_size_eval = get_compatible_batch_size(args.batch_size_eval, args.neg_sample_size_eval)
-    args.num_workers = 8 # fix num_workers to 8
-    if args.num_proc > 1:
-        test_sampler_tails = []
-        test_sampler_heads = []
-        for i in range(args.num_proc):
-            test_sampler_head = eval_dataset.create_sampler('test', args.batch_size_eval,
-                                                            args.neg_sample_size_eval,
-                                                            args.neg_sample_size_eval,
-                                                            args.eval_filter,
-                                                            mode='chunk-head',
-                                                            num_workers=args.num_workers,
-                                                            rank=i, ranks=args.num_proc)
-            test_sampler_tail = eval_dataset.create_sampler('test', args.batch_size_eval,
-                                                            args.neg_sample_size_eval,
-                                                            args.neg_sample_size_eval,
-                                                            args.eval_filter,
-                                                            mode='chunk-tail',
-                                                            num_workers=args.num_workers,
-                                                            rank=i, ranks=args.num_proc)
-            test_sampler_heads.append(test_sampler_head)
-            test_sampler_tails.append(test_sampler_tail)
-    else:
-        test_sampler_head = eval_dataset.create_sampler('test', args.batch_size_eval,
-                                                        args.neg_sample_size_eval,
-                                                        args.neg_sample_size_eval,
-                                                        args.eval_filter,
-                                                        mode='chunk-head',
-                                                        num_workers=args.num_workers,
-                                                        rank=0, ranks=1)
-        test_sampler_tail = eval_dataset.create_sampler('test', args.batch_size_eval,
-                                                        args.neg_sample_size_eval,
-                                                        args.neg_sample_size_eval,
-                                                        args.eval_filter,
-                                                        mode='chunk-tail',
-                                                        num_workers=args.num_workers,
-                                                        rank=0, ranks=1)
-    # load model
-    n_entities = dataset.n_entities
-    n_relations = dataset.n_relations
-    ckpt_path = args.model_path
-    model = load_model_from_checkpoint(logger, args, n_entities, n_relations, ckpt_path)
-    if args.num_proc > 1:
-        model.share_memory()
-    # test
-    args.step = 0
-    args.max_step = 0
-    start = time.time()
-    if args.num_proc > 1:
-        queue = mp.Queue(args.num_proc)
-        procs = []
-        for i in range(args.num_proc):
-            proc = mp.Process(target=test_mp, args=(args,
-                                                    model,
-                                                    [test_sampler_heads[i], test_sampler_tails[i]],
-                                                    i,
-                                                    'Test',
-                                                    queue))
-            procs.append(proc)
-            proc.start()
-        total_metrics = {}
-        metrics = {}
-        logs = []
-        for i in range(args.num_proc):
-            log = queue.get()
-            logs = logs + log
-        for metric in logs[0].keys():
-            metrics[metric] = sum([log[metric] for log in logs]) / len(logs)
-        for k, v in metrics.items():
-            print('Test average {} at [{}/{}]: {}'.format(k, args.step, args.max_step, v))
-        for proc in procs:
-            proc.join()
-    else:
-        test(args, model, [test_sampler_head, test_sampler_tail])
-    print('Test takes {:.3f} seconds'.format(time.time() - start))
-if __name__ == '__main__':
-    args = ArgParser().parse_args()
-    main(args)
--- a/apps/kg/kvclient.py
+++ b/apps/kg/kvclient.py
-# -*- coding: utf-8 -*-
-#
-# setup.py
-#
-# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-import os
-import argparse
-import time
-import logging
-import socket
-if os.name != 'nt':
-    import fcntl
-    import struct
-import torch.multiprocessing as mp
-from train_pytorch import load_model, dist_train_test
-from utils import get_compatible_batch_size
-from train import get_logger
-from dataloader import TrainDataset, NewBidirectionalOneShotIterator
-from dataloader import get_dataset, get_partition_dataset
-import dgl
-import dgl.backend as F
-WAIT_TIME = 10
-class ArgParser(argparse.ArgumentParser):
-    def __init__(self):
-        super(ArgParser, self).__init__()
-        self.add_argument('--model_name', default='TransE',
-                          choices=['TransE', 'TransE_l1', 'TransE_l2', 'TransR',
-                                   'RESCAL', 'DistMult', 'ComplEx', 'RotatE'],
-                          help='model to use')
-        self.add_argument('--data_path', type=str, default='../data',
-                          help='root path of all dataset')
-        self.add_argument('--dataset', type=str, default='FB15k',
-                          help='dataset name, under data_path')
-        self.add_argument('--format', type=str, default='built_in',
-                          help='the format of the dataset, it can be built_in,'\
-                                'raw_udd_{htr} and udd_{htr}')
-        self.add_argument('--save_path', type=str, default='../ckpts',
-                          help='place to save models and logs')
-        self.add_argument('--save_emb', type=str, default=None,
-                          help='save the embeddings in the specific location.')
-        self.add_argument('--max_step', type=int, default=80000,
-                          help='train xx steps')
-        self.add_argument('--batch_size', type=int, default=1024,
-                          help='batch size')
-        self.add_argument('--batch_size_eval', type=int, default=8,
-                          help='batch size used for eval and test')
-        self.add_argument('--neg_sample_size', type=int, default=128,
-                          help='negative sampling size')
-        self.add_argument('--neg_deg_sample', action='store_true',
-                          help='negative sample proportional to vertex degree in the training')
-        self.add_argument('--neg_deg_sample_eval', action='store_true',
-                          help='negative sampling proportional to vertex degree in the evaluation')
-        self.add_argument('--neg_sample_size_eval', type=int, default=-1,
-                          help='negative sampling size for evaluation')
-        self.add_argument('--hidden_dim', type=int, default=256,
-                          help='hidden dim used by relation and entity')
-        self.add_argument('--lr', type=float, default=0.0001,
-                          help='learning rate')
-        self.add_argument('-g', '--gamma', type=float, default=12.0,
-                          help='margin value')
-        self.add_argument('--no_eval_filter', action='store_true',
-                          help='do not filter positive edges among negative edges for evaluation')
-        self.add_argument('--gpu', type=int, default=[-1], nargs='+', 
-                          help='a list of active gpu ids, e.g. 0 1 2 4')
-        self.add_argument('--mix_cpu_gpu', action='store_true',
-                          help='mix CPU and GPU training')
-        self.add_argument('-de', '--double_ent', action='store_true',
-                          help='double entitiy dim for complex number')
-        self.add_argument('-dr', '--double_rel', action='store_true',
-                          help='double relation dim for complex number')
-        self.add_argument('-log', '--log_interval', type=int, default=1000,
-                          help='do evaluation after every x steps')
-        self.add_argument('--eval_interval', type=int, default=10000,
-                          help='do evaluation after every x steps')
-        self.add_argument('--eval_percent', type=float, default=1,
-                          help='sample some percentage for evaluation.')
-        self.add_argument('-adv', '--neg_adversarial_sampling', action='store_true',
-                          help='if use negative adversarial sampling')
-        self.add_argument('-a', '--adversarial_temperature', default=1.0, type=float,
-                          help='adversarial_temperature')
-        self.add_argument('--valid', action='store_true',
-                          help='if valid a model')
-        self.add_argument('--test', action='store_true',
-                          help='if test a model')
-        self.add_argument('-rc', '--regularization_coef', type=float, default=0.000002,
-                          help='set value > 0.0 if regularization is used')
-        self.add_argument('-rn', '--regularization_norm', type=int, default=3,
-                          help='norm used in regularization')
-        self.add_argument('--non_uni_weight', action='store_true',
-                          help='if use uniform weight when computing loss')
-        self.add_argument('--pickle_graph', action='store_true',
-                          help='pickle built graph, building a huge graph is slow.')
-        self.add_argument('--num_proc', type=int, default=1,
-                          help='number of process used')
-        self.add_argument('--num_thread', type=int, default=1,
-                          help='number of thread used')
-        self.add_argument('--rel_part', action='store_true',
-                          help='enable relation partitioning')
-        self.add_argument('--soft_rel_part', action='store_true',
-                          help='enable soft relation partition')
-        self.add_argument('--async_update', action='store_true',
-                          help='allow async_update on node embedding')
-        self.add_argument('--force_sync_interval', type=int, default=-1,
-                          help='We force a synchronization between processes every x steps')
-        self.add_argument('--machine_id', type=int, default=0,
-                          help='Unique ID of current machine.')
-        self.add_argument('--total_machine', type=int, default=1,
-                          help='Total number of machine.')
-        self.add_argument('--ip_config', type=str, default='ip_config.txt',
-                          help='IP configuration file of kvstore')
-        self.add_argument('--num_client', type=int, default=1,
-                          help='Number of client on each machine.')
-def get_long_tail_partition(n_relations, n_machine):
-    """Relation types has a long tail distribution for many dataset.
-       So we need to average shuffle the data before we partition it.
-    """
-    assert n_relations > 0, 'n_relations must be a positive number.'
-    assert n_machine > 0, 'n_machine must be a positive number.'
-    partition_book = [0] * n_relations
-    part_id = 0
-    for i in range(n_relations):
-        partition_book[i] = part_id
-        part_id += 1
-        if part_id == n_machine:
-          part_id = 0
-    return partition_book 
-def local_ip4_addr_list():
-    """Return a set of IPv4 address
-    """
-    nic = set()
-    for ix in socket.if_nameindex():
-        name = ix[1]
-        s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
-        ip = socket.inet_ntoa(fcntl.ioctl(
-            s.fileno(),
-            0x8915,  # SIOCGIFADDR
-            struct.pack('256s', name[:15].encode("UTF-8")))[20:24])
-        nic.add(ip)
-    return nic
-def get_local_machine_id(server_namebook):
-    """Get machine ID via server_namebook
-    """
-    assert len(server_namebook) > 0, 'server_namebook cannot be empty.'
-    res = 0
-    for ID, data in server_namebook.items():
-        machine_id = data[0]
-        ip = data[1]
-        if ip in local_ip4_addr_list():
-            res = machine_id
-            break
-    return res
-def start_worker(args, logger):
-    """Start kvclient for training
-    """
-    init_time_start = time.time()
-    time.sleep(WAIT_TIME) # wait for launch script
-    server_namebook = dgl.contrib.read_ip_config(filename=args.ip_config)
-    args.machine_id = get_local_machine_id(server_namebook)
-    dataset, entity_partition_book, local2global = get_partition_dataset(
-        args.data_path,
-        args.dataset,
-        args.machine_id)
-    n_entities = dataset.n_entities
-    n_relations = dataset.n_relations
-    print('Partition %d n_entities: %d' % (args.machine_id, n_entities))
-    print("Partition %d n_relations: %d" % (args.machine_id, n_relations))
-    entity_partition_book = F.tensor(entity_partition_book)
-    relation_partition_book = get_long_tail_partition(dataset.n_relations, args.total_machine)
-    relation_partition_book = F.tensor(relation_partition_book)
-    local2global = F.tensor(local2global)
-    relation_partition_book.share_memory_()
-    entity_partition_book.share_memory_()
-    local2global.share_memory_()
-    train_data = TrainDataset(dataset, args, ranks=args.num_client)
-    # if there is no cross partition relaiton, we fall back to strict_rel_part
-    args.strict_rel_part = args.mix_cpu_gpu and (train_data.cross_part == False)
-    args.soft_rel_part = args.mix_cpu_gpu and args.soft_rel_part and train_data.cross_part
-    if args.neg_sample_size_eval < 0:
-        args.neg_sample_size_eval = dataset.n_entities
-    args.batch_size = get_compatible_batch_size(args.batch_size, args.neg_sample_size)
-    args.batch_size_eval = get_compatible_batch_size(args.batch_size_eval, args.neg_sample_size_eval)
-    args.num_workers = 8 # fix num_workers to 8
-    train_samplers = []
-    for i in range(args.num_client):
-        train_sampler_head = train_data.create_sampler(args.batch_size,
-                                                       args.neg_sample_size,
-                                                       args.neg_sample_size,
-                                                       mode='head',
-                                                       num_workers=args.num_workers,
-                                                       shuffle=True,
-                                                       exclude_positive=False,
-                                                       rank=i)
-        train_sampler_tail = train_data.create_sampler(args.batch_size,
-                                                       args.neg_sample_size,
-                                                       args.neg_sample_size,
-                                                       mode='tail',
-                                                       num_workers=args.num_workers,
-                                                       shuffle=True,
-                                                       exclude_positive=False,
-                                                       rank=i)
-        train_samplers.append(NewBidirectionalOneShotIterator(train_sampler_head, train_sampler_tail,
-                                                              args.neg_sample_size, args.neg_sample_size,
-                                                              True, n_entities))
-    dataset = None
-    model = load_model(logger, args, n_entities, n_relations)
-    model.share_memory()
-    print('Total initialize time {:.3f} seconds'.format(time.time() - init_time_start))
-    rel_parts = train_data.rel_parts if args.strict_rel_part or args.soft_rel_part else None
-    cross_rels = train_data.cross_rels if args.soft_rel_part else None
-    procs = []
-    barrier = mp.Barrier(args.num_client)
-    for i in range(args.num_client):
-        proc = mp.Process(target=dist_train_test, args=(args,
-                                                        model,
-                                                        train_samplers[i],
-                                                        entity_partition_book,
-                                                        relation_partition_book,
-                                                        local2global,
-                                                        i,
-                                                        rel_parts,
-                                                        cross_rels,
-                                                        barrier))
-        procs.append(proc)
-        proc.start()
-    for proc in procs:
-        proc.join()
-if __name__ == '__main__':
-    args = ArgParser().parse_args()
-    logger = get_logger(args)
-    start_worker(args, logger)
--- a/apps/kg/kvserver.py
+++ b/apps/kg/kvserver.py
-# -*- coding: utf-8 -*-
-#
-# setup.py
-#
-# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-import os
-import argparse
-import time
-import dgl
-from dgl.contrib import KVServer
-import torch as th
-from train_pytorch import load_model
-from dataloader import get_server_partition_dataset
-NUM_THREAD = 1 # Fix the number of threads to 1 on kvstore
-class KGEServer(KVServer):
-    """User-defined kvstore for DGL-KGE
-    """
-    def _push_handler(self, name, ID, data, target):
-        """Row-Sparse Adagrad updater
-        """
-        original_name = name[0:-6]
-        state_sum = target[original_name+'_state-data-']
-        grad_sum = (data * data).mean(1)
-        state_sum.index_add_(0, ID, grad_sum)
-        std = state_sum[ID]  # _sparse_mask
-        std_values = std.sqrt_().add_(1e-10).unsqueeze(1)
-        tmp = (-self.clr * data / std_values)
-        target[name].index_add_(0, ID, tmp)
-    def set_clr(self, learning_rate):
-        """Set learning rate for Row-Sparse Adagrad updater
-        """
-        self.clr = learning_rate
-# Note: Most of the args are unnecessary for KVStore, will remove them later
-class ArgParser(argparse.ArgumentParser):
-    def __init__(self):
-        super(ArgParser, self).__init__()
-        self.add_argument('--model_name', default='TransE',
-                          choices=['TransE', 'TransE_l1', 'TransE_l2', 'TransR',
-                                   'RESCAL', 'DistMult', 'ComplEx', 'RotatE'],
-                          help='model to use')
-        self.add_argument('--data_path', type=str, default='../data',
-                          help='root path of all dataset')
-        self.add_argument('--dataset', type=str, default='FB15k',
-                          help='dataset name, under data_path')
-        self.add_argument('--format', type=str, default='built_in',
-                          help='the format of the dataset, it can be built_in,'\
-                                'raw_udd_{htr} and udd_{htr}')
-        self.add_argument('--hidden_dim', type=int, default=256,
-                          help='hidden dim used by relation and entity')
-        self.add_argument('--lr', type=float, default=0.0001,
-                          help='learning rate')
-        self.add_argument('-g', '--gamma', type=float, default=12.0,
-                          help='margin value')
-        self.add_argument('--gpu', type=int, default=[-1], nargs='+',
-                          help='a list of active gpu ids, e.g. 0')
-        self.add_argument('--mix_cpu_gpu', action='store_true',
-                          help='mix CPU and GPU training')
-        self.add_argument('-de', '--double_ent', action='store_true',
-                          help='double entitiy dim for complex number')
-        self.add_argument('-dr', '--double_rel', action='store_true',
-                          help='double relation dim for complex number')
-        self.add_argument('--num_thread', type=int, default=1,
-                          help='number of thread used')
-        self.add_argument('--server_id', type=int, default=0,
-                          help='Unique ID of each server')
-        self.add_argument('--ip_config', type=str, default='ip_config.txt',
-                          help='IP configuration file of kvstore')
-        self.add_argument('--total_client', type=int, default=1,
-                          help='Total number of client worker nodes')
-def get_server_data(args, machine_id):
-   """Get data from data_path/dataset/part_machine_id
-      Return: glocal2local, 
-              entity_emb, 
-              entity_state, 
-              relation_emb, 
-              relation_emb_state
-   """
-   g2l, dataset = get_server_partition_dataset(
-    args.data_path, 
-    args.dataset, 
-    machine_id)
-   # Note that the dataset doesn't ccontain the triple
-   print('n_entities: ' + str(dataset.n_entities))
-   print('n_relations: ' + str(dataset.n_relations))
-   args.soft_rel_part = False
-   args.strict_rel_part = False
-   model = load_model(None, args, dataset.n_entities, dataset.n_relations)
-   return g2l, model.entity_emb.emb, model.entity_emb.state_sum, model.relation_emb.emb, model.relation_emb.state_sum
-def start_server(args):
-    """Start kvstore service
-    """
-    th.set_num_threads(NUM_THREAD)
-    server_namebook = dgl.contrib.read_ip_config(filename=args.ip_config)
-    my_server = KGEServer(server_id=args.server_id, 
-                          server_namebook=server_namebook, 
-                          num_client=args.total_client)
-    my_server.set_clr(args.lr)
-    if my_server.get_id() % my_server.get_group_count() == 0: # master server
-        g2l, entity_emb, entity_emb_state, relation_emb, relation_emb_state = get_server_data(args, my_server.get_machine_id())
-        my_server.set_global2local(name='entity_emb', global2local=g2l)
-        my_server.init_data(name='relation_emb', data_tensor=relation_emb)
-        my_server.init_data(name='relation_emb_state', data_tensor=relation_emb_state)
-        my_server.init_data(name='entity_emb', data_tensor=entity_emb)
-        my_server.init_data(name='entity_emb_state', data_tensor=entity_emb_state)
-    else: # backup server
-        my_server.set_global2local(name='entity_emb')
-        my_server.init_data(name='relation_emb')
-        my_server.init_data(name='relation_emb_state')
-        my_server.init_data(name='entity_emb')
-        my_server.init_data(name='entity_emb_state')
-    print('KVServer %d listen for requests ...' % my_server.get_id())
-    my_server.start()
-if __name__ == '__main__':
-    args = ArgParser().parse_args()
-    start_server(args)
\ No newline at end of file
--- a/apps/kg/models/__init__.py
+++ b/apps/kg/models/__init__.py
-# -*- coding: utf-8 -*-
-#
-# setup.py
-#
-# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-from .general_models import KEModel
--- a/apps/kg/models/general_models.py
+++ b/apps/kg/models/general_models.py
-# -*- coding: utf-8 -*-
-#
-# setup.py
-#
-# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-"""
-Graph Embedding Model
-1. TransE
-2. TransR
-3. RESCAL
-4. DistMult
-5. ComplEx
-6. RotatE
-"""
-import os
-import numpy as np
-import dgl.backend as F
-backend = os.environ.get('DGLBACKEND', 'pytorch')
-if backend.lower() == 'mxnet':
-    from .mxnet.tensor_models import logsigmoid
-    from .mxnet.tensor_models import get_device
-    from .mxnet.tensor_models import norm
-    from .mxnet.tensor_models import get_scalar
-    from .mxnet.tensor_models import reshape
-    from .mxnet.tensor_models import cuda
-    from .mxnet.tensor_models import ExternalEmbedding
-    from .mxnet.score_fun import *
-else:
-    from .pytorch.tensor_models import logsigmoid
-    from .pytorch.tensor_models import get_device
-    from .pytorch.tensor_models import norm
-    from .pytorch.tensor_models import get_scalar
-    from .pytorch.tensor_models import reshape
-    from .pytorch.tensor_models import cuda
-    from .pytorch.tensor_models import ExternalEmbedding
-    from .pytorch.score_fun import *
-class KEModel(object):
-    """ DGL Knowledge Embedding Model.
-    Parameters
-    ----------
-    args:
-        Global configs.
-    model_name : str
-        Which KG model to use, including 'TransE_l1', 'TransE_l2', 'TransR',
-        'RESCAL', 'DistMult', 'ComplEx', 'RotatE'
-    n_entities : int
-        Num of entities.
-    n_relations : int
-        Num of relations.
-    hidden_dim : int
-        Dimetion size of embedding.
-    gamma : float
-        Gamma for score function.
-    double_entity_emb : bool
-        If True, entity embedding size will be 2 * hidden_dim.
-        Default: False
-    double_relation_emb : bool
-        If True, relation embedding size will be 2 * hidden_dim.
-        Default: False
-    """
-    def __init__(self, args, model_name, n_entities, n_relations, hidden_dim, gamma,
-                 double_entity_emb=False, double_relation_emb=False):
-        super(KEModel, self).__init__()
-        self.args = args
-        self.n_entities = n_entities
-        self.n_relations = n_relations
-        self.model_name = model_name
-        self.hidden_dim = hidden_dim
-        self.eps = 2.0
-        self.emb_init = (gamma + self.eps) / hidden_dim
-        entity_dim = 2 * hidden_dim if double_entity_emb else hidden_dim
-        relation_dim = 2 * hidden_dim if double_relation_emb else hidden_dim
-        device = get_device(args)
-        self.entity_emb = ExternalEmbedding(args, n_entities, entity_dim,
-                                            F.cpu() if args.mix_cpu_gpu else device)
-        # For RESCAL, relation_emb = relation_dim * entity_dim
-        if model_name == 'RESCAL':
-            rel_dim = relation_dim * entity_dim
-        else:
-            rel_dim = relation_dim
-        self.rel_dim = rel_dim
-        self.entity_dim = entity_dim
-        self.strict_rel_part = args.strict_rel_part
-        self.soft_rel_part = args.soft_rel_part
-        if not self.strict_rel_part and not self.soft_rel_part:
-            self.relation_emb = ExternalEmbedding(args, n_relations, rel_dim,
-                                                  F.cpu() if args.mix_cpu_gpu else device)
-        else:
-            self.global_relation_emb = ExternalEmbedding(args, n_relations, rel_dim, F.cpu())
-        if model_name == 'TransE' or model_name == 'TransE_l2':
-            self.score_func = TransEScore(gamma, 'l2')
-        elif model_name == 'TransE_l1':
-            self.score_func = TransEScore(gamma, 'l1')
-        elif model_name == 'TransR':
-            projection_emb = ExternalEmbedding(args,
-                                               n_relations,
-                                               entity_dim * relation_dim,
-                                               F.cpu() if args.mix_cpu_gpu else device)
-            self.score_func = TransRScore(gamma, projection_emb, relation_dim, entity_dim)
-        elif model_name == 'DistMult':
-            self.score_func = DistMultScore()
-        elif model_name == 'ComplEx':
-            self.score_func = ComplExScore()
-        elif model_name == 'RESCAL':
-            self.score_func = RESCALScore(relation_dim, entity_dim)
-        elif model_name == 'RotatE':
-            self.score_func = RotatEScore(gamma, self.emb_init)
-        self.model_name = model_name
-        self.head_neg_score = self.score_func.create_neg(True)
-        self.tail_neg_score = self.score_func.create_neg(False)
-        self.head_neg_prepare = self.score_func.create_neg_prepare(True)
-        self.tail_neg_prepare = self.score_func.create_neg_prepare(False)
-        self.reset_parameters()
-    def share_memory(self):
-        """Use torch.tensor.share_memory_() to allow cross process embeddings access.
-        """
-        self.entity_emb.share_memory()
-        if self.strict_rel_part or self.soft_rel_part:
-            self.global_relation_emb.share_memory()
-        else:
-            self.relation_emb.share_memory()
-        if self.model_name == 'TransR':
-            self.score_func.share_memory()
-    def save_emb(self, path, dataset):
-        """Save the model.
-        Parameters
-        ----------
-        path : str
-            Directory to save the model.
-        dataset : str
-            Dataset name as prefix to the saved embeddings.
-        """
-        self.entity_emb.save(path, dataset+'_'+self.model_name+'_entity')
-        if self.strict_rel_part or self.soft_rel_part:
-            self.global_relation_emb.save(path, dataset+'_'+self.model_name+'_relation')
-        else:
-            self.relation_emb.save(path, dataset+'_'+self.model_name+'_relation')   
-        self.score_func.save(path, dataset+'_'+self.model_name)
-    def load_emb(self, path, dataset):
-        """Load the model.
-        Parameters
-        ----------
-        path : str
-            Directory to load the model.
-        dataset : str
-            Dataset name as prefix to the saved embeddings.
-        """
-        self.entity_emb.load(path, dataset+'_'+self.model_name+'_entity')
-        self.relation_emb.load(path, dataset+'_'+self.model_name+'_relation')
-        self.score_func.load(path, dataset+'_'+self.model_name)
-    def reset_parameters(self):
-        """Re-initialize the model.
-        """
-        self.entity_emb.init(self.emb_init)
-        self.score_func.reset_parameters()
-        if (not self.strict_rel_part) and (not self.soft_rel_part):
-            self.relation_emb.init(self.emb_init)
-        else:
-            self.global_relation_emb.init(self.emb_init)
-    def predict_score(self, g):
-        """Predict the positive score.
-        Parameters
-        ----------
-        g : DGLGraph
-            Graph holding positive edges.
-        Returns
-        -------
-        tensor
-            The positive score
-        """
-        self.score_func(g)
-        return g.edata['score']
-    def predict_neg_score(self, pos_g, neg_g, to_device=None, gpu_id=-1, trace=False,
-                          neg_deg_sample=False):
-        """Calculate the negative score.
-        Parameters
-        ----------
-        pos_g : DGLGraph
-            Graph holding positive edges.
-        neg_g : DGLGraph
-            Graph holding negative edges.
-        to_device : func
-            Function to move data into device.
-        gpu_id : int
-            Which gpu to move data to.
-        trace : bool
-            If True, trace the computation. This is required in training.
-            If False, do not trace the computation.
-            Default: False
-        neg_deg_sample : bool
-            If True, we use the head and tail nodes of the positive edges to
-            construct negative edges.
-            Default: False
-        Returns
-        -------
-        tensor
-            The negative score
-        """
-        num_chunks = neg_g.num_chunks
-        chunk_size = neg_g.chunk_size
-        neg_sample_size = neg_g.neg_sample_size
-        mask = F.ones((num_chunks, chunk_size * (neg_sample_size + chunk_size)),
-                      dtype=F.float32, ctx=F.context(pos_g.ndata['emb']))
-        if neg_g.neg_head:
-            neg_head_ids = neg_g.ndata['id'][neg_g.head_nid]
-            neg_head = self.entity_emb(neg_head_ids, gpu_id, trace)
-            head_ids, tail_ids = pos_g.all_edges(order='eid')
-            if to_device is not None and gpu_id >= 0:
-                tail_ids = to_device(tail_ids, gpu_id)
-            tail = pos_g.ndata['emb'][tail_ids]
-            rel = pos_g.edata['emb']
-            # When we train a batch, we could use the head nodes of the positive edges to
-            # construct negative edges. We construct a negative edge between a positive head
-            # node and every positive tail node.
-            # When we construct negative edges like this, we know there is one positive
-            # edge for a positive head node among the negative edges. We need to mask
-            # them.
-            if neg_deg_sample:
-                head = pos_g.ndata['emb'][head_ids]
-                head = head.reshape(num_chunks, chunk_size, -1)
-                neg_head = neg_head.reshape(num_chunks, neg_sample_size, -1)
-                neg_head = F.cat([head, neg_head], 1)
-                neg_sample_size = chunk_size + neg_sample_size
-                mask[:,0::(neg_sample_size + 1)] = 0
-            neg_head = neg_head.reshape(num_chunks * neg_sample_size, -1)
-            neg_head, tail = self.head_neg_prepare(pos_g.edata['id'], num_chunks, neg_head, tail, gpu_id, trace)
-            neg_score = self.head_neg_score(neg_head, rel, tail,
-                                            num_chunks, chunk_size, neg_sample_size)
-        else:
-            neg_tail_ids = neg_g.ndata['id'][neg_g.tail_nid]
-            neg_tail = self.entity_emb(neg_tail_ids, gpu_id, trace)
-            head_ids, tail_ids = pos_g.all_edges(order='eid')
-            if to_device is not None and gpu_id >= 0:
-                head_ids = to_device(head_ids, gpu_id)
-            head = pos_g.ndata['emb'][head_ids]
-            rel = pos_g.edata['emb']
-            # This is negative edge construction similar to the above.
-            if neg_deg_sample:
-                tail = pos_g.ndata['emb'][tail_ids]
-                tail = tail.reshape(num_chunks, chunk_size, -1)
-                neg_tail = neg_tail.reshape(num_chunks, neg_sample_size, -1)
-                neg_tail = F.cat([tail, neg_tail], 1)
-                neg_sample_size = chunk_size + neg_sample_size
-                mask[:,0::(neg_sample_size + 1)] = 0
-            neg_tail = neg_tail.reshape(num_chunks * neg_sample_size, -1)
-            head, neg_tail = self.tail_neg_prepare(pos_g.edata['id'], num_chunks, head, neg_tail, gpu_id, trace)
-            neg_score = self.tail_neg_score(head, rel, neg_tail,
-                                            num_chunks, chunk_size, neg_sample_size)
-        if neg_deg_sample:
-            neg_g.neg_sample_size = neg_sample_size
-            mask = mask.reshape(num_chunks, chunk_size, neg_sample_size)
-            return neg_score * mask
-        else:
-            return neg_score
-    def forward_test(self, pos_g, neg_g, logs, gpu_id=-1):
-        """Do the forward and generate ranking results.
-        Parameters
-        ----------
-        pos_g : DGLGraph
-            Graph holding positive edges.
-        neg_g : DGLGraph
-            Graph holding negative edges.
-        logs : List
-            Where to put results in.
-        gpu_id : int
-            Which gpu to accelerate the calculation. if -1 is provided, cpu is used.
-        """
-        pos_g.ndata['emb'] = self.entity_emb(pos_g.ndata['id'], gpu_id, False)
-        pos_g.edata['emb'] = self.relation_emb(pos_g.edata['id'], gpu_id, False)
-        self.score_func.prepare(pos_g, gpu_id, False)
-        batch_size = pos_g.number_of_edges()
-        pos_scores = self.predict_score(pos_g)
-        pos_scores = reshape(logsigmoid(pos_scores), batch_size, -1)
-        neg_scores = self.predict_neg_score(pos_g, neg_g, to_device=cuda,
-                                            gpu_id=gpu_id, trace=False,
-                                            neg_deg_sample=self.args.neg_deg_sample_eval)
-        neg_scores = reshape(logsigmoid(neg_scores), batch_size, -1)
-        # We need to filter the positive edges in the negative graph.
-        if self.args.eval_filter:
-            filter_bias = reshape(neg_g.edata['bias'], batch_size, -1)
-            if gpu_id >= 0:
-                filter_bias = cuda(filter_bias, gpu_id)
-            neg_scores += filter_bias
-        # To compute the rank of a positive edge among all negative edges,
-        # we need to know how many negative edges have higher scores than
-        # the positive edge.
-        rankings = F.sum(neg_scores >= pos_scores, dim=1) + 1
-        rankings = F.asnumpy(rankings)
-        for i in range(batch_size):
-            ranking = rankings[i]
-            logs.append({
-                'MRR': 1.0 / ranking,
-                'MR': float(ranking),
-                'HITS@1': 1.0 if ranking <= 1 else 0.0,
-                'HITS@3': 1.0 if ranking <= 3 else 0.0,
-                'HITS@10': 1.0 if ranking <= 10 else 0.0
-            })
-    # @profile
-    def forward(self, pos_g, neg_g, gpu_id=-1):
-        """Do the forward.
-        Parameters
-        ----------
-        pos_g : DGLGraph
-            Graph holding positive edges.
-        neg_g : DGLGraph
-            Graph holding negative edges.
-        gpu_id : int
-            Which gpu to accelerate the calculation. if -1 is provided, cpu is used.
-        Returns
-        -------
-        tensor
-            loss value
-        dict
-            loss info
-        """
-        pos_g.ndata['emb'] = self.entity_emb(pos_g.ndata['id'], gpu_id, True)
-        pos_g.edata['emb'] = self.relation_emb(pos_g.edata['id'], gpu_id, True)
-        self.score_func.prepare(pos_g, gpu_id, True)
-        pos_score = self.predict_score(pos_g)
-        pos_score = logsigmoid(pos_score)
-        if gpu_id >= 0:
-            neg_score = self.predict_neg_score(pos_g, neg_g, to_device=cuda,
-                                               gpu_id=gpu_id, trace=True,
-                                               neg_deg_sample=self.args.neg_deg_sample)
-        else:
-            neg_score = self.predict_neg_score(pos_g, neg_g, trace=True,
-                                               neg_deg_sample=self.args.neg_deg_sample)
-        neg_score = reshape(neg_score, -1, neg_g.neg_sample_size)
-        # Adversarial sampling
-        if self.args.neg_adversarial_sampling:
-            neg_score = F.sum(F.softmax(neg_score * self.args.adversarial_temperature, dim=1).detach()
-                         * logsigmoid(-neg_score), dim=1)
-        else:
-            neg_score = F.mean(logsigmoid(-neg_score), dim=1)
-        # subsampling weight
-        # TODO: add subsampling to new sampler
-        if self.args.non_uni_weight:
-            subsampling_weight = pos_g.edata['weight']
-            pos_score = (pos_score * subsampling_weight).sum() / subsampling_weight.sum()
-            neg_score = (neg_score * subsampling_weight).sum() / subsampling_weight.sum()
-        else:
-            pos_score = pos_score.mean()
-            neg_score = neg_score.mean()
-        # compute loss
-        loss = -(pos_score + neg_score) / 2
-        log = {'pos_loss': - get_scalar(pos_score),
-               'neg_loss': - get_scalar(neg_score),
-               'loss': get_scalar(loss)}
-        # regularization: TODO(zihao)
-        #TODO: only reg ent&rel embeddings. other params to be added.
-        if self.args.regularization_coef > 0.0 and self.args.regularization_norm > 0:
-            coef, nm = self.args.regularization_coef, self.args.regularization_norm
-            reg = coef * (norm(self.entity_emb.curr_emb(), nm) + norm(self.relation_emb.curr_emb(), nm))
-            log['regularization'] = get_scalar(reg)
-            loss = loss + reg
-        return loss, log
-    def update(self, gpu_id=-1):
-        """ Update the embeddings in the model
-        gpu_id : int
-            Which gpu to accelerate the calculation. if -1 is provided, cpu is used.
-        """
-        self.entity_emb.update(gpu_id)
-        self.relation_emb.update(gpu_id)
-        self.score_func.update(gpu_id)
-    def prepare_relation(self, device=None):
-        """ Prepare relation embeddings in multi-process multi-gpu training model.
-        device : th.device
-            Which device (GPU) to put relation embeddings in.
-        """
-        self.relation_emb = ExternalEmbedding(self.args, self.n_relations, self.rel_dim, device)
-        self.relation_emb.init(self.emb_init)
-        if self.model_name == 'TransR':
-            local_projection_emb = ExternalEmbedding(self.args, self.n_relations,
-                                                    self.entity_dim * self.rel_dim, device)
-            self.score_func.prepare_local_emb(local_projection_emb)
-            self.score_func.reset_parameters()
-    def prepare_cross_rels(self, cross_rels):
-        self.relation_emb.setup_cross_rels(cross_rels, self.global_relation_emb)
-        if self.model_name == 'TransR':
-            self.score_func.prepare_cross_rels(cross_rels)
-    def writeback_relation(self, rank=0, rel_parts=None):
-        """ Writeback relation embeddings in a specific process to global relation embedding.
-        Used in multi-process multi-gpu training model.
-        rank : int
-            Process id.
-        rel_parts : List of tensor
-            List of tensor stroing edge types of each partition.
-        """
-        idx = rel_parts[rank]
-        if self.soft_rel_part:
-            idx = self.relation_emb.get_noncross_idx(idx)
-        self.global_relation_emb.emb[idx] = F.copy_to(self.relation_emb.emb, F.cpu())[idx]
-        if self.model_name == 'TransR':
-            self.score_func.writeback_local_emb(idx)
-    def load_relation(self, device=None):
-        """ Sync global relation embeddings into local relation embeddings.
-        Used in multi-process multi-gpu training model.
-        device : th.device
-            Which device (GPU) to put relation embeddings in.
-        """
-        self.relation_emb = ExternalEmbedding(self.args, self.n_relations, self.rel_dim, device)
-        self.relation_emb.emb = F.copy_to(self.global_relation_emb.emb, device)
-        if self.model_name == 'TransR':
-            local_projection_emb = ExternalEmbedding(self.args, self.n_relations,
-                                                     self.entity_dim * self.rel_dim, device)
-            self.score_func.load_local_emb(local_projection_emb)
-    def create_async_update(self):
-        """Set up the async update for entity embedding.
-        """
-        self.entity_emb.create_async_update()
-    def finish_async_update(self):
-        """Terminate the async update for entity embedding.
-        """
-        self.entity_emb.finish_async_update()
-    def pull_model(self, client, pos_g, neg_g):
-        with th.no_grad():
-            entity_id = F.cat(seq=[pos_g.ndata['id'], neg_g.ndata['id']], dim=0)
-            relation_id = pos_g.edata['id']
-            entity_id = F.tensor(np.unique(F.asnumpy(entity_id)))
-            relation_id = F.tensor(np.unique(F.asnumpy(relation_id)))
-            l2g = client.get_local2global()
-            global_entity_id = l2g[entity_id]
-            entity_data = client.pull(name='entity_emb', id_tensor=global_entity_id)
-            relation_data = client.pull(name='relation_emb', id_tensor=relation_id)
-            self.entity_emb.emb[entity_id] = entity_data
-            self.relation_emb.emb[relation_id] = relation_data
-    def push_gradient(self, client):
-        with th.no_grad():
-            l2g = client.get_local2global()
-            for entity_id, entity_data in self.entity_emb.trace:
-                grad = entity_data.grad.data
-                global_entity_id =l2g[entity_id]
-                client.push(name='entity_emb', id_tensor=global_entity_id, data_tensor=grad)
-            for relation_id, relation_data in self.relation_emb.trace:
-                grad = relation_data.grad.data
-                client.push(name='relation_emb', id_tensor=relation_id, data_tensor=grad)
-        self.entity_emb.trace = []
-        self.relation_emb.trace = []
--- a/apps/kg/models/mxnet/__init__.py
+++ b/apps/kg/models/mxnet/__init__.py
-# -*- coding: utf-8 -*-
-#
-# setup.py
-#
-# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
\ No newline at end of file
--- a/apps/kg/models/mxnet/score_fun.py
+++ b/apps/kg/models/mxnet/score_fun.py
-# -*- coding: utf-8 -*-
-#
-# setup.py
-#
-# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-import numpy as np
-import mxnet as mx
-from mxnet import gluon
-from mxnet.gluon import nn
-from mxnet import ndarray as nd
-def batched_l2_dist(a, b):
-    a_squared = nd.power(nd.norm(a, axis=-1), 2)
-    b_squared = nd.power(nd.norm(b, axis=-1), 2)
-    squared_res = nd.add(nd.linalg_gemm(
-        a, nd.transpose(b, axes=(0, 2, 1)), nd.broadcast_axes(nd.expand_dims(b_squared, axis=-2), axis=1, size=a.shape[1]), alpha=-2
-    ), nd.expand_dims(a_squared, axis=-1))
-    res = nd.sqrt(nd.clip(squared_res, 1e-30, np.finfo(np.float32).max))
-    return res
-def batched_l1_dist(a, b):
-    a = nd.expand_dims(a, axis=-2)
-    b = nd.expand_dims(b, axis=-3)
-    res = nd.norm(a - b, ord=1, axis=-1)
-    return res
-class TransEScore(nn.Block):
-    """ TransE score function
-    Paper link: https://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-relational-data
-    """
-    def __init__(self, gamma, dist_func='l2'):
-        super(TransEScore, self).__init__()
-        self.gamma = gamma
-        if dist_func == 'l1':
-            self.neg_dist_func = batched_l1_dist
-            self.dist_ord = 1
-        else: # default use l2
-            self.neg_dist_func = batched_l2_dist
-            self.dist_ord = 2
-    def edge_func(self, edges):
-        head = edges.src['emb']
-        tail = edges.dst['emb']
-        rel = edges.data['emb']
-        score = head + rel - tail
-        return {'score': self.gamma - nd.norm(score, ord=self.dist_ord, axis=-1)}
-    def prepare(self, g, gpu_id, trace=False):
-        pass
-    def create_neg_prepare(self, neg_head):
-        def fn(rel_id, num_chunks, head, tail, gpu_id, trace=False):
-            return head, tail
-        return fn
-    def update(self, gpu_id=-1):
-        pass
-    def reset_parameters(self):
-        pass
-    def save(self, path, name):
-        pass
-    def load(self, path, name):
-        pass
-    def forward(self, g):
-        g.apply_edges(lambda edges: self.edge_func(edges))
-    def create_neg(self, neg_head):
-        gamma = self.gamma
-        if neg_head:
-            def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
-                hidden_dim = heads.shape[1]
-                heads = heads.reshape(num_chunks, neg_sample_size, hidden_dim)
-                tails = tails - relations
-                tails = tails.reshape(num_chunks, chunk_size, hidden_dim)
-                return gamma - self.neg_dist_func(tails, heads)
-            return fn
-        else:
-            def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
-                hidden_dim = heads.shape[1]
-                heads = heads + relations
-                heads = heads.reshape(num_chunks, chunk_size, hidden_dim)
-                tails = tails.reshape(num_chunks, neg_sample_size, hidden_dim)
-                return gamma - self.neg_dist_func(heads, tails)
-            return fn
-class TransRScore(nn.Block):
-    """TransR score function
-    Paper link: https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/download/9571/9523
-    """
-    def __init__(self, gamma, projection_emb, relation_dim, entity_dim):
-        super(TransRScore, self).__init__()
-        self.gamma = gamma
-        self.projection_emb = projection_emb
-        self.relation_dim = relation_dim
-        self.entity_dim = entity_dim
-    def edge_func(self, edges):
-        head = edges.data['head_emb']
-        tail = edges.data['tail_emb']
-        rel = edges.data['emb']
-        score = head + rel - tail
-        return {'score': self.gamma - nd.norm(score, ord=1, axis=-1)}
-    def prepare(self, g, gpu_id, trace=False):
-        head_ids, tail_ids = g.all_edges(order='eid')
-        projection = self.projection_emb(g.edata['id'], gpu_id, trace)
-        projection = projection.reshape(-1, self.entity_dim, self.relation_dim)
-        head_emb = g.ndata['emb'][head_ids.as_in_context(g.ndata['emb'].context)].expand_dims(axis=-2)
-        tail_emb = g.ndata['emb'][tail_ids.as_in_context(g.ndata['emb'].context)].expand_dims(axis=-2)
-        g.edata['head_emb'] = nd.batch_dot(head_emb, projection).squeeze()
-        g.edata['tail_emb'] = nd.batch_dot(tail_emb, projection).squeeze()
-    def create_neg_prepare(self, neg_head):
-        if neg_head:
-            def fn(rel_id, num_chunks, head, tail, gpu_id, trace=False):
-                # pos node, project to its relation
-                projection = self.projection_emb(rel_id, gpu_id, trace)
-                projection = projection.reshape(-1, self.entity_dim, self.relation_dim)
-                tail = tail.reshape(-1, 1, self.entity_dim)
-                tail = nd.batch_dot(tail, projection)
-                tail = tail.reshape(num_chunks, -1, self.relation_dim)
-                # neg node, each project to all relations
-                projection = projection.reshape(num_chunks, -1, self.entity_dim, self.relation_dim)
-                head = head.reshape(num_chunks, -1, 1, self.entity_dim)
-                num_rels = projection.shape[1]
-                num_nnodes = head.shape[1]
-                heads = []
-                for i in range(num_chunks):
-                    head_negs = []
-                    for j in range(num_nnodes):
-                        head_neg = head[i][j]
-                        head_neg = head_neg.reshape(1, 1, self.entity_dim)
-                        head_neg = nd.broadcast_axis(head_neg, axis=0, size=num_rels)
-                        head_neg = nd.batch_dot(head_neg, projection[i])
-                        head_neg = head_neg.squeeze(axis=1)
-                        head_negs.append(head_neg)
-                    head_negs = nd.stack(*head_negs, axis=1)
-                    heads.append(head_negs)
-                head = nd.stack(*heads)
-                return head, tail
-            return fn
-        else:
-            def fn(rel_id, num_chunks, head, tail, gpu_id, trace=False):
-                # pos node, project to its relation
-                projection = self.projection_emb(rel_id, gpu_id, trace)
-                projection = projection.reshape(-1, self.entity_dim, self.relation_dim)
-                head = head.reshape(-1, 1, self.entity_dim)
-                head = nd.batch_dot(head, projection).squeeze()
-                head = head.reshape(num_chunks, -1, self.relation_dim)
-                projection = projection.reshape(num_chunks, -1, self.entity_dim, self.relation_dim)
-                tail = tail.reshape(num_chunks, -1, 1, self.entity_dim)
-                num_rels = projection.shape[1]
-                num_nnodes = tail.shape[1]
-                tails = []
-                for i in range(num_chunks):
-                    tail_negs = []
-                    for j in range(num_nnodes):
-                        tail_neg = tail[i][j]
-                        tail_neg = tail_neg.reshape(1, 1, self.entity_dim)
-                        tail_neg = nd.broadcast_axis(tail_neg, axis=0, size=num_rels)
-                        tail_neg = nd.batch_dot(tail_neg, projection[i])
-                        tail_neg = tail_neg.squeeze(axis=1)
-                        tail_negs.append(tail_neg)
-                    tail_negs = nd.stack(*tail_negs, axis=1)
-                    tails.append(tail_negs)
-                tail = nd.stack(*tails)
-                return head, tail
-            return fn
-    def forward(self, g):
-        g.apply_edges(lambda edges: self.edge_func(edges))
-    def reset_parameters(self):
-        self.projection_emb.init(1.0)
-    def update(self, gpu_id=-1):
-        self.projection_emb.update(gpu_id)
-    def save(self, path, name):
-        self.projection_emb.save(path, name+'projection')
-    def load(self, path, name):
-        self.projection_emb.load(path, name+'projection')
-    def prepare_local_emb(self, projection_emb):
-        self.global_projection_emb = self.projection_emb
-        self.projection_emb = projection_emb
-    def writeback_local_emb(self, idx):
-        self.global_projection_emb.emb[idx] = self.projection_emb.emb.as_in_context(mx.cpu())[idx]
-    def load_local_emb(self, projection_emb):
-        context = projection_emb.emb.context
-        projection_emb.emb = self.projection_emb.emb.as_in_context(context)
-        self.projection_emb = projection_emb
-    def create_neg(self, neg_head):
-        gamma = self.gamma
-        if neg_head:
-            def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
-                relations = relations.reshape(num_chunks, -1, self.relation_dim)
-                tails = tails - relations
-                tails = tails.reshape(num_chunks, -1, 1, self.relation_dim)
-                score = heads - tails
-                return gamma - nd.norm(score, ord=1, axis=-1)
-            return fn
-        else:
-            def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
-                relations = relations.reshape(num_chunks, -1, self.relation_dim)
-                heads = heads - relations
-                heads = heads.reshape(num_chunks, -1, 1, self.relation_dim)
-                score = heads - tails
-                return gamma - nd.norm(score, ord=1, axis=-1)
-            return fn
-class DistMultScore(nn.Block):
-    """DistMult score function
-    Paper link: https://arxiv.org/abs/1412.6575
-    """
-    def __init__(self):
-        super(DistMultScore, self).__init__()
-    def edge_func(self, edges):
-        head = edges.src['emb']
-        tail = edges.dst['emb']
-        rel = edges.data['emb']
-        score = head * rel * tail
-        # TODO: check if there exists minus sign and if gamma should be used here(jin)
-        return {'score': nd.sum(score, axis=-1)}
-    def prepare(self, g, gpu_id, trace=False):
-        pass
-    def create_neg_prepare(self, neg_head):
-        def fn(rel_id, num_chunks, head, tail, gpu_id, trace=False):
-            return head, tail
-        return fn
-    def update(self, gpu_id=-1):
-        pass
-    def reset_parameters(self):
-        pass
-    def save(self, path, name):
-        pass
-    def load(self, path, name):
-        pass
-    def forward(self, g):
-        g.apply_edges(lambda edges: self.edge_func(edges))
-    def create_neg(self, neg_head):
-        if neg_head:
-            def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
-                hidden_dim = heads.shape[1]
-                heads = heads.reshape(num_chunks, neg_sample_size, hidden_dim)
-                heads = nd.transpose(heads, axes=(0, 2, 1))
-                tmp = (tails * relations).reshape(num_chunks, chunk_size, hidden_dim)
-                return nd.linalg_gemm2(tmp, heads)
-            return fn
-        else:
-            def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
-                hidden_dim = heads.shape[1]
-                tails = tails.reshape(num_chunks, neg_sample_size, hidden_dim)
-                tails = nd.transpose(tails, axes=(0, 2, 1))
-                tmp = (heads * relations).reshape(num_chunks, chunk_size, hidden_dim)
-                return nd.linalg_gemm2(tmp, tails)
-            return fn
-class ComplExScore(nn.Block):
-    """ComplEx score function
-    Paper link: https://arxiv.org/abs/1606.06357
-    """
-    def __init__(self):
-        super(ComplExScore, self).__init__()
-    def edge_func(self, edges):
-        real_head, img_head = nd.split(edges.src['emb'], num_outputs=2, axis=-1)
-        real_tail, img_tail = nd.split(edges.dst['emb'], num_outputs=2, axis=-1)
-        real_rel, img_rel = nd.split(edges.data['emb'], num_outputs=2, axis=-1)
-        score = real_head * real_tail * real_rel \
-                + img_head * img_tail * real_rel \
-                + real_head * img_tail * img_rel \
-                - img_head * real_tail * img_rel
-        # TODO: check if there exists minus sign and if gamma should be used here(jin)
-        return {'score': nd.sum(score, -1)}
-    def prepare(self, g, gpu_id, trace=False):
-        pass
-    def create_neg_prepare(self, neg_head):
-        def fn(rel_id, num_chunks, head, tail, gpu_id, trace=False):
-            return head, tail
-        return fn
-    def update(self, gpu_id=-1):
-        pass
-    def reset_parameters(self):
-        pass
-    def save(self, path, name):
-        pass
-    def load(self, path, name):
-        pass
-    def forward(self, g):
-        g.apply_edges(lambda edges: self.edge_func(edges))
-    def create_neg(self, neg_head):
-        if neg_head:
-            def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
-                hidden_dim = heads.shape[1]
-                emb_real, emb_img = nd.split(tails, num_outputs=2, axis=-1)
-                rel_real, rel_img = nd.split(relations, num_outputs=2, axis=-1)
-                real = emb_real * rel_real + emb_img * rel_img
-                img = -emb_real * rel_img + emb_img * rel_real
-                emb_complex = nd.concat(real, img, dim=-1)
-                tmp = emb_complex.reshape(num_chunks, chunk_size, hidden_dim)
-                heads = heads.reshape(num_chunks, neg_sample_size, hidden_dim)
-                heads = nd.transpose(heads, axes=(0, 2, 1))
-                return nd.linalg_gemm2(tmp, heads)
-            return fn
-        else:
-            def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
-                hidden_dim = heads.shape[1]
-                emb_real, emb_img = nd.split(heads, num_outputs=2, axis=-1)
-                rel_real, rel_img = nd.split(relations, num_outputs=2, axis=-1)
-                real = emb_real * rel_real - emb_img * rel_img
-                img = emb_real * rel_img + emb_img * rel_real
-                emb_complex = nd.concat(real, img, dim=-1)
-                tmp = emb_complex.reshape(num_chunks, chunk_size, hidden_dim)
-                tails = tails.reshape(num_chunks, neg_sample_size, hidden_dim)
-                tails = nd.transpose(tails, axes=(0, 2, 1))
-                return nd.linalg_gemm2(tmp, tails)
-            return fn
-class RESCALScore(nn.Block):
-    """RESCAL score function
-    Paper link: http://www.icml-2011.org/papers/438_icmlpaper.pdf
-    """
-    def __init__(self, relation_dim, entity_dim):
-        super(RESCALScore, self).__init__()
-        self.relation_dim = relation_dim
-        self.entity_dim = entity_dim
-    def edge_func(self, edges):
-        head = edges.src['emb']
-        tail = edges.dst['emb'].expand_dims(2)
-        rel = edges.data['emb']
-        rel = rel.reshape(-1, self.relation_dim, self.entity_dim)
-        score = head * mx.nd.batch_dot(rel, tail).squeeze()
-        # TODO: check if use self.gamma
-        return {'score': mx.nd.sum(score, -1)}
-        # return {'score': self.gamma - th.norm(score, p=1, dim=-1)}
-    def prepare(self, g, gpu_id, trace=False):
-        pass
-    def create_neg_prepare(self, neg_head):
-        def fn(rel_id, num_chunks, head, tail, gpu_id, trace=False):
-            return head, tail
-        return fn
-    def update(self, gpu_id=-1):
-        pass
-    def reset_parameters(self):
-        pass
-    def save(self, path, name):
-        pass
-    def load(self, path, name):
-        pass
-    def forward(self, g):
-        g.apply_edges(lambda edges: self.edge_func(edges))
-    def create_neg(self, neg_head):
-        if neg_head:
-            def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
-                hidden_dim = heads.shape[1]
-                heads = heads.reshape(num_chunks, neg_sample_size, hidden_dim)
-                heads = mx.nd.transpose(heads, axes=(0,2,1))
-                tails = tails.expand_dims(2)
-                relations = relations.reshape(-1, self.relation_dim, self.entity_dim)
-                tmp = mx.nd.batch_dot(relations, tails).squeeze()
-                tmp = tmp.reshape(num_chunks, chunk_size, hidden_dim)
-                return nd.linalg_gemm2(tmp, heads)
-            return fn
-        else:
-            def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
-                hidden_dim = heads.shape[1]
-                tails = tails.reshape(num_chunks, neg_sample_size, hidden_dim)
-                tails = mx.nd.transpose(tails, axes=(0,2,1))
-                heads = heads.expand_dims(2)
-                relations = relations.reshape(-1, self.relation_dim, self.entity_dim)
-                tmp = mx.nd.batch_dot(relations, heads).squeeze()
-                tmp = tmp.reshape(num_chunks, chunk_size, hidden_dim)
-                return nd.linalg_gemm2(tmp, tails)
-            return fn
-class RotatEScore(nn.Block):
-    """RotatE score function
-    Paper link: https://arxiv.org/abs/1902.10197
-    """
-    def __init__(self, gamma, emb_init, eps=1e-10):
-        super(RotatEScore, self).__init__()
-        self.gamma = gamma
-        self.emb_init = emb_init
-        self.eps = eps
-    def edge_func(self, edges):
-        real_head, img_head = nd.split(edges.src['emb'], num_outputs=2, axis=-1)
-        real_tail, img_tail = nd.split(edges.dst['emb'], num_outputs=2, axis=-1)
-        phase_rel = edges.data['emb'] / (self.emb_init / np.pi)
-        re_rel, im_rel = nd.cos(phase_rel), nd.sin(phase_rel)
-        real_score = real_head * re_rel - img_head * im_rel
-        img_score = real_head * im_rel + img_head * re_rel
-        real_score = real_score - real_tail
-        img_score = img_score - img_tail
-        #sqrt((x*x).sum() + eps)
-        score = mx.nd.sqrt(real_score * real_score + img_score * img_score + self.eps).sum(-1)
-        return {'score': self.gamma - score} 
-    def prepare(self, g, gpu_id, trace=False):
-        pass
-    def create_neg_prepare(self, neg_head):
-        def fn(rel_id, num_chunks, head, tail, gpu_id, trace=False):
-            return head, tail
-        return fn
-    def update(self, gpu_id=-1):
-        pass
-    def reset_parameters(self):
-        pass
-    def save(self, path, name):
-        pass
-    def load(self, path, name):
-        pass
-    def forward(self, g):
-        g.apply_edges(lambda edges: self.edge_func(edges))
-    def create_neg(self, neg_head):
-        gamma = self.gamma
-        emb_init = self.emb_init
-        eps = self.eps
-        if neg_head:
-            def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
-                hidden_dim = heads.shape[1]
-                emb_real, emb_img = nd.split(tails, num_outputs=2, axis=-1)
-                phase_rel = relations / (emb_init / np.pi)
-                rel_real, rel_img = nd.cos(phase_rel), nd.sin(phase_rel)
-                real = emb_real * rel_real + emb_img * rel_img
-                img = -emb_real * rel_img + emb_img * rel_real
-                emb_complex = nd.concat(real, img, dim=-1)
-                tmp = emb_complex.reshape(num_chunks, chunk_size, 1, hidden_dim)
-                heads = heads.reshape(num_chunks, 1, neg_sample_size, hidden_dim)
-                score = tmp - heads
-                score_real, score_img = nd.split(score, num_outputs=2, axis=-1)
-                score = mx.nd.sqrt(score_real * score_real + score_img * score_img + self.eps).sum(-1)
-                return gamma - score
-            return fn
-        else:
-            def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
-                hidden_dim = heads.shape[1]
-                emb_real, emb_img = nd.split(heads, num_outputs=2, axis=-1)
-                phase_rel = relations / (emb_init / np.pi)
-                rel_real, rel_img = nd.cos(phase_rel), nd.sin(phase_rel)
-                real = emb_real * rel_real - emb_img * rel_img
-                img = emb_real * rel_img + emb_img * rel_real
-                emb_complex = nd.concat(real, img, dim=-1)
-                tmp = emb_complex.reshape(num_chunks, chunk_size, 1, hidden_dim)
-                tails = tails.reshape(num_chunks, 1, neg_sample_size, hidden_dim)
-                score = tmp - tails
-                score_real, score_img = nd.split(score, num_outputs=2, axis=-1)
-                score = mx.nd.sqrt(score_real * score_real + score_img * score_img + self.eps).sum(-1)
-                return gamma - score
-            return fn