# DGL - Knowledge Graph Embedding ## Introduction DGL-KE is a DGL-based package for computing node embeddings and relation embeddings of knowledge graphs efficiently. This package is adapted from [KnowledgeGraphEmbedding](https://github.com/DeepGraphLearning/KnowledgeGraphEmbedding). We enable fast and scalable training of knowledge graph embedding, while still keeping the package as extensible as [KnowledgeGraphEmbedding](https://github.com/DeepGraphLearning/KnowledgeGraphEmbedding). On a single machine, it takes only a few minutes for medium-size knowledge graphs, such as FB15k and wn18, and takes a couple of hours on Freebase, which has hundreds of millions of edges. DGL-KE includes the following knowledge graph embedding models: - TransE (TransE_l1 with L1 distance and TransE_l2 with L2 distance) - DistMult - ComplEx - RESCAL - TransR - RotatE It will add other popular models in the future. DGL-KE supports multiple training modes: - CPU training - GPU training - Joint CPU & GPU training - Multiprocessing training on CPUs For joint CPU & GPU training, node embeddings are stored on CPU and mini-batches are trained on GPU. This is designed for training KGE models on large knowledge graphs For multiprocessing training, each process train mini-batches independently and use shared memory for communication between processes. This is designed to train KGE models on large knowledge graphs with many CPU cores. We will support multi-GPU training and distributed training in a near future. ## Requirements The package can run with both Pytorch and MXNet. For Pytorch, it works with Pytorch v1.2 or newer. For MXNet, it works with MXNet 1.5 or newer. ## Built-in Datasets DGL-KE provides five built-in knowledge graphs: | Dataset | #nodes | #edges | #relations | |---------|--------|--------|------------| | [FB15k](https://data.dgl.ai/dataset/FB15k.zip) | 14951 | 592213 | 1345 | | [FB15k-237](https://data.dgl.ai/dataset/FB15k-237.zip) | 14541 | 310116 | 237 | | [wn18](https://data.dgl.ai/dataset/wn18.zip) | 40943 | 151442 | 18 | | [wn18rr](https://data.dgl.ai/dataset/wn18rr.zip) | 40943 | 93003 | 11 | | [Freebase](https://data.dgl.ai/dataset/Freebase.zip) | 86054151 | 338586276 | 14824 | Users can specify one of the datasets with `--dataset` in `train.py` and `eval.py`. ## Performance The 1 GPU speed is measured with 8 CPU cores and one Nvidia V100 GPU. (AWS P3.2xlarge) The 8 GPU speed is measured with 64 CPU cores and eight Nvidia V100 GPU. (AWS P3.16xlarge) The speed on FB15k 1GPU | Models | TransE_l1 | TransE_l2 | DistMult | ComplEx | RESCAL | TransR | RotatE | |---------|-----------|-----------|----------|---------|--------|--------|--------| |MAX_STEPS| 48000 | 32000 | 40000 | 100000 | 32000 | 32000 | 20000 | |TIME | 370s | 270s | 312s | 282s | 2095s | 1556s | 1861s | The accuracy on FB15k | Models | MR | MRR | HITS@1 | HITS@3 | HITS@10 | |-----------|-------|-------|--------|--------|---------| | TransE_l1 | 44.18 | 0.675 | 0.551 | 0.774 | 0.861 | | TransE_l2 | 46.71 | 0.665 | 0.551 | 0.804 | 0.846 | | DistMult | 61.04 | 0.725 | 0.625 | 0.837 | 0.883 | | ComplEx | 64.59 | 0.785 | 0.718 | 0.835 | 0.889 | | RESCAL | 122.3 | 0.669 | 0.598 | 0.711 | 0.793 | | TransR | 59.86 | 0.676 | 0.591 | 0.735 | 0.814 | | RotatE | 43.66 | 0.728 | 0.632 | 0.801 | 0.874 | The speed on FB15k 8GPU | Models | TransE_l1 | TransE_l2 | DistMult | ComplEx | RESCAL | TransR | RotatE | |---------|-----------|-----------|----------|---------|--------|--------|--------| |MAX_STEPS| 6000 | 4000 | 5000 | 4000 | 4000 | 4000 | 2500 | |TIME | 88.93s | 62.99s | 72.74s | 68.37s | 245.9s | 203.9s | 126.7s | The accuracy on FB15k | Models | MR | MRR | HITS@1 | HITS@3 | HITS@10 | |-----------|-------|-------|--------|--------|---------| | TransE_l1 | 44.25 | 0.672 | 0.547 | 0.774 | 0.860 | | TransE_l2 | 46.13 | 0.658 | 0.539 | 0.748 | 0.845 | | DistMult | 61.72 | 0.723 | 0.626 | 0.798 | 0.881 | | ComplEx | 65.84 | 0.754 | 0.676 | 0.813 | 0.880 | | RESCAL | 135.6 | 0.652 | 0.580 | 0.693 | 0.779 | | TransR | 65.27 | 0.676 | 0.591 | 0.736 | 0.811 | | RotatE | 49.59 | 0.683 | 0.581 | 0.759 | 0.848 | In comparison, GraphVite uses 4 GPUs and takes 14 minutes. Thus, DGL-KE trains TransE on FB15k 9.5X as fast as GraphVite with 8 GPUs. More performance information on GraphVite can be found [here](https://github.com/DeepGraphLearning/graphvite). The speed on wn18 1GPU | Models | TransE_l1 | TransE_l2 | DistMult | ComplEx | RESCAL | TransR | RotatE | |---------|-----------|-----------|----------|---------|--------|--------|--------| |MAX_STEPS| 32000 | 32000 | 20000 | 20000 | 20000 | 30000 | 24000 | |TIME | 531.5s | 406.6s | 284.1s | 282.3s | 443.6s | 766.2s | 829.4s | The accuracy on wn18 | Models | MR | MRR | HITS@1 | HITS@3 | HITS@10 | |-----------|-------|-------|--------|--------|---------| | TransE_l1 | 318.4 | 0.764 | 0.602 | 0.929 | 0.949 | | TransE_l2 | 206.2 | 0.561 | 0.306 | 0.800 | 0.944 | | DistMult | 486.0 | 0.818 | 0.711 | 0.921 | 0.948 | | ComplEx | 268.6 | 0.933 | 0.916 | 0.949 | 0.961 | | RESCAL | 536.6 | 0.848 | 0.790 | 0.900 | 0.927 | | TransR | 452.4 | 0.620 | 0.461 | 0.758 | 0.856 | | RotatE | 487.9 | 0.944 | 0.940 | 0.947 | 0.952 | The speed on wn18 8GPU | Models | TransE_l1 | TransE_l2 | DistMult | ComplEx | RESCAL | TransR | RotatE | |---------|-----------|-----------|----------|---------|--------|--------|--------| |MAX_STEPS| 4000 | 4000 | 2500 | 2500 | 2500 | 2500 | 3000 | |TIME | 119.3s | 81.1s | 76.0s | 58.0s | 594.1s | 1168s | 139.8s | The accuracy on wn18 | Models | MR | MRR | HITS@1 | HITS@3 | HITS@10 | |-----------|-------|-------|--------|--------|---------| | TransE_l1 | 360.3 | 0.745 | 0.562 | 0.930 | 0.951 | | TransE_l2 | 193.8 | 0.557 | 0.301 | 0.799 | 0.942 | | DistMult | 499.9 | 0.807 | 0.692 | 0.917 | 0.945 | | ComplEx | 476.7 | 0.935 | 0.926 | 0.943 | 0.949 | | RESCAL | 618.8 | 0.848 | 0.791 | 0.897 | 0.927 | | TransR | 513.1 | 0.659 | 0.491 | 0.821 | 0.871 | | RotatE | 466.2 | 0.944 | 0.940 | 0.945 | 0.951 | The speed on Freebase (8 GPU) | Models | TransE_l2 | DistMult | ComplEx | TransR | RotatE | |---------|-----------|----------|---------|--------|--------| |MAX_STEPS| 320000 | 300000 | 360000 | 300000 | 300000 | |TIME | 7908s | 7425s | 8946s | 16816s | 12817s | The accuracy on Freebase (it is tested when 1000 negative edges are sampled for each positive edge). | Models | MR | MRR | HITS@1 | HITS@3 | HITS@10 | |-----------|--------|-------|--------|--------|---------| | TransE_l2 | 22.4 | 0.756 | 0.688 | 0.800 | 0.882 | | DistMul | 45.4 | 0.833 | 0.812 | 0.843 | 0.872 | | ComplEx | 48.0 | 0.830 | 0.812 | 0.838 | 0.864 | | TransR | 51.2 | 0.697 | 0.656 | 0.716 | 0.771 | | RotatE | 93.3 | 0.770 | 0.749 | 0.780 | 0.805 | The speed on Freebase (48 CPU) This measured with 48 CPU cores on an AWS r5dn.24xlarge | Models | TransE_l2 | DistMult | ComplEx | |---------|-----------|----------|---------| |MAX_STEPS| 50000 | 50000 | 50000 | |TIME | 7002s | 6340s | 8133s | The accuracy on Freebase (it is tested when 1000 negative edges are sampled for each positive edge). | Models | MR | MRR | HITS@1 | HITS@3 | HITS@10 | |-----------|--------|-------|--------|--------|---------| | TransE_l2 | 30.8 | 0.814 | 0.764 | 0.848 | 0.902 | | DistMul | 45.1 | 0.834 | 0.815 | 0.843 | 0.871 | | ComplEx | 44.9 | 0.837 | 0.819 | 0.845 | 0.870 | The configuration for reproducing the performance results can be found [here](https://github.com/dmlc/dgl/blob/master/apps/kg/config/best_config.sh). ## Usage DGL-KE doesn't require installation. The package contains two scripts `train.py` and `eval.py`. * `train.py` trains knowledge graph embeddings and outputs the trained node embeddings and relation embeddings. * `eval.py` reads the pre-trained node embeddings and relation embeddings and evaluate how accurate to predict the tail node when given (head, rel, ?), and predict the head node when given (?, rel, tail). ### Input formats: DGL-KE supports two knowledge graph input formats for user defined dataset - raw_udd_[h|r|t], raw user defined dataset. In this format, user only need to provide triples and let the dataloader generate and manipulate the id mapping. The dataloader will generate two files: entities.tsv for entity id mapping and relations.tsv for relation id mapping. The order of head, relation and tail entities are described in [h|r|t], for example, raw_udd_trh means the triples are stored in the order of tail, relation and head. It should contains three files: - *train* stores the triples in the training set. In format of a triple, e.g., [src_name, rel_name, dst_name] and should follow the order specified in [h|r|t] - *valid* stores the triples in the validation set. In format of a triple, e.g., [src_name, rel_name, dst_name] and should follow the order specified in [h|r|t] - *test* stores the triples in the test set. In format of a triple, e.g., [src_name, rel_name, dst_name] and should follow the order specified in [h|r|t] Format 2: - udd_[h|r|t], user defined dataset. In this format, user should provide the id mapping for entities and relations. The order of head, relation and tail entities are described in [h|r|t], for example, raw_udd_trh means the triples are stored in the order of tail, relation and head. It should contains five files: - *entities* stores the mapping between entity name and entity Id - *relations* stores the mapping between relation name relation Id - *train* stores the triples in the training set. In format of a triple, e.g., [src_id, rel_id, dst_id] and should follow the order specified in [h|r|t] - *valid* stores the triples in the validation set. In format of a triple, e.g., [src_id, rel_id, dst_id] and should follow the order specified in [h|r|t] - *test* stores the triples in the test set. In format of a triple, e.g., [src_id, rel_id, dst_id] and should follow the order specified in [h|r|t] ### Output formats: To save the trained embeddings, users have to provide the path with `--save_emb` when running `train.py`. The saved embeddings are stored as numpy ndarrays. * The node embedding is saved as `XXX_YYY_entity.npy`. * The relation embedding is saved as `XXX_YYY_relation.npy`. `XXX` is the dataset name and `YYY` is the model name. ### Command line parameters Here are some examples of using the training script. Train KGE models with GPU. ```bash python3 train.py --model DistMult --dataset FB15k --batch_size 1024 --neg_sample_size 256 \ --hidden_dim 400 --gamma 143.0 --lr 0.08 --batch_size_eval 16 --valid --test -adv \ --gpu 0 --max_step 40000 ``` Train KGE models with mixed multiple GPUs. ```bash python3 train.py --model DistMult --dataset FB15k --batch_size 1024 --neg_sample_size 256 \ --hidden_dim 400 --gamma 143.0 --lr 0.08 --batch_size_eval 16 --valid --test -adv \ --max_step 5000 --mix_cpu_gpu --num_proc 8 --gpu 0 1 2 3 4 5 6 7 --async_update \ --soft_rel_part --force_sync_interval 1000 ``` Train embeddings and verify it later. ```bash python3 train.py --model DistMult --dataset FB15k --batch_size 1024 --neg_sample_size 256 \ --hidden_dim 400 --gamma 143.0 --lr 0.08 --batch_size_eval 16 --valid --test -adv \ --gpu 0 --max_step 40000 --save_emb DistMult_FB15k_emb python3 eval.py --model_name DistMult --dataset FB15k --hidden_dim 400 \ --gamma 143.0 --batch_size 16 --gpu 0 --model_path DistMult_FB15k_emb/ ``` Train embeddings with multi-processing. This currently doesn't work in MXNet. ```bash python3 train.py --model TransE_l2 --dataset Freebase --batch_size 1000 \ --neg_sample_size 200 --hidden_dim 400 --gamma 10 --lr 0.1 --max_step 50000 \ --log_interval 100 --batch_size_eval 1000 --neg_sample_size_eval 1000 --test \ -adv --regularization_coef 1e-9 --num_thread 1 --num_proc 48 ```