"docs/vscode:/vscode.git/clone" did not exist on "e0f36ed5a1ebc7b1ed044bf459f3576f30ff949a"
Unverified Commit 15b951d4 authored by Da Zheng's avatar Da Zheng Committed by GitHub
Browse files

[KG][Model] Knowledge graph embeddings (#888)

* upd

* fig edgebatch edges

* add test

* trigger

* Update README.md for pytorch PinSage example.

Add noting that the PinSage model example under
example/pytorch/recommendation only work with Python 3.6+
as its dataset loader depends on stanfordnlp package
which work only with Python 3.6+.

* Provid a frame agnostic API to test nn modules on both CPU and CUDA side.

1. make dgl.nn.xxx frame agnostic
2. make test.backend include dgl.nn modules
3. modify test_edge_softmax of test/mxnet/test_nn.py and
    test/pytorch/test_nn.py work on both CPU and GPU

* Fix style

* Delete unused code

* Make agnostic test only related to tests/backend

1. clear all agnostic related code in dgl.nn
2. make test_graph_conv agnostic to cpu/gpu

* Fix code style

* fix

* doc

* Make all test code under tests.mxnet/pytorch.test_nn.py
work on both CPU and GPU.

* Fix syntex

* Remove rand

* Add TAGCN nn.module and example

* Now tagcn can run on CPU.

* Add unitest for TGConv

* Fix style

* For pubmed dataset, using --lr=0.005 can achieve better acc

* Fix style

* Fix some descriptions

* trigger

* Fix doc

* Add nn.TGConv and example

* Fix bug

* Update data in mxnet.tagcn test acc.

* Fix some comments and code

* delete useless code

* Fix namming

* Fix bug

* Fix bug

* Add test for mxnet TAGCov

* Add test code for mxnet TAGCov

* Update some docs

* Fix some code

* Update docs dgl.nn.mxnet

* Update weight init

* Fix

* init version.

* change default value of regularization.

* avoid specifying adversarial_temperature

* use default eval_interval.

* remove original model.

* remove optimizer.

* set default value of num_proc

* set default value of log_interval.

* don't need to set neg_sample_size_valid.

* remove unused code.

* use uni_weight by default.

* unify model.

* rename model.

* remove unnecessary data sampler.

* remove the code for checkpoint.

* fix eval.

* raise exception in invalid arguments.

* remove RowAdagrad.

* remove unsupported score function for now.

* Fix bugs of kg
Update README

* Update Readme for mxnet distmult

* Update README.md

* Update README.md

* revert changes on dmlc

* add tests.

* update CI.

* add tests script.

* reorder tests in CI.

* measure performance.

* add results on wn18

* remove some code.

* rename the training script.

* new results on TransE.

* remove --train.

* add format.

* fix.

* use EdgeSubgraph.

* create PBGNegEdgeSubgraph to simplify the code.

* fix test

* fix CI.

* run nose for unit tests.

* remove unused code in dataset.

* change argument to save embeddings.

* test training and eval scripts in CI.

* check Pytorch version.

* fix a minor problem in config.

* fix a minor bug.

* fix readme.

* Update README.md

* Update README.md

* Update README.md
parent 1c00f3a8
...@@ -582,6 +582,9 @@ class EdgeSampler(object): ...@@ -582,6 +582,9 @@ class EdgeSampler(object):
assert g.number_of_edges() == len(relations) assert g.number_of_edges() == len(relations)
self._relations = relations self._relations = relations
if batch_size < 0 or neg_sample_size < 0:
raise Exception('Invalid arguments')
self._return_false_neg = return_false_neg self._return_false_neg = return_false_neg
self._batch_size = int(batch_size) self._batch_size = int(batch_size)
......
#!/bin/bash
KG_DIR="./apps/kg/"
function fail {
echo FAIL: $@
exit -1
}
function usage {
echo "Usage: $0 backend device"
}
# check arguments
if [ $# -ne 2 ]; then
usage
fail "Error: must specify device and bakend"
fi
if [ "$2" == "cpu" ]; then
dev=-1
elif [ "$2" == "gpu" ]; then
export CUDA_VISIBLE_DEVICES=0
dev=0
else
usage
fail "Unknown device $2"
fi
export DGLBACKEND=$1
export DGL_LIBRARY_PATH=${PWD}/build
export PYTHONPATH=${PWD}/python:$KG_DIR:$PYTHONPATH
export DGL_DOWNLOAD_DIR=${PWD}
# test
pushd $KG_DIR> /dev/null
python3 -m nose -v --with-xunit tests/test_score.py || "run test_score.py on $1"
if [ "$2" == "cpu" ]; then
# verify CPU training
python3 train.py --model DistMult --dataset FB15k --batch_size 128 \
--neg_sample_size 16 --hidden_dim 100 --gamma 500.0 --lr 0.1 --max_step 100 \
--batch_size_eval 16 --valid --test -adv --eval_interval 30 --eval_percent 0.01
elif [ "$2" == "gpu" ]; then
# verify GPU training
python3 train.py --model DistMult --dataset FB15k --batch_size 128 \
--neg_sample_size 16 --hidden_dim 100 --gamma 500.0 --lr 0.1 --max_step 100 \
--batch_size_eval 16 --gpu 0 --valid --test -adv --eval_interval 30 --eval_percent 0.01
# verify mixed CPU GPU training
python3 train.py --model DistMult --dataset FB15k --batch_size 128 \
--neg_sample_size 16 --hidden_dim 100 --gamma 500.0 --lr 0.1 --max_step 100 \
--batch_size_eval 16 --gpu 0 --valid --test -adv --mix_cpu_gpu --eval_percent 0.01 \
--save_emb DistMult_FB15k_emb
# verify saving training result
python3 eval.py --model_name DistMult --dataset FB15k --hidden_dim 2000 \
--gamma 500.0 --batch_size 16 --gpu 0 --model_path DistMult_FB15k_emb/ --eval_percent 0.01
fi
popd > /dev/null
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment