"...git@developer.sourcefind.cn:renzhc/diffusers_dcu.git" did not exist on "1926331eaf59dae54aeb97cde19dae16c2fdaa48"
Unverified Commit 15b951d4 authored by Da Zheng's avatar Da Zheng Committed by GitHub
Browse files

[KG][Model] Knowledge graph embeddings (#888)

* upd

* fig edgebatch edges

* add test

* trigger

* Update README.md for pytorch PinSage example.

Add noting that the PinSage model example under
example/pytorch/recommendation only work with Python 3.6+
as its dataset loader depends on stanfordnlp package
which work only with Python 3.6+.

* Provid a frame agnostic API to test nn modules on both CPU and CUDA side.

1. make dgl.nn.xxx frame agnostic
2. make test.backend include dgl.nn modules
3. modify test_edge_softmax of test/mxnet/test_nn.py and
    test/pytorch/test_nn.py work on both CPU and GPU

* Fix style

* Delete unused code

* Make agnostic test only related to tests/backend

1. clear all agnostic related code in dgl.nn
2. make test_graph_conv agnostic to cpu/gpu

* Fix code style

* fix

* doc

* Make all test code under tests.mxnet/pytorch.test_nn.py
work on both CPU and GPU.

* Fix syntex

* Remove rand

* Add TAGCN nn.module and example

* Now tagcn can run on CPU.

* Add unitest for TGConv

* Fix style

* For pubmed dataset, using --lr=0.005 can achieve better acc

* Fix style

* Fix some descriptions

* trigger

* Fix doc

* Add nn.TGConv and example

* Fix bug

* Update data in mxnet.tagcn test acc.

* Fix some comments and code

* delete useless code

* Fix namming

* Fix bug

* Fix bug

* Add test for mxnet TAGCov

* Add test code for mxnet TAGCov

* Update some docs

* Fix some code

* Update docs dgl.nn.mxnet

* Update weight init

* Fix

* init version.

* change default value of regularization.

* avoid specifying adversarial_temperature

* use default eval_interval.

* remove original model.

* remove optimizer.

* set default value of num_proc

* set default value of log_interval.

* don't need to set neg_sample_size_valid.

* remove unused code.

* use uni_weight by default.

* unify model.

* rename model.

* remove unnecessary data sampler.

* remove the code for checkpoint.

* fix eval.

* raise exception in invalid arguments.

* remove RowAdagrad.

* remove unsupported score function for now.

* Fix bugs of kg
Update README

* Update Readme for mxnet distmult

* Update README.md

* Update README.md

* revert changes on dmlc

* add tests.

* update CI.

* add tests script.

* reorder tests in CI.

* measure performance.

* add results on wn18

* remove some code.

* rename the training script.

* new results on TransE.

* remove --train.

* add format.

* fix.

* use EdgeSubgraph.

* create PBGNegEdgeSubgraph to simplify the code.

* fix test

* fix CI.

* run nose for unit tests.

* remove unused code in dataset.

* change argument to save embeddings.

* test training and eval scripts in CI.

* check Pytorch version.

* fix a minor problem in config.

* fix a minor bug.

* fix readme.

* Update README.md

* Update README.md

* Update README.md
parent 1c00f3a8
...@@ -69,6 +69,14 @@ def unit_test_win64(backend, dev) { ...@@ -69,6 +69,14 @@ def unit_test_win64(backend, dev) {
} }
} }
def kg_test_linux(backend, dev) {
init_git()
unpack_lib("dgl-${dev}-linux", dgl_linux_libs)
timeout(time: 20, unit: 'MINUTES') {
sh "bash tests/scripts/task_kg_test.sh ${backend} ${dev}"
}
}
def example_test_linux(backend, dev) { def example_test_linux(backend, dev) {
init_git() init_git()
unpack_lib("dgl-${dev}-linux", dgl_linux_libs) unpack_lib("dgl-${dev}-linux", dgl_linux_libs)
...@@ -196,6 +204,11 @@ pipeline { ...@@ -196,6 +204,11 @@ pipeline {
tutorial_test_linux("pytorch") tutorial_test_linux("pytorch")
} }
} }
stage("Knowledge Graph test") {
steps {
kg_test_linux("pytorch", "cpu")
}
}
} }
post { post {
always { always {
...@@ -257,6 +270,11 @@ pipeline { ...@@ -257,6 +270,11 @@ pipeline {
unit_test_linux("mxnet", "cpu") unit_test_linux("mxnet", "cpu")
} }
} }
stage("Knowledge Graph test") {
steps {
kg_test_linux("mxnet", "cpu")
}
}
//stage("Tutorial test") { //stage("Tutorial test") {
// steps { // steps {
// tutorial_test_linux("mxnet") // tutorial_test_linux("mxnet")
......
# DGL - Knowledge Graph Embedding
## Introduction
DGL-KE aims to computing knowledge graph embeddings efficiently on giant knowledge graphs.
It can train knowledge graphs, such as FB15k and wn18, within a few minutes, while it trains
Freebase, which has hundreds of millions of edges within a couple of hours.
It supports multiple knowledge graph embeddings. For now, it supports knowledge graph embedding
models including:
- TransE
- DistMult
- ComplEx
More models will be supported in a near future.
DGL-KE supports multiple training modes:
- CPU & GPU training
- Mixed CPU & GPU training: node embeddings are stored on CPU and mini-batches are trained on GPU. This is designed for training KGE models on large knowledge graphs.
- Multiprocessing training on CPUs: this is designed to train KGE models on large knowledge graphs with many CPU cores.
We will support multi-GPU training and distributed training in a near future.
## Requirements
The package can run with both Pytorch and MXNet. For Pytorch, it works with Pytorch v1.2 or newer.
For MXNet, it can work with MXNet 1.5 or newer.
## Datasets
DGL-KE provides five knowledge graphs:
- FB15k
- FB15k-237
- wn18
- wn18rr
- Freebase
Users can specify one of the datasets with `--dataset` in `train.py` and `eval.py`.
## Performance
The speed is measured on an EC2 P3 instance on a Nvidia V100 GPU.
The speed on FB15k
| Models | TrasnE | DistMult | ComplEx |
|---------|--------|----------|---------|
|MAX_STEPS| 20000 | 100000 | 100000 |
|TIME | 411s | 690s | 806s |
The accuracy on FB15k
| Models | MR | MRR | HITS@1 | HITS@3 | HITS@10 |
|----------|-------|-------|--------|--------|---------|
| TransE | 69.12 | 0.656 | 0.567 | 0.718 | 0.802 |
| DistMult | 43.35 | 0.783 | 0.713 | 0.837 | 0.897 |
| ComplEx | 51.99 | 0.785 | 0.720 | 0.832 | 0.889 |
The speed on wn18
| Models | TrasnE | DistMult | ComplEx |
|---------|--------|----------|---------|
|MAX_STEPS| 40000 | 10000 | 20000 |
|TIME | 719s | 126s | 266s |
The accuracy on wn18
| Models | MR | MRR | HITS@1 | HITS@3 | HITS@10 |
|----------|--------|-------|--------|--------|---------|
| TransE | 321.35 | 0.760 | 0.652 | 0.850 | 0.940 |
| DistMult | 271.09 | 0.769 | 0.639 | 0.892 | 0.949 |
| ComplEx | 276.37 | 0.935 | 0.916 | 0.950 | 0.960 |
## Usage
The package supports two data formats for a knowledge graph.
Format 1:
- entities.dict maps entity Id to entity name.
- relations.dict maps relation Id to relation name.
- train.txt stores the triples (head, rel, tail) in the training set.
- valid.txt stores the triples (head, rel, tail) in the validation set.
- test.txt stores the triples (head, rel, tail) in the test set.
Format 2:
- entity2id.txt maps entity name to entity Id.
- relation2id.txt maps relation name to relation Id.
- train.txt stores the triples (head, tail, rel) in the training set.
- valid.txt stores the triples (head, tail, rel) in the validation set.
- test.txt stores the triples (head, tail, rel) in the test set.
Here are some examples of using the training script.
Train KGE models with GPU.
```bash
python3 train.py --model DistMult --dataset FB15k --batch_size 1024 \
--neg_sample_size 256 --hidden_dim 2000 --gamma 500.0 --lr 0.1 --max_step 100000 \
--batch_size_eval 16 --gpu 0 --valid --test -adv
```
Train KGE models with mixed CPUs and GPUs.
```bash
python3 train.py --model DistMult --dataset FB15k --batch_size 1024 \
--neg_sample_size 256 --hidden_dim 2000 --gamma 500.0 --lr 0.1 --max_step 100000 \
--batch_size_eval 16 --gpu 0 --valid --test -adv --mix_cpu_gpu
```
Train embeddings and verify it later.
```bash
python3 train.py --model DistMult --dataset FB15k --batch_size 1024 \
--neg_sample_size 256 --hidden_dim 2000 --gamma 500.0 --lr 0.1 --max_step 100000 \
--batch_size_eval 16 --gpu 0 --valid -adv --save_emb DistMult_FB15k_emb
python3 eval.py --model_name DistMult --dataset FB15k --hidden_dim 2000 \
--gamma 500.0 --batch_size 16 --gpu 0 --model_path DistMult_FB15k_emb/
```
Train embeddings with multi-processing. This currently doesn't work in MXNet.
```bash
python3 train.py --model DistMult --dataset FB15k --batch_size 1024 \
--neg_sample_size 256 --hidden_dim 2000 --gamma 500.0 --lr 0.07 --max_step 3000 \
--batch_size_eval 16 --regularization_coef 0.000001 --valid --test -adv --num_proc 8
```
## Freebase
Train embeddings on Freebase with multi-processing on X1.
```bash
DGLBACKEND=pytorch python3 train.py --model ComplEx --dataset Freebase --batch_size 1024 \
--neg_sample_size 256 --hidden_dim 400 --gamma 500.0 \
--lr 0.1 --max_step 50000 --batch_size_eval 128 --test -adv --eval_interval 300000 \
--neg_sample_size_test 10000 --eval_percent 0.2 --num_proc 64
Test average MR at [0/50000]: 754.5566055566055
Test average MRR at [0/50000]: 0.7333319016877765
Test average HITS@1 at [0/50000]: 0.7182952182952183
Test average HITS@3 at [0/50000]: 0.7409752409752409
Test average HITS@10 at [0/50000]: 0.7587412587412588
```
#To reproduce reported results on README, you can run the model with the following commands:
# for FB15k
DGLBACKEND=pytorch python3 train.py --model DistMult --dataset FB15k --batch_size 1024 \
--neg_sample_size 256 --hidden_dim 2000 --gamma 500.0 --lr 0.1 --max_step 100000 \
--batch_size_eval 16 --gpu 0 --valid --test -adv
DGLBACKEND=pytorch python3 train.py --model ComplEx --dataset FB15k --batch_size 1024 \
--neg_sample_size 256 --hidden_dim 2000 --gamma 500.0 --lr 0.2 --max_step 100000 \
--batch_size_eval 16 --gpu 1 --valid --test -adv
DGLBACKEND=pytorch python3 train.py --model TransE --dataset FB15k --batch_size 1024 \
--neg_sample_size 256 --hidden_dim 2000 --gamma 24.0 --lr 0.01 --max_step 20000 \
--batch_size_eval 16 --gpu 0 --valid --test -adv
# for wn18
DGLBACKEND=pytorch python3 train.py --model TransE --dataset wn18 --batch_size 1024 \
--neg_sample_size 512 --hidden_dim 500 --gamma 12.0 --adversarial_temperature 0.5 \
--lr 0.01 --max_step 40000 --batch_size_eval 16 --gpu 0 --valid --test -adv \
--regularization_coef 0.00001
DGLBACKEND=pytorch python3 train.py --model DistMult --dataset wn18 --batch_size 1024 \
--neg_sample_size 1024 --hidden_dim 1000 --gamma 200.0 --lr 0.1 --max_step 10000 \
--batch_size_eval 16 --gpu 0 --valid --test -adv --regularization_coef 0.00001
DGLBACKEND=pytorch python3 train.py --model ComplEx --dataset wn18 --batch_size 1024 \
--neg_sample_size 1024 --hidden_dim 500 --gamma 200.0 --lr 0.1 --max_step 20000 \
--batch_size_eval 16 --gpu 0 --valid --test -adv --regularization_coef 0.00001
import os
def _download_and_extract(url, path, filename):
import shutil, zipfile
from tqdm import tqdm
import requests
fn = os.path.join(path, filename)
while True:
try:
with zipfile.ZipFile(fn) as zf:
zf.extractall(path)
print('Unzip finished.')
break
except Exception:
os.makedirs(path, exist_ok=True)
f_remote = requests.get(url, stream=True)
sz = f_remote.headers.get('content-length')
assert f_remote.status_code == 200, 'fail to open {}'.format(url)
with open(fn, 'wb') as writer:
for chunk in tqdm(f_remote.iter_content(chunk_size=1024*1024)):
writer.write(chunk)
print('Download finished. Unzipping the file...')
class KGDataset1:
'''Load a knowledge graph with format 1
In this format, the folder with a knowledge graph has five files:
* entities.dict stores the mapping between entity Id and entity name.
* relations.dict stores the mapping between relation Id and relation name.
* train.txt stores the triples in the training set.
* valid.txt stores the triples in the validation set.
* test.txt stores the triples in the test set.
The mapping between entity (relation) Id and entity (relation) name is stored as 'id\tname'.
The triples are stored as 'head_name\trelation_name\ttail_name'.
'''
def __init__(self, path, name):
url = 'https://s3.us-east-2.amazonaws.com/dgl.ai/dataset/{}.zip'.format(name)
if not os.path.exists(os.path.join(path, name)):
print('File not found. Downloading from', url)
_download_and_extract(url, path, name + '.zip')
path = os.path.join(path, name)
with open(os.path.join(path, 'entities.dict')) as f:
entity2id = {}
for line in f:
eid, entity = line.strip().split('\t')
entity2id[entity] = int(eid)
self.entity2id = entity2id
with open(os.path.join(path, 'relations.dict')) as f:
relation2id = {}
for line in f:
rid, relation = line.strip().split('\t')
relation2id[relation] = int(rid)
self.relation2id = relation2id
# TODO: to deal with contries dataset.
self.n_entities = len(self.entity2id)
self.n_relations = len(self.relation2id)
self.train = self.read_triple(path, 'train')
self.valid = self.read_triple(path, 'valid')
self.test = self.read_triple(path, 'test')
def read_triple(self, path, mode):
# mode: train/valid/test
triples = []
with open(os.path.join(path, '{}.txt'.format(mode))) as f:
for line in f:
h, r, t = line.strip().split('\t')
triples.append((self.entity2id[h], self.relation2id[r], self.entity2id[t]))
return triples
class KGDataset2:
'''Load a knowledge graph with format 2
In this format, the folder with a knowledge graph has five files:
* entity2id.txt stores the mapping between entity name and entity Id.
* relation2id.txt stores the mapping between relation name relation Id.
* train.txt stores the triples in the training set.
* valid.txt stores the triples in the validation set.
* test.txt stores the triples in the test set.
The mapping between entity (relation) name and entity (relation) Id is stored as 'name\tid'.
The triples are stored as 'head_nid\trelation_id\ttail_nid'.
'''
def __init__(self, path, name):
url = 'https://s3.us-east-2.amazonaws.com/dgl.ai/dataset/{}.zip'.format(name)
if not os.path.exists(os.path.join(path, name)):
print('File not found. Downloading from', url)
_download_and_extract(url, path, '{}.zip'.format(name))
self.path = os.path.join(path, name)
f_ent2id = os.path.join(self.path, 'entity2id.txt')
f_rel2id = os.path.join(self.path, 'relation2id.txt')
with open(f_ent2id) as f_ent:
self.n_entities = int(f_ent.readline()[:-1])
with open(f_rel2id) as f_rel:
self.n_relations = int(f_rel.readline()[:-1])
self.train = self.read_triple(self.path, 'train')
self.valid = self.read_triple(self.path, 'valid')
self.test = self.read_triple(self.path, 'test')
def read_triple(self, path, mode, skip_first_line=False):
triples = []
print('Reading {} triples....'.format(mode))
with open(os.path.join(path, '{}.txt'.format(mode))) as f:
if skip_first_line:
_ = f.readline()
for line in f:
h, t, r = line.strip().split('\t')
triples.append((int(h), int(r), int(t)))
print('Finished. Read {} {} triples.'.format(len(triples), mode))
return triples
def get_dataset(data_path, data_name, format_str):
if data_name == 'Freebase':
dataset = KGDataset2(data_path, data_name)
elif format_str == '1':
dataset = KGDataset1(data_path, data_name)
else:
dataset = KGDataset2(data_path, data_name)
return dataset
from .KGDataset import *
from .sampler import *
import math
import numpy as np
import scipy as sp
import dgl.backend as F
import dgl
import os
import pickle
import time
# This partitions a list of edges based on relations to make sure
# each partition has roughly the same number of edges and relations.
def RelationPartition(edges, n):
print('relation partition {} edges into {} parts'.format(len(edges), n))
rel = np.array([r for h, r, t in edges])
uniq, cnts = np.unique(rel, return_counts=True)
idx = np.flip(np.argsort(cnts))
cnts = cnts[idx]
uniq = uniq[idx]
assert cnts[0] > cnts[-1]
edge_cnts = np.zeros(shape=(n,), dtype=np.int64)
rel_cnts = np.zeros(shape=(n,), dtype=np.int64)
rel_dict = {}
for i in range(len(cnts)):
cnt = cnts[i]
r = uniq[i]
idx = np.argmin(edge_cnts)
rel_dict[r] = idx
edge_cnts[idx] += cnt
rel_cnts[idx] += 1
for i, edge_cnt in enumerate(edge_cnts):
print('part {} has {} edges and {} relations'.format(i, edge_cnt, rel_cnts[i]))
parts = []
for _ in range(n):
parts.append([])
for h, r, t in edges:
idx = rel_dict[r]
parts[idx].append((h, r, t))
return parts
def RandomPartition(edges, n):
print('random partition {} edges into {} parts'.format(len(edges), n))
idx = np.random.permutation(len(edges))
part_size = int(math.ceil(len(idx) / n))
parts = []
for i in range(n):
start = part_size * i
end = min(part_size * (i + 1), len(idx))
parts.append([edges[i] for i in idx[start:end]])
return parts
def ConstructGraph(edges, n_entities, i, args):
pickle_name = 'graph_train_{}.pickle'.format(i)
if args.pickle_graph and os.path.exists(os.path.join(args.data_path, args.dataset, pickle_name)):
with open(os.path.join(args.data_path, args.dataset, pickle_name), 'rb') as graph_file:
g = pickle.load(graph_file)
print('Load pickled graph.')
else:
src = [t[0] for t in edges]
etype_id = [t[1] for t in edges]
dst = [t[2] for t in edges]
coo = sp.sparse.coo_matrix((np.ones(len(src)), (src, dst)), shape=[n_entities, n_entities])
g = dgl.DGLGraph(coo, readonly=True, sort_csr=True)
g.ndata['id'] = F.arange(0, g.number_of_nodes())
g.edata['id'] = F.tensor(etype_id, F.int64)
if args.pickle_graph:
with open(os.path.join(args.data_path, args.dataset, pickle_name), 'wb') as graph_file:
pickle.dump(g, graph_file)
return g
class TrainDataset(object):
def __init__(self, dataset, args, weighting=False, ranks=64):
triples = dataset.train
print('|Train|:', len(triples))
if ranks > 1 and args.rel_part:
triples_list = RelationPartition(triples, ranks)
elif ranks > 1:
triples_list = RandomPartition(triples, ranks)
else:
triples_list = [triples]
self.graphs = []
for i, triples in enumerate(triples_list):
g = ConstructGraph(triples, dataset.n_entities, i, args)
if weighting:
# TODO: weight to be added
count = self.count_freq(triples)
subsampling_weight = np.vectorize(
lambda h, r, t: np.sqrt(1 / (count[(h, r)] + count[(t, -r - 1)]))
)
weight = subsampling_weight(src, etype_id, dst)
g.edata['weight'] = F.zerocopy_from_numpy(weight)
# to be added
self.graphs.append(g)
def count_freq(self, triples, start=4):
count = {}
for head, rel, tail in triples:
if (head, rel) not in count:
count[(head, rel)] = start
else:
count[(head, rel)] += 1
if (tail, -rel - 1) not in count:
count[(tail, -rel - 1)] = start
else:
count[(tail, -rel - 1)] += 1
return count
def create_sampler(self, batch_size, neg_sample_size=2, mode='head', num_workers=5,
shuffle=True, exclude_positive=False, rank=0):
EdgeSampler = getattr(dgl.contrib.sampling, 'EdgeSampler')
return EdgeSampler(self.graphs[rank],
batch_size=batch_size,
neg_sample_size=neg_sample_size,
negative_mode=mode,
num_workers=num_workers,
shuffle=shuffle,
exclude_positive=exclude_positive,
return_false_neg=False)
class PBGNegEdgeSubgraph(dgl.subgraph.DGLSubGraph):
def __init__(self, subg, num_chunks, chunk_size,
neg_sample_size, neg_head):
super(PBGNegEdgeSubgraph, self).__init__(subg._parent, subg.sgi)
self.subg = subg
self.num_chunks = num_chunks
self.chunk_size = chunk_size
self.neg_sample_size = neg_sample_size
self.neg_head = neg_head
@property
def head_nid(self):
return self.subg.head_nid
@property
def tail_nid(self):
return self.subg.tail_nid
def create_neg_subgraph(pos_g, neg_g, is_pbg, neg_head, num_nodes):
assert neg_g.number_of_edges() % pos_g.number_of_edges() == 0
neg_sample_size = int(neg_g.number_of_edges() / pos_g.number_of_edges())
# We use all nodes to create negative edges. Regardless of the sampling algorithm,
# we can always view the subgraph with one chunk.
if (neg_head and len(neg_g.head_nid) == num_nodes) \
or (not neg_head and len(neg_g.tail_nid) == num_nodes):
num_chunks = 1
chunk_size = pos_g.number_of_edges()
elif is_pbg:
if pos_g.number_of_edges() < neg_sample_size:
num_chunks = 1
chunk_size = pos_g.number_of_edges()
else:
# This is probably the last batch. Let's ignore it.
if pos_g.number_of_edges() % neg_sample_size > 0:
return None
num_chunks = int(pos_g.number_of_edges()/ neg_sample_size)
chunk_size = neg_sample_size
else:
num_chunks = pos_g.number_of_edges()
chunk_size = 1
return PBGNegEdgeSubgraph(neg_g, num_chunks, chunk_size,
neg_sample_size, neg_head)
class EvalSampler(object):
def __init__(self, g, edges, batch_size, neg_sample_size, mode, num_workers):
EdgeSampler = getattr(dgl.contrib.sampling, 'EdgeSampler')
self.sampler = EdgeSampler(g,
batch_size=batch_size,
seed_edges=edges,
neg_sample_size=neg_sample_size,
negative_mode=mode,
num_workers=num_workers,
shuffle=False,
exclude_positive=False,
relations=g.edata['id'],
return_false_neg=True)
self.sampler_iter = iter(self.sampler)
self.mode = mode
self.neg_head = 'head' in mode
self.g = g
def __iter__(self):
return self
def __next__(self):
while True:
pos_g, neg_g = next(self.sampler_iter)
neg_positive = neg_g.edata['false_neg']
neg_g = create_neg_subgraph(pos_g, neg_g, 'PBG' in self.mode,
self.neg_head, self.g.number_of_nodes())
if neg_g is not None:
break
pos_g.copy_from_parent()
neg_g.copy_from_parent()
neg_g.edata['bias'] = F.astype(-neg_positive, F.float32)
return pos_g, neg_g
def reset(self):
self.sampler_iter = iter(self.sampler)
return self
class EvalDataset(object):
def __init__(self, dataset, args):
triples = dataset.train + dataset.valid + dataset.test
pickle_name = 'graph_all.pickle'
if args.pickle_graph and os.path.exists(os.path.join(args.data_path, args.dataset, pickle_name)):
with open(os.path.join(args.data_path, args.dataset, pickle_name), 'rb') as graph_file:
g = pickle.load(graph_file)
print('Load pickled graph.')
else:
src = [t[0] for t in triples]
etype_id = [t[1] for t in triples]
dst = [t[2] for t in triples]
coo = sp.sparse.coo_matrix((np.ones(len(src)), (src, dst)), shape=[dataset.n_entities, dataset.n_entities])
g = dgl.DGLGraph(coo, readonly=True, sort_csr=True)
g.ndata['id'] = F.arange(0, g.number_of_nodes())
g.edata['id'] = F.tensor(etype_id, F.int64)
if args.pickle_graph:
with open(os.path.join(args.data_path, args.dataset, pickle_name), 'wb') as graph_file:
pickle.dump(g, graph_file)
self.g = g
self.num_train = len(dataset.train)
self.num_valid = len(dataset.valid)
self.num_test = len(dataset.test)
if args.eval_percent < 1:
self.valid = np.random.randint(0, self.num_valid,
size=(int(self.num_valid * args.eval_percent),)) + self.num_train
else:
self.valid = np.arange(self.num_train, self.num_train + self.num_valid)
print('|valid|:', len(self.valid))
if args.eval_percent < 1:
self.test = np.random.randint(0, self.num_test,
size=(int(self.num_test * args.eval_percent,)))
self.test += self.num_train + self.num_valid
else:
self.test = np.arange(self.num_train + self.num_valid, self.g.number_of_edges())
print('|test|:', len(self.test))
self.num_valid = len(self.valid)
self.num_test = len(self.test)
def get_edges(self, eval_type):
if eval_type == 'valid':
return self.valid
elif eval_type == 'test':
return self.test
else:
raise Exception('get invalid type: ' + eval_type)
def check(self, eval_type):
edges = self.get_edges(eval_type)
subg = self.g.edge_subgraph(edges)
if eval_type == 'valid':
data = self.valid
elif eval_type == 'test':
data = self.test
subg.copy_from_parent()
src, dst, eid = subg.all_edges('all', order='eid')
src_id = subg.ndata['id'][src]
dst_id = subg.ndata['id'][dst]
etype = subg.edata['id'][eid]
orig_src = np.array([t[0] for t in data])
orig_etype = np.array([t[1] for t in data])
orig_dst = np.array([t[2] for t in data])
np.testing.assert_equal(F.asnumpy(src_id), orig_src)
np.testing.assert_equal(F.asnumpy(dst_id), orig_dst)
np.testing.assert_equal(F.asnumpy(etype), orig_etype)
def create_sampler(self, eval_type, batch_size, neg_sample_size, mode='head',
num_workers=5, rank=0, ranks=1):
edges = self.get_edges(eval_type)
beg = edges.shape[0] * rank // ranks
end = min(edges.shape[0] * (rank + 1) // ranks, edges.shape[0])
edges = edges[beg: end]
print("eval on {} edges".format(len(edges)))
return EvalSampler(self.g, edges, batch_size, neg_sample_size, mode, num_workers)
class NewBidirectionalOneShotIterator:
def __init__(self, dataloader_head, dataloader_tail, is_pbg, num_nodes):
self.sampler_head = dataloader_head
self.sampler_tail = dataloader_tail
self.iterator_head = self.one_shot_iterator(dataloader_head, is_pbg,
True, num_nodes)
self.iterator_tail = self.one_shot_iterator(dataloader_tail, is_pbg,
False, num_nodes)
self.step = 0
def __next__(self):
self.step += 1
if self.step % 2 == 0:
pos_g, neg_g = next(self.iterator_head)
else:
pos_g, neg_g = next(self.iterator_tail)
return pos_g, neg_g
@staticmethod
def one_shot_iterator(dataloader, is_pbg, neg_head, num_nodes):
while True:
for pos_g, neg_g in dataloader:
neg_g = create_neg_subgraph(pos_g, neg_g, is_pbg, neg_head, num_nodes)
if neg_g is None:
continue
pos_g.copy_from_parent()
neg_g.copy_from_parent()
yield pos_g, neg_g
from dataloader import EvalDataset, TrainDataset
from dataloader import get_dataset
import argparse
import torch.multiprocessing as mp
import os
import logging
import time
import pickle
import line_profiler
backend = os.environ.get('DGLBACKEND')
if backend.lower() == 'mxnet':
from train_mxnet import load_model_from_checkpoint
from train_mxnet import test
else:
from train_pytorch import load_model_from_checkpoint
from train_pytorch import test
class ArgParser(argparse.ArgumentParser):
def __init__(self):
super(ArgParser, self).__init__()
self.add_argument('--model_name', default='TransE',
choices=['TransE', 'TransH', 'TransR', 'TransD',
'RESCAL', 'DistMult', 'ComplEx', 'RotatE', 'pRotatE'],
help='model to use')
self.add_argument('--data_path', type=str, default='data',
help='root path of all dataset')
self.add_argument('--dataset', type=str, default='FB15k',
help='dataset name, under data_path')
self.add_argument('--format', type=str, default='1',
help='the format of the dataset.')
self.add_argument('--model_path', type=str, default='ckpts',
help='the place where models are saved')
self.add_argument('--batch_size', type=int, default=8,
help='batch size used for eval and test')
self.add_argument('--neg_sample_size', type=int, default=-1,
help='negative sampling size for testing')
self.add_argument('--hidden_dim', type=int, default=256,
help='hidden dim used by relation and entity')
self.add_argument('-g', '--gamma', type=float, default=12.0,
help='margin value')
self.add_argument('--eval_percent', type=float, default=1,
help='sample some percentage for evaluation.')
self.add_argument('--gpu', type=int, default=-1,
help='use GPU')
self.add_argument('--mix_cpu_gpu', action='store_true',
help='mix CPU and GPU training')
self.add_argument('-de', '--double_ent', action='store_true',
help='double entitiy dim for complex number')
self.add_argument('-dr', '--double_rel', action='store_true',
help='double relation dim for complex number')
self.add_argument('--seed', type=int, default=0,
help='set random seed fro reproducibility')
self.add_argument('--num_worker', type=int, default=16,
help='number of workers used for loading data')
self.add_argument('--num_proc', type=int, default=1,
help='number of process used')
def parse_args(self):
args = super().parse_args()
return args
def get_logger(args):
if not os.path.exists(args.model_path):
raise Exception('No existing model_path: ' + args.model_path)
log_file = os.path.join(args.model_path, 'eval.log')
logging.basicConfig(
format='%(asctime)s %(levelname)-8s %(message)s',
level=logging.INFO,
datefmt='%Y-%m-%d %H:%M:%S',
filename=log_file,
filemode='w'
)
logger = logging.getLogger(__name__)
print("Logs are being recorded at: {}".format(log_file))
return logger
def main(args):
# load dataset and samplers
dataset = get_dataset(args.data_path, args.dataset, args.format)
args.pickle_graph = False
args.train = False
args.valid = False
args.test = True
args.batch_size_eval = args.batch_size
logger = get_logger(args)
# Here we want to use the regualr negative sampler because we need to ensure that
# all positive edges are excluded.
eval_dataset = EvalDataset(dataset, args)
args.neg_sample_size_test = args.neg_sample_size
if args.neg_sample_size < 0:
args.neg_sample_size_test = args.neg_sample_size = eval_dataset.g.number_of_nodes()
if args.num_proc > 1:
test_sampler_tails = []
test_sampler_heads = []
for i in range(args.num_proc):
test_sampler_head = eval_dataset.create_sampler('test', args.batch_size,
args.neg_sample_size,
mode='head',
num_workers=args.num_worker,
rank=i, ranks=args.num_proc)
test_sampler_tail = eval_dataset.create_sampler('test', args.batch_size,
args.neg_sample_size,
mode='tail',
num_workers=args.num_worker,
rank=i, ranks=args.num_proc)
test_sampler_heads.append(test_sampler_head)
test_sampler_tails.append(test_sampler_tail)
else:
test_sampler_head = eval_dataset.create_sampler('test', args.batch_size,
args.neg_sample_size,
mode='head',
num_workers=args.num_worker,
rank=0, ranks=1)
test_sampler_tail = eval_dataset.create_sampler('test', args.batch_size,
args.neg_sample_size,
mode='tail',
num_workers=args.num_worker,
rank=0, ranks=1)
# load model
n_entities = dataset.n_entities
n_relations = dataset.n_relations
ckpt_path = args.model_path
model = load_model_from_checkpoint(logger, args, n_entities, n_relations, ckpt_path)
if args.num_proc > 1:
model.share_memory()
# test
args.step = 0
args.max_step = 0
if args.num_proc > 1:
procs = []
for i in range(args.num_proc):
proc = mp.Process(target=test, args=(args, model, [test_sampler_heads[i], test_sampler_tails[i]]))
procs.append(proc)
proc.start()
for proc in procs:
proc.join()
else:
test(args, model, [test_sampler_head, test_sampler_tail])
if __name__ == '__main__':
args = ArgParser().parse_args()
main(args)
from .general_models import KEModel
import os
import numpy as np
import dgl.backend as F
backend = os.environ.get('DGLBACKEND')
if backend.lower() == 'mxnet':
from .mxnet.tensor_models import logsigmoid
from .mxnet.tensor_models import get_device
from .mxnet.tensor_models import norm
from .mxnet.tensor_models import get_scalar
from .mxnet.tensor_models import reshape
from .mxnet.tensor_models import cuda
from .mxnet.tensor_models import ExternalEmbedding
from .mxnet.score_fun import *
else:
from .pytorch.tensor_models import logsigmoid
from .pytorch.tensor_models import get_device
from .pytorch.tensor_models import norm
from .pytorch.tensor_models import get_scalar
from .pytorch.tensor_models import reshape
from .pytorch.tensor_models import cuda
from .pytorch.tensor_models import ExternalEmbedding
from .pytorch.score_fun import *
class KEModel(object):
def __init__(self, args, model_name, n_entities, n_relations, hidden_dim, gamma,
double_entity_emb=False, double_relation_emb=False):
super(KEModel, self).__init__()
self.args = args
self.n_entities = n_entities
self.model_name = model_name
self.hidden_dim = hidden_dim
self.eps = 2.0
self.emb_init = (gamma + self.eps) / hidden_dim
entity_dim = 2 * hidden_dim if double_entity_emb else hidden_dim
relation_dim = 2 * hidden_dim if double_relation_emb else hidden_dim
device = get_device(args)
self.entity_emb = ExternalEmbedding(args, n_entities, entity_dim,
F.cpu() if args.mix_cpu_gpu else device)
# For RESCAL, relation_emb = relation_dim * entity_dim
if model_name == 'RESCAL':
rel_dim = relation_dim * entity_dim
else:
rel_dim = relation_dim
self.relation_emb = ExternalEmbedding(args, n_relations, rel_dim, device)
if model_name == 'TransE':
self.score_func = TransEScore(gamma)
elif model_name == 'DistMult':
self.score_func = DistMultScore()
elif model_name == 'ComplEx':
self.score_func = ComplExScore()
self.head_neg_score = self.score_func.create_neg(True)
self.tail_neg_score = self.score_func.create_neg(False)
self.reset_parameters()
def share_memory(self):
# TODO(zhengda) we should make it work for parameters in score func
self.entity_emb.share_memory()
self.relation_emb.share_memory()
def save_emb(self, path, dataset):
self.entity_emb.save(path, dataset+'_'+self.model_name+'_entity')
self.relation_emb.save(path, dataset+'_'+self.model_name+'_relation')
self.score_func.save(path, dataset)
def load_emb(self, path, dataset):
self.entity_emb.load(path, dataset+'_'+self.model_name+'_entity')
self.relation_emb.load(path, dataset+'_'+self.model_name+'_relation')
self.score_func.load(path, dataset)
def reset_parameters(self):
self.entity_emb.init(self.emb_init)
self.relation_emb.init(self.emb_init)
self.score_func.reset_parameters()
def predict_score(self, g):
self.score_func(g)
return g.edata['score']
def predict_neg_score(self, pos_g, neg_g, to_device=None, gpu_id=-1, trace=False):
num_chunks = neg_g.num_chunks
chunk_size = neg_g.chunk_size
neg_sample_size = neg_g.neg_sample_size
if neg_g.neg_head:
neg_head_ids = neg_g.ndata['id'][neg_g.head_nid]
neg_head = self.entity_emb(neg_head_ids, gpu_id, trace)
_, tail_ids = pos_g.all_edges(order='eid')
if to_device is not None and gpu_id >= 0:
tail_ids = to_device(tail_ids, gpu_id)
tail = pos_g.ndata['emb'][tail_ids]
rel = pos_g.edata['emb']
neg_score = self.head_neg_score(neg_head, rel, tail,
num_chunks, chunk_size, neg_sample_size)
else:
neg_tail_ids = neg_g.ndata['id'][neg_g.tail_nid]
neg_tail = self.entity_emb(neg_tail_ids, gpu_id, trace)
head_ids, _ = pos_g.all_edges(order='eid')
if to_device is not None and gpu_id >= 0:
head_ids = to_device(head_ids, gpu_id)
head = pos_g.ndata['emb'][head_ids]
rel = pos_g.edata['emb']
neg_score = self.tail_neg_score(head, rel, neg_tail,
num_chunks, chunk_size, neg_sample_size)
return neg_score
def forward_test(self, pos_g, neg_g, logs, gpu_id=-1):
pos_g.ndata['emb'] = self.entity_emb(pos_g.ndata['id'], gpu_id, False)
pos_g.edata['emb'] = self.relation_emb(pos_g.edata['id'], gpu_id, False)
batch_size = pos_g.number_of_edges()
pos_scores = self.predict_score(pos_g)
pos_scores = reshape(logsigmoid(pos_scores), batch_size, -1)
neg_scores = self.predict_neg_score(pos_g, neg_g, to_device=cuda,
gpu_id=gpu_id, trace=False)
neg_scores = reshape(logsigmoid(neg_scores), batch_size, -1)
# We need to filter the positive edges in the negative graph.
filter_bias = reshape(neg_g.edata['bias'], batch_size, -1)
if self.args.gpu >= 0:
filter_bias = cuda(filter_bias, self.args.gpu)
neg_scores += filter_bias
# To compute the rank of a positive edge among all negative edges,
# we need to know how many negative edges have higher scores than
# the positive edge.
rankings = F.sum(neg_scores > pos_scores, dim=1) + 1
rankings = F.asnumpy(rankings)
for i in range(batch_size):
ranking = rankings[i]
logs.append({
'MRR': 1.0 / ranking,
'MR': float(ranking),
'HITS@1': 1.0 if ranking <= 1 else 0.0,
'HITS@3': 1.0 if ranking <= 3 else 0.0,
'HITS@10': 1.0 if ranking <= 10 else 0.0
})
# @profile
def forward(self, pos_g, neg_g, gpu_id=-1):
pos_g.ndata['emb'] = self.entity_emb(pos_g.ndata['id'], gpu_id, True)
pos_g.edata['emb'] = self.relation_emb(pos_g.edata['id'], gpu_id, True)
pos_score = self.predict_score(pos_g)
pos_score = logsigmoid(pos_score)
if gpu_id >= 0:
neg_score = self.predict_neg_score(pos_g, neg_g, to_device=cuda,
gpu_id=gpu_id, trace=True)
else:
neg_score = self.predict_neg_score(pos_g, neg_g, trace=True)
neg_score = reshape(neg_score, -1, neg_g.neg_sample_size)
# Adversarial sampling
if self.args.neg_adversarial_sampling:
neg_score = F.sum(F.softmax(neg_score * self.args.adversarial_temperature, dim=1).detach()
* logsigmoid(-neg_score), dim=1)
else:
neg_score = F.mean(logsigmoid(-neg_score), dim=1)
# subsampling weight
# TODO: add subsampling to new sampler
if self.args.non_uni_weight:
subsampling_weight = pos_g.edata['weight']
pos_score = (pos_score * subsampling_weight).sum() / subsampling_weight.sum()
neg_score = (neg_score * subsampling_weight).sum() / subsampling_weight.sum()
else:
pos_score = pos_score.mean()
neg_score = neg_score.mean()
# compute loss
loss = -(pos_score + neg_score) / 2
log = {'pos_loss': - get_scalar(pos_score),
'neg_loss': - get_scalar(neg_score),
'loss': get_scalar(loss)}
# regularization: TODO(zihao)
#TODO: only reg ent&rel embeddings. other params to be added.
if self.args.regularization_coef > 0.0 and self.args.regularization_norm > 0:
coef, nm = self.args.regularization_coef, self.args.regularization_norm
reg = coef * (norm(self.entity_emb.curr_emb(), nm) + norm(self.relation_emb.curr_emb(), nm))
log['regularization'] = get_scalar(reg)
loss = loss + reg
return loss, log
def update(self):
self.entity_emb.update()
self.relation_emb.update()
import mxnet as mx
from mxnet import gluon
from mxnet.gluon import nn
from mxnet import ndarray as nd
class TransEScore(nn.Block):
def __init__(self, gamma):
super(TransEScore, self).__init__()
self.gamma = gamma
def edge_func(self, edges):
head = edges.src['emb']
tail = edges.dst['emb']
rel = edges.data['emb']
score = head + rel - tail
return {'score': self.gamma - nd.norm(score, ord=1, axis=-1)}
def reset_parameters(self):
pass
def save(self, path, name):
pass
def load(self, path, name):
pass
def forward(self, g):
g.apply_edges(lambda edges: self.edge_func(edges))
def create_neg(self, neg_head):
gamma = self.gamma
if neg_head:
def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
hidden_dim = heads.shape[1]
heads = heads.reshape(num_chunks, 1, neg_sample_size, hidden_dim)
tails = tails - relations
tails = tails.reshape(num_chunks,chunk_size, 1, hidden_dim)
return gamma - nd.norm(heads - tails, ord=1, axis=-1)
return fn
else:
def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
hidden_dim = heads.shape[1]
heads = heads + relations
heads = heads.reshape(num_chunks, chunk_size, 1, hidden_dim)
tails = tails.reshape(num_chunks, 1, neg_sample_size, hidden_dim)
return gamma - nd.norm(heads - tails, ord=1, axis=-1)
return fn
class DistMultScore(nn.Block):
def __init__(self):
super(DistMultScore, self).__init__()
def edge_func(self, edges):
head = edges.src['emb']
tail = edges.dst['emb']
rel = edges.data['emb']
score = head * rel * tail
# TODO: check if there exists minus sign and if gamma should be used here(jin)
return {'score': nd.sum(score, axis=-1)}
def reset_parameters(self):
pass
def save(self, path, name):
pass
def load(self, path, name):
pass
def forward(self, g):
g.apply_edges(lambda edges: self.edge_func(edges))
def create_neg(self, neg_head):
if neg_head:
def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
hidden_dim = heads.shape[1]
heads = heads.reshape(num_chunks, neg_sample_size, hidden_dim)
heads = nd.transpose(heads, axes=(0, 2, 1))
tmp = (tails * relations).reshape(num_chunks, chunk_size, hidden_dim)
return nd.linalg_gemm2(tmp, heads)
return fn
else:
def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
hidden_dim = heads.shape[1]
tails = tails.reshape(num_chunks, neg_sample_size, hidden_dim)
tails = nd.transpose(tails, axes=(0, 2, 1))
tmp = (heads * relations).reshape(num_chunks, chunk_size, hidden_dim)
return nd.linalg_gemm2(tmp, tails)
return fn
import os
import numpy as np
import mxnet as mx
from mxnet import gluon
from mxnet import ndarray as nd
from .score_fun import *
from .. import *
def logsigmoid(val):
max_elem = nd.maximum(0., -val)
z = nd.exp(-max_elem) + nd.exp(-val - max_elem)
return -(max_elem + nd.log(z))
get_device = lambda args : mx.gpu(args.gpu) if args.gpu >= 0 else mx.cpu()
norm = lambda x, p: nd.sum(nd.abs(x) ** p)
get_scalar = lambda x: x.detach().asscalar()
reshape = lambda arr, x, y: arr.reshape(x, y)
cuda = lambda arr, gpu: arr.as_in_context(mx.gpu(gpu))
class ExternalEmbedding:
def __init__(self, args, num, dim, ctx):
self.gpu = args.gpu
self.args = args
self.trace = []
self.emb = nd.empty((num, dim), dtype=np.float32, ctx=ctx)
self.state_sum = nd.zeros((self.emb.shape[0]), dtype=np.float32, ctx=ctx)
self.state_step = 0
def init(self, emb_init):
nd.random.uniform(-emb_init, emb_init,
shape=self.emb.shape, dtype=self.emb.dtype,
ctx=self.emb.context, out=self.emb)
def share_memory(self):
# TODO(zhengda) fix this later
pass
def __call__(self, idx, gpu_id=-1, trace=True):
if self.emb.context != idx.context:
idx = idx.as_in_context(self.emb.context)
data = nd.take(self.emb, idx)
if self.gpu >= 0:
data = data.as_in_context(mx.gpu(self.gpu))
data.attach_grad()
if trace:
self.trace.append((idx, data))
return data
def update(self):
self.state_step += 1
for idx, data in self.trace:
grad = data.grad
clr = self.args.lr
#clr = self.args.lr / (1 + (self.state_step - 1) * group['lr_decay'])
# the update is non-linear so indices must be unique
grad_indices = idx
grad_values = grad
grad_sum = (grad_values * grad_values).mean(1)
ctx = self.state_sum.context
if ctx != grad_indices.context:
grad_indices = grad_indices.as_in_context(ctx)
if ctx != grad_sum.context:
grad_sum = grad_sum.as_in_context(ctx)
self.state_sum[grad_indices] += grad_sum
std = self.state_sum[grad_indices] # _sparse_mask
std_values = nd.expand_dims(nd.sqrt(std) + 1e-10, 1)
if self.gpu >= 0:
std_values = std_values.as_in_context(mx.gpu(self.args.gpu))
tmp = (-clr * grad_values / std_values)
if tmp.context != ctx:
tmp = tmp.as_in_context(ctx)
# TODO(zhengda) the overhead is here.
self.emb[grad_indices] = mx.nd.take(self.emb, grad_indices) + tmp
self.trace = []
def curr_emb(self):
data = [data for _, data in self.trace]
return nd.concat(*data, dim=0)
def save(self, path, name):
emb_fname = os.path.join(path, name+'.emb')
nd.save(emb_fname, self.emb)
def load(self, path, name):
emb_fname = os.path.join(path, name+'.emb')
self.emb = nd.load(emb_fname)[0]
import torch as th
import torch.nn as nn
import torch.nn.functional as functional
import torch.nn.init as INIT
class TransEScore(nn.Module):
def __init__(self, gamma):
super(TransEScore, self).__init__()
self.gamma = gamma
def edge_func(self, edges):
head = edges.src['emb']
tail = edges.dst['emb']
rel = edges.data['emb']
score = head + rel - tail
return {'score': self.gamma - th.norm(score, p=1, dim=-1)}
def forward(self, g):
g.apply_edges(lambda edges: self.edge_func(edges))
def reset_parameters(self):
pass
def save(self, path, name):
pass
def load(self, path, name):
pass
def create_neg(self, neg_head):
gamma = self.gamma
if neg_head:
def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
hidden_dim = heads.shape[1]
heads = heads.reshape(num_chunks, neg_sample_size, hidden_dim)
tails = tails - relations
tails = tails.reshape(num_chunks, chunk_size, hidden_dim)
return gamma - th.cdist(tails, heads, p=1)
return fn
else:
def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
hidden_dim = heads.shape[1]
heads = heads + relations
heads = heads.reshape(num_chunks, chunk_size, hidden_dim)
tails = tails.reshape(num_chunks, neg_sample_size, hidden_dim)
return gamma - th.cdist(heads, tails, p=1)
return fn
class DistMultScore(nn.Module):
def __init__(self):
super(DistMultScore, self).__init__()
def edge_func(self, edges):
head = edges.src['emb']
tail = edges.dst['emb']
rel = edges.data['emb']
score = head * rel * tail
# TODO: check if there exists minus sign and if gamma should be used here(jin)
return {'score': th.sum(score, dim=-1)}
def reset_parameters(self):
pass
def save(self, path, name):
pass
def load(self, path, name):
pass
def forward(self, g):
g.apply_edges(lambda edges: self.edge_func(edges))
def create_neg(self, neg_head):
if neg_head:
def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
hidden_dim = heads.shape[1]
heads = heads.reshape(num_chunks, neg_sample_size, hidden_dim)
heads = th.transpose(heads, 1, 2)
tmp = (tails * relations).reshape(num_chunks, chunk_size, hidden_dim)
return th.bmm(tmp, heads)
return fn
else:
def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
hidden_dim = tails.shape[1]
tails = tails.reshape(num_chunks, neg_sample_size, hidden_dim)
tails = th.transpose(tails, 1, 2)
tmp = (heads * relations).reshape(num_chunks, chunk_size, hidden_dim)
return th.bmm(tmp, tails)
return fn
class ComplExScore(nn.Module):
def __init__(self):
super(ComplExScore, self).__init__()
def edge_func(self, edges):
real_head, img_head = th.chunk(edges.src['emb'], 2, dim=-1)
real_tail, img_tail = th.chunk(edges.dst['emb'], 2, dim=-1)
real_rel, img_rel = th.chunk(edges.data['emb'], 2, dim=-1)
score = real_head * real_tail * real_rel \
+ img_head * img_tail * real_rel \
+ real_head * img_tail * img_rel \
- img_head * real_tail * img_rel
# TODO: check if there exists minus sign and if gamma should be used here(jin)
return {'score': th.sum(score, -1)}
def reset_parameters(self):
pass
def save(self, path, name):
pass
def load(self, path, name):
pass
def forward(self, g):
g.apply_edges(lambda edges: self.edge_func(edges))
def create_neg(self, neg_head):
if neg_head:
def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
hidden_dim = heads.shape[1]
emb_real = tails[..., :hidden_dim // 2]
emb_imag = tails[..., hidden_dim // 2:]
rel_real = relations[..., :hidden_dim // 2]
rel_imag = relations[..., hidden_dim // 2:]
real = emb_real * rel_real + emb_imag * rel_imag
imag = -emb_real * rel_imag + emb_imag * rel_real
emb_complex = th.cat((real, imag), dim=-1)
tmp = emb_complex.reshape(num_chunks, chunk_size, hidden_dim)
heads = heads.reshape(num_chunks, neg_sample_size, hidden_dim)
heads = th.transpose(heads, 1, 2)
return th.bmm(tmp, heads)
return fn
else:
def fn(heads, relations, tails, num_chunks, chunk_size, neg_sample_size):
hidden_dim = heads.shape[1]
emb_real = heads[..., :hidden_dim // 2]
emb_imag = heads[..., hidden_dim // 2:]
rel_real = relations[..., :hidden_dim // 2]
rel_imag = relations[..., hidden_dim // 2:]
real = emb_real * rel_real - emb_imag * rel_imag
imag = emb_real * rel_imag + emb_imag * rel_real
emb_complex = th.cat((real, imag), dim=-1)
tmp = emb_complex.reshape(num_chunks, chunk_size, hidden_dim)
tails = tails.reshape(num_chunks, neg_sample_size, hidden_dim)
tails = th.transpose(tails, 1, 2)
return th.bmm(tmp, tails)
return fn
class RESCALScore(nn.Module):
def __init__(self, relation_dim, entity_dim):
super(RESCALScore, self).__init__()
self.relation_dim = relation_dim
self.entity_dim = entity_dim
def edge_func(self, edges):
head = edges.src['emb']
tail = edges.dst['emb'].unsqueeze(-1)
rel = edges.data['emb']
rel = rel.view(-1, self.relation_dim, self.entity_dim)
score = head * th.matmul(rel, tail).squeeze(-1)
# TODO: check if use self.gamma
return {'score': th.sum(score, dim=-1)}
# return {'score': self.gamma - th.norm(score, p=1, dim=-1)}
def reset_parameters(self):
pass
def save(self, path, name):
pass
def load(self, path, name):
pass
def forward(self, g):
g.apply_edges(lambda edges: self.edge_func(edges))
"""
Knowledge Graph Embedding Models.
1. TransE
2. DistMult
3. ComplEx
4. RotatE
5. pRotatE
6. TransH
7. TransR
8. TransD
9. RESCAL
"""
import os
import numpy as np
import torch as th
import torch.nn as nn
import torch.nn.functional as functional
import torch.nn.init as INIT
from .. import *
logsigmoid = functional.logsigmoid
def get_device(args):
return th.device('cpu') if args.gpu < 0 else th.device('cuda:' + str(args.gpu))
norm = lambda x, p: x.norm(p=p)**p
get_scalar = lambda x: x.detach().item()
reshape = lambda arr, x, y: arr.view(x, y)
cuda = lambda arr, gpu: arr.cuda(gpu)
class ExternalEmbedding:
def __init__(self, args, num, dim, device):
self.gpu = args.gpu
self.args = args
self.trace = []
self.emb = th.empty(num, dim, dtype=th.float32, device=device)
self.state_sum = self.emb.new().resize_(self.emb.size(0)).zero_()
self.state_step = 0
def init(self, emb_init):
INIT.uniform_(self.emb, -emb_init, emb_init)
INIT.zeros_(self.state_sum)
def share_memory(self):
self.emb.share_memory_()
self.state_sum.share_memory_()
def __call__(self, idx, gpu_id=-1, trace=True):
s = self.emb[idx]
if self.gpu >= 0:
s = s.cuda(self.gpu)
data = s.clone().detach().requires_grad_(True)
if trace:
self.trace.append((idx, data))
return data
def update(self):
self.state_step += 1
with th.no_grad():
for idx, data in self.trace:
grad = data.grad.data
clr = self.args.lr
#clr = self.args.lr / (1 + (self.state_step - 1) * group['lr_decay'])
# the update is non-linear so indices must be unique
grad_indices = idx
grad_values = grad
grad_sum = (grad_values * grad_values).mean(1)
device = self.state_sum.device
if device != grad_indices.device:
grad_indices = grad_indices.to(device)
if device != grad_sum.device:
grad_sum = grad_sum.to(device)
self.state_sum.index_add_(0, grad_indices, grad_sum)
std = self.state_sum[grad_indices] # _sparse_mask
std_values = std.sqrt_().add_(1e-10).unsqueeze(1)
if self.gpu >= 0:
std_values = std_values.cuda(self.args.gpu)
tmp = (-clr * grad_values / std_values)
if tmp.device != device:
tmp = tmp.to(device)
# TODO(zhengda) the overhead is here.
self.emb.index_add_(0, grad_indices, tmp)
self.trace = []
def curr_emb(self):
data = [data for _, data in self.trace]
return th.cat(data, 0)
def save(self, path, name):
file_name = os.path.join(path, name)
np.save(file_name, self.emb.cpu().detach().numpy())
def load(self, path, name):
file_name = os.path.join(path, name+'.npy')
self.emb = th.Tensor(np.load(file_name))
import os
import scipy as sp
import dgl
import numpy as np
import dgl.backend as F
import dgl
backend = os.environ.get('DGLBACKEND')
if backend.lower() == 'mxnet':
from models.mxnet.score_fun import *
else:
from models.pytorch.score_fun import *
from models.general_models import KEModel
from dataloader.sampler import create_neg_subgraph
def generate_rand_graph(n):
arr = (sp.sparse.random(n, n, density=0.1, format='coo') != 0).astype(np.int64)
g = dgl.DGLGraph(arr, readonly=True)
num_rels = 10
entity_emb = F.uniform((g.number_of_nodes(), 10), F.float32, F.cpu(), 0, 1)
rel_emb = F.uniform((num_rels, 10), F.float32, F.cpu(), 0, 1)
g.ndata['id'] = F.arange(0, g.number_of_nodes())
rel_ids = np.random.randint(0, num_rels, g.number_of_edges(), dtype=np.int64)
g.edata['id'] = F.tensor(rel_ids, F.int64)
return g, entity_emb, rel_emb
ke_score_funcs = {'TransE': TransEScore(12.0),
'DistMult': DistMultScore()}
class BaseKEModel:
def __init__(self, score_func, entity_emb, rel_emb):
self.score_func = score_func
self.head_neg_score = self.score_func.create_neg(True)
self.tail_neg_score = self.score_func.create_neg(False)
self.entity_emb = entity_emb
self.rel_emb = rel_emb
def predict_score(self, g):
g.ndata['emb'] = self.entity_emb[g.ndata['id']]
g.edata['emb'] = self.rel_emb[g.edata['id']]
self.score_func(g)
return g.edata['score']
def predict_neg_score(self, pos_g, neg_g):
pos_g.ndata['emb'] = self.entity_emb[pos_g.ndata['id']]
pos_g.edata['emb'] = self.rel_emb[pos_g.edata['id']]
neg_g.ndata['emb'] = self.entity_emb[neg_g.ndata['id']]
neg_g.edata['emb'] = self.rel_emb[neg_g.edata['id']]
num_chunks = neg_g.num_chunks
chunk_size = neg_g.chunk_size
neg_sample_size = neg_g.neg_sample_size
if neg_g.neg_head:
neg_head_ids = neg_g.ndata['id'][neg_g.head_nid]
neg_head = self.entity_emb[neg_head_ids]
_, tail_ids = pos_g.all_edges(order='eid')
tail = pos_g.ndata['emb'][tail_ids]
rel = pos_g.edata['emb']
neg_score = self.head_neg_score(neg_head, rel, tail,
num_chunks, chunk_size, neg_sample_size)
else:
neg_tail_ids = neg_g.ndata['id'][neg_g.tail_nid]
neg_tail = self.entity_emb[neg_tail_ids]
head_ids, _ = pos_g.all_edges(order='eid')
head = pos_g.ndata['emb'][head_ids]
rel = pos_g.edata['emb']
neg_score = self.tail_neg_score(head, rel, neg_tail,
num_chunks, chunk_size, neg_sample_size)
return neg_score
def check_score_func(func_name):
batch_size = 10
neg_sample_size = 10
g, entity_emb, rel_emb = generate_rand_graph(100)
hidden_dim = entity_emb.shape[1]
ke_score_func = ke_score_funcs[func_name]
model = BaseKEModel(ke_score_func, entity_emb, rel_emb)
EdgeSampler = getattr(dgl.contrib.sampling, 'EdgeSampler')
sampler = EdgeSampler(g, batch_size=batch_size,
neg_sample_size=neg_sample_size,
negative_mode='PBG-head',
num_workers=1,
shuffle=False,
exclude_positive=False,
return_false_neg=False)
for pos_g, neg_g in sampler:
neg_g = create_neg_subgraph(pos_g, neg_g, True, True, g.number_of_nodes())
pos_g.copy_from_parent()
neg_g.copy_from_parent()
score1 = F.reshape(model.predict_score(neg_g), (batch_size, -1))
score2 = model.predict_neg_score(pos_g, neg_g)
score2 = F.reshape(score2, (batch_size, -1))
np.testing.assert_allclose(F.asnumpy(score1), F.asnumpy(score2),
rtol=1e-5, atol=1e-5)
def test_score_func():
for key in ke_score_funcs:
check_score_func(key)
if __name__ == '__main__':
test_score_func()
from dataloader import EvalDataset, TrainDataset, NewBidirectionalOneShotIterator
from dataloader import get_dataset
import torch.multiprocessing as mp
import argparse
import os
import logging
import time
backend = os.environ.get('DGLBACKEND')
if backend.lower() == 'mxnet':
from train_mxnet import load_model
from train_mxnet import train
from train_mxnet import test
else:
from train_pytorch import load_model
from train_pytorch import train
from train_pytorch import test
class ArgParser(argparse.ArgumentParser):
def __init__(self):
super(ArgParser, self).__init__()
self.add_argument('--model_name', default='TransE',
choices=['TransE', 'TransH', 'TransR', 'TransD',
'RESCAL', 'DistMult', 'ComplEx', 'RotatE', 'pRotatE'],
help='model to use')
self.add_argument('--data_path', type=str, default='data',
help='root path of all dataset')
self.add_argument('--dataset', type=str, default='FB15k',
help='dataset name, under data_path')
self.add_argument('--format', type=str, default='1',
help='the format of the dataset.')
self.add_argument('--save_path', type=str, default='ckpts',
help='place to save models and logs')
self.add_argument('--save_emb', type=str, default=None,
help='save the embeddings in the specific location.')
self.add_argument('--max_step', type=int, default=80000,
help='train xx steps')
self.add_argument('--warm_up_step', type=int, default=None,
help='for learning rate decay')
self.add_argument('--batch_size', type=int, default=1024,
help='batch size')
self.add_argument('--batch_size_eval', type=int, default=8,
help='batch size used for eval and test')
self.add_argument('--neg_sample_size', type=int, default=128,
help='negative sampling size')
self.add_argument('--neg_sample_size_valid', type=int, default=1000,
help='negative sampling size for validation')
self.add_argument('--neg_sample_size_test', type=int, default=-1,
help='negative sampling size for testing')
self.add_argument('--hidden_dim', type=int, default=256,
help='hidden dim used by relation and entity')
self.add_argument('--lr', type=float, default=0.0001,
help='learning rate')
self.add_argument('-g', '--gamma', type=float, default=12.0,
help='margin value')
self.add_argument('--eval_percent', type=float, default=1,
help='sample some percentage for evaluation.')
self.add_argument('--gpu', type=int, default=-1,
help='use GPU')
self.add_argument('--mix_cpu_gpu', action='store_true',
help='mix CPU and GPU training')
self.add_argument('-de', '--double_ent', action='store_true',
help='double entitiy dim for complex number')
self.add_argument('-dr', '--double_rel', action='store_true',
help='double relation dim for complex number')
self.add_argument('--seed', type=int, default=0,
help='set random seed fro reproducibility')
self.add_argument('-log', '--log_interval', type=int, default=1000,
help='do evaluation after every x steps')
self.add_argument('--eval_interval', type=int, default=10000,
help='do evaluation after every x steps')
self.add_argument('-adv', '--neg_adversarial_sampling', action='store_true',
help='if use negative adversarial sampling')
self.add_argument('-a', '--adversarial_temperature', default=1.0, type=float)
self.add_argument('--valid', action='store_true',
help='if valid a model')
self.add_argument('--test', action='store_true',
help='if test a model')
self.add_argument('-rc', '--regularization_coef', type=float, default=0.000002,
help='set value > 0.0 if regularization is used')
self.add_argument('-rn', '--regularization_norm', type=int, default=3,
help='norm used in regularization')
self.add_argument('--num_worker', type=int, default=16,
help='number of workers used for loading data')
self.add_argument('--non_uni_weight', action='store_true',
help='if use uniform weight when computing loss')
self.add_argument('--init_step', type=int, default=0,
help='DONT SET MANUALLY, used for resume')
self.add_argument('--step', type=int, default=0,
help='DONT SET MANUALLY, track current step')
self.add_argument('--pickle_graph', action='store_true',
help='pickle built graph, building a huge graph is slow.')
self.add_argument('--num_proc', type=int, default=1,
help='number of process used')
self.add_argument('--rel_part', action='store_true',
help='enable relation partitioning')
def get_logger(args):
if not os.path.exists(args.save_path):
os.mkdir(args.save_path)
folder = '{}_{}_'.format(args.model_name, args.dataset)
n = len([x for x in os.listdir(args.save_path) if x.startswith(folder)])
folder += str(n)
args.save_path = os.path.join(args.save_path, folder)
if not os.path.exists(args.save_path):
os.makedirs(args.save_path)
log_file = os.path.join(args.save_path, 'train.log')
logging.basicConfig(
format='%(asctime)s %(levelname)-8s %(message)s',
level=logging.INFO,
datefmt='%Y-%m-%d %H:%M:%S',
filename=log_file,
filemode='w'
)
logger = logging.getLogger(__name__)
print("Logs are being recorded at: {}".format(log_file))
return logger
def run(args, logger):
# load dataset and samplers
dataset = get_dataset(args.data_path, args.dataset, args.format)
n_entities = dataset.n_entities
n_relations = dataset.n_relations
if args.neg_sample_size_test < 0:
args.neg_sample_size_test = n_entities
train_data = TrainDataset(dataset, args, ranks=args.num_proc)
if args.num_proc > 1:
train_samplers = []
for i in range(args.num_proc):
train_sampler_head = train_data.create_sampler(args.batch_size, args.neg_sample_size,
mode='PBG-head',
num_workers=args.num_worker,
shuffle=True,
exclude_positive=True,
rank=i)
train_sampler_tail = train_data.create_sampler(args.batch_size, args.neg_sample_size,
mode='PBG-tail',
num_workers=args.num_worker,
shuffle=True,
exclude_positive=True,
rank=i)
train_samplers.append(NewBidirectionalOneShotIterator(train_sampler_head, train_sampler_tail,
True, n_entities))
else:
train_sampler_head = train_data.create_sampler(args.batch_size, args.neg_sample_size,
mode='PBG-head',
num_workers=args.num_worker,
shuffle=True,
exclude_positive=True)
train_sampler_tail = train_data.create_sampler(args.batch_size, args.neg_sample_size,
mode='PBG-tail',
num_workers=args.num_worker,
shuffle=True,
exclude_positive=True)
train_sampler = NewBidirectionalOneShotIterator(train_sampler_head, train_sampler_tail,
True, n_entities)
if args.valid or args.test:
eval_dataset = EvalDataset(dataset, args)
if args.valid:
# Here we want to use the regualr negative sampler because we need to ensure that
# all positive edges are excluded.
if args.num_proc > 1:
valid_sampler_heads = []
valid_sampler_tails = []
for i in range(args.num_proc):
valid_sampler_head = eval_dataset.create_sampler('valid', args.batch_size_eval,
args.neg_sample_size_valid,
mode='PBG-head',
num_workers=args.num_worker,
rank=i, ranks=args.num_proc)
valid_sampler_tail = eval_dataset.create_sampler('valid', args.batch_size_eval,
args.neg_sample_size_valid,
mode='PBG-tail',
num_workers=args.num_worker,
rank=i, ranks=args.num_proc)
valid_sampler_heads.append(valid_sampler_head)
valid_sampler_tails.append(valid_sampler_tail)
else:
valid_sampler_head = eval_dataset.create_sampler('valid', args.batch_size_eval,
args.neg_sample_size_valid,
mode='PBG-head',
num_workers=args.num_worker,
rank=0, ranks=1)
valid_sampler_tail = eval_dataset.create_sampler('valid', args.batch_size_eval,
args.neg_sample_size_valid,
mode='PBG-tail',
num_workers=args.num_worker,
rank=0, ranks=1)
if args.test:
# Here we want to use the regualr negative sampler because we need to ensure that
# all positive edges are excluded.
if args.num_proc > 1:
test_sampler_tails = []
test_sampler_heads = []
for i in range(args.num_proc):
test_sampler_head = eval_dataset.create_sampler('test', args.batch_size_eval,
args.neg_sample_size_test,
mode='head',
num_workers=args.num_worker,
rank=i, ranks=args.num_proc)
test_sampler_tail = eval_dataset.create_sampler('test', args.batch_size_eval,
args.neg_sample_size_test,
mode='tail',
num_workers=args.num_worker,
rank=i, ranks=args.num_proc)
test_sampler_heads.append(test_sampler_head)
test_sampler_tails.append(test_sampler_tail)
else:
test_sampler_head = eval_dataset.create_sampler('test', args.batch_size_eval,
args.neg_sample_size_test,
mode='head',
num_workers=args.num_worker,
rank=0, ranks=1)
test_sampler_tail = eval_dataset.create_sampler('test', args.batch_size_eval,
args.neg_sample_size_test,
mode='tail',
num_workers=args.num_worker,
rank=0, ranks=1)
# We need to free all memory referenced by dataset.
eval_dataset = None
dataset = None
# load model
model = load_model(logger, args, n_entities, n_relations)
if args.num_proc > 1:
model.share_memory()
# train
start = time.time()
if args.num_proc > 1:
procs = []
for i in range(args.num_proc):
valid_samplers = [valid_sampler_heads[i], valid_sampler_tails[i]] if args.valid else None
proc = mp.Process(target=train, args=(args, model, train_samplers[i], valid_samplers))
procs.append(proc)
proc.start()
for proc in procs:
proc.join()
else:
valid_samplers = [valid_sampler_head, valid_sampler_tail] if args.valid else None
train(args, model, train_sampler, valid_samplers)
print('training takes {} seconds'.format(time.time() - start))
if args.save_emb is not None:
if not os.path.exists(args.save_emb):
os.mkdir(args.save_emb)
model.save_emb(args.save_emb, args.dataset)
# test
if args.test:
if args.num_proc > 1:
procs = []
for i in range(args.num_proc):
proc = mp.Process(target=test, args=(args, model, [test_sampler_heads[i], test_sampler_tails[i]]))
procs.append(proc)
proc.start()
for proc in procs:
proc.join()
else:
test(args, model, [test_sampler_head, test_sampler_tail])
if __name__ == '__main__':
args = ArgParser().parse_args()
logger = get_logger(args)
run(args, logger)
from models import KEModel
import mxnet as mx
from mxnet import gluon
from mxnet import ndarray as nd
import os
import logging
import time
import json
def load_model(logger, args, n_entities, n_relations, ckpt=None):
model = KEModel(args, args.model_name, n_entities, n_relations,
args.hidden_dim, args.gamma,
double_entity_emb=args.double_ent, double_relation_emb=args.double_rel)
if ckpt is not None:
# TODO: loading model emb only work for genernal Embedding, not for ExternalEmbedding
if args.gpu >= 0:
model.load_parameters(ckpt, ctx=mx.gpu(args.gpu))
else:
model.load_parameters(ckpt, ctx=mx.cpu())
logger.info('Load model {}'.format(args.model_name))
return model
def load_model_from_checkpoint(logger, args, n_entities, n_relations, ckpt_path):
model = load_model(logger, args, n_entities, n_relations)
model.load_emb(ckpt_path, args.dataset)
return model
def train(args, model, train_sampler, valid_samplers=None):
if args.num_proc > 1:
os.environ['OMP_NUM_THREADS'] = '1'
logs = []
for arg in vars(args):
logging.info('{:20}:{}'.format(arg, getattr(args, arg)))
start = time.time()
for step in range(args.init_step, args.max_step):
pos_g, neg_g = next(train_sampler)
args.step = step
with mx.autograd.record():
loss, log = model.forward(pos_g, neg_g, args.gpu)
loss.backward()
logs.append(log)
model.update()
if step % args.log_interval == 0:
for k in logs[0].keys():
v = sum(l[k] for l in logs) / len(logs)
print('[Train]({}/{}) average {}: {}'.format(step, args.max_step, k, v))
logs = []
print(time.time() - start)
start = time.time()
if args.valid and step % args.eval_interval == 0 and step > 1 and valid_samplers is not None:
start = time.time()
test(args, model, valid_samplers, mode='Valid')
print('test:', time.time() - start)
# clear cache
logs = []
def test(args, model, test_samplers, mode='Test'):
logs = []
for sampler in test_samplers:
#print('Number of tests: ' + len(sampler))
count = 0
for pos_g, neg_g in sampler:
model.forward_test(pos_g, neg_g, logs, args.gpu)
metrics = {}
if len(logs) > 0:
for metric in logs[0].keys():
metrics[metric] = sum([log[metric] for log in logs]) / len(logs)
for k, v in metrics.items():
print('{} average {} at [{}/{}]: {}'.format(mode, k, args.step, args.max_step, v))
for i in range(len(test_samplers)):
test_samplers[i] = test_samplers[i].reset()
from models import KEModel
from torch.utils.data import DataLoader
import torch.optim as optim
import torch as th
import torch.multiprocessing as mp
from distutils.version import LooseVersion
TH_VERSION = LooseVersion(th.__version__)
if TH_VERSION.version[0] == 1 and TH_VERSION.version[1] < 2:
raise Exception("DGL-ke has to work with Pytorch version >= 1.2")
import os
import logging
import time
def load_model(logger, args, n_entities, n_relations, ckpt=None):
model = KEModel(args, args.model_name, n_entities, n_relations,
args.hidden_dim, args.gamma,
double_entity_emb=args.double_ent, double_relation_emb=args.double_rel)
if ckpt is not None:
# TODO: loading model emb only work for genernal Embedding, not for ExternalEmbedding
model.load_state_dict(ckpt['model_state_dict'])
return model
def load_model_from_checkpoint(logger, args, n_entities, n_relations, ckpt_path):
model = load_model(logger, args, n_entities, n_relations)
model.load_emb(ckpt_path, args.dataset)
return model
def train(args, model, train_sampler, valid_samplers=None):
if args.num_proc > 1:
th.set_num_threads(1)
logs = []
for arg in vars(args):
logging.info('{:20}:{}'.format(arg, getattr(args, arg)))
start = time.time()
update_time = 0
forward_time = 0
backward_time = 0
for step in range(args.init_step, args.max_step):
pos_g, neg_g = next(train_sampler)
args.step = step
start1 = time.time()
loss, log = model.forward(pos_g, neg_g)
forward_time += time.time() - start1
start1 = time.time()
loss.backward()
backward_time += time.time() - start1
start1 = time.time()
model.update()
update_time += time.time() - start1
logs.append(log)
if step % args.log_interval == 0:
for k in logs[0].keys():
v = sum(l[k] for l in logs) / len(logs)
print('[Train]({}/{}) average {}: {}'.format(step, args.max_step, k, v))
logs = []
print('[Train] {} steps take {:.3f} seconds'.format(args.log_interval,
time.time() - start))
print('forward: {:.3f}, backward: {:.3f}, update: {:.3f}'.format(forward_time,
backward_time,
update_time))
update_time = 0
forward_time = 0
backward_time = 0
start = time.time()
if args.valid and step % args.eval_interval == 0 and step > 1 and valid_samplers is not None:
start = time.time()
test(args, model, valid_samplers, mode='Valid')
print('test:', time.time() - start)
def test(args, model, test_samplers, mode='Test'):
if args.num_proc > 1:
th.set_num_threads(1)
start = time.time()
with th.no_grad():
logs = []
for sampler in test_samplers:
count = 0
for pos_g, neg_g in sampler:
with th.no_grad():
model.forward_test(pos_g, neg_g, logs, args.gpu)
metrics = {}
if len(logs) > 0:
for metric in logs[0].keys():
metrics[metric] = sum([log[metric] for log in logs]) / len(logs)
for k, v in metrics.items():
print('{} average {} at [{}/{}]: {}'.format(mode, k, args.step, args.max_step, v))
print('test:', time.time() - start)
test_samplers[0] = test_samplers[0].reset()
test_samplers[1] = test_samplers[1].reset()
...@@ -39,7 +39,7 @@ TAGConv ...@@ -39,7 +39,7 @@ TAGConv
:show-inheritance: :show-inheritance:
Global Pooling Layers Global Pooling Layers
---------------------------------------- ----------------------------------------
.. automodule:: dgl.nn.mxnet.glob .. automodule:: dgl.nn.mxnet.glob
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment