1. 04 Jun, 2020 1 commit
    • Chao Ma's avatar
      [KVStore] Re-write kvstore using DGL RPC infrastructure (#1569) · 64f49703
      Chao Ma authored
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update init_data
      
      * update server_state
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * debug init_data
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * test get_meta_data
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * debug push
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * use F.reverse_data_type_dict
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * fix lint
      
      * update
      
      * fix lint
      
      * update
      
      * fix lint
      
      * update
      
      * update
      
      * update
      
      * update
      
      * fix test
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * set random seed
      
      * update
      64f49703
  2. 06 May, 2020 1 commit
    • Da Zheng's avatar
      [Test] Add tests for TensorFlow (#1501) · 16561a2e
      Da Zheng authored
      
      
      * add test.
      
      * move test code.
      
      * remvoe unnecessary test.
      
      * fix.
      
      * turn on tests for TF.
      
      * Revert "move test code."
      
      This reverts commit e7b4f36395b2121a7be030bd4364a704d0e357bf.
      
      * fix.
      
      * fix.
      
      * skip test for tensorflow.
      Co-authored-by: default avatarChao Ma <mctt90@gmail.com>
      16561a2e
  3. 05 Apr, 2020 1 commit
  4. 28 Mar, 2020 1 commit
  5. 08 Oct, 2019 1 commit
    • xiang song(charlie.song)'s avatar
      [Bug fix] Fix concurrency bug reported at issue#755 (#823) · fd1b1474
      xiang song(charlie.song) authored
      * upd
      
      * fig edgebatch edges
      
      * add test
      
      * trigger
      
      * Update README.md for pytorch PinSage example.
      
      Add noting that the PinSage model example under
      example/pytorch/recommendation only work with Python 3.6+
      as its dataset loader depends on stanfordnlp package
      which work only with Python 3.6+.
      
      * Provid a frame agnostic API to test nn modules on both CPU and CUDA side.
      
      1. make dgl.nn.xxx frame agnostic
      2. make test.backend include dgl.nn modules
      3. modify test_edge_softmax of test/mxnet/test_nn.py and
          test/pytorch/test_nn.py work on both CPU and GPU
      
      * Fix style
      
      * Delete unused code
      
      * Make agnostic test only related to tests/backend
      
      1. clear all agnostic related code in dgl.nn
      2. make test_graph_conv agnostic to cpu/gpu
      
      * Fix code style
      
      * fix
      
      * doc
      
      * Make all test code under tests.mxnet/pytorch.test_nn.py
      work on both CPU and GPU.
      
      * Fix syntex
      
      * Remove rand
      
      * Add TAGCN nn.module and example
      
      * Now tagcn can run on CPU.
      
      * Add unitest for TGConv
      
      * Fix style
      
      * For pubmed dataset, using --lr=0.005 can achieve better acc
      
      * Fix style
      
      * Fix some descriptions
      
      * trigger
      
      * Fix doc
      
      * Add nn.TGConv and example
      
      * Fix bug
      
      * Update data in mxnet.tagcn test acc.
      
      * Fix some comments and code
      
      * delete useless code
      
      * Fix namming
      
      * Fix bug
      
      * Fix bug
      
      * Add test for mxnet TAGCov
      
      * Add test code for mxnet TAGCov
      
      * Update some docs
      
      * Fix some code
      
      * Update docs dgl.nn.mxnet
      
      * Update weight init
      
      * Fix
      
      * reproduce the bug
      
      * Fix concurrency bug reported at #755.
      Also make test_shared_mem_store.py more deterministic.
      
      * Update test_shared_mem_store.py
      
      * Update dmlc/core
      fd1b1474
  6. 30 Aug, 2019 1 commit
  7. 12 Jun, 2019 1 commit
  8. 09 Jun, 2019 1 commit
    • Da Zheng's avatar
      [BUGFIX] Fix bugs in shared mem graph store. (#630) · 2e9949d2
      Da Zheng authored
      * fix graph store for Pytorch.
      
      * add test.
      
      * fix dtype error in test
      
      * disable test on GPU.
      
      * test avoid windows.
      
      * fix shared-memory test.
      
      * use script to control testing environment.
      
      * update test.
      
      * enable all tests.
      
      * fix test script.
      2e9949d2
  9. 06 Jun, 2019 1 commit
    • Lingfan Yu's avatar
      [Feature][Kernel] DGL kernel support (#596) · 653428bd
      Lingfan Yu authored
      * [Kernel] Minigun integration and fused kernel support (#519)
      
      * kernel interface
      
      * add minigun
      
      * Add cuda build
      
      * functors
      
      * working on binary elewise
      
      * binary reduce
      
      * change kernel interface
      
      * WIP
      
      * wip
      
      * fix minigun
      
      * compile
      
      * binary reduce kernels
      
      * compile
      
      * simple test passed
      
      * more reducers
      
      * fix thrust problem
      
      * fix cmake
      
      * fix cmake; add proper guard for atomic
      
      * WIP: bcast
      
      * WIP
      
      * bcast kernels
      
      * update to new minigun pass-by-value practice
      
      * broadcasting dim
      
      * add copy src and copy edge
      
      * fix linking
      
      * fix none array problem
      
      * fix copy edge
      
      * add device_type and device_id to backend operator
      
      * cache csr adj, remove cache for adjmat and incmat
      
      * custom ops in backend and pytorch impl
      
      * change dgl-mg kernel python interface
      
      * add id_mapping var
      
      * clean up plus v2e spmv schedule
      
      * spmv schedule & clean up fall back
      
      * symbolic message and reduce func, remove bundle func
      
      * new executors
      
      * new backend interface for dgl kernels and pytorch impl
      
      * minor fix
      
      * fix
      
      * fix docstring, comments, func names
      
      * nodeflow
      
      * fix message id mapping and bugs...
      
      * pytorch test case & fix
      
      * backward binary reduce
      
      * fix bug
      
      * WIP: cusparse
      
      * change to int32 csr for cusparse workaround
      
      * disable cusparse
      
      * change back to int64
      
      * broadcasting backward
      
      * cusparse; WIP: add rev_csr
      
      * unit test for kernels
      
      * pytorch backward with dgl kernel
      
      * edge softmax
      
      * fix backward
      
      * improve softmax
      
      * cache edge on device
      
      * cache mappings on device
      
      * fix partial forward code
      
      * cusparse done
      
      * copy_src_sum with cusparse
      
      * rm id getter
      
      * reduce grad for broadcast
      
      * copy edge reduce backward
      
      * kernel unit test for broadcasting
      
      * full kernel unit test
      
      * add cpu kernels
      
      * edge softmax unit test
      
      * missing ref
      
      * fix compile and small bugs
      
      * fix bug in bcast
      
      * Add backward both
      
      * fix torch utests
      
      * expose infershape
      
      * create out tensor in python
      
      * fix c++ lint
      
      * [Kernel] Add GPU utest and kernel utest (#524)
      
      * fix gpu utest
      
      * cuda utest runnable
      
      * temp disable test nodeflow; unified test for kernel
      
      * cuda test kernel done
      
      * [Kernel] Update kernel branch (#550)
      
      * [Model] add multiprocessing training with sampling. (#484)
      
      * reorganize sampling code.
      
      * add multi-process training.
      
      * speed up gcn_cv
      
      * fix graphsage_cv.
      
      * add new API in graph store.
      
      * update barrier impl.
      
      * support both local and distributed training.
      
      * fix multiprocess train.
      
      * fix.
      
      * fix barrier.
      
      * add script for loading data.
      
      * multiprocessing sampling.
      
      * accel training.
      
      * replace pull with spmv for speedup.
      
      * nodeflow copy from parent with context.
      
      * enable GPU.
      
      * fix a bug in graph store.
      
      * enable multi-GPU training.
      
      * fix lint.
      
      * add comments.
      
      * rename to run_store_server.py
      
      * fix gcn_cv.
      
      * fix a minor bug in sampler.
      
      * handle error better in graph store.
      
      * improve graphsage_cv for distributed mode.
      
      * update README.
      
      * fix.
      
      * update.
      
      * [Tutorial] add sampling tutorial. (#522)
      
      * add sampling tutorial.
      
      * add readme
      
      * update author list.
      
      * fix indent in the code.
      
      * rename the file.
      
      * update tutorial.
      
      * fix the last API.
      
      * update image.
      
      * [BUGFIX] fix the problems in the sampling tutorial. (#523)
      
      * add index.
      
      * update.
      
      * update tutorial.
      
      * fix gpu utest
      
      * cuda utest runnable
      
      * temp disable test nodeflow; unified test for kernel
      
      * cuda test kernel done
      
      * Fixing typo in JTNN after interface change (#536)
      
      * [BugFix] Fix getting src and dst id of ALL edges in NodeFlow.apply_block (#515)
      
      * [Bug Fix] Fix inplace op at backend (#546)
      
      * Fix inplace operation
      
      * fix line seprator
      
      * [Feature] Add batch and unbatch for immutable graph (#539)
      
      * Add batch and unbatch for immutable graph
      
      * fix line seprator
      
      * fix lintr
      
      * remove unnecessary include
      
      * fix code review
      
      * [BUGFix] Improve multi-processing training (#526)
      
      * fix.
      
      * add comment.
      
      * remove.
      
      * temp fix.
      
      * initialize for shared memory.
      
      * fix graphsage.
      
      * fix gcn.
      
      * add more unit tests.
      
      * add more tests.
      
      * avoid creating shared-memory exclusively.
      
      * redefine remote initializer.
      
      * improve initializer.
      
      * fix unit test.
      
      * fix lint.
      
      * fix lint.
      
      * initialize data in the graph store server properly.
      
      * fix test.
      
      * fix test.
      
      * fix test.
      
      * small fix.
      
      * add comments.
      
      * cleanup server.
      
      * test graph store with a random port.
      
      * print.
      
      * print to stderr.
      
      * test1
      
      * test2
      
      * remove comment.
      
      * adjust the initializer signature.
      
      * [API] update graph store API. (#549)
      
      * add init_ndata and init_edata in DGLGraph.
      
      * adjust SharedMemoryGraph API.
      
      * print warning.
      
      * fix comment.
      
      * update example
      
      * fix.
      
      * fix examples.
      
      * add unit tests.
      
      * add comments.
      
      * [Refactor] Immutable graph index (#543)
      
      * WIP
      
      * header
      
      * WIP .cc
      
      * WIP
      
      * transpose
      
      * wip
      
      * immutable graph .h and .cc
      
      * WIP: nodeflow.cc
      
      * compile
      
      * remove all tmp dl managed ctx; they caused refcount issue
      
      * one simple test
      
      * WIP: testing
      
      * test_graph
      
      * fix graph index
      
      * fix bug in sampler; pass pytorch utest
      
      * WIP on mxnet
      
      * fix lint
      
      * fix mxnet unittest w/ unfortunate workaround
      
      * fix msvc
      
      * fix lint
      
      * SliceRows and test_nodeflow
      
      * resolve reviews
      
      * resolve reviews
      
      * try fix win ci
      
      * try fix win ci
      
      * poke win ci again
      
      * poke
      
      * lazy multigraph flag; stackoverflow error
      
      * revert node subgraph test
      
      * lazy object
      
      * try fix win build
      
      * try fix win build
      
      * poke ci
      
      * fix build script
      
      * fix compile
      
      * add a todo
      
      * fix reviews
      
      * fix compile
      
      * [Kernel] Update kernel branch (#576)
      
      * [Model] add multiprocessing training with sampling. (#484)
      
      * reorganize sampling code.
      
      * add multi-process training.
      
      * speed up gcn_cv
      
      * fix graphsage_cv.
      
      * add new API in graph store.
      
      * update barrier impl.
      
      * support both local and distributed training.
      
      * fix multiprocess train.
      
      * fix.
      
      * fix barrier.
      
      * add script for loading data.
      
      * multiprocessing sampling.
      
      * accel training.
      
      * replace pull with spmv for speedup.
      
      * nodeflow copy from parent with context.
      
      * enable GPU.
      
      * fix a bug in graph store.
      
      * enable multi-GPU training.
      
      * fix lint.
      
      * add comments.
      
      * rename to run_store_server.py
      
      * fix gcn_cv.
      
      * fix a minor bug in sampler.
      
      * handle error better in graph store.
      
      * improve graphsage_cv for distributed mode.
      
      * update README.
      
      * fix.
      
      * update.
      
      * [Tutorial] add sampling tutorial. (#522)
      
      * add sampling tutorial.
      
      * add readme
      
      * update author list.
      
      * fix indent in the code.
      
      * rename the file.
      
      * update tutorial.
      
      * fix the last API.
      
      * update image.
      
      * [BUGFIX] fix the problems in the sampling tutorial. (#523)
      
      * add index.
      
      * update.
      
      * update tutorial.
      
      * fix gpu utest
      
      * cuda utest runnable
      
      * temp disable test nodeflow; unified test for kernel
      
      * cuda test kernel done
      
      * Fixing typo in JTNN after interface change (#536)
      
      * [BugFix] Fix getting src and dst id of ALL edges in NodeFlow.apply_block (#515)
      
      * [Bug Fix] Fix inplace op at backend (#546)
      
      * Fix inplace operation
      
      * fix line seprator
      
      * [Feature] Add batch and unbatch for immutable graph (#539)
      
      * Add batch and unbatch for immutable graph
      
      * fix line seprator
      
      * fix lintr
      
      * remove unnecessary include
      
      * fix code review
      
      * [BUGFix] Improve multi-processing training (#526)
      
      * fix.
      
      * add comment.
      
      * remove.
      
      * temp fix.
      
      * initialize for shared memory.
      
      * fix graphsage.
      
      * fix gcn.
      
      * add more unit tests.
      
      * add more tests.
      
      * avoid creating shared-memory exclusively.
      
      * redefine remote initializer.
      
      * improve initializer.
      
      * fix unit test.
      
      * fix lint.
      
      * fix lint.
      
      * initialize data in the graph store server properly.
      
      * fix test.
      
      * fix test.
      
      * fix test.
      
      * small fix.
      
      * add comments.
      
      * cleanup server.
      
      * test graph store with a random port.
      
      * print.
      
      * print to stderr.
      
      * test1
      
      * test2
      
      * remove comment.
      
      * adjust the initializer signature.
      
      * [API] update graph store API. (#549)
      
      * add init_ndata and init_edata in DGLGraph.
      
      * adjust SharedMemoryGraph API.
      
      * print warning.
      
      * fix comment.
      
      * update example
      
      * fix.
      
      * fix examples.
      
      * add unit tests.
      
      * add comments.
      
      * [Refactor] Immutable graph index (#543)
      
      * WIP
      
      * header
      
      * WIP .cc
      
      * WIP
      
      * transpose
      
      * wip
      
      * immutable graph .h and .cc
      
      * WIP: nodeflow.cc
      
      * compile
      
      * remove all tmp dl managed ctx; they caused refcount issue
      
      * one simple test
      
      * WIP: testing
      
      * test_graph
      
      * fix graph index
      
      * fix bug in sampler; pass pytorch utest
      
      * WIP on mxnet
      
      * fix lint
      
      * fix mxnet unittest w/ unfortunate workaround
      
      * fix msvc
      
      * fix lint
      
      * SliceRows and test_nodeflow
      
      * resolve reviews
      
      * resolve reviews
      
      * try fix win ci
      
      * try fix win ci
      
      * poke win ci again
      
      * poke
      
      * lazy multigraph flag; stackoverflow error
      
      * revert node subgraph test
      
      * lazy object
      
      * try fix win build
      
      * try fix win build
      
      * poke ci
      
      * fix build script
      
      * fix compile
      
      * add a todo
      
      * fix reviews
      
      * fix compile
      
      * all demo use python-3 (#555)
      
      * [DEMO] Reproduce numbers of distributed training in AMLC giant graph paper (#556)
      
      * update
      
      * update
      
      * update
      
      * update num_hops
      
      * fix bug
      
      * update
      
      * report numbers of distributed training in AMLC giant graph paper
      
      * [DEMO] Remove duplicate code for sampling (#557)
      
      * update
      
      * update
      
      * re-use single-machine code
      
      * update
      
      * use relative path
      
      * update
      
      * update
      
      * update
      
      * add __init__.py
      
      * add __init__.py
      
      * import sys, os
      
      * fix typo
      
      * update
      
      * [Perf] Improve performance of graph store. (#554)
      
      * fix.
      
      * use inplace.
      
      * move to shared memory graph store.
      
      * fix.
      
      * add more unit tests.
      
      * fix.
      
      * fix test.
      
      * fix test.
      
      * disable test.
      
      * fix.
      
      * [BUGIFX] fix a bug in edge_ids (#560)
      
      * add test.
      
      * fix compute.
      
      * fix test.
      
      * turn on test.
      
      * fix a bug.
      
      * add test.
      
      * fix.
      
      * disable test.
      
      * [DEMO] Add Pytorch demo for distributed sampler (#562)
      
      * update
      
      * update
      
      * update
      
      * add sender
      
      * update
      
      * remove duplicate cpde
      
      * [Test] Add gtest to project (#547)
      
      * add gtest module
      
      * add gtest
      
      * fix
      
      * Update CMakeLists.txt
      
      * Update README.md
      
      * [Perf] lazily create msg_index. (#563)
      
      * lazily create msg_index.
      
      * update test.
      
      * [BUGFIX] fix bugs for running GCN on giant graphs. (#561)
      
      * load mxnet csr.
      
      * enable load large csr.
      
      * fix
      
      * fix.
      
      * fix int overflow.
      
      * fix test.
      
      * [BugFix] Fix error when bfs_level = 0 in Entity Classification with RGCN (#559)
      
      * [DEMO] Update demo of distributed sampler (#564)
      
      * update
      
      * update
      
      * update demo
      
      * add network cpp test (#565)
      
      * Add unittest for C++ RPC (#566)
      
      * [CI] Fix CI for cpp test (#570)
      
      * fix CI for cpp test
      
      * update port number
      
      * [Docker] update docker image (#575)
      
      * update docker image
      
      * specify lint version
      
      * rm torch import from unified tests
      
      * [Kernel][Scheduler][MXNet] Scheduler for DGL kernels and MXNet backend support (#541)
      
      * [Model] add multiprocessing training with sampling. (#484)
      
      * reorganize sampling code.
      
      * add multi-process training.
      
      * speed up gcn_cv
      
      * fix graphsage_cv.
      
      * add new API in graph store.
      
      * update barrier impl.
      
      * support both local and distributed training.
      
      * fix multiprocess train.
      
      * fix.
      
      * fix barrier.
      
      * add script for loading data.
      
      * multiprocessing sampling.
      
      * accel training.
      
      * replace pull with spmv for speedup.
      
      * nodeflow copy from parent with context.
      
      * enable GPU.
      
      * fix a bug in graph store.
      
      * enable multi-GPU training.
      
      * fix lint.
      
      * add comments.
      
      * rename to run_store_server.py
      
      * fix gcn_cv.
      
      * fix a minor bug in sampler.
      
      * handle error better in graph store.
      
      * improve graphsage_cv for distributed mode.
      
      * update README.
      
      * fix.
      
      * update.
      
      * [Tutorial] add sampling tutorial. (#522)
      
      * add sampling tutorial.
      
      * add readme
      
      * update author list.
      
      * fix indent in the code.
      
      * rename the file.
      
      * update tutorial.
      
      * fix the last API.
      
      * update image.
      
      * [BUGFIX] fix the problems in the sampling tutorial. (#523)
      
      * add index.
      
      * update.
      
      * update tutorial.
      
      * fix gpu utest
      
      * cuda utest runnable
      
      * temp disable test nodeflow; unified test for kernel
      
      * cuda test kernel done
      
      * edge softmax module
      
      * WIP
      
      * Fixing typo in JTNN after interface change (#536)
      
      * mxnet backend support
      
      * improve reduce grad
      
      * add max to unittest backend
      
      * fix kernel unittest
      
      * [BugFix] Fix getting src and dst id of ALL edges in NodeFlow.apply_block (#515)
      
      * lint
      
      * lint
      
      * win build
      
      * [Bug Fix] Fix inplace op at backend (#546)
      
      * Fix inplace operation
      
      * fix line seprator
      
      * [Feature] Add batch and unbatch for immutable graph (#539)
      
      * Add batch and unbatch for immutable graph
      
      * fix line seprator
      
      * fix lintr
      
      * remove unnecessary include
      
      * fix code review
      
      * [BUGFix] Improve multi-processing training (#526)
      
      * fix.
      
      * add comment.
      
      * remove.
      
      * temp fix.
      
      * initialize for shared memory.
      
      * fix graphsage.
      
      * fix gcn.
      
      * add more unit tests.
      
      * add more tests.
      
      * avoid creating shared-memory exclusively.
      
      * redefine remote initializer.
      
      * improve initializer.
      
      * fix unit test.
      
      * fix lint.
      
      * fix lint.
      
      * initialize data in the graph store server properly.
      
      * fix test.
      
      * fix test.
      
      * fix test.
      
      * small fix.
      
      * add comments.
      
      * cleanup server.
      
      * test graph store with a random port.
      
      * print.
      
      * print to stderr.
      
      * test1
      
      * test2
      
      * remove comment.
      
      * adjust the initializer signature.
      
      * try
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * try
      
      * test
      
      * test
      
      * test
      
      * try
      
      * try
      
      * try
      
      * test
      
      * fix
      
      * try gen_target
      
      * fix gen_target
      
      * fix msvc var_args expand issue
      
      * fix
      
      * [API] update graph store API. (#549)
      
      * add init_ndata and init_edata in DGLGraph.
      
      * adjust SharedMemoryGraph API.
      
      * print warning.
      
      * fix comment.
      
      * update example
      
      * fix.
      
      * fix examples.
      
      * add unit tests.
      
      * add comments.
      
      * [Refactor] Immutable graph index (#543)
      
      * WIP
      
      * header
      
      * WIP .cc
      
      * WIP
      
      * transpose
      
      * wip
      
      * immutable graph .h and .cc
      
      * WIP: nodeflow.cc
      
      * compile
      
      * remove all tmp dl managed ctx; they caused refcount issue
      
      * one simple test
      
      * WIP: testing
      
      * test_graph
      
      * fix graph index
      
      * fix bug in sampler; pass pytorch utest
      
      * WIP on mxnet
      
      * fix lint
      
      * fix mxnet unittest w/ unfortunate workaround
      
      * fix msvc
      
      * fix lint
      
      * SliceRows and test_nodeflow
      
      * resolve reviews
      
      * resolve reviews
      
      * try fix win ci
      
      * try fix win ci
      
      * poke win ci again
      
      * poke
      
      * lazy multigraph flag; stackoverflow error
      
      * revert node subgraph test
      
      * lazy object
      
      * try fix win build
      
      * try fix win build
      
      * poke ci
      
      * fix build script
      
      * fix compile
      
      * add a todo
      
      * fix reviews
      
      * fix compile
      
      * WIP
      
      * WIP
      
      * all demo use python-3 (#555)
      
      * ToImmutable and CopyTo
      
      * [DEMO] Reproduce numbers of distributed training in AMLC giant graph paper (#556)
      
      * update
      
      * update
      
      * update
      
      * update num_hops
      
      * fix bug
      
      * update
      
      * report numbers of distributed training in AMLC giant graph paper
      
      * [DEMO] Remove duplicate code for sampling (#557)
      
      * update
      
      * update
      
      * re-use single-machine code
      
      * update
      
      * use relative path
      
      * update
      
      * update
      
      * update
      
      * add __init__.py
      
      * add __init__.py
      
      * import sys, os
      
      * fix typo
      
      * update
      
      * [Perf] Improve performance of graph store. (#554)
      
      * fix.
      
      * use inplace.
      
      * move to shared memory graph store.
      
      * fix.
      
      * add more unit tests.
      
      * fix.
      
      * fix test.
      
      * fix test.
      
      * disable test.
      
      * fix.
      
      * [BUGIFX] fix a bug in edge_ids (#560)
      
      * add test.
      
      * fix compute.
      
      * fix test.
      
      * turn on test.
      
      * fix a bug.
      
      * add test.
      
      * fix.
      
      * disable test.
      
      * DGLRetValue DGLContext conversion
      
      * [DEMO] Add Pytorch demo for distributed sampler (#562)
      
      * update
      
      * update
      
      * update
      
      * add sender
      
      * update
      
      * remove duplicate cpde
      
      * [Test] Add gtest to project (#547)
      
      * add gtest module
      
      * add gtest
      
      * fix
      
      * Update CMakeLists.txt
      
      * Update README.md
      
      * Add support to convert immutable graph to 32 bits
      
      * [Perf] lazily create msg_index. (#563)
      
      * lazily create msg_index.
      
      * update test.
      
      * fix binary reduce following new minigun template
      
      * enable both int64 and int32 kernels
      
      * [BUGFIX] fix bugs for running GCN on giant graphs. (#561)
      
      * load mxnet csr.
      
      * enable load large csr.
      
      * fix
      
      * fix.
      
      * fix int overflow.
      
      * fix test.
      
      * new kernel interface done for CPU
      
      * docstring
      
      * rename & docstring
      
      * copy reduce and backward
      
      * [BugFix] Fix error when bfs_level = 0 in Entity Classification with RGCN (#559)
      
      * [DEMO] Update demo of distributed sampler (#564)
      
      * update
      
      * update
      
      * update demo
      
      * adapt cuda kernels to the new interface
      
      * add network cpp test (#565)
      
      * fix bug
      
      * Add unittest for C++ RPC (#566)
      
      * [CI] Fix CI for cpp test (#570)
      
      * fix CI for cpp test
      
      * update port number
      
      * [Docker] update docker image (#575)
      
      * update docker image
      
      * specify lint version
      
      * rm torch import from unified tests
      
      * remove pytorch-specific test_function
      
      * fix unittest
      
      * fix
      
      * fix unittest backend bug in converting tensor to numpy array
      
      * fix
      
      * mxnet version
      
      * [BUGFIX] fix for MXNet 1.5. (#552)
      
      * remove clone.
      
      * turn on numpy compatible.
      
      * Revert "remove clone."
      
      This reverts commit 17bbf76ed72ff178df6b3f35addc428048672457.
      
      * revert format changes
      
      * fix mxnet api name
      
      * revert mistakes in previous revert
      
      * roll back CI to 20190523 build
      
      * fix unittest
      
      * disable test_shared_mem_store.py for now
      
      * remove mxnet/test_specialization.py
      
      * sync win64 test script
      
      * fix lowercase
      
      * missing backend in gpu unit test
      
      * transpose to get forward graph
      
      * pass update all
      
      * add sanity check
      
      * passing test_specialization.py
      
      * fix and pass test_function
      
      * fix check
      
      * fix pytorch softmax
      
      * mxnet kernels
      
      * c++ lint
      
      * pylint
      
      * try
      
      * win build
      
      * fix
      
      * win
      
      * ci enable gpu build
      
      * init submodule recursively
      
      * backend docstring
      
      * try
      
      * test win dev
      
      * doc string
      
      * disable pytorch test_nn
      
      * try to fix windows issue
      
      * bug fixed, revert changes
      
      * [Test] fix CI. (#586)
      
      * disable unit test in mxnet tutorial.
      
      * retry socket connection.
      
      * roll back to set_np_compat
      
      * try to fix multi-processing test hangs when it fails.
      
      * fix test.
      
      * fix.
      
      * doc string
      
      * doc string and clean up
      
      * missing field in ctypes
      
      * fix node flow schedule and unit test
      
      * rename
      
      * pylint
      
      * copy from parent default context
      
      * fix unit test script
      
      * fix
      
      * demo bug in nodeflow gpu test
      
      * [Kernel][Bugfix] fix nodeflow bug (#604)
      
      * fix nodeflow bug
      
      * remove debug code
      
      * add build gtest option
      
      * fix cmake; fix graph index bug in spmv.py
      
      * remove clone
      
      * fix div rhs grad bug
      
      * [Kernel] Support full builtin method, edge softmax and unit tests (#605)
      
      * add full builtin support
      
      * unit test
      
      * unit test backend
      
      * edge softmax
      
      * apply edge with builtin
      
      * fix kernel unit test
      
      * disable mxnet test_shared_mem_store
      
      * gen builtin reduce
      
      * enable mxnet gpu unittest
      
      * revert some changes
      
      * docstring
      
      * add note for the hack
      
      * [Kernel][Unittest][CI] Fix MXNet GPU CI (#607)
      
      * update docker image for MXNet GPU CI
      
      * force all dgl graph input and output on CPU
      
      * fix gpu unittest
      
      * speedup compilation
      
      * add some comments
      
      * lint
      
      * add more comments
      
      * fix as requested
      
      * add some comments
      
      * comment
      
      * lint
      
      * lint
      
      * update pylint
      
      * fix as requested
      
      * lint
      
      * lint
      
      * lint
      
      * docstrings of python DGL kernel entries
      
      * disable lint warnings on arguments in kernel.py
      
      * fix docstring in scheduler
      
      * fix some bug in unittest; try again
      
      * Revert "Merge branch 'kernel' of github.com:zzhang-cn/dgl into kernel"
      
      This reverts commit 1d2299e68b004182ea6130b088de1f1122b18a49, reversing
      changes made to ddc97fbf1bec2b7815c0da7c74f7ecb2f428889b.
      
      * Revert "fix some bug in unittest; try again"
      
      This reverts commit ddc97fbf1bec2b7815c0da7c74f7ecb2f428889b.
      
      * more comprehensive kernel test
      
      * remove shape check in test_specialization
      653428bd
  10. 02 Jun, 2019 1 commit
    • Da Zheng's avatar
      [Test] Fix tests in test_shared_mem_store. (#588) · 372203f0
      Da Zheng authored
      * fix test.
      
      * better assert.
      
      * more asserts.
      
      * print to stderr.
      
      * destroy g.
      
      * fix tests.
      
      * add timeout in sync_barrier.
      
      * test _sync_barrier.
      
      * fix.
      
      * avoid printing messages.
      
      * fix test.
      
      * fix test.
      
      * fix.
      372203f0
  11. 30 May, 2019 1 commit
    • Da Zheng's avatar
      [Test] fix CI. (#586) · 40dc1859
      Da Zheng authored
      * disable unit test in mxnet tutorial.
      
      * retry socket connection.
      
      * roll back to set_np_compat
      
      * try to fix multi-processing test hangs when it fails.
      
      * fix test.
      
      * fix.
      40dc1859
  12. 23 May, 2019 2 commits
  13. 21 May, 2019 1 commit
    • Da Zheng's avatar
      [API] update graph store API. (#549) · b2b8be25
      Da Zheng authored
      * add init_ndata and init_edata in DGLGraph.
      
      * adjust SharedMemoryGraph API.
      
      * print warning.
      
      * fix comment.
      
      * update example
      
      * fix.
      
      * fix examples.
      
      * add unit tests.
      
      * add comments.
      b2b8be25
  14. 20 May, 2019 1 commit
    • Da Zheng's avatar
      [BUGFix] Improve multi-processing training (#526) · cdfca992
      Da Zheng authored
      * fix.
      
      * add comment.
      
      * remove.
      
      * temp fix.
      
      * initialize for shared memory.
      
      * fix graphsage.
      
      * fix gcn.
      
      * add more unit tests.
      
      * add more tests.
      
      * avoid creating shared-memory exclusively.
      
      * redefine remote initializer.
      
      * improve initializer.
      
      * fix unit test.
      
      * fix lint.
      
      * fix lint.
      
      * initialize data in the graph store server properly.
      
      * fix test.
      
      * fix test.
      
      * fix test.
      
      * small fix.
      
      * add comments.
      
      * cleanup server.
      
      * test graph store with a random port.
      
      * print.
      
      * print to stderr.
      
      * test1
      
      * test2
      
      * remove comment.
      
      * adjust the initializer signature.
      cdfca992
  15. 07 May, 2019 1 commit
    • Da Zheng's avatar
      [Model] add multiprocessing training with sampling. (#484) · 3a1392e6
      Da Zheng authored
      * reorganize sampling code.
      
      * add multi-process training.
      
      * speed up gcn_cv
      
      * fix graphsage_cv.
      
      * add new API in graph store.
      
      * update barrier impl.
      
      * support both local and distributed training.
      
      * fix multiprocess train.
      
      * fix.
      
      * fix barrier.
      
      * add script for loading data.
      
      * multiprocessing sampling.
      
      * accel training.
      
      * replace pull with spmv for speedup.
      
      * nodeflow copy from parent with context.
      
      * enable GPU.
      
      * fix a bug in graph store.
      
      * enable multi-GPU training.
      
      * fix lint.
      
      * add comments.
      
      * rename to run_store_server.py
      
      * fix gcn_cv.
      
      * fix a minor bug in sampler.
      
      * handle error better in graph store.
      
      * improve graphsage_cv for distributed mode.
      
      * update README.
      
      * fix.
      
      * update.
      3a1392e6
  16. 16 Apr, 2019 1 commit
  17. 08 Apr, 2019 1 commit
    • Da Zheng's avatar
      [Feature] Create shared memory graph store. (#468) · bfdd1eaa
      Da Zheng authored
      * accelerate gcn_ns.
      
      * add timing.
      
      * run infer with whole graph.
      
      * distributed gcn_ns.
      
      * reconstruct gcn_ns.
      
      * minor fix.
      
      * change graphsage_cv for numa.
      
      * fix #OMP threads.
      
      * accelerate graphsage_cv.
      
      * fix a weird bug.
      
      * add profiler in graphsage_cv.
      
      * accelerate graphsage_cv.
      
      manually aggregate neighbors' embeddings with pull.
      
      * load csr directly in gcn_ns_sc.
      
      * parallel sort for graph index.
      
      * Revert "parallel sort for graph index."
      
      This reverts commit 86fe2c7117fe5e56b0d481b39849c258b166945b.
      
      * run gcn_ns_sc on GPUs.
      
      * acc gcn_cv_sc.
      
      * change gcn_cv for numa.
      
      * fix gcn_cv to use numa and gpu.
      
      * improve graphsage_cv to use numa and gpu.
      
      * improve gcn_ns.
      
      * improve graphsage_cv.
      
      * init shared memory graph store.
      
      * fix.
      
      * enable init ndata.
      
      * improve tests.
      
      * add bidirectional communication.
      
      * link to rt.
      
      * fix compilation error.
      
      * fix shared memory init.
      
      * use MessageQueue for inter-process communication.
      
      * reconstruct immutable graph csr.
      
      * fix gcn.
      
      * load csr to shared memory.
      
      * fix minor bugs.
      
      * add comments.
      
      * refactor SharedMemory.
      
      * fix bugs in ImmutableGraph.
      
      * create CSR graph from shared memory.
      
      * add more test for loading a csr graph.
      
      * terminate graph store properly.
      
      * allow initializing ndata in the graph store server.
      
      * use RPC for inter-process communication.
      
      * a script for loading a graph.
      
      * allow customizing port.
      
      * list all ndata and edata.
      
      * support dtype.
      
      * reorganize SharedMemoryGraphStore.
      
      * fix ndata shape.
      
      * reconstruct gcn_ns.
      
      * print info.
      
      * set omp in gcn_ns.
      
      * reset sampling examples.
      
      * fix lint.
      
      * fix lint.
      
      * reset gcn.
      
      * disable shared memory in windows.
      
      * fix.
      
      * fix.
      
      * reset changes.
      
      * revert nodeflow changes.
      
      * fix cmake.
      
      * fix test.
      
      * fix test.
      
      * fix test.
      
      * fix test.
      
      * add comments.
      
      * fix test.
      
      * move vector out.
      
      * fix lint.
      
      * fix lint.
      
      * move SharedMemory.
      
      * update cmake.
      
      * update comment.
      
      * fix comments.
      
      * Revert "update cmake."
      
      This reverts commit 592445e37077f70a6e3f2e5245f9a3d086b04f3b.
      
      * update cmake.
      
      * add comments.
      
      * rename.
      
      * change the comment.
      
      * fix a bug.
      
      * rename.
      
      * add comments.
      
      * add comments.
      
      * add init_edata.
      
      * rewrite memory alloc.
      
      * move vector to CSR.
      
      * fix.
      
      * init data.
      
      * Revert "init data."
      
      This reverts commit 2b217b9553911b7dd84a9f1d9b68430b5aa18e23.
      
      * init data.
      
      * init new columns correctly.
      bfdd1eaa