- 03 May, 2020 1 commit
-
-
Da Zheng authored
* initial version from distributed training. This is copied from multiprocessing training. * modify for distributed training. * it's runnable now. * measure time in neighbor sampling. * simplify neighbor sampling. * fix a bug in distributed neighbor sampling. * allow single-machine training. * fix a bug. * fix a bug. * fix openmp. * make some improvement. * fix. * add prepare in the sampler. * prepare nodeflow async. * fix a bug. * get id. * simplify the code. * improve. * fix partition.py * fix the example. * add more features. * fix the example. * allow one partition * use distributed kvstore. * do g2l map manually. * fix commandline. * a temp script to save reddit. * fix pull_handler. * add pytorch version. * estimate the time for copying data. * delete unused code. * fix a bug. * print id. * fix a bug * fix a bug * fix a bug. * remove redundent code. * revert modify in sampler. * fix temp script. * remove pytorch version. * fix. * distributed training with pytorch. * add distributed graph store. * fix. * add metis_partition_assignment. * fix a few bugs in distributed graph store. * fix test. * fix bugs in distributed graph store. * fix tests. * remove code of defining DistGraphStore. * fix partition. * fix example. * update run.sh. * only read necessary node data. * batching data fetch of multiple NodeFlows. * simplify gcn. * remove unnecessary code. * use the new copy_from_kvstore. * update training script. * print time in graphsage. * make distributed training runnable. * use val_nid. * fix train_sampling. * add distributed training. * add run.sh * add more timing. * fix a bug. * save graph metadata when partition. * create ndata and edata in distributed graph store. * add timing in minibatch training of GraphSage. * use pytorch distributed. * add checks. * fix a bug in global vs. local ids. * remove fast pull * fix a compile error. * update and add new APIs. * implement more methods in DistGraphStore. * update more APIs. * rename it to DistGraph. * rename to DistTensor * remove some unnecessary API. * remove unnecessary files. * revert changes in sampler. * Revert "simplify gcn." This reverts commit 0ed3a34ca714203a5b45240af71555d4227ce452. * Revert "simplify neighbor sampling." This reverts commit 551c72d20f05a029360ba97f312c7a7a578aacec. * Revert "measure time in neighbor sampling." This reverts commit 63ae80c7b402bb626e24acbbc8fdfe9fffd0bc64. * Revert "add timing in minibatch training of GraphSage." This reverts commit e59dc8957a414c7df5c316f51d78bce822bdef5e. * Revert "fix train_sampling." This reverts commit ea6aea9a4aabb8ba0ff63070aa51e7ca81536ad9. * fix lint. * add comments and small update. * add more comments. * add more unit tests and fix bugs. * check the existence of shared-mem graph index. * use new partitioned graph storage. * fix bugs. * print error in fast pull. * fix lint * fix a compile error. * save absolute path after partitioning. * small fixes in the example * Revert "[kvstore] support any data type for init_data() (#1465)" This reverts commit 87b6997b . * fix a bug. * disable evaluation. * Revert "Revert "[kvstore] support any data type for init_data() (#1465)"" This reverts commit f5b8039c6326eb73bad8287db3d30d93175e5bee. * support set and init data. * support set and init data. * Revert "Revert "[kvstore] support any data type for init_data() (#1465)"" This reverts commit f5b8039c6326eb73bad8287db3d30d93175e5bee. * fix bugs. * fix unit test. * move to dgl.distributed. * fix lint. * fix lint. * remove local_nids. * fix lint. * fix test. * remove train_dist. * revert train_sampling. * rename funcs. * address comments. * address comments. Use NodeDataView/EdgeDataView to keep track of data. * address comments. * address comments. * revert. * save data with DGL serializer. * use the right way of getting shape. * fix lint. * address comments. * address comments. * fix an error in mxnet. * address comments. * add edge_map. * add more test and fix bugs. Co-authored-by:
Zheng <dzzhen@186590dc80ff.ant.amazon.com> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-6-131.us-east-2.compute.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-26-167.us-east-2.compute.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-16-150.us-west-2.compute.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-16-250.us-west-2.compute.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-30-135.us-west-2.compute.internal>
-
- 30 Mar, 2020 1 commit
-
-
Jinjing Zhou authored
* TF backend fix and new logic to choose backend * fix * fix * fix * fix * fix backend * fix * dlpack alignment * add flag * flag * lint * lint * remove unused * several fixes Co-authored-by:Minjie Wang <wmjlyjemaine@gmail.com>
-
- 07 Mar, 2020 1 commit
-
-
Quan (Andy) Gan authored
* add num nodes in ctors * fix * lint * addresses comments * replace with constexpr * remove function with rvalue reference * address comments
-
- 31 Jan, 2020 1 commit
-
-
Quan (Andy) Gan authored
* trying to refactor IndexSelect * partial implementation * add index select and assign for floats as well * move to random choice source * more updates * fixes * fixes * more fixes * adding python impl * fixes * unit test * lint * lint x2 * lint x3 * update metapath2vec * debugging performance * still debugging for performance * tuning * switching to succvec * redo * revert non-uniform sampler to use vector * still not fast * why does this crash with OpenMP??? * because there was a data race!!! * add documentations and remove assign op * lint * lint x2 * lol what have i done * lint x3 * fix and disable gpu testing * bugfix * generic random walk * reorg the random walk source code * Update randomwalks.h * Update randomwalks_cpu.cc * rename file * move internal function to anonymous ns * reorg & docstrings * constant restart probability * docstring fix * more commit * random walk with restart, tested * some fixes * switch to using NDArray for choice * massive fix & docstring * lint x? * lint x?? * fix * export symbols * skip gpu test * addresses comments * replaces another VecToIdArray * add randomwalks.h to include * replace void * with template
-
- 21 May, 2019 1 commit
-
-
Minjie Wang authored
* WIP * header * WIP .cc * WIP * transpose * wip * immutable graph .h and .cc * WIP: nodeflow.cc * compile * remove all tmp dl managed ctx; they caused refcount issue * one simple test * WIP: testing * test_graph * fix graph index * fix bug in sampler; pass pytorch utest * WIP on mxnet * fix lint * fix mxnet unittest w/ unfortunate workaround * fix msvc * fix lint * SliceRows and test_nodeflow * resolve reviews * resolve reviews * try fix win ci * try fix win ci * poke win ci again * poke * lazy multigraph flag; stackoverflow error * revert node subgraph test * lazy object * try fix win build * try fix win build * poke ci * fix build script * fix compile * add a todo * fix reviews * fix compile
-
- 08 Apr, 2019 1 commit
-
-
Da Zheng authored
* accelerate gcn_ns. * add timing. * run infer with whole graph. * distributed gcn_ns. * reconstruct gcn_ns. * minor fix. * change graphsage_cv for numa. * fix #OMP threads. * accelerate graphsage_cv. * fix a weird bug. * add profiler in graphsage_cv. * accelerate graphsage_cv. manually aggregate neighbors' embeddings with pull. * load csr directly in gcn_ns_sc. * parallel sort for graph index. * Revert "parallel sort for graph index." This reverts commit 86fe2c7117fe5e56b0d481b39849c258b166945b. * run gcn_ns_sc on GPUs. * acc gcn_cv_sc. * change gcn_cv for numa. * fix gcn_cv to use numa and gpu. * improve graphsage_cv to use numa and gpu. * improve gcn_ns. * improve graphsage_cv. * init shared memory graph store. * fix. * enable init ndata. * improve tests. * add bidirectional communication. * link to rt. * fix compilation error. * fix shared memory init. * use MessageQueue for inter-process communication. * reconstruct immutable graph csr. * fix gcn. * load csr to shared memory. * fix minor bugs. * add comments. * refactor SharedMemory. * fix bugs in ImmutableGraph. * create CSR graph from shared memory. * add more test for loading a csr graph. * terminate graph store properly. * allow initializing ndata in the graph store server. * use RPC for inter-process communication. * a script for loading a graph. * allow customizing port. * list all ndata and edata. * support dtype. * reorganize SharedMemoryGraphStore. * fix ndata shape. * reconstruct gcn_ns. * print info. * set omp in gcn_ns. * reset sampling examples. * fix lint. * fix lint. * reset gcn. * disable shared memory in windows. * fix. * fix. * reset changes. * revert nodeflow changes. * fix cmake. * fix test. * fix test. * fix test. * fix test. * add comments. * fix test. * move vector out. * fix lint. * fix lint. * move SharedMemory. * update cmake. * update comment. * fix comments. * Revert "update cmake." This reverts commit 592445e37077f70a6e3f2e5245f9a3d086b04f3b. * update cmake. * add comments. * rename. * change the comment. * fix a bug. * rename. * add comments. * add comments. * add init_edata. * rewrite memory alloc. * move vector to CSR. * fix. * init data. * Revert "init data." This reverts commit 2b217b9553911b7dd84a9f1d9b68430b5aa18e23. * init data. * init new columns correctly.
-
- 05 Dec, 2018 1 commit
-
-
Lingfan Yu authored
* include/dgl/runtime * include * src/runtime * src/graph * src/scheduler * src * clean up CMakeLists * further clean up in cmake * install commands * python/dgl/_ffi/_cython * python/dgl/_ffi/_ctypes * python/dgl/_ffi * python/dgl * some fix * copy right
-
- 18 Oct, 2018 1 commit
-
-
Gan Quan authored
* multigraph support on graph index * more tests * multigraph flag, bugfix on clear & copy * networkx interfaces * including graph index tests in Jenkins * node subgraph test * edge subgraphs * removing duplicates in pred/succ * more explicit test and doc * query source and destination from edge id * subgraphindex * renaming has_edge to has_edge_between, apply_edges adding eid * send_on and send_and_recv_on * DGLGraph edge subgraph * merged send_on and send_and_recv_on * change request * removing hashmap * creating multigraph by flag; mingw support * changes per request * reverting networkx auto multigraph discovery * notes on send/send_and_recv on multigraphs * changing test reducer from sum to max * added a fixme note in spmv scheduler
-
- 09 Oct, 2018 1 commit
-
-
Minjie Wang authored
-
- 05 Sep, 2018 1 commit
-
-
Minjie Wang authored
-