- 06 Jan, 2023 2 commits
-
-
peizhou001 authored
-
peizhou001 authored
-
- 21 Dec, 2022 1 commit
-
-
Serge Panev authored
Signed-off-by:
Serge Panev <spanev@nvidia.com> Signed-off-by:
Serge Panev <spanev@nvidia.com>
-
- 01 Dec, 2022 1 commit
-
-
peizhou001 authored
-
- 21 Nov, 2022 1 commit
-
-
Rhett Ying authored
* [Dist] instantiate NodeDataView in lazy mode * fix test failure * init node/edge data store at the very beginning * fix test failures * refine comment * add more tests
-
- 17 Nov, 2022 1 commit
-
-
Rhett Ying authored
* [Dist] fix is_node check according to policy * add more tests
-
- 07 Nov, 2022 1 commit
-
-
Rhett Ying authored
* [Dist] enable access DistGraph.edges via canonical etype * refine code * refine test * refine code
-
- 04 Nov, 2022 1 commit
-
-
Rhett Ying authored
* [Dist] deprecate etype and always use canonical etype for partition and load * enable canonical etypes in dist part pipeline * resolve rebase conflicts * fix lint * fix test failure * throw exception if outdated part config is loaded * refine * refine * revert unnecessary change * fix typo
-
- 01 Nov, 2022 1 commit
-
-
peizhou001 authored
* add save/load for distributed optimizer Co-authored-by:Ubuntu <ubuntu@ip-172-31-16-19.ap-northeast-1.compute.internal>
-
- 29 Oct, 2022 1 commit
-
-
Quan (Andy) Gan authored
* sample neighbors with masks * oops * refactor again * remove * remove debug code * rename macro * address comments * more stuff * remove * fix * try fix unit test * oops * fix test * oops * change name * rename a lot of stuff * oops * ugh * misc fixes * lint * address a lot of comments * lint * lint * fix * that was silly * fix * fix * fix * oops
-
- 26 Oct, 2022 1 commit
-
-
Rhett Ying authored
* [Dist] reduce startup overhead: enable to save in specified formats * [Dist] reduce startup overhead: sort partitions when generating * sort csc/csr only whenmultiple etypes * refine
-
- 17 Oct, 2022 1 commit
-
-
Rhett Ying authored
* [Dist] Reduce peak memory in DistDGL: avoid validation, release memory once loaded * remove orig_id from ndata/edata for partition_graph() * delete orig_id from ndata/edata in dist part pipeline * reduce dtype size and format before saving graphs * fix lint * ETYPE requires to be int32/64 for CSRSortByTag * fix test failure * refine
-
- 12 Oct, 2022 1 commit
-
-
Rhett Ying authored
* [Dist] enable iterate multiple dist dataloaders simultaneously * format file * add support for any number of dataloaders * fix lint * refine code
-
- 10 Oct, 2022 1 commit
-
-
Hongzhi (Steve), Chen authored
Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
- 30 Sep, 2022 1 commit
-
-
Quan (Andy) Gan authored
* first commit * add test * fixes * ah this is how you skip setup * fix * ugh * address comments * i like black
-
- 16 Aug, 2022 1 commit
-
-
Rhett Ying authored
* [Feature] enable graph partition book support canonical etypes * fix lint * fix lint * add todo * refine according to review comments * fix lint * refine naming * revert PartitionPolicy __init__ * refine docstring * fix doc string
-
- 03 Aug, 2022 1 commit
-
-
Rhett Ying authored
-
- 01 Aug, 2022 1 commit
-
-
Rhett Ying authored
-
- 28 Jul, 2022 1 commit
-
-
Rhett Ying authored
* [DistTest] fix incorrect shell if statement * fix incorrect use of dist.initialize()
-
- 11 Jul, 2022 2 commits
-
-
Rhett Ying authored
* [Dist] enable to specify sort_etype for sample_etype_neighbours * fix lint * pass argument instead of env * fix lint and doc string * refine args * remove unnecessary lines * debug only * debug add sort time log * change interface * fix typo Co-authored-by:Xin Yao <xiny@nvidia.com>
-
Rhett Ying authored
* [Dist] format dtypes when loading graph in server * add test * refine * add comments
-
- 20 Jun, 2022 1 commit
-
-
Rhett Ying authored
* [Dist] defer to load node/edge feats * fix lint * Update python/dgl/distributed/partition.py Co-authored-by:
Minjie Wang <minjie.wang@nyu.edu> * Update python/dgl/distributed/partition.py Co-authored-by:
Minjie Wang <minjie.wang@nyu.edu> * fix lint Co-authored-by:
Minjie Wang <minjie.wang@nyu.edu>
-
- 16 Jun, 2022 1 commit
-
-
Rhett Ying authored
* [Dist] set socket as default backend for RPC * add tests both for socket and tensorpipe
-
- 09 Jun, 2022 1 commit
-
-
Rhett Ying authored
-
- 08 Jun, 2022 1 commit
-
-
Rhett Ying authored
* [ist] enable time out when fetching msg * fix lint error * minor refinements * improve minor log * fix dist test * fix timeout issue in tensorpipe
-
- 18 May, 2022 1 commit
-
-
Rhett Ying authored
* [Dist][BugFix] enable sampling on bipartite * add comments for tests
-
- 11 May, 2022 1 commit
-
-
Rhett Ying authored
* [Dist] Enable maximum try times for socket backend via DGL_DIST_MAX_TRY_TIMES * reset env before/after test * print log for info when trying to connect * fix * print log in python instead of cpp
-
- 27 Apr, 2022 1 commit
-
-
Rhett Ying authored
* [Feature] enable socket net_type for rpc * fix lint * fix lint * fix build issue on windows * fix test failure on windows * fix test failure * fix cpp unit test failure * net_type blocking max_try_times * fix other comments * fix lint * fix comment * fix lint * fix cpp
-
- 24 Mar, 2022 1 commit
-
-
Rhett Ying authored
Co-authored-by:Jinjing Zhou <VoVAllen@users.noreply.github.com>
-
- 14 Mar, 2022 1 commit
-
-
Rhett Ying authored
* [BugFix] pass ntype/etype into partition book when node/edge_split * fix test failure * fix test failue on mxnet * fix test failure
-
- 02 Mar, 2022 1 commit
-
-
Rhett Ying authored
-
- 30 Jan, 2022 2 commits
-
-
Rhett Ying authored
* [Fix] sleep for a while when launching clients which will connect to multiple servers * pre-allocate more ports * no multiple partitions on single machine
-
Quan (Andy) Gan authored
* initial update * more * more * multi-gpu example * cluster gcn, finalize homogeneous * more explanation * fix * bunch of fixes * fix * RGAT example and more fixes * shadow-gnn sampler and some changes in unit test * fix * wth * more fixes * remove shadow+node/edge dataloader tests for possible ux changes * lints * add legacy dataloading import just in case * fix * update pylint for f-strings * fix * lint * lint * lint again * cherry-picking commit fa9f494 * oops * fix * add sample_neighbors in dist_graph * fix * lint * fix * fix * fix * fix tutorial * fix * fix * fix * fix warning * remove debug * add get_foo_storage apis * lint
-
- 28 Jan, 2022 2 commits
-
-
Quan (Andy) Gan authored
* migrate to pylint 2.6.0 * fix * fix? * ??? * oops
-
Rhett Ying authored
-
- 26 Jan, 2022 1 commit
-
-
Rhett Ying authored
* [Feature] long live server for multiple client groups * generate globally unique name for DistTensor within DGL automatically
-
- 19 Jan, 2022 2 commits
-
-
Jinjing Zhou authored
-
Rhett Ying authored
* [Fix] reduce error msg, refine fetch logic of available ports * un-initialize client before sending shutdown request * fix import error * print connect failure log only in debug mode * enable DMLC_LOG_DEBUG=1 in CI
-
- 11 Jan, 2022 1 commit
-
-
Rhett Ying authored
* [Feature] enable TP::Receiver wait for any numbers of senders * fix random unit test failure * avoid endless future wait * fix unit test failure * fix seg fault when finalize wait in receiver * [Feature] refactor sender connect logic and remove unnecessary sleeps in unit tests * fix lint * release RPCContext resources before process exits * [Debug] TPReceiver wait start log * [Debug] add log in get port * [Debug] add log * [ReDebug] revert time sleep in unit tests * [Debug] remove sleep for test_distri,test_mp * [debug] add more log * [debug] add listen_booted_ flag * [debug] restore commented code for queue * [debug] sleep more in rpc_client * restore change in tests * Revert "restore change in tests" This reverts commit 41a18926d181ec2517069389bfc41de2cc949280. * Revert "[debug] sleep more in rpc_client" This reverts commit a908e758eabca0a6ce62eb2e59baea02a840ac67. * Revert "[debug] restore commented code for queue" This reverts commit d3f993b3746e6bb6e2cc2f90204dd7e9461c6301. * Revert "[debug] add listen_booted_ flag" This reverts commit 244b2167d94942ff2a0acec8823b974975e52580. * Revert "[debug] add more log" This reverts commit 4b78447b0a575a824821dc7e25cca2246e6e30e2. * Revert "[Debug] remove sleep for test_distri,test_mp" This reverts commit e1df1aadcc8b1c2a0013ed77322ac391a8807612. * remove debug code * revert unnecessary change * revert unnecessary changes * always reset RPCContext when get started and reset all data * remove time.sleep in dist tests * fix lint * reset envs before each dist test * reset env properly * add time sleep when start each server * sleep for a while when boot server * replace wait_thread with callback * fix lint * add dglconnect handshake check Co-authored-by:Jinjing Zhou <VoVAllen@users.noreply.github.com>
-
- 06 Dec, 2021 1 commit
-
-
Jinjing Zhou authored
* doesn't know whether works * add change * fix * fix * fix * remove * revert * lint * lint * fix * revert * lint * fix * only build rpc on linux * lint * lint * fix build on windows * fix windows * remove old test * fix cmake * Revert "remove old test" This reverts commit f1ea75c777c34cdc1f08c0589676ba6aee1feb29. * fix windows * fix * fix * fix indent * fix indent * address comment * fix * fix * fix * fix * fix * lint * fix indent * fix lint * add introduction * fix * lint * lint * add more logs * fix * update xbyak for C++14 with gcc5 * Remove channels * fix * add test script * fix * remove unused file * fix lint * add timeout
-