- 07 Nov, 2022 1 commit
-
-
Hongzhi (Steve), Chen authored
* replace * blabla * balbla * blabla Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
- 06 Nov, 2022 1 commit
-
-
Hongzhi (Steve), Chen authored
* param * brief * note * return * tparam * brief2 * file * return2 * return * blabla * all Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
- 04 Nov, 2022 3 commits
-
-
Hongzhi (Steve), Chen authored
* [Misc] clang-format auto fix. * manual * manual Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
Hongzhi (Steve), Chen authored
* [Misc] clang-format auto fix. * fix * manual Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
Hongzhi (Steve), Chen authored
Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
- 21 Sep, 2022 1 commit
-
-
Xin Yao authored
* disable warning for tensorpipe * fix warning * enable lint check for cuh files * resolve comments
-
- 19 Sep, 2022 1 commit
-
-
Xin Yao authored
* rename `DLContext` to `DGLContext` * rename `kDLGPU` to `kDLCUDA` * replace DLTensor with DGLArray * fix linting * Unify DGLType and DLDataType to DGLDataType * Fix FFI * rename DLDeviceType to DGLDeviceType * decouple dlpack from the core library * fix bug * fix lint * fix merge * fix build * address comments * rename dl_converter to dlpack_convert * remove redundant comments
-
- 27 Jul, 2022 1 commit
-
-
Rhett Ying authored
* [Log] fix confusing error log in TCPSocket::Bind() * fix lint
-
- 20 Jun, 2022 1 commit
-
-
Rhett Ying authored
-
- 08 Jun, 2022 1 commit
-
-
Rhett Ying authored
* [ist] enable time out when fetching msg * fix lint error * minor refinements * improve minor log * fix dist test * fix timeout issue in tensorpipe
-
- 11 May, 2022 1 commit
-
-
Rhett Ying authored
* [Dist] Enable maximum try times for socket backend via DGL_DIST_MAX_TRY_TIMES * reset env before/after test * print log for info when trying to connect * fix * print log in python instead of cpp
-
- 27 Apr, 2022 1 commit
-
-
Rhett Ying authored
* [Feature] enable socket net_type for rpc * fix lint * fix lint * fix build issue on windows * fix test failure on windows * fix test failure * fix cpp unit test failure * net_type blocking max_try_times * fix other comments * fix lint * fix comment * fix lint * fix cpp
-
- 24 Mar, 2022 1 commit
-
-
Rhett Ying authored
Co-authored-by:Jinjing Zhou <VoVAllen@users.noreply.github.com>
-
- 26 Jan, 2022 1 commit
-
-
Rhett Ying authored
* [Feature] long live server for multiple client groups * generate globally unique name for DistTensor within DGL automatically
-
- 19 Jan, 2022 1 commit
-
-
Rhett Ying authored
* [Fix] reduce error msg, refine fetch logic of available ports * un-initialize client before sending shutdown request * fix import error * print connect failure log only in debug mode * enable DMLC_LOG_DEBUG=1 in CI
-
- 11 Jan, 2022 1 commit
-
-
Rhett Ying authored
* [Feature] enable TP::Receiver wait for any numbers of senders * fix random unit test failure * avoid endless future wait * fix unit test failure * fix seg fault when finalize wait in receiver * [Feature] refactor sender connect logic and remove unnecessary sleeps in unit tests * fix lint * release RPCContext resources before process exits * [Debug] TPReceiver wait start log * [Debug] add log in get port * [Debug] add log * [ReDebug] revert time sleep in unit tests * [Debug] remove sleep for test_distri,test_mp * [debug] add more log * [debug] add listen_booted_ flag * [debug] restore commented code for queue * [debug] sleep more in rpc_client * restore change in tests * Revert "restore change in tests" This reverts commit 41a18926d181ec2517069389bfc41de2cc949280. * Revert "[debug] sleep more in rpc_client" This reverts commit a908e758eabca0a6ce62eb2e59baea02a840ac67. * Revert "[debug] restore commented code for queue" This reverts commit d3f993b3746e6bb6e2cc2f90204dd7e9461c6301. * Revert "[debug] add listen_booted_ flag" This reverts commit 244b2167d94942ff2a0acec8823b974975e52580. * Revert "[debug] add more log" This reverts commit 4b78447b0a575a824821dc7e25cca2246e6e30e2. * Revert "[Debug] remove sleep for test_distri,test_mp" This reverts commit e1df1aadcc8b1c2a0013ed77322ac391a8807612. * remove debug code * revert unnecessary change * revert unnecessary changes * always reset RPCContext when get started and reset all data * remove time.sleep in dist tests * fix lint * reset envs before each dist test * reset env properly * add time sleep when start each server * sleep for a while when boot server * replace wait_thread with callback * fix lint * add dglconnect handshake check Co-authored-by:Jinjing Zhou <VoVAllen@users.noreply.github.com>
-
- 06 Dec, 2021 1 commit
-
-
Jinjing Zhou authored
* doesn't know whether works * add change * fix * fix * fix * remove * revert * lint * lint * fix * revert * lint * fix * only build rpc on linux * lint * lint * fix build on windows * fix windows * remove old test * fix cmake * Revert "remove old test" This reverts commit f1ea75c777c34cdc1f08c0589676ba6aee1feb29. * fix windows * fix * fix * fix indent * fix indent * address comment * fix * fix * fix * fix * fix * lint * fix indent * fix lint * add introduction * fix * lint * lint * add more logs * fix * update xbyak for C++14 with gcc5 * Remove channels * fix * add test script * fix * remove unused file * fix lint * add timeout
-
- 28 Sep, 2021 1 commit
-
-
Jingcheng Yu authored
Co-authored-by:JingchengYu94 <jingchengyu94@gmail.com>
-
- 02 Sep, 2021 1 commit
-
-
Tomasz Patejko authored
* [CPU, Parallel] Rewriting omp pragmas with parallel_for * [CPU, Parallel] Decrease number of calls to task function * c[CPU, Parallel] Modify calls to new interface of parallel_for
-
- 25 Jul, 2021 1 commit
-
-
Jingcheng Yu authored
Co-authored-by:
JingchengYu94 <jingchengyu94@gmail.com> Co-authored-by:
xiang song(charlie.song) <classicxsong@gmail.com>
-
- 22 Mar, 2021 1 commit
-
-
Da Zheng authored
* print error messages. * fix.
-
- 26 Dec, 2020 1 commit
-
-
Da Zheng authored
* delete shared memory when receive signal. * rename. * fix lint. * fix lint. * fix compile. * Fix. * we need to report error if the shared memory exist. * disable tensorflow test for shared memory. * revert. Co-authored-by:
Ubuntu <ubuntu@ip-172-31-2-202.us-west-1.compute.internal> Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com>
-
- 15 Dec, 2020 1 commit
-
-
Da Zheng authored
* reuse. * fix compile. Co-authored-by:Ubuntu <ubuntu@ip-172-31-2-202.us-west-1.compute.internal>
-
- 09 Dec, 2020 1 commit
-
-
Chao Ma authored
-
- 25 Sep, 2020 1 commit
-
-
Chao Ma authored
-
- 19 Aug, 2020 1 commit
-
-
Chao Ma authored
* small fix * update
-
- 14 Aug, 2020 1 commit
-
-
Jinjing Zhou authored
* aaa * revert * 111 * fix * fff * 111 * try * debug * fff * 111 * 1111 * fix * try * 111 * 111 * print * f * 111 * revert * fix * revert * add timeout * Revert "add timeout" This reverts commit fb34f67f11ad6556824daa6efa29a6ec2c7c1d3e.
-
- 13 Aug, 2020 1 commit
-
-
Chao Ma authored
* Fix memory leak * update * update * update * update * update * update * update
-
- 05 Aug, 2020 1 commit
-
-
Jinjing Zhou authored
* 111 * 111 * fix * 111 * fix * 11 * fix * lint * Update __init__.py * lint * fix * lint * fix * fix * fix * fix * fix * try fix * try fix * fix * Revert "fix" This reverts commit a0b954fd4e99b7df92b53db8334dcb583d6e1551. * fixes. * fix. * fix test. * fix exit. * fix. * fix * fix * lint * lint * lint * fix * Update .gitignore * 111 * fix * 111 * 111 * fff * 1111 * 111 * 1325315 * ffff * f??? * fff * 1111 * 111 * fix * 111 * asda * 1111 * 11 * 123 * 啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊 * spawn * 1231231 * up * 111 * fix * fix * Revert "fix" This reverts commit 7373f95312fdcaa36d2fc330bf242339e89c045d. * fix * fix * 1111 * fix * fix tests * start kvclient as early as possible. * lint * fix test * lint * 1111 * fix * fix * 111 * fix * fix * 1 * fix * fix * lint * fix * lint * lint * remove quit * fix * lint * fix * fix several * lint * fix minor * fix * lint Co-authored-by:
Da Zheng <zhengda1936@gmail.com> Co-authored-by:
xiang song(charlie.song) <classicxsong@gmail.com>
-
- 30 Jul, 2020 1 commit
-
-
Chao Ma authored
* udpate * update * update * update * update * update * update * update * fix lint * update * update * update * update * udpate * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update Co-authored-by:Da Zheng <zhengda1936@gmail.com>
-
- 23 Jul, 2020 1 commit
-
-
Chao Ma authored
* update * update * update * update * udpate
-
- 22 Jul, 2020 1 commit
-
-
Da Zheng authored
* add eval. * extend DistTensor. * fix. * add barrier. * add more print. * add more checks in kvstore. * fix lint. * get all neighbors for eval. * reorganize. * fix. * fix. * fix. * fix test. * add reuse_if_exist. * add test for reuse_if_exist. * fix lint. * fix bugs. * fix. * print errors of tcp socket. * support delete tensors. * fix lint. * fix * fix example Co-authored-by:Ubuntu <ubuntu@ip-172-31-19-1.us-west-2.compute.internal>
-
- 21 Jul, 2020 1 commit
-
-
Chao Ma authored
-
- 09 Jul, 2020 1 commit
-
-
mszarma authored
Added handling EINTR signal by retrying when returned value and errno are related to EINTR interrupt error which my appear during tcp function execution. Added information about EAGAIN timeout failure in accept stage. Added printing out errno in logs. Co-authored-by:Chao Ma <mctt90@gmail.com>
-
- 07 Jul, 2020 1 commit
-
-
Chao Ma authored
* update * update * update
-
- 03 Jul, 2020 1 commit
-
-
mszarma authored
Refactor code to use timeval struct for setting up the SO_RCVTIMEO variable timeout value for sockets in case of Linux. Changed SetTimeout API timeout input to seconds. Co-authored-by:Chao Ma <mctt90@gmail.com>
-
- 29 Jun, 2020 1 commit
-
-
Chao Ma authored
* add num_clients to kvstore * update * update * update * update * update * update * update * fix lint * update * update * update * add test * update * update * update * update * fix test * update * update * update * update * update * update * update Co-authored-by:Da Zheng <zhengda1936@gmail.com>
-
- 28 Jun, 2020 1 commit
-
-
Minjie Wang authored
* add cub; array cumsum * CSRSliceRows * fix warning * operator << for ndarray; CSRSliceRows * add CSRIsSorted * add csr_sort * inplace coosort and outplace csrsort * WIP: coo is sorted * mv cuda_utils * add AllTrue utility * csr sort * coo sort * coo2csr for sorted coo arrays * CSRToCOO from sorted * pass tests for the new kernel changes * cannot use inplace sort * lint * try fix msvc error * Fix g.copy_to and g.asnumbits; ToBlock no longer uses CSC * stash * revert some hack * revert some changes * address comments * fix * fix to_block unittest * add todo note
-
- 17 Jun, 2020 2 commits