- 06 Aug, 2022 1 commit
-
-
kylasa authored
* Alltoall Fix to bypass gloo - alltoallv bug which is preventing further testing 1. Replaced alltoallv gloo wrapper call with alltoall message. 2. All the messages are padded to be of same length 3. Receiving side unpads the messages and continues processing. * Code changes to address CI comments 1. Removed unused functions from gloo_wrapper.py 2. Changed the function signature of alltoallv_cpu_data as suggested. 3. Added docstring to include more description of the functionality inside alltoallv_cpu_data. Included more asserts to validate the assumptions. * Changed the function name appropriately Changed the function name from "alltoallv_cpu_data" to alltoallv_cpu which I believe is appropriate because underlying functionality is providing alltoallv which is basically alltoall_cpu + padding * Added code and text to address the review comments. 1. Changed the function name to indicate the local use of this function. 2. Changed docstring to indicate the assumptions made by alltoallv_cpu function. * Removed unused function from import statement Removed unused/removed function from import statement.
-
- 23 Jul, 2022 1 commit
-
-
kylasa authored
* Code changes to address the updated file format support for massively large graphs. 1. Updated the docstring for the starting function 'gen_dist_partitions" to describe the newly proposed file format for input dataset. 2. Code which was dependent on the structure of the old-metadata json object has been updated to read from the newly proposed metadata file. 3. Fixed some errors when appropriate functions were invoked and the calling function expects return values from the invoked furnction. 4. This modified code has been tested on "mag" dataset using 4-way partitions and verified the results * Code changes to address the CI review comments 1. Improved docstrings for some functions. 2. Added a new function in the utils.py to compute the id ranges and this is used in multiple places. * Added TODO to indicate the redundant data structure. Because of the new file format changes, one of the dictionaries (node_feature_tids, node_tids) will be redundant. Added TODO text so that this will be removed in the next iteration of code changes.
-
- 13 Jul, 2022 1 commit
-
-
kylasa authored
* Code changes for the following 1. Generating node data at each process 2. Reading csv files using pyarrow 3. feature complete code. * Removed some typo's because of which unit tests were failing 1. Change the file name to correct file name when loading edges from file 2. When storing node-features after shuffling, use the correct key to store the global-nids of node features which are received after transmitted. * Code changes to address CI comments by reviewers 1. Removed some redundant code and added text in the doc-strings to describe the functionality of some functions. 2 function signatures and invocations now match w.r.t argument list 3. Added detailed description of the metadata json structure so that the users understand the the type of information present in this file and how it is used through out the code. * Addressing code review comments 1. Addressed all the CI comments and some of the changes include simplifying the code related to the concatenation of lists and enhancing the docstrings of functions which are changed in this process. * Update docstring's of two functions appropriately in response to code review comments Removed "todo" from the docstring of the gen_nodedata function. Added "todo" to the gen_dist_partitions function when node-id to partition-id's are read for the first time. Removed 'num-node-weights' from the docstring for the get_dataset function and added schema_map docstring to the argument list.
-
- 05 Jul, 2022 3 commits
-
-
kylasa authored
* Added code to support multiple-file-support feature and removed single-file-support code 1. Added code to read dataset in multiple-file-format 2. Removed code for single-file format * added files missing in the previous commit This commit includes dataset_utils.py, which reads the dataset in multiple-file-format, gloo_wrapper function calls to support exchanging dictionaries as objects and helper functions in utils.py * Update convert_partition.py Updated function call "create_metadata_json" file to include partition_id so that each rank only creates its own metadata object and later on these are accumulated on rank-0 to create graph-level metadata json file. * addressing code review comments during the CI process code changes resulting from the code review comments received during the CI process. * Code reorganization Addressing CI comments and code reorganization for easier understanding. * Removed commented out line removed commented out line.
-
Da Zheng authored
-
- 02 Jul, 2022 1 commit
-
-
Chang Liu authored
-
- 01 Jul, 2022 3 commits
-
-
Rhett Ying authored
-
Chang Liu authored
* minor update on golden example * update * update * Update README Co-authored-by:Minjie Wang <wmjlyjemaine@gmail.com>
-
Rhett Ying authored
* [Feature] extend sort_csr/csc_by_tag to edge * fix test ffailure in tensorflow * refine sorting by edges * fix docstring * remove unnecessary mem Co-authored-by:Xin Yao <xiny@nvidia.com>
-
- 30 Jun, 2022 5 commits
-
-
Chang Liu authored
* Regolden graphsage example to guide others * update golden * update * Update example and propagate to original folder * Update to remove ^M (windows DOS) character * update * Merge file changes and update README * Minor comment update Co-authored-by:
Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by:
Mufei Li <mufeili1996@gmail.com>
-
Chang Liu authored
Co-authored-by:Xin Yao <xiny@nvidia.com>
-
Minjie Wang authored
* try optimize CI * fix go test; adjust timing report * disable certain tests for mx/tf backends * fix ut * add pydantic
-
Quan (Andy) Gan authored
Co-authored-by:Xin Yao <xiny@nvidia.com>
-
nv-dlasalle authored
* * Workaround for graph data saving/loading compatibility problem in Column class. There may be more places in DGL with the same issue, due to using Python serialization, instead of a more cohesive, comprehensive strategy. This is just a local fix. * Add checking for non-empty states * Add unit test * Handle the case of columns without storage Co-authored-by:
ndickson <ndickson@nvidia.com> Co-authored-by:
Xin Yao <xiny@nvidia.com>
-
- 29 Jun, 2022 6 commits
-
-
kylasa authored
* code changes for bug fixes identified during mag_lsc dataset 1. Changed from call torch.Tensor() to torch.from_numpy() to address memory corruption issues when creating large tensors. Tricky thing is this works correctly for small tensors. 2. Changed dgl.graph() function call to include 'num_nodes" argument to specifically mention all the nodes in a graph partition. * Update convert_partition.py Moving the changes to the function "create_metadata_json" function to the "multiple-file-format" support, where this change is more appropriate. Since multiple machine testing was done with these code changes. * Addressing review comments. Removed space as suggested at the end of the line
-
Mufei Li authored
-
Xin Yao authored
-
Xin Yao authored
* fix using alternative streams * use a alternative stream for subgraph transferring * fix StreamContext when stream is None
-
Rhett Ying authored
Co-authored-by:Minjie Wang <wmjlyjemaine@gmail.com>
-
nv-dlasalle authored
* Update nccl communicator for when NCCL is missing * Use static_cast * Add doc string * Fix whitespace * Resrtict unit test to GPU runs Co-authored-by:Xin Yao <xiny@nvidia.com>
-
- 28 Jun, 2022 4 commits
-
-
Mufei Li authored
Co-authored-by:
Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by:
Xin Yao <xiny@nvidia.com>
-
Rhett Ying authored
* [BugFix] fix build issue on mac OS * refine
-
Mufei Li authored
-
Mufei Li authored
* Update * Update * Update * Update * Update * Update * Update * Update * Update * Update
-
- 27 Jun, 2022 4 commits
-
-
ndickson-nvidia authored
* * Added missing specializations for `__half` of `DLDataTypeTraits`, `IndexSelect`, `Full`, `Scatter_`, `CSRGetData`, `CSRMM`, `CSRSum`, `IndexSelectCPUFromGPU` * Fixed casting issue in `_LinearSearchKernel` that was preventing it from supporting `__half` * Added `#if`'d out specializations of `CSRGEMM`, `CSRGEAM`, and `Xgeam`, which would require functions that aren't currently provided by cublas * * Added more specific error messages for unimplemented FP16 specializations of Xgeam, CSRGEMM, and CSRGEAM * * Added missing instantiation of DLDataTypeTraits<__half>::dtype * * Fixed linter error * Added clearer comment explaining why the cast to long long is necessary * * Worked around a compile error in some particular setup, where __half can't be constructed on the host side * * Fixed linter formatting errors * * Changes to comments as recommended * * Made recommended changes to logging errors in FP16 specializations * Also changed the existing Xgeam function for unsupported data types from LOG(INFO) to LOG(FATAL)
-
Xin Yao authored
-
Rhett Ying authored
* [BugFix] fix rpc-related build issue on mac OS * add warning message * add warning message
-
Rhett Ying authored
* [Dist] enable USE_EPOLL in default * fix build issue on windows * fix build issue on windows * fix build issue on windows * fix build issue on windows * fix build issue on windows * fix build issue
-
- 24 Jun, 2022 2 commits
-
-
PotatoChipsNinja authored
Co-authored-by:Xin Yao <xiny@nvidia.com>
-
nv-dlasalle authored
* Add uva by default to embedding * More updates * Update optimizer * Add new uva functions * Expose new pinned memory function * Add unit tests * Update formatting * Fix unit test * Handle auto UVA case when training is on CPU * Allow per-embedding decisions for whether to use UVA * Address spares_optim.py comments * Remove unused templates * Update unit test * Use dgl allocate memory for pinning * allow automatically unpin * workaround for d2h copy with a different dtype * fix linting * update error message * update copyright Co-authored-by:
Xin Yao <xiny@nvidia.com> Co-authored-by:
Minjie Wang <wmjlyjemaine@gmail.com>
-
- 23 Jun, 2022 5 commits
-
-
Lucas Prieto authored
Co-authored-by:
Mufei Li <mufeili1996@gmail.com> Co-authored-by:
Xin Yao <xiny@nvidia.com>
-
Chang Liu authored
* Remove all torchtext legacy-related APIs * Remove unused BagOfWordsPretrained class, and fix some typos Co-authored-by:Mufei Li <mufeili1996@gmail.com>
-
Xin Yao authored
* Explicitly unpin tensoradapter allocated arrays * Undo unrelated change * Add unit test * update unit test * add pinned_by_dgl flag to NDArray::Container * use dgl.ndarray for holding the pinning status * update multi-gpu uva inference * reinterpret cast NDArray::Container* to DLTensor* in MoveAsDLTensor * update unpin column and examples * add unit test for unpin column Co-authored-by:
Dominique LaSalle <dlasalle@nvidia.com> Co-authored-by:
nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
-
Triston authored
* Fix a cub compile error for CUDA 11.5 * Fix comparison of integer expressions of different signedness in coo_sort.cu file * Fix comparison of integer expressions of different signedness in cuda_compact_graph.cu file * Remove never referenced variable in spmm.cu * Fix comparison of integer expressions of different signedness in rowwise_pick.h file * Fix comparison of integer expressions of different signedness in choice.cc file * Remove never referenced variable col_data in spat_op_impl_coo.cc * Remove never referenced variable allowed in global_uniform.cc * Fix comparison of integer expressions of different signedness in graph.cc * Fix comparison of integer expressions of different signedness in graph_apis.cc * Fix the un-used ctx variable in ndarray_partition.cc file for cpu only build * Fix comparison of integer expressions of different signedness in libra_partition.cc * Fix comparison of integer expressions of different signedness in graph_op.cc Co-authored-by:
Triston Cao <tristonc@nvidia.com> Co-authored-by:
Quan (Andy) Gan <coin2028@hotmail.com>
-
Rhett Ying authored
-
- 22 Jun, 2022 3 commits
-
-
Mufei Li authored
* Update citation_graph.py * Update * Update * Update Co-authored-by:Minjie Wang <wmjlyjemaine@gmail.com>
-
Quan (Andy) Gan authored
* fix * fix * Update utils.py
-
maqy authored
* fix unstable sort * add torch version check * reformat * split too long comments * Update dataloader.py Co-authored-by:Quan (Andy) Gan <coin2028@hotmail.com>
-
- 21 Jun, 2022 1 commit
-
-
Mufei Li authored
* Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update
-