- 10 Mar, 2023 1 commit
-
-
kylasa authored
* Added testcase for testing distributed lookup service. * Applying lintrunner patch. * Fixing CI Test environment failures. * lintrunner patch. * lintrunner patch * Fix CI Failure. * Fixing CI Test failure cases. * lintrunner patch. * lintrunner patch and CI test failure. * Restore no. of test cases. * Resolving pythonpath issues. * lintrunner patch. * updating PYTHONPATH to resolve lib path * Resolve merge conflicts * Resolving issues with PYTHONPATH env variable. * fix module path * rename utils script under test to avoid ambiguity * remove unnecessary pythonpath * fix lint error * fix lint error --------- Co-authored-by:RhettYing <rhett_ying@qq.com>
-
- 25 Feb, 2023 1 commit
-
-
kylasa authored
* Implemented the following changes. * Remove NUM_NODES_PER_CHUNK * Remove NUM_EDGES_PER_CHUNK * Remove the dependency between no. of edge files per edge type and no. of partitions * Remove the dependency between no. of edge feature files per edge type and no. of partitions * Remove the dependency between no. of edge feature files and no. of edge files per edge type. * Remove the dependency between no. of node feature files and no. of partitions * Add “node_type_counts”. This will be a list of integers. Each integer will represent total count of a node-type. The index in this list and the index in the “node_type” will be the same for a given node-type. * Add “edge_type_counts”. This will be a list of integers. Each integer will represent total count of an edge-type. The index in this list and the index in the “edge_type” list will be the same for a given edge-type. * Applying lintrunner patch. * Adding missing keys to the metadata in the unit test framework. * lintrunner patch. * Resolving CI test failures due to merge conflicts. * Applying lintrunner patch * applying lintrunner patch * Replacing tabspace with spaces - to satisfy lintrunner * Fixing the CI Test Failure cases. * Applying lintrunner patch * lintrunner complaining about a blank line. * Resolving issues with print statement for NoneType * Removed tests for the arbitrary chunks tests. Since this functionality is not supported anymore. * Addressing CI review comments. * addressing CI review comments * lintrunner patch * lintrunner patch. * Addressing CI review comments. * lintrunner patch.
-
- 13 Feb, 2023 1 commit
-
-
kylasa authored
Following changes are made in this PR. 1. In dataset_utils.py, when reading edges from disk we follow the order defined by the STR_EDGE_TYPE key in the metadata.json file. This order is implicitly used to assign edgeid to edge types. This same order is used to read edges from the disk as well. 2. Now the unit test framework will also randomize the order of edges read from the disk. This is done for the edges when reading from the disk for the unit tests. Co-authored-by:Quan (Andy) Gan <coin2028@hotmail.com>
-
- 05 Jan, 2023 1 commit
-
-
Theodore Vasiloudis authored
* Allow reading and writing single-column vector Parquet files. These files are commonly produced by Spark ML's feature processing code. * [Dist] Only write single-column vector files for Parquet in tests.
-
- 03 Jan, 2023 1 commit
-
-
Theodore Vasiloudis authored
[Dist] Add support for Parquet-formatted edges files, remove some assumptions on edge file number. (#5051) * [Dist] Add support for Parquet-formatted edges files, remove some assumptions on edge file number. * [Dist] Add parquet edges option to unit tests. Co-authored-by:xiang song(charlie.song) <classicxsong@gmail.com>
-
- 15 Dec, 2022 1 commit
-
-
Rhett Ying authored
* [Dist] enable to chunk node/edge data into arbitrary number of chunks * [Dist] enable to split node/edge data into arbitrary parts * refine code * Format boolean to uint8 forcely to avoid dist.scatter failure * convert boolean to int8 before scatter and revert it after scatter * refine code * fix test * refine code * move test utilities into utils.py * update comment * fix empty data * update * update * fix empty data issue * release unnecessary mem * release unnecessary mem * release unnecessary mem * release unnecessary mem * release unnecessary mem * remove unnecessary shuffle data * separate array_split into standalone utility * add example Co-authored-by:xiang song(charlie.song) <classicxsong@gmail.com>
-