1. 07 Mar, 2023 3 commits
  2. 06 Mar, 2023 4 commits
    • kylasa's avatar
      [DistDGL][UserEx]Sync parmetis_wrapper with changes in metadata.json (#5385) · 7b766393
      kylasa authored
      * Sync parmetis_wrapper with changes in metadata.json
      
      1. In the preprocess.py, make sure that num_partitions is defined as input argument. Also, align 'input_dir' with the input dataset. schema_file is assumed to be located inside the input_dir. Also, graph_stats.txt file is assumed to be present in the input_dir.
      
      2. Use DGL_HOME environment variable so that parmetis_wrapper command can be run anywhere.
      
      * Fix CI test failure cases.
      
      * Addressing CI review comments.
      
      * Addressing CI test failures.
      
      * Applying lintrunner patch
      7b766393
    • kylasa's avatar
      Support for no. of chunks smaller than no. of partitions. (#5390) · 894ad1e3
      kylasa authored
      * Support for no. of chunks smaller than no. of partitions and Adding appropriate test cases.
      
      Following changes are made with this PR.
      1. Code changes for handling no. of chunks smaller than no. of partitions
      2. Adding new test cases, which were previously deleted, for no. of chunks smaller than no. of partitions.
      3. Also adding test cases, where multiple partitions are handled by a single process.
      
      * Committing the missing files in this commit.
      
      * lintrunner patch.
      
      * lintrunner check
      
      * lintrunner patch here.
      
      * CI review comments.
      894ad1e3
    • Quan (Andy) Gan's avatar
      [Bugfix] Fix duplicate worker_init_fn argument when provided in DataLoader (#5420) · 851d66fa
      Quan (Andy) Gan authored
      * fix duplicate worker_init_fn
      
      * lint
      
      * lint again
      
      * uugh
      851d66fa
    • Rhett Ying's avatar
      [BugFix] fix torch cuda version (#5426) · 26b245a0
      Rhett Ying authored
      26b245a0
  3. 04 Mar, 2023 1 commit
  4. 03 Mar, 2023 2 commits
  5. 02 Mar, 2023 1 commit
  6. 01 Mar, 2023 3 commits
  7. 28 Feb, 2023 2 commits
  8. 27 Feb, 2023 3 commits
  9. 25 Feb, 2023 1 commit
    • kylasa's avatar
      [DistDGL][Feature_Request]Changes in the metadata.json file for input graph dataset. (#5310) · a14f69c9
      kylasa authored
      * Implemented the following changes.
      
      * Remove NUM_NODES_PER_CHUNK
      * Remove NUM_EDGES_PER_CHUNK
      * Remove the dependency between no. of edge files per edge type and no. of partitions
      * Remove the dependency between no. of edge feature files per edge type and no. of partitions
      * Remove the dependency between no. of edge feature files and no. of edge files per edge type.
      * Remove the dependency between no. of node feature files and no. of partitions
      * Add “node_type_counts”. This will be a list of integers. Each integer will represent total count of a node-type. The index in this list and the index in the “node_type” will be the same for a given node-type.
      * Add “edge_type_counts”. This will be a list of integers. Each integer will represent total count of an edge-type. The index in this list and the index in the “edge_type” list will be the same for a given edge-type.
      
      * Applying lintrunner patch.
      
      * Adding missing keys to the metadata in the unit test framework.
      
      * lintrunner patch.
      
      * Resolving CI test failures due to merge conflicts.
      
      * Applying lintrunner patch
      
      * applying lintrunner patch
      
      * Replacing tabspace with spaces - to satisfy lintrunner
      
      * Fixing the CI Test Failure cases.
      
      * Applying lintrunner patch
      
      * lintrunner complaining about a blank line.
      
      * Resolving issues with print statement for NoneType
      
      * Removed tests for the arbitrary chunks tests. Since this functionality is not supported anymore.
      
      * Addressing CI review comments.
      
      * addressing CI review comments
      
      * lintrunner patch
      
      * lintrunner patch.
      
      * Addressing CI review comments.
      
      * lintrunner patch.
      a14f69c9
  10. 24 Feb, 2023 2 commits
  11. 23 Feb, 2023 6 commits
    • kylasa's avatar
      New script for customers to validate partitioned graph objects (#5340) · c42fa8a5
      kylasa authored
      * A new script to validate graph partitioning pipeline
      
      * Addressing CI review comments.
      
      * lintrunner patch.
      c42fa8a5
    • kylasa's avatar
      [DistDGL][Robustness]Uneven distribution of input graph files for nodes/edges and features. (#5227) · bbc538d9
      kylasa authored
      * Uneven distribution of nodes/edges/features
      
      To handle unevenly sized files for nodes/edges and feature files for nodes and edges, we have to synchronize before starting large no. of messages (either one large message or a burst of messages).
      
      * Applying lintrunner patch.
      
      * Removing tabspaces for lintrunner.
      
      * lintrunner patch.
      
      * removed issues introduced by the merge conflicts. Lots of code was repeated
      bbc538d9
    • kylasa's avatar
      [DistDGL][Mem_Optimizations]get_partition_ids, service provided by the... · 61b6edab
      kylasa authored
      [DistDGL][Mem_Optimizations]get_partition_ids, service provided by the distributed lookup service has high memory footprint (#5226)
      
      * get_partition_ids, service provided by the distributed lookup service has high memory footprint
      
      'get_partitionid' function, which is used to retrieve owner processes of the given list of global node ids, has high memory footprint. Currently this is of the order of 8x compared to the size of the input list.
      
      For massively large datasets, this memory needs are very unrealistic and may result in OOM. In the case of CoreGraph, when retrieving owner of an edge list of size 6 Billion edges, the memory needs can be as high as 8*8*8 = 256 GB.
      
      To limit the amount of memory used by this function, we split the size of the message sent to the distributed lookup service, so that each message is limited by the number of global node ids, which is 200 million. This reduced the memory footprint of this entire function to be no more than 0.2 * 8 * 8 = 13 GB. which is within reasonable limits.
      
      Now since we send multiple small messages compared to one large message to the distributed lookup service, this may consume more wall-clock-time compared to earlier implementation.
      
      * lintrunner patch.
      
      * using np.ceil() per suggestion.
      
      * converting the output of np.ceil() as ints.
      61b6edab
    • Kacper Pietkun's avatar
      [Bugfix] fixed leak in SpMMCreateBlocks (#5210) · 99937422
      Kacper Pietkun authored
      * fixed leak in SpMMCreateBlocks
      
      * clang format
      99937422
    • Kunal Mukherjee's avatar
      [Model] Implemented SubgraphX Explainer for Homogeneous graph (#5315) · 45153fc0
      Kunal Mukherjee authored
      
      
      * subgraphx commit
      
      * nits
      
      * newline eof added
      
      * lint fix
      
      * test script updated to use default values
      
      * lint fix
      
      * graphs that are used for test cases are updated to a small graph
      
      * lint formatted
      
      * test paramter adj to complete the test under 20s
      
      * lint fixes
      
      ---------
      Co-authored-by: default avatarkxm180046 <kxm180046@utdallas.edu>
      45153fc0
    • czkkkkkk's avatar
  12. 22 Feb, 2023 5 commits
  13. 21 Feb, 2023 7 commits