1. 24 Feb, 2023 2 commits
  2. 23 Feb, 2023 6 commits
    • kylasa's avatar
      New script for customers to validate partitioned graph objects (#5340) · c42fa8a5
      kylasa authored
      * A new script to validate graph partitioning pipeline
      
      * Addressing CI review comments.
      
      * lintrunner patch.
      c42fa8a5
    • kylasa's avatar
      [DistDGL][Robustness]Uneven distribution of input graph files for nodes/edges and features. (#5227) · bbc538d9
      kylasa authored
      * Uneven distribution of nodes/edges/features
      
      To handle unevenly sized files for nodes/edges and feature files for nodes and edges, we have to synchronize before starting large no. of messages (either one large message or a burst of messages).
      
      * Applying lintrunner patch.
      
      * Removing tabspaces for lintrunner.
      
      * lintrunner patch.
      
      * removed issues introduced by the merge conflicts. Lots of code was repeated
      bbc538d9
    • kylasa's avatar
      [DistDGL][Mem_Optimizations]get_partition_ids, service provided by the... · 61b6edab
      kylasa authored
      [DistDGL][Mem_Optimizations]get_partition_ids, service provided by the distributed lookup service has high memory footprint (#5226)
      
      * get_partition_ids, service provided by the distributed lookup service has high memory footprint
      
      'get_partitionid' function, which is used to retrieve owner processes of the given list of global node ids, has high memory footprint. Currently this is of the order of 8x compared to the size of the input list.
      
      For massively large datasets, this memory needs are very unrealistic and may result in OOM. In the case of CoreGraph, when retrieving owner of an edge list of size 6 Billion edges, the memory needs can be as high as 8*8*8 = 256 GB.
      
      To limit the amount of memory used by this function, we split the size of the message sent to the distributed lookup service, so that each message is limited by the number of global node ids, which is 200 million. This reduced the memory footprint of this entire function to be no more than 0.2 * 8 * 8 = 13 GB. which is within reasonable limits.
      
      Now since we send multiple small messages compared to one large message to the distributed lookup service, this may consume more wall-clock-time compared to earlier implementation.
      
      * lintrunner patch.
      
      * using np.ceil() per suggestion.
      
      * converting the output of np.ceil() as ints.
      61b6edab
    • Kacper Pietkun's avatar
      [Bugfix] fixed leak in SpMMCreateBlocks (#5210) · 99937422
      Kacper Pietkun authored
      * fixed leak in SpMMCreateBlocks
      
      * clang format
      99937422
    • Kunal Mukherjee's avatar
      [Model] Implemented SubgraphX Explainer for Homogeneous graph (#5315) · 45153fc0
      Kunal Mukherjee authored
      
      
      * subgraphx commit
      
      * nits
      
      * newline eof added
      
      * lint fix
      
      * test script updated to use default values
      
      * lint fix
      
      * graphs that are used for test cases are updated to a small graph
      
      * lint formatted
      
      * test paramter adj to complete the test under 20s
      
      * lint fixes
      
      ---------
      Co-authored-by: default avatarkxm180046 <kxm180046@utdallas.edu>
      45153fc0
    • czkkkkkk's avatar
  3. 22 Feb, 2023 5 commits
  4. 21 Feb, 2023 18 commits
  5. 20 Feb, 2023 6 commits
  6. 19 Feb, 2023 3 commits