• nv-dlasalle's avatar
    [Sampling] Implement `dgl.to_block()` for the GPU (#2339) · bc3a532f
    nv-dlasalle authored
    
    
    * Add start of to_block gpu implementation
    
    * Pull in more changes from 0.4.2 cuda_to_block
    
    * Move more code to IdArray
    
    * Refactor DeviceNodeMapMaker
    
    * Updates
    
    * get compiling
    
    * Integrate to_block
    
    * Fix ID allocation
    
    * Minor fixes
    
    * Cleanup cuda calls to use cuda_common
    
    * Reduce kernel calls
    
    * Lint cleanup
    
    * Expand documentation
    
    * Remove unused function
    
    * Rename variables for consistency
    
    * Add doxygen comments
    
    * Fix file extension
    
    * Remove raw asynccopy for deviceapi
    
    * Remove unused function
    
    * Fix block/tile configuration
    
    * Add cuda_device_common.cuh
    
    * Add basic hashtable
    
    * Migrate part of hashtable
    
    * Refactor to use external hashtable
    
    * Make functions members
    
    * Format hash table functions
    
    * Migrate duplicate filling
    
    * Move last function over
    
    * Refactor with cu file
    
    * lint c++ code
    
    * Move context check to C++ code
    
    * Use macro switch
    
    * Add missing files
    
    * Update docstring
    
    * update docs
    
    * Move atomic functions
    
    * Refactor hashtable
    
    * Fix linting
    
    * Expand docs
    
    * Fix mismatched argument names
    
    * Switch doxygen comments from using @param to \param
    Co-authored-by: default avatarJinjing Zhou <VoVAllen@users.noreply.github.com>
    Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>
    bc3a532f
CUDA.cmake 10.6 KB