• rocking5566's avatar
    elementwise op (#238) · aafc3ac2
    rocking5566 authored
    
    
    * Add elementwise operation kernel and example
    
    * Add comment
    
    * Add template argument of dim . Prepare to support multiple dimension
    
    * Rename example
    
    * Support 1 dimension
    
    * Add static assert
    
    * Add comment
    
    * Extract pad
    
    * Remove redundant argument
    
    * Support any dimension for elementwise operation
    
    * Remove line
    
    * Let it be the multiple number of CU
    
    * Move thread per block to the parameter of constructor
    
    * rename threadPerBlock with blockSize
    
    * Support double
    
    * rename kernel function name
    
    * remove redundant include header
    
    * Refine type
    
    * Need to the final dimension
    
    * Refine variable name
    
    * Refine type
    
    * Use index_t instead of int in API
    Co-authored-by: default avatarrocking <chunylai@amd.com>
    aafc3ac2
CMakeLists.txt 2.36 KB