1. 12 Dec, 2022 1 commit
    • arai713's avatar
      Gridwise elementwise 2d (#466) · 0e5c264c
      arai713 authored
      
      
      * added 2d gridwise elementwise
      
      * added 2d version of device elementwise
      
      * added example file with updated device elementwise call
      
      * added Cmake file
      
      * changed NumDim into 2D
      
      * fixed compiler issues
      
      * fixed indexing for loop step
      
      * fixed NumDim dimension error
      
      * changed blockID to 2D
      
      * updated Grid Desc
      
      * updated kernel call
      
      * fixed 2d thread indexing
      
      * added dimensions for example file
      
      * commented out unused code
      
      * changed vector load
      
      * removed extra code
      
      * temporarily removing vector load on 2nd dim
      
      * changed vector load back, still causing errors
      
      * altered indexing
      
      * changed isSupportedArgument for 2D
      
      * changed indexing + do/while
      
      * fixed isSupportedArgument
      
      * changed dimension for debugging
      
      * fixed
      
      * added testing printouts
      
      * testing change
      
      * added variables to distribute threads through both dimensions
      
      * testing changes
      
      * integrated variable for thread distribution into device elementwise and added as parameter for gridwise elementwise
      
      * removed most of the extraneous code, testing with different dimensions
      
      * testing
      
      * removed debugging print statements
      
      * moved 2d elementwise permute into elementwise permute directory
      
      * fixed formatting
      
      * removed debugging comments from threadwise transfer
      Co-authored-by: default avatarJing Zhang <jizhan@amd.com>
      Co-authored-by: default avatarPo Yen Chen <PoYen.Chen@amd.com>
      0e5c264c
  2. 19 Oct, 2022 1 commit
  3. 17 Oct, 2022 1 commit
    • arai713's avatar
      adding tensor_permutation example folder (#389) · cee440fe
      arai713 authored
      * adding tensor_permutation example folder
      
      * fixed formatting
      
      * adding tensor_permutation example folder
      
      * fixed formatting
      
      * changed deviceelementwise parameters for outscalar
      
      * removed .swo file
      
      * updated folder/file name
      
      * changed function call in verification for better consistency with hostelementwist parameters
      
      * formatted again
      
      * fixed shape in verification function call
      
      * changed verification function call, added definition for nhwc
      
      * added elementwise permute example
      
      * updated CMakeLists file in folder
      
      * Delete CmakeLists.txt
      
      * Delete tensor_permute.cpp
      
      * first version of 2d gridwise_elementwise kernel
      
      * temporary fix for stride problem
      
      * formatting
      
      * format
      
      * changed directory name
      
      * Delete gridwise_elementwise_2d.hpp
      
      * Delete CMakeLists.txt
      
      * Delete extra file
      
      * delete extra file
      
      * got rid of extraneous code
      
      * added 2d device elementwise file
      
      * deleted accidently added file
      
      * update
      
      * stride values generalized with equations
      
      * updated stride for output matrix
      
      * Update CMakeLists.txt
      
      * removed extraneous commented code
      
      * removed shape_nchw vector, replaced with GetLength for each dimension
      
      * changed vector load in kernel call
      
      * removed extra space in CMake
      cee440fe