"src/include/gridwise_direct_convolution_1.hip.hpp" did not exist on "73480fee3635310aedbbec68b6084c94cfd2457d"
  1. 02 Nov, 2022 1 commit
    • rocking5566's avatar
      Refine layernorm naming and test code (#497) · d4d1147f
      rocking5566 authored
      * Sync the naming
      
      * Sync the test of layernorm with groupnorm
      
      * Sync the naming
      
      * Minor change for comment and log
      
      * [What] Add saveMean and SaveInvVariance in the interface.
      [Why] These can optimize the backward
      d4d1147f
  2. 28 Oct, 2022 1 commit
  3. 13 Oct, 2022 1 commit
  4. 07 Oct, 2022 1 commit
    • Shaojie WANG's avatar
      Optimization for gridwise group norm (#453) · 40942b90
      Shaojie WANG authored
      
      
      * use another instance to check the efficiency
      
      * optimize group layer norm
      
      * 1. coalesce load/store data for gridwise layer norm welford. 2. move a sqrt and divison into a outer static loop
      
      * add more instances to layernorm
      
      * add 2 more test cases
      
      * remove ignore in generating tuple of vector
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      40942b90
  5. 20 Sep, 2022 1 commit
    • rocking5566's avatar
      Group norm (#417) · 4eba345f
      rocking5566 authored
      
      
      * Add groupnorm example by layernorm
      1.  Reference is not ready
      2. shape of gamma and beta need to be fix
      
      * Let shape of gamma and beta can be same as x
      
      * Modify test, instance and client example
      
      * [What] Fix bug of layernorm for greater than 2 dimension.
      [Why] We need to get upper length from merge transform instead of embed transform.
      
      * Add reference for groupnorm
      
      * Fuse sigmoid after groupnorm
      
      * [What] Rename original layernorm into layernorm2d
      [Why] Prepare to add groupnorm using layernorm5d
      
      * clang-format
      
      * Add groupnorm test
      
      * Refine error message
      
      * Add groupnorm ckProfiler
      
      * Test groupnorm kernel from device_instance
      
      * update example
      
      * upadte profiler
      
      * Fix test naming
      
      * Fix argc number
      
      * Move descriptor and sweeponce to argument for quick debugging
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      4eba345f