1. 02 Nov, 2022 1 commit
    • rocking5566's avatar
      Refine layernorm naming and test code (#497) · d4d1147f
      rocking5566 authored
      * Sync the naming
      
      * Sync the test of layernorm with groupnorm
      
      * Sync the naming
      
      * Minor change for comment and log
      
      * [What] Add saveMean and SaveInvVariance in the interface.
      [Why] These can optimize the backward
      d4d1147f
  2. 13 Oct, 2022 2 commits
    • Adam Osewski's avatar
      Refactor device op implementations into `impl` subdirectory. (#420) · 30480288
      Adam Osewski authored
      
      
      * Move kernel implementation files under impl directory.
      
      * Update examples paths.
      
      * Update device kernel impl include paths.
      
      * Update tensor operation instances include paths.
      
      * Update profiler and tests include paths.
      
      * Clang-format
      
      * Update include paths for batched gemm reduce
      
      * Refactor UnitTest ConvNDBwdWeight.
      
      * Refactor fwd and bwd data convND UT.
      
      * Fix used test macro.
      
      * Fix include path.
      
      * Fix include paths.
      
      * Fix include paths in profiler and tests.
      
      * Fix include paths.
      Co-authored-by: default avatarAdam Osewski <aosewski@amd.com>
      30480288
    • rocking5566's avatar
      Fix bug of layernorm ckProfiler and refine code (#448) · 1b62bfaa
      rocking5566 authored
      * Fix bug of profiler for layernorm
      
      * 1. Rename layernorm into normalization
      2. Decouple softmax from normalization
      
      * clang-format
      1b62bfaa
  3. 20 Sep, 2022 1 commit
    • rocking5566's avatar
      Group norm (#417) · 4eba345f
      rocking5566 authored
      
      
      * Add groupnorm example by layernorm
      1.  Reference is not ready
      2. shape of gamma and beta need to be fix
      
      * Let shape of gamma and beta can be same as x
      
      * Modify test, instance and client example
      
      * [What] Fix bug of layernorm for greater than 2 dimension.
      [Why] We need to get upper length from merge transform instead of embed transform.
      
      * Add reference for groupnorm
      
      * Fuse sigmoid after groupnorm
      
      * [What] Rename original layernorm into layernorm2d
      [Why] Prepare to add groupnorm using layernorm5d
      
      * clang-format
      
      * Add groupnorm test
      
      * Refine error message
      
      * Add groupnorm ckProfiler
      
      * Test groupnorm kernel from device_instance
      
      * update example
      
      * upadte profiler
      
      * Fix test naming
      
      * Fix argc number
      
      * Move descriptor and sweeponce to argument for quick debugging
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      4eba345f
  4. 13 Aug, 2022 1 commit
    • rocking5566's avatar
      Layernorm welford (#346) · 0bd6b842
      rocking5566 authored
      
      
      * Add threadwise and blockwise welford
      
      * Rename gridwise op, prepare to add welford version
      
      * implement welford and integrate welford into layernorm
      
      * Take care of tail loop
      
      * Fix buf when ThreadSliceK > 1
      
      * Fix bug of merging of two empty set
      
      * Rename clip to clamp
      
      * 1. Fix type of count
      2. Remove useless static_assert
      
      * Do not inherit Reduction::Argument
      
      * [What] replace __syncthreads() with block_sync_lds()
      [Why] __syncthreads might wait both lgkmcnt(0) and vmcnt(0)
      
      * Add y stride
      
      * Rename.
      DeviceLayernorm -> DeviceLayernormImpl
      DeviceNormalization2 -> DeviceLayernorm
      
      * Move literal ""_uz & ""_zu into namespace 'literals'
      
      * Move namespace 'literals' as 'ck::literals'
      Co-authored-by: default avatarPo-Yen, Chen <PoYen.Chen@amd.com>
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      0bd6b842
  5. 11 Aug, 2022 1 commit
    • rocking5566's avatar
      ckProfiler for layernorm (#330) · fdfd7eb5
      rocking5566 authored
      * Refine parameter
      
      * Add base class for layernorm
      
      * Add layernorm instance
      
      * Add layernorm to ckProfiler
      
      * Remove redundant
      
      * Add verification
      
      * Fix compile error due to merge
      fdfd7eb5