1. 11 Feb, 2022 1 commit
    • zjing14's avatar
      Batched GEMM for fp16 (#79) · b53e9d08
      zjing14 authored
      * prepare host for batched_gemm
      
      * init commit of batched kernels
      
      * fixed
      
      * refine transform with freeze
      
      * m/n padding
      
      * fixed a bug; clean
      
      * add small tiles
      
      * clean
      
      * clean code
      
      * clean code
      
      * add nt, tn, tt layout
      
      * add missing file
      
      * use StaticBufferTupleOfVector instead
      
      * add reference_batched_gemm
      
      * fixed a macro
      b53e9d08
  2. 07 Feb, 2022 1 commit
    • Chao Liu's avatar
      GEMM+Bias+ReLU+Add (#76) · 823657ed
      Chao Liu authored
      * tweak conv for odd C
      
      * update script
      
      * clean up elementwise op
      
      * fix build
      
      * clean up
      
      * added example for gemm+bias+relu+add
      
      * added example for gemm+bias+relu
      
      * add profiler for gemm_s_shuffle; re-org files
      
      * add profiler
      
      * fix build
      
      * clean up
      
      * clean up
      
      * clean up
      
      * fix build
      823657ed