1. 08 Mar, 2023 1 commit
    • Adam Osewski's avatar
      GroupedGEMM + Gelu client example/instances/profiler (#614) · 9096b1c7
      Adam Osewski authored
      
      
      * Grouped gemm + Gelu instances.
      
      * Device Instance Factory for GroupedGemm+Gelu
      
      * Client example
      
      * Rangify fill helper functions.
      
      * Fix name clash.
      
      * Profiler for grouped_gemm+gelu
      
      * No need to use full namespace name.
      
      * Add check for MRaw divisible by vector load.
      
      * Ugly fix for big errors.
      
      * Add grouped_gemm+gelu to profiler CMakelists.
      
      * Store in argument additional info.
      
      * Information about Mraw, Nraw, Kraw values.
      
      * Use FastGelu instead of Gelu.
      
      * Change client ex to use FastGelu
      
      * Remove relaxed error precision.
      
      * Remove duplicate output elementwise-op
      
      ---------
      Co-authored-by: default avatarAdam Osewski <aosewski@amd.com>
      Co-authored-by: default avatarzjing14 <zhangjing14@gmail.com>
      9096b1c7