1. 27 Nov, 2024 2 commits
    • Illia Silin's avatar
    • Adam Osewski's avatar
      Polished Grouped GEMM APIs and new BF16 instances (#1600) · 061ac064
      Adam Osewski authored
      * Few small fixes.
      
      * New GroupedGemm instances (BF16)
      
      * Unify and refactor GroupedGEMM device API.
      
      * Adapt changes to new API.
      
      * Adapt grouped gemm profiler.
      
      * Accept multiple kbatches for grouped gemm profiler.
      
      - delete obsolete two stage as it is now covered by grouped gemm
      
      * Update unit test for grouped gemm.
      
      * Fix thresholds for BF16 and F8. Unblock tests.
      
      * Fix few instances.
      
      * Multiple small fixes.
      
      * Adapt to new API, check dynamic casting.
      
      * Uncomment few data types in grouped gemm profiler.
      
      * Fix call to SetDeviceArgs.
      
      * Fix profile grouped gemm multiply tile loop.
      
      * Fix grouped gemm tile loop kernel args in client examples.
      
      * Review comments.
      061ac064
  2. 26 Nov, 2024 2 commits
  3. 25 Nov, 2024 1 commit
  4. 21 Nov, 2024 1 commit
  5. 18 Nov, 2024 2 commits
  6. 14 Nov, 2024 1 commit
  7. 13 Nov, 2024 3 commits
  8. 07 Nov, 2024 1 commit
  9. 05 Nov, 2024 1 commit
  10. 30 Oct, 2024 1 commit
  11. 29 Oct, 2024 1 commit
  12. 26 Oct, 2024 2 commits
  13. 25 Oct, 2024 1 commit
    • aledudek's avatar
      Generic threshold calculation (#1546) · 9385caa3
      aledudek authored
      * Calculate generic relative threshold pool3dfwd
      
      * Calculate absolute error threshold pool3d fwd
      
      * Generic threshold calculation take max input for relative error pool3dfwd
      
      * Remove max possible value for error calculation at runtime
      
      * Remove debug print in pool3dfwd
      
      * Pool3d fwd adjusted types in generic threshold calculation
      
      * Generic threshold calculation take into account number of accumulations and accdatatype
      
      * Generic threshold fix final error formula
      
      * Generic threshold calculation - num of accs fix
      
      * Generic threshold calculation - adjust absolute error
      
      * Generic threshold calculation - OutDataType in absolute error
      9385caa3
  14. 22 Oct, 2024 1 commit
  15. 18 Oct, 2024 1 commit
  16. 15 Oct, 2024 1 commit
  17. 14 Oct, 2024 4 commits
  18. 12 Oct, 2024 1 commit
  19. 11 Oct, 2024 1 commit
  20. 09 Oct, 2024 2 commits
  21. 08 Oct, 2024 3 commits
  22. 07 Oct, 2024 1 commit
  23. 04 Oct, 2024 1 commit
  24. 02 Oct, 2024 5 commits