1. 11 May, 2023 5 commits
  2. 10 May, 2023 3 commits
  3. 04 May, 2023 6 commits
    • Rostyslav Geyyer's avatar
      Optimize bf16 conversion (#664) · b076a02a
      Rostyslav Geyyer authored
      * Add TypeConvert class and start refactoring
      
      * Refactor TypeConvert as a struct
      
      * Get back to template functions type_convert
      
      * Add a type_convert_bf16_rtn, set rtz as default
      
      * Clean up
      
      * Add UnaryConvertPrecision struct for high-precision workloads
      
      * Format
      
      * Update type_convert to UnaryConvert on threadwise level
      
      * Update UnaryConvertPrecision
      
      * Format
      
      * Fix chmod
      
      * Add a flag to pick converion method
      
      * Format
      
      * Remove the added flag
      
      * Merge elementwise op with type conversion
      
      * Move type_convert to elemwise op, update the op
      
      * Update type_convert_precision -> bf16_convert_rtn
      
      * Clean up
      
      * Update comments
      
      * Update the CK_WORKAROUND_DENORM_FIX flag handling
      
      * Update the unneeded op to work but warn user
      
      * Remove the message
      
      * Use a PassThrough instead of ConvertBF16RTN to calcaulate reference
      
      * Format
      
      * Add missing include
      b076a02a
    • Adam Osewski's avatar
      Add more debug log informations. · 310bdd5a
      Adam Osewski authored
      310bdd5a
    • Adam Osewski's avatar
      Turn on DEBUG_LOG · 66b3a28d
      Adam Osewski authored
      66b3a28d
    • Adam Osewski's avatar
      Don't pass kbatch to CalculateKPadded. · 1fb6a1e0
      Adam Osewski authored
      1fb6a1e0
    • Adam Osewski's avatar
      a3e2ddba
    • Adam Osewski's avatar
      Add license header. · 88436bd9
      Adam Osewski authored
      88436bd9
  4. 03 May, 2023 2 commits
  5. 28 Apr, 2023 1 commit
  6. 26 Apr, 2023 1 commit
  7. 24 Apr, 2023 1 commit
  8. 22 Apr, 2023 1 commit
  9. 16 Apr, 2023 2 commits
  10. 11 Apr, 2023 2 commits
  11. 10 Apr, 2023 1 commit
    • rocking5566's avatar
      Groupnorm + swish external api (#668) · ed3a2e52
      rocking5566 authored
      * Rename to proper naming
      
      * Add example of groupnorm + swish
      
      * Extract duplicate code in example
      
      * Add groupnorm + swish instances
      
      * Ractor instance generation, split into multiple cpp file
      
      * Add external api and client example
      
      * Refine profiler message
      
      * Use ck math version of exp
      
      * Refine problem size in example
      
      * Add host version of exp
      ed3a2e52
  12. 07 Apr, 2023 1 commit
  13. 30 Mar, 2023 2 commits
  14. 29 Mar, 2023 2 commits
  15. 23 Mar, 2023 1 commit
  16. 22 Mar, 2023 1 commit
  17. 20 Mar, 2023 2 commits
  18. 15 Mar, 2023 5 commits
  19. 10 Mar, 2023 1 commit