1. 05 Aug, 2024 5 commits
  2. 02 Aug, 2024 5 commits
  3. 01 Aug, 2024 2 commits
  4. 31 Jul, 2024 20 commits
  5. 30 Jul, 2024 2 commits
  6. 26 Jul, 2024 2 commits
  7. 25 Jul, 2024 2 commits
  8. 24 Jul, 2024 2 commits
    • Andriy Roshchenko's avatar
      Adding more instances of grouped convolution 3d forward for FP8 with... · 4a8a1bef
      Andriy Roshchenko authored
      Adding more instances of grouped convolution 3d forward for FP8 with ConvScale+Bias element-wise operation. (#1412)
      
      * Add CMakePresets configurations.
      
      * Add binary elementwise ConvScaleAdd and an example.
      
      * Numerical verification of results.
      
      Observed significant irregularities in F8 to F32 type conversions:
      ```log
      ConvScaleAdd: float=145.000000   f8_t=160.000000    e=144.000000
      ConvScaleAdd: float=97.000000   f8_t=96.000000    e=104.000000
      ConvScaleAdd: float=65.000000   f8_t=64.000000    e=72.000000
      ```
      
      * Implemented ConvScaleAdd + Example.
      
      * Add ConvScale+Bias Instances
      
      * Add Client Example for ConvScale+Bias
      
      * Fix number of bytes in an example..
      
      * Cleanup.
      4a8a1bef
    • Bartłomiej Kocot's avatar
      Add support for half_t and bfloat to reduction operations (#1395) · ffabd70a
      Bartłomiej Kocot authored
      * Add support for half_t and bfloat to reduction operations
      
      * Fix bhalf convert
      
      * Next fix bf16
      ffabd70a