"vscode:/vscode.git/clone" did not exist on "dff2c686c277f4f8de5f1cab5a486dbad70d7864"
  1. 19 Oct, 2023 1 commit
  2. 17 Oct, 2023 1 commit
  3. 05 Oct, 2023 1 commit
  4. 29 Sep, 2023 1 commit
    • Bartlomiej Wroblewski's avatar
      Add support for mixed precision in contraction scale and bilinear (#936) · f0748506
      Bartlomiej Wroblewski authored
      * Extract common functionality to separate files
      
      * Reference contraction: Remove incorrect consts from type_converts
      
      * Reference contraction: Add missing type_convert for dst value
      
      * Reference contraction: Fix incorrect order of B matrix dimensions
      
      * Add support for mixed precision in contraction scale and bilinear
      
      * Move using statements from instances to a common file
      
      * Move using statements from examples to a common file
      
      * Fix the order of B matrix dimensions across examples and profiler
      
      * Fix the computation of error threshold
      
      * Make ComputeDataType an optional argument
      
      * Include possible DataType -> ComputeDataType casting error in the threshold
      
      * Remove commented code
      f0748506
  5. 27 Sep, 2023 1 commit
    • Bartłomiej Kocot's avatar
      Add column to image kernel (#930) · e2243a4d
      Bartłomiej Kocot authored
      * Add column to image kernel
      
      * Minor fixes for dtypes and client examples
      
      * Disable tests for disabled dtypes
      
      * Disable add instances functions for disabled data types
      
      * Minor stylistic fixes
      
      * Revert "Disable add instances functions for disabled data types"
      
      This reverts commit 728b8695.
      
      * Instances reduction
      
      * Add comments in device_column_to_image_impl
      
      * Update changelog and Copyrights
      
      * Improve changelog
      e2243a4d
  6. 05 Sep, 2023 1 commit
    • Bartłomiej Kocot's avatar
      Add image to column kernel (#867) · 0077eeb3
      Bartłomiej Kocot authored
      * Add image to column kernel
      
      * Add instances, tests, profiler, example
      
      * Add client example
      
      * Several fixes of image to column
      
      * Fix variable name in device_image_to_column_impl
      
      * Several fixes of image to column profiler
      
      * Fix num_btype calculation
      
      * Make new mesaurements for correct bytes calculation
      0077eeb3
  7. 21 Jul, 2023 1 commit
  8. 12 Jul, 2023 1 commit
  9. 21 Jun, 2023 1 commit
  10. 12 Jun, 2023 1 commit
    • Bartłomiej Kocot's avatar
      Add DeviceBatchedGemmMultipleD_Dl (#732) · fc9f9756
      Bartłomiej Kocot authored
      * Add DeviceBatchedGemmMultipleD_Dl
      
      * Fix batched_gemm tests
      
      * Fix comments
      
      * test_batched_gemm_multi_d fixes
      
      * Fix args for isSupported batchedGemmMultipleDDl
      
      * Disable tests for gfx90a
      fc9f9756
  11. 15 May, 2023 1 commit
    • Bartłomiej Kocot's avatar
      Add contraction profiler and tests (#701) · 642d5e91
      Bartłomiej Kocot authored
      * Add contraction profiler and tests
      
      * Build and style fixes
      
      * Allow to use any elementwise operator for ref_contraction
      
      * Introduce profile_contraction_scale and profile_contraction_bilinear
      
      * Make ref_contraction generic and extend interface tests
      
      * Stylistic minor fixes
      
      * Extend test_contraction_interface
      642d5e91
  12. 31 Mar, 2022 1 commit
    • Chao Liu's avatar
      Compile for gfx908 and gfx90a (#130) · cd167e49
      Chao Liu authored
      * adding compilation for multiple targets
      
      * fix build
      
      * clean
      
      * update Jekinsfile
      
      * update readme
      
      * update Jenkins
      
      * use ck::half_t instead of ushort for bf16
      
      * rename enum classes
      
      * clean
      
      * rename
      
      * clean
      cd167e49
  13. 09 Mar, 2022 1 commit
    • Chao Liu's avatar
      Reorganize files, Part 1 (#119) · 5d37d7bf
      Chao Liu authored
      * delete obselete files
      
      * move files
      
      * build
      
      * update cmake
      
      * update cmake
      
      * fix build
      
      * reorg examples
      
      * update cmake for example and test
      5d37d7bf
  14. 14 Nov, 2021 1 commit
    • Chao Liu's avatar
      ckProfiler and device-level XDL GEMM operator (#48) · e823d518
      Chao Liu authored
      * add DeviceGemmXdl
      
      * update script
      
      * fix naming issue
      
      * fix comment
      
      * output HostTensorDescriptor
      
      * rename
      
      * padded GEMM for fwd v4r4r4 nhwc
      
      * refactor
      
      * refactor
      
      * refactor
      
      * adding ckProfiler
      
      * adding ckProfiler
      
      * refactor
      
      * fix tuning parameter bug
      
      * add more gemm instances
      
      * add more fp16 GEMM instances
      
      * fix profiler driver
      
      * fix bug in tuning parameter
      
      * add fp32 gemm instances
      
      * small fix
      
      * refactor
      
      * rename
      
      * refactor gemm profiler; adding DeviceConv and conv profiler
      
      * refactor
      
      * fix
      
      * add conv profiler
      
      * refactor
      
      * adding more GEMM and Conv instance
      
      * Create README.md
      
      Add build instruction for ckProfiler
      
      * Create README.md
      
      Add Readme for gemm_xdl example
      
      * Update README.md
      
      Remove build instruction from top most folder
      
      * Update README.md
      
      * clean up
      e823d518