1. 17 Oct, 2023 1 commit
  2. 04 Oct, 2023 1 commit
    • Rostyslav Geyyer's avatar
      Add conv bwd weight fp16 comp bf8 fp8 op, instances and example (#945) · 42facfc6
      Rostyslav Geyyer authored
      
      
      * Add f8 bf8 gemm example
      
      * Add element-wise ops
      
      * Add intrinsics
      
      * Update reference calculation
      
      * Add an additional type option for xdlops gemm
      
      * Fix build process
      
      * Add bf8 to buffer addressing
      
      * Update blockwise op, split typeA and typeB
      
      * Update for compatibility
      
      * Uppdate naming to f8->fp8
      
      * Update naming
      
      * Format
      
      * Update naming (#937)
      
      * Add a client example
      
      * Add computetypes to device and gridwise ops
      
      * Add instances, update instance factory
      
      * Format
      
      * Fix a flag
      
      * Add ckProfiler mode
      
      * Fix typos
      
      * Add an example
      
      * Add bf8 generator
      
      * add bf8 mfma; fixed type_convert for bf8
      
      * move verfication ahead of timing
      
      * Update reference calculation
      
      * Fix reference
      
      * Narrow down float init range
      
      * Fix bf8 bf8 mfma
      
      * Add bf8 @ fp8 mfma
      
      * Update example
      
      * Update instances
      
      * Update profiler api
      
      * Update for compatibility
      
      * Format
      
      * Remove extra example
      
      * Clean up
      
      * workaround convert
      
      ---------
      Co-authored-by: default avatarJing Zhang <jizha@amd.com>
      42facfc6
  3. 31 May, 2023 1 commit
  4. 22 Feb, 2023 1 commit
    • Rostyslav Geyyer's avatar
      Add Grouped Conv Backward Weight on Navi21 for ResNet50. (#505) · 246ceee4
      Rostyslav Geyyer authored
      
      
      * Add DeviceOp and examples
      
      * Format DeviceOp template arguments
      
      * Remove bf16 example
      
      * Format
      
      * Format
      
      * Update MakeABCGridDescriptor_A_K0_M_K1_B_K0_N_K1_C_M_N
      
      * Refactor argument preparation
      
      * Update conv_bwd_weight_dl to grouped_conv_bwd_weight_dl
      
      * Rename device op file
      
      * Update include directive in the example file
      
      * Update descriptor preparation for grouped op
      
      * Update the argument
      
      * Update batch handling
      
      * Add gridwise gemm supporting batched input
      
      * Update blockwise indexing, working version
      
      * Update copyright year
      
      * Update check if argument is supported
      
      * Refactor and make consistent with xdl examples
      
      * Update check if argument is supported
      
      * Add changelog entry
      
      * Added comments on Dl op split_k>1 support
      
      ---------
      Co-authored-by: default avatarRosty Geyyer <rosty.geyyer@amd.com>
      Co-authored-by: default avatarzjing14 <zhangjing14@gmail.com>
      246ceee4
  5. 10 Nov, 2022 1 commit
    • Po Yen Chen's avatar
      Add client example of grouped conv2d backward weight (data type: fp16) (#498) · 38470e04
      Po Yen Chen authored
      * Remove redundant CMake setting
      
      * Extract common code from files
      
      * Rename folder 'convnd' to 'conv'
      
      * Use std::array<> to accept compile-time kwnown # of arguments
      
      * Fix compilation error of tuning parameter
      
      * In example, use same setting as unit-test
      
      * Remove no-longer used include directive
      
      * Add interface for grouped conv bwd weight
      
      * Add group support for conv bwd weight
      
      * Add grouped conv bwd weight example
      
      * Use group parameter in example
      
      * Rename example folder
      
      * Remove non-grouped version example source files
      
      * Rename device op template
      
      * Add group support to convolution backward weight
      
      * Remove debug messages
      
      * Use smaller group size in example
      
      * Use named variable as loop terminate condition
      
      * Prettify example output message
      
      * Enlarge used grid size
      
      * Allow real grid size exceeds expected grid size
      
      * Rename interface file
      
      * Add client example for grouped conv2d bwd weight
      
      * Fix wrong include directive
      
      * Rename client example folder
      38470e04
  6. 02 Nov, 2022 1 commit
    • Po Yen Chen's avatar
      Add client example of grouped conv2d backward data (data type: fp16) (#481) · 9e57a290
      Po Yen Chen authored
      * Improve example reusability
      
      * Remove no-longer used file
      
      * Rename folder of grouped_conv_bwd_data example
      
      * Add normal grouped conv bwd example
      
      * Add interface 'DeviceGroupedConvBwdData'
      
      * Prettify comment of device op type arguments
      
      * Add grouped conv2d/conv3d backward data fp16 instances
      
      * Fix wrong template argument
      
      * Add grouped_conv2d_bwd_data client example
      
      * Use simpler expression to calculate memory size
      
      * Fix formating
      
      * Remove grouped_conv3d_bw_data instances
      
      Underlying device operator is not ready to handle 3D input
      
      * Remove no-longer necessary include directive
      
      * Add missing include directive
      
      * Use more realistic conv param in example
      9e57a290