Commits · 189ea3b9aa1cf311869b8dd8d058e4531bf54bd4 · gaoqiong / composable_kernel

18 Jul, 2023 1 commit

Add mechanism to build CK for select data types, add Navi3x CI. (#790) · 189ea3b9

Illia Silin authored Jul 17, 2023

* allow building CK for specific data types

* add CI build and test stage on Naiv3x without some int8 instances

* add missing gemm fp16 instances

* add the changes to the missed cmake file

* add empty lines at end of source files

* Do not build quantization client example on navi3 in CI

* disable batched_gemm_multi_d_int8 instances with DTYPES

* disable device_conv2d_bwd_data_instance with DTYPES

* fix ckprofiler for conv_bwd_data for int8

* properly isolate the conv_bwd_data int8 instances

* remove empty line

189ea3b9

15 Jun, 2023 1 commit

Enable gfx941 and gfx942 architectures. (#752) · 027e46ee

Illia Silin authored Jun 15, 2023

* enable gfx941/942 targets

* fix clang format

* fix the cmake logic for multiple targets

* fix cmake syntax for looping over targets

* add gfx941/942 support for gemm_xdl instances

027e46ee

23 May, 2023 1 commit

Enable gemm_dl and other kernels on Navi3x. (#714) · d821d1e5

Illia Silin authored May 23, 2023

* enable dl kernels on navi3

* do not build xdl tests and examples on Navi

* run tests before building everything on jenkins

* disable gemm_bilinear on gfx1030

* add gpu targets to installer on Navi

* put tests in the same order as before

* reduce the number of navi targets in CI

* build CI installed for gfx940 as well

* only build for MI300 during QA runs

d821d1e5

15 Mar, 2023 1 commit

gemm/Conv xdlops + dlops quantization (#625) · 16dc18e0

rocking5566 authored Mar 16, 2023



* Add conv perlayer quantization

* Add gemm_dlops quantization

* Support int8 for innerproduct

* Refine gemm dlops int8 kernel parameter

* Support gfx908(MI100) and gfx90a(MI200)

* clang-format

* Rename example number

* Support different layout for d tensor

* Add conv dlops perchannel quantization example

* Move to example 40

* Extract the common code for different platform (dlops and xdlops)

* Move ot subfolder. Prepare to add other op of quantization

* Refine the quantization instance library

* Add conv dl instances and client example

* Remove unnecessary type

* Add gemm quantization instance

* Add external api and client example

* Refine num_bytes

* Separete different layout to different cpp

* Add more xdl instances

* Revert "Remove unnecessary type"

This reverts commit 820869182f6a8f62b2c9004101ba6bf76b96be14.

* Remove CShuffleDataType in dlops
Let acc and CShuffleDataType be the same in xdlops

---------
Co-authored-by: zjing14 <zhangjing14@gmail.com>

16dc18e0

30 Nov, 2022 1 commit

gemm, conv perchannel quantization (#503) · ad541ad6

rocking5566 authored Dec 01, 2022

* Use gemm_multiple_D instead

* Add gemm bias relu quantization example

* Add pure gemm quantization example

* Add quantization of perchannel conv + bias + relu example

* Refine the code

* Rename multiplier to requant_scale

* Rename the folder

* Remove redundant comment

* Rename the file. Prepare to add perchannel

* Add conv perchannel instance

* Move to quantization folder

* Add conv perchannel client example

* Apply Rangify constructor of HostTensorDescriptor & Tensor<>

* Fix merge error

ad541ad6