Unverified Commit b8898d7e authored by turneram's avatar turneram Committed by GitHub
Browse files

Update CK commit hash and add gfx940 to supported archs (#1842)

* Add initial ck_gemm code

* Format

* Add additional src files

* Format

* Add include

* Simplify fuse_ck

* Format

* Rename var

* Enable pass

* Update ck version

* Fix include

* Add group stride

* Disable warnings for ck headers

* Format

* Add unpack array

* Add interface to enable tuning

* Format

* Update compile_ops to handle tuning config

* Format

* Add some comments

* Move time_op to migraphx_gpu

* Add banchmarking

* Refactor

* Format

* Add lift class macro

* Use device name

* Format

* Generate configs

* Format

* Pass tuning parameter

* Move data type to is_ck_gemm matcher

* Format

* Add problem_cache to avoid retuning same configs

* Format

* Format

* Mark the problems

* Format

* Use is_null

* Format

* Resize vector

* Only tune with exaustive tuning

* Format

* Use assert

* FOrmat

* Tidy fixes

* More tidy fixes

* Format

* Add license to missing files

* Format

* Use transform

* Format

* Fix tidy

* Format

* Fix cppcheck issues

* Format

* Add static_assert

* Add ops header

* Add assertion in batcher

* Format

* Improve the batch fold check

* Format

* Add where op workaround for CK

* Skip if any input is not a supported ck type

* Format

* Check batch is standard

* Format

* Remove redundant static keyword

* Update commit hash

* Fix error when running without --exhaustive-tune

* Formatting

* Formatting

* Remove fuse_ck_gemm_softmax_gemm

* Update ck hash

* Correct spelling mistake

* Remove commented out logic from fuse_ck

* Remove unused include and add comment

* Formatting

* Remove redundant get_shape and remove ck_gemm from names

* Formatting

* Allow for mixed types with int8 gemms

* Formatting

* Add back find_package from merge

* Update CK commit hash and add gfx940 to fuse_ops supported archs

* Formatting

* Update CK hash
parent 2d635f91
...@@ -28,4 +28,4 @@ ROCmSoftwarePlatform/half@rocm-5.4.2 ...@@ -28,4 +28,4 @@ ROCmSoftwarePlatform/half@rocm-5.4.2
pybind/pybind11@d159a563383d10c821ba7b2a71905d1207db6de4 --build pybind/pybind11@d159a563383d10c821ba7b2a71905d1207db6de4 --build
msgpack/msgpack-c@cpp-3.3.0 -DMSGPACK_BUILD_TESTS=Off msgpack/msgpack-c@cpp-3.3.0 -DMSGPACK_BUILD_TESTS=Off
sqlite3@3.17 -DCMAKE_POSITION_INDEPENDENT_CODE=On sqlite3@3.17 -DCMAKE_POSITION_INDEPENDENT_CODE=On
ROCmSoftwarePlatform/composable_kernel@84c5bec1d66a633802fd977bd61e0aada7a6f153 -DCK_BUILD_JIT_LIB=On -DCMAKE_POSITION_INDEPENDENT_CODE=On ROCmSoftwarePlatform/composable_kernel@ac580f77a84c705c678816ef7195adfcc02bdda5 -DCK_BUILD_JIT_LIB=On -DCMAKE_POSITION_INDEPENDENT_CODE=On
...@@ -165,7 +165,8 @@ struct fusion ...@@ -165,7 +165,8 @@ struct fusion
const std::unordered_set<std::string>& get_supported_archs() const std::unordered_set<std::string>& get_supported_archs()
{ {
static std::unordered_set<std::string> supported_archs{"gfx900", "gfx906", "gfx908", "gfx1030"}; static std::unordered_set<std::string> supported_archs{
"gfx900", "gfx906", "gfx908", "gfx1030", "gfx940"};
return supported_archs; return supported_archs;
} }
......
...@@ -428,7 +428,7 @@ struct ck_gemm_compiler : compiler<ck_gemm_compiler> ...@@ -428,7 +428,7 @@ struct ck_gemm_compiler : compiler<ck_gemm_compiler>
{ {
std::vector<shape> gemm_shapes{ std::vector<shape> gemm_shapes{
shapes[0], shapes[1], shapes.back().with_type(shapes[0].type())}; shapes[0], shapes[1], shapes.back().with_type(shapes[0].type())};
std::cout << "ck_gemm: " << to_json_string(to_value(gemm_shapes)) std::cout << "gpu::ck_gemm: " << to_json_string(to_value(gemm_shapes))
<< std::endl; << std::endl;
} }
m.replace_instruction(ins2, code_object, ins2->inputs()); m.replace_instruction(ins2, code_object, ins2->inputs());
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment