"driver/vscode:/vscode.git/clone" did not exist on "d3df5eb192ef6348360a0e839f10e441c4393663"
- 21 Apr, 2023 1 commit
-
-
Haocong WANG authored
* [Navi3x] Fix Gridwise_multiple_d operation (#649) * Add CMake Option "USE_OPT_NAVI3X" * fix bug * standardize docs (#655) * Separate bibtex requirement from rocm-docs-core (#656) * separate bibtex requirement from rocm-docs-core * point requirements to source rocm-docs-core repo * Add CMake Option "USE_OPT_NAVI3X" (#647) * Add CMake Option "USE_OPT_NAVI3X" * remove navi3x opt compile option from cmake script * Conv + quantization + tanh (#645) * Rename file. Prepare to support another activation * Add comment for quantization * Extract out_elementop * Add tanh example * Add conv + bias + tanh quantization instance * Add missing parameter * Refine cmake * Add external api and client example * Extract variable in example * Fix the comment --------- Co-authored-by:
zjing14 <zhangjing14@gmail.com> * Add a denorm test fix (#603) * Add type_convert implementations for bf16 * Add the fix for conv_fwd * Add the fix for conv_bwd_data * Add the fix for conv_bwd_weight * Format * Format * Another format * Add a macro to use workaround on MI200 only * Format --------- Co-authored-by:
Rosty Geyyer <rosty.geyyer@amd.com> Co-authored-by:
zjing14 <zhangjing14@gmail.com> * simplify karg in device/grid of split-k op (#644) * simplify karg in device/grid split-k op * fix mk_kn_mn instances * add more instances * use name from tensor layout * fix 3rd dword of buffer source descriptor (#659) * add fp64 instances (#658) Co-authored-by:
root <root@ctr-ubbsmc15.amd.com> * Issue #666: Revert "simplify karg in device/grid of split-k op (#644)" (#665) This reverts commit bb5530af . * Groupnorm + swish external api (#668) * Rename to proper naming * Add example of groupnorm + swish * Extract duplicate code in example * Add groupnorm + swish instances * Ractor instance generation, split into multiple cpp file * Add external api and client example * Refine profiler message * Use ck math version of exp * Refine problem size in example * Add host version of exp * add a marco to turn on/off denorm fix (off by default) (#673) * add a marco to turn off denorm fix by default * expose the marco --------- Co-authored-by:
root <root@ctr-ubbsmc15.amd.com> * fixed quant example (#672) Co-authored-by:
root <root@ctr-ubbsmc15.amd.com> * Add dependabot config and pin rocm-docs-core (#663) * [gtest] suppress unsafe buffer warn (#670) ref: https://github.com/ROCmSoftwarePlatform/MIOpen/pull/1912 * Add memory index guard in wmma device ops (#667) * Add more macros to turn on/off denorm fix (#678) Co-authored-by:
Rosty Geyyer <rosty.geyyer@amd.com> * Fix a typo (#676) * Add (#677) * Allow using ROCm release candidate compilers. (#679) * enable use of rocm5.5 release candidate 4 * upgrade to ROCM5.5 RC5 * try fix the PUB_KEY error, remove the cmake-data package * upgrade to latest cmake version * use private dockerhub repo for rocm5.5 rc5 * add missing bracket * Disable SkipLDS & Align AIT api * Update dependabot config (#682) Co-authored-by:
samjwu <samjwu@users.noreply.github.com> * update attn api * solve type_convert bug + enable --------- Co-authored-by:
Sam Wu <sjwu@ualberta.ca> Co-authored-by:
Sam Wu <sam.wu2@amd.com> Co-authored-by:
rocking5566 <ChunYu.Lai@amd.com> Co-authored-by:
zjing14 <zhangjing14@gmail.com> Co-authored-by:
Rostyslav Geyyer <46627076+geyyer@users.noreply.github.com> Co-authored-by:
Rosty Geyyer <rosty.geyyer@amd.com> Co-authored-by:
carlushuang <carlus.huang@amd.com> Co-authored-by:
root <root@ctr-ubbsmc15.amd.com> Co-authored-by:
Jun Liu <Liu.Jun@amd.com> Co-authored-by:
Illia Silin <98187287+illsilin@users.noreply.github.com> Co-authored-by:
samjwu <samjwu@users.noreply.github.com> Co-authored-by:
haocwang <Haocong.WANG@amd.com>
-
- 25 Jun, 2022 2 commits
-
-
Chao Liu authored
-
Chao Liu authored
* ad gelu and fast_gelu * added GeLU and fast GeLU * clean up * add gemm+fastgelu example * add gemm+gelu instances * update profiler * clean up * clean up * adding gemm+bias+activation * clean * adding bias * clean * adding gemm multiple d * debugging * add gemm bias add fastgelu * rename, clean * refactoring; add readme * refactor * refactor * refactor * refactor * refactor * refactor * fix * fix * update example * update example * rename * update example * add ckProfiler * clean * clean * clean * clean * add client app example * update readme * delete obselete files * remove old client app * delete old file * cleaning * clean * remove half * fix header path * fix header path * fix header path * fix header path * fix header path * fix header path for all examples * fix header path * fix header path * fix header path * fix header path * fix header path * fix header path * fix header path * fix header path * fix header path * revert client app example * clean build * fix build * temporary disable client test on Jenkins * clean * clean * clean
-
- 02 Jun, 2022 1 commit
-
-
Qianfeng authored
* Use the unified naming for math functions on host and HIP kernel * Corresponding change/simplification in reduction host/profiler/examples due to unified math functions renaming * Renaming GetReductionZeroVal() to GetIdentityValue() * Tiny renaming in profile_reduce_impl.hpp * More renaming in profile_reduce_impl.hpp * Replace zeroVal by identiyVal * Remove ck_ prefix in the naming of ck::math provided functions
-
- 26 May, 2022 1 commit
-
-
Qianfeng authored
* Add example for computing LayerNorm mean and meansquare * Refactor the pool2d_fwd example and add example for float type testing * Revert "Add example for computing LayerNorm mean and meansquare" This reverts commit df52e6f9d897b00c981baa48f291450bcd60925d. * Tiny fix in pool2d_fwd_common.hpp
-