"docs/git@developer.sourcefind.cn:wangsen/mineru.git" did not exist on "6eb54a55990385f566b6ad5e7da30d256c8731cc"
Unverified Commit ebab84b6 authored by Shaojie WANG's avatar Shaojie WANG Committed by GitHub
Browse files

MNKO padding support on bmm+masking+scale+softmax+bmm+premute (#425)



* add lower triangle bmm

* init code for tile skipping

* functionality right with lower triangle mask

* add decoder lower triangular mask calculation

* use 7*13 group

* fix n2 compute error

* attention with lower triangle mask with tile skipping

* add template to distinguish masking kernel

* rename template and remove default template value

* remove lower triangle gemm reference struct

* add some comments on example

* add 10 instance for masking bmm + scale + softmax + bmm + permute kernels

* add test

* add test file

* add gtest for bmm masking scale softmax bmm permute

* clang-format

* fix compile error

* check lef bottom corner for tile skipping

* fix error: check left bottom corner for tile skipping

* add k padding

* add test and instance for MNK padding

* passing a mask struct

* fix instances

* delete used comments

* format
Co-authored-by: default avatardanyao12 <yaodan@dc-smc-13.amd.com>
Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
parent 9f7c1930
...@@ -160,7 +160,8 @@ struct DeviceInstanceWrapper_TNTT_FP16_M128_N128_K32_O128 ...@@ -160,7 +160,8 @@ struct DeviceInstanceWrapper_TNTT_FP16_M128_N128_K32_O128
1, // CShuffleMXdlPerWavePerShuffle 1, // CShuffleMXdlPerWavePerShuffle
2, // CShuffleNXdlPerWavePerShuffle 2, // CShuffleNXdlPerWavePerShuffle
S<1, 32, 1, 8>, // CShuffleBlockTransferClusterLengths_MBlock_MPerBlock_NBlock_NPerBlock S<1, 32, 1, 8>, // CShuffleBlockTransferClusterLengths_MBlock_MPerBlock_NBlock_NPerBlock
8>; // CShuffleBlockTransferScalarPerVector_NPerBlock 8, // CShuffleBlockTransferScalarPerVector_NPerBlock
false>;
bool IsSupported(int M, int N, int K, int O) bool IsSupported(int M, int N, int K, int O)
{ {
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment