Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
composable_kernel_ROCM
Commits
6c4e10da
Commit
6c4e10da
authored
Dec 23, 2024
by
Po Yen Chen
Browse files
Only check incomplete split in first&last iterations
parent
e86da0e9
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
23 additions
and
19 deletions
+23
-19
include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_pipeline_qr_ks_vs_async.hpp
...peline/block_fmha_fwd_splitkv_pipeline_qr_ks_vs_async.hpp
+23
-19
No files found.
include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_pipeline_qr_ks_vs_async.hpp
View file @
6c4e10da
...
@@ -473,21 +473,24 @@ struct BlockFmhaFwdSplitKVPipelineQRKSVSAsync
...
@@ -473,21 +473,24 @@ struct BlockFmhaFwdSplitKVPipelineQRKSVSAsync
}
}
move_tile_window
(
bias_dram_window
,
{
0
,
kN0
});
move_tile_window
(
bias_dram_window
,
{
0
,
kN0
});
//
/ TODO:
only check in first/last iteration
without increasing code size
// only check in first/last iteration
s
if
constexpr
(
kHasUnevenSplits
)
if
constexpr
(
kHasUnevenSplits
)
{
if
(
1
<
num_splits
&&
(
i_total_loops
==
0
||
i_total_loops
==
num_total_loop
-
1
))
{
{
const
auto
k_origin
=
k_page_block_navigator
.
to_global_window_origin
(
const
auto
k_origin
=
k_page_block_navigator
.
to_global_window_origin
(
i_page_block_k
,
k_dram_block_window
.
get_window_origin
());
i_page_block_k
,
k_dram_block_window
.
get_window_origin
());
set_tile_if
(
set_tile_if
(
s_acc
,
s_acc
,
-
numeric
<
SMPLComputeDataType
>::
infinity
(),
-
numeric
<
SMPLComputeDataType
>::
infinity
(),
[
&
,
[
&
,
physical_seqlen_k_start_
=
physical_seqlen_k_start
,
physical_seqlen_k_start_
=
physical_seqlen_k_start
,
physical_seqlen_k_end_
=
physical_seqlen_k_end
](
auto
tile_idx
)
{
physical_seqlen_k_end_
=
physical_seqlen_k_end
](
auto
tile_idx
)
{
const
auto
col
=
k_origin
.
at
(
number
<
0
>
{})
+
tile_idx
.
at
(
number
<
1
>
{});
const
auto
col
=
k_origin
.
at
(
number
<
0
>
{})
+
tile_idx
.
at
(
number
<
1
>
{});
if
constexpr
(
kIsPagedKV
)
if
constexpr
(
kIsPagedKV
)
{
{
return
col
<
physical_seqlen_k_start_
||
physical_seqlen_k_end_
<=
col
;
return
col
<
physical_seqlen_k_start_
||
physical_seqlen_k_end_
<=
col
;
}
}
else
else
{
{
...
@@ -495,6 +498,7 @@ struct BlockFmhaFwdSplitKVPipelineQRKSVSAsync
...
@@ -495,6 +498,7 @@ struct BlockFmhaFwdSplitKVPipelineQRKSVSAsync
}
}
});
});
}
}
}
if
constexpr
(
kPadSeqLenK
||
FmhaMask
::
IsMasking
)
if
constexpr
(
kPadSeqLenK
||
FmhaMask
::
IsMasking
)
{
{
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment