Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
composable_kernel_ROCM
Commits
6ff7fa94
"tests/vscode:/vscode.git/clone" did not exist on "fa489eaed6b1812c1a1b604bb5c11ea861523f45"
Commit
6ff7fa94
authored
Dec 17, 2024
by
Po Yen Chen
Browse files
Update num_splits heuristic
parent
337f073d
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
2 additions
and
2 deletions
+2
-2
example/ck_tile/01_fmha/fmha_fwd.cpp
example/ck_tile/01_fmha/fmha_fwd.cpp
+1
-1
example/ck_tile/01_fmha/fmha_fwd.hpp
example/ck_tile/01_fmha/fmha_fwd.hpp
+1
-1
No files found.
example/ck_tile/01_fmha/fmha_fwd.cpp
View file @
6ff7fa94
...
@@ -235,7 +235,7 @@ int override_num_splits_if_necessary(int batch,
...
@@ -235,7 +235,7 @@ int override_num_splits_if_necessary(int batch,
if
(
num_splits
<
1
&&
p_drop
==
0.0
f
)
if
(
num_splits
<
1
&&
p_drop
==
0.0
f
)
{
{
return
num_splits_heuristic
(
return
num_splits_heuristic
(
batch
*
nhead
*
num_m_blocks
,
props
.
multiProcessorCount
*
2
,
32
);
batch
*
nhead
*
num_m_blocks
,
props
.
multiProcessorCount
*
2
,
16
);
}
}
return
num_splits
;
return
num_splits
;
...
...
example/ck_tile/01_fmha/fmha_fwd.hpp
View file @
6ff7fa94
...
@@ -829,7 +829,7 @@ Int num_splits_heuristic(Int batch_nhead_mblocks, Int num_SMs, Int max_splits)
...
@@ -829,7 +829,7 @@ Int num_splits_heuristic(Int batch_nhead_mblocks, Int num_SMs, Int max_splits)
std
::
vector
<
float
>
efficiency
;
std
::
vector
<
float
>
efficiency
;
efficiency
.
reserve
(
max_splits
);
efficiency
.
reserve
(
max_splits
);
for
(
Int
num_splits
=
1
;
num_splits
<=
max_splits
;
num_splits
++
)
for
(
Int
num_splits
=
1
;
num_splits
<=
max_splits
;
num_splits
*=
2
)
{
{
float
n_blocks
=
float
(
batch_nhead_mblocks
*
num_splits
)
/
num_SMs
;
float
n_blocks
=
float
(
batch_nhead_mblocks
*
num_splits
)
/
num_SMs
;
float
eff
=
n_blocks
/
std
::
ceil
(
n_blocks
);
float
eff
=
n_blocks
/
std
::
ceil
(
n_blocks
);
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment