Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
composable_kernel
Commits
edbcaedd
Commit
edbcaedd
authored
Jul 15, 2022
by
Anthony Chang
Browse files
tighten up block sync
parent
c097eb34
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
3 deletions
+3
-3
include/ck/tensor_operation/gpu/grid/gridwise_softmax.hpp
include/ck/tensor_operation/gpu/grid/gridwise_softmax.hpp
+3
-3
No files found.
include/ck/tensor_operation/gpu/grid/gridwise_softmax.hpp
View file @
edbcaedd
...
@@ -252,7 +252,8 @@ struct GridwiseSoftmax_mk_to_mk
...
@@ -252,7 +252,8 @@ struct GridwiseSoftmax_mk_to_mk
static_for
<
0
,
MThreadSliceSize
,
1
>
{}([
&
](
auto
I
)
{
static_for
<
0
,
MThreadSliceSize
,
1
>
{}([
&
](
auto
I
)
{
BlockwiseMaxReduce
::
Reduce
(
reduce_work_buf
,
max_value_buf
(
I
));
BlockwiseMaxReduce
::
Reduce
(
reduce_work_buf
,
max_value_buf
(
I
));
block_sync_lds
();
if
(
I
<
MThreadSliceSize
-
1
)
block_sync_lds
();
// wait for reading being complete before writing to LDS
});
});
threadwise_src_load
.
MoveSrcSliceWindow
(
in_grid_desc_m_k
,
in_thread_copy_bwd_step
);
threadwise_src_load
.
MoveSrcSliceWindow
(
in_grid_desc_m_k
,
in_thread_copy_bwd_step
);
...
@@ -305,10 +306,9 @@ struct GridwiseSoftmax_mk_to_mk
...
@@ -305,10 +306,9 @@ struct GridwiseSoftmax_mk_to_mk
reducedTiles
++
;
reducedTiles
++
;
}
while
(
reducedTiles
<
num_k_block_tile_iteration
);
}
while
(
reducedTiles
<
num_k_block_tile_iteration
);
block_sync_lds
();
// wait for reading being complete before writing to LDS
static_for
<
0
,
MThreadSliceSize
,
1
>
{}([
&
](
auto
I
)
{
static_for
<
0
,
MThreadSliceSize
,
1
>
{}([
&
](
auto
I
)
{
block_sync_lds
();
// wait for reading being complete before writing to LDS
BlockwiseSumReduce
::
Reduce
(
reduce_work_buf
,
accu_value_buf
(
I
));
BlockwiseSumReduce
::
Reduce
(
reduce_work_buf
,
accu_value_buf
(
I
));
block_sync_lds
();
});
});
threadwise_src_load
.
MoveSrcSliceWindow
(
in_grid_desc_m_k
,
in_thread_copy_fwd_step
);
threadwise_src_load
.
MoveSrcSliceWindow
(
in_grid_desc_m_k
,
in_thread_copy_fwd_step
);
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment