Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
composable_kernel
Commits
b4c2a6bb
Commit
b4c2a6bb
authored
Jun 19, 2023
by
Po-Yen, Chen
Browse files
Put partial ds_read reading logics in previous iteration
parent
ecef4987
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
9 additions
and
4 deletions
+9
-4
include/ck/tensor_operation/gpu/grid/gridwise_gemm_pipeline_v2.hpp
...k/tensor_operation/gpu/grid/gridwise_gemm_pipeline_v2.hpp
+9
-4
No files found.
include/ck/tensor_operation/gpu/grid/gridwise_gemm_pipeline_v2.hpp
View file @
b4c2a6bb
...
@@ -77,14 +77,15 @@ struct GridwiseGemmPipeline_v2
...
@@ -77,14 +77,15 @@ struct GridwiseGemmPipeline_v2
{
{
index_t
i
=
0
;
index_t
i
=
0
;
block_sync_lds
();
blockwise_gemm
.
PrepareRun
(
a_block_buf
);
do
do
{
{
__builtin_amdgcn_iglp_opt
(
2
);
// __builtin_amdgcn_iglp_opt(2);
block_sync_lds
();
// GEMM i
// GEMM i
blockwise_gemm
.
PrepareRun
(
a_block_buf
);
blockwise_gemm
.
Run
(
b_block_buf
,
c_thread_buf
);
blockwise_gemm
.
Run
(
b_block_buf
,
c_thread_buf
);
block_sync_lds
();
block_sync_lds
();
...
@@ -103,6 +104,10 @@ struct GridwiseGemmPipeline_v2
...
@@ -103,6 +104,10 @@ struct GridwiseGemmPipeline_v2
// global read i + 2
// global read i + 2
b_blockwise_copy
.
RunRead
(
b_grid_desc
,
b_grid_buf
);
b_blockwise_copy
.
RunRead
(
b_grid_desc
,
b_grid_buf
);
block_sync_lds
();
blockwise_gemm
.
PrepareRun
(
a_block_buf
);
++
i
;
++
i
;
}
while
(
i
<
(
num_loop
-
2
));
}
while
(
i
<
(
num_loop
-
2
));
}
}
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment