Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
composable_kernel
Commits
04edd5fa
Unverified
Commit
04edd5fa
authored
Aug 22, 2022
by
Rostyslav Geyyer
Committed by
GitHub
Aug 22, 2022
Browse files
Merge branch 'develop' into lwpck-359_int4
parents
1dd03dda
c366de55
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
3 additions
and
8 deletions
+3
-8
example/16_gemm_multi_d_multi_reduces/CMakeLists.txt
example/16_gemm_multi_d_multi_reduces/CMakeLists.txt
+1
-5
example/16_gemm_multi_d_multi_reduces/gemm_max_xdl_fp16.cpp
example/16_gemm_multi_d_multi_reduces/gemm_max_xdl_fp16.cpp
+1
-1
include/ck/tensor_operation/gpu/grid/gridwise_gemm_multiple_d_multiple_r_xdl_cshuffle.hpp
...grid/gridwise_gemm_multiple_d_multiple_r_xdl_cshuffle.hpp
+1
-2
No files found.
example/16_gemm_multi_d_multi_reduces/CMakeLists.txt
View file @
04edd5fa
add_example_executable
(
example_gemm_add_add_mean_meansquare_xdl_fp16 gemm_add_add_mean_meansquare_xdl_fp16.cpp
)
add_example_executable
(
example_gemm_mean_meansquare_xdl_fp16 gemm_mean_meansquare_xdl_fp16.cpp
)
#exclude GEMM+max exampe from testing, since there is random failure on gfx908
#https://github.com/ROCmSoftwarePlatform/composable_kernel/issues/358
#TODO: fix the failure and re-enable this test
add_example_executable_no_testing
(
example_gemm_max_xdl_fp16 gemm_max_xdl_fp16.cpp
)
add_example_executable
(
example_gemm_max_xdl_fp16 gemm_max_xdl_fp16.cpp
)
example/16_gemm_multi_d_multi_reduces/gemm_max_xdl_fp16.cpp
View file @
04edd5fa
...
...
@@ -211,7 +211,7 @@ int main()
r0_device_buf
.
FromDevice
(
r0_m
.
mData
.
data
());
pass
=
ck
::
utils
::
check_err
(
e_m_n
.
mData
,
e_m_n_host
.
mData
,
"Error: Incorrect results
c
"
,
1e-2
,
1e-2
);
e_m_n
.
mData
,
e_m_n_host
.
mData
,
"Error: Incorrect results
e
"
,
1e-2
,
1e-2
);
pass
&=
ck
::
utils
::
check_err
(
r0_m
.
mData
,
r0_m_host
.
mData
,
"Error: Incorrect results d0"
,
1e-2
,
1e-2
);
}
...
...
include/ck/tensor_operation/gpu/grid/gridwise_gemm_multiple_d_multiple_r_xdl_cshuffle.hpp
View file @
04edd5fa
...
...
@@ -776,8 +776,7 @@ struct GridwiseGemmMultipleDMultipleR_k0mk1_k0nk1_mn_xdl_cshuffle_v1
static_for
<
0
,
num_access
,
1
>
{}([
&
](
auto
access_id
)
{
// make sure it's safe to read from LDS
if
constexpr
(
access_id
>
0
)
block_sync_lds
();
block_sync_lds
();
// each thread shuffle data from VGPR to LDS
c_thread_copy_vgpr_to_lds
.
Run
(
c_thread_desc_m0_n0_m1_n1_m2_m3_m4_n2
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment