Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
composable_kernel
Commits
9a25afe4
Unverified
Commit
9a25afe4
authored
Nov 23, 2022
by
rocking5566
Committed by
GitHub
Nov 23, 2022
Browse files
Merge branch 'develop' into gemm_layernorm_welford
parents
c1568902
43a889b7
Changes
45
Expand all
Show whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
449 additions
and
5 deletions
+449
-5
library/src/tensor_operation_instance/gpu/gemm_fastgelu/device_gemm_fastgelu_xdl_c_shuffle_f16_f16_f16_mk_nk_mn_instance.cpp
..._fastgelu_xdl_c_shuffle_f16_f16_f16_mk_nk_mn_instance.cpp
+100
-0
profiler/include/profile_batched_gemm_softmax_gemm_permute_impl.hpp
...nclude/profile_batched_gemm_softmax_gemm_permute_impl.hpp
+19
-2
test/batched_gemm_softmax_gemm_permute/CMakeLists.txt
test/batched_gemm_softmax_gemm_permute/CMakeLists.txt
+4
-1
test/batched_gemm_softmax_gemm_permute/test_batched_gemm_softmax_gemm_permute_bf16.cpp
...m_permute/test_batched_gemm_softmax_gemm_permute_bf16.cpp
+182
-0
test/batched_gemm_softmax_gemm_permute/test_batched_gemm_softmax_gemm_permute_util.hpp
...m_permute/test_batched_gemm_softmax_gemm_permute_util.hpp
+144
-2
No files found.
library/src/tensor_operation_instance/gpu/gemm_fastgelu/device_gemm_fastgelu_xdl_c_shuffle_f16_f16_f16_mk_nk_mn_instance.cpp
0 → 100644
View file @
9a25afe4
This diff is collapsed.
Click to expand it.
profiler/include/profile_batched_gemm_softmax_gemm_permute_impl.hpp
View file @
9a25afe4
...
@@ -309,8 +309,25 @@ bool profile_batched_gemm_softmax_gemm_permute_impl(bool do_verification,
...
@@ -309,8 +309,25 @@ bool profile_batched_gemm_softmax_gemm_permute_impl(bool do_verification,
{
{
c_device_buf
.
FromDevice
(
c_gs_ms_os_device_result
.
mData
.
data
());
c_device_buf
.
FromDevice
(
c_gs_ms_os_device_result
.
mData
.
data
());
pass
=
// default absolute error and relative error is 0.001
pass
&
ck
::
utils
::
check_err
(
c_gs_ms_os_device_result
,
c_gs_ms_os_host_result
);
double
rtol
=
1e-3
;
double
atol
=
1e-3
;
// when BF16 is taken, set absolute error and relative error to 0.01
if
(
std
::
is_same_v
<
ADataType
,
ck
::
bhalf_t
>
&&
std
::
is_same_v
<
B0DataType
,
ck
::
bhalf_t
>
&&
std
::
is_same_v
<
B1DataType
,
ck
::
bhalf_t
>
&&
std
::
is_same_v
<
CDataType
,
ck
::
bhalf_t
>
)
{
rtol
=
1e-2
;
atol
=
1e-2
;
}
pass
=
pass
&
ck
::
utils
::
check_err
(
c_gs_ms_os_device_result
,
c_gs_ms_os_host_result
,
"Error: Incorrect results!"
,
rtol
,
atol
);
if
(
do_log
)
if
(
do_log
)
{
{
...
...
test/batched_gemm_softmax_gemm_permute/CMakeLists.txt
View file @
9a25afe4
add_custom_target
(
test_batched_gemm_softmax_gemm_permute
)
add_custom_target
(
test_batched_gemm_softmax_gemm_permute
)
add_gtest_executable
(
test_batched_gemm_softmax_gemm_permute_fp16 test_batched_gemm_softmax_gemm_permute_fp16.cpp
)
add_gtest_executable
(
test_batched_gemm_softmax_gemm_permute_fp16 test_batched_gemm_softmax_gemm_permute_fp16.cpp
)
add_gtest_executable
(
test_batched_gemm_softmax_gemm_permute_bf16 test_batched_gemm_softmax_gemm_permute_bf16.cpp
)
target_link_libraries
(
test_batched_gemm_softmax_gemm_permute_fp16 PRIVATE utility device_batched_gemm_softmax_gemm_permute_instance
)
target_link_libraries
(
test_batched_gemm_softmax_gemm_permute_fp16 PRIVATE utility device_batched_gemm_softmax_gemm_permute_instance
)
target_link_libraries
(
test_batched_gemm_softmax_gemm_permute_bf16 PRIVATE utility device_batched_gemm_softmax_gemm_permute_instance
)
add_dependencies
(
test_batched_gemm_softmax_gemm_permute test_batched_gemm_softmax_gemm_permute_fp16
)
add_dependencies
(
test_batched_gemm_softmax_gemm_permute test_batched_gemm_softmax_gemm_permute_fp16
)
add_dependencies
(
test_batched_gemm_softmax_gemm_permute test_batched_gemm_softmax_gemm_permute_bf16
)
\ No newline at end of file
test/batched_gemm_softmax_gemm_permute/test_batched_gemm_softmax_gemm_permute_bf16.cpp
0 → 100644
View file @
9a25afe4
This diff is collapsed.
Click to expand it.
test/batched_gemm_softmax_gemm_permute/test_batched_gemm_softmax_gemm_permute_util.hpp
View file @
9a25afe4
This diff is collapsed.
Click to expand it.
Prev
1
2
3
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment