Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
composable_kernel_ROCM
Commits
8f455e57
Commit
8f455e57
authored
Jan 31, 2025
by
Andriy Roshchenko
Browse files
Fix comments.
parent
94b8c629
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
8 deletions
+6
-8
include/ck/utility/amd_xdlops.hpp
include/ck/utility/amd_xdlops.hpp
+6
-8
No files found.
include/ck/utility/amd_xdlops.hpp
View file @
8f455e57
...
@@ -482,10 +482,9 @@ struct intrin_mfma_f32_32x32x64f8f6f4;
...
@@ -482,10 +482,9 @@ struct intrin_mfma_f32_32x32x64f8f6f4;
/// @brief Performs a matrix fused multiply-accumulate operation on 32x32x64 submatrices for f8, f6,
/// @brief Performs a matrix fused multiply-accumulate operation on 32x32x64 submatrices for f8, f6,
/// and f4 data types.
/// and f4 data types.
///
///
/// @note Calls scaled version of the instruction as the original instruction is not supported on
/// @note Calls scaled version of the instruction as the original instruction is not supported in
/// the backend. As per Matthew Arsenault: "Use the scaled versions. It's not a workaround, that is
/// the backend. That is the intended use. There is a backend optimization to select the unscaled
/// the intended use. There is a backend optimization to select to the unscaled if you use 0
/// operation if the scale is 0.
/// scales."
template
<
>
template
<
>
struct
intrin_mfma_f32_32x32x64f8f6f4
<
32
,
32
>
struct
intrin_mfma_f32_32x32x64f8f6f4
<
32
,
32
>
{
{
...
@@ -590,10 +589,9 @@ struct intrin_mfma_f32_16x16x128f8f6f4;
...
@@ -590,10 +589,9 @@ struct intrin_mfma_f32_16x16x128f8f6f4;
/// @brief Performs a matrix fused multiply-accumulate operation on 16x16x128 submatrices for f8f6f4
/// @brief Performs a matrix fused multiply-accumulate operation on 16x16x128 submatrices for f8f6f4
/// data types.
/// data types.
///
///
/// @note Calls scaled version of the instruction as the original instruction is not supported on
/// @note Calls scaled version of the instruction as the original instruction is not supported in
/// the backend. As per Matthew Arsenault: "Use the scaled versions. It's not a workaround, that is
/// the backend. That is the intended use. There is a backend optimization to select the unscaled
/// the intended use. There is a backend optimization to select to the unscaled if you use 0
/// operation if the scale is 0.
/// scales."
template
<
>
template
<
>
struct
intrin_mfma_f32_16x16x128f8f6f4
<
16
,
16
>
struct
intrin_mfma_f32_16x16x128f8f6f4
<
16
,
16
>
{
{
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment