Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
composable_kernel_ROCM
Commits
a53d4d9e
"vscode:/vscode.git/clone" did not exist on "ae360af625413fdcd4045045a673b0a656612e3c"
Commit
a53d4d9e
authored
Apr 23, 2024
by
Adam Osewski
Browse files
Small refinements.
parent
98def248
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
1 addition
and
5 deletions
+1
-5
profiler/include/profiler/profile_grouped_gemm_multiple_d_splitk_impl.hpp
.../profiler/profile_grouped_gemm_multiple_d_splitk_impl.hpp
+1
-5
No files found.
profiler/include/profiler/profile_grouped_gemm_multiple_d_splitk_impl.hpp
View file @
a53d4d9e
...
...
@@ -227,7 +227,6 @@ bool profile_ggemm_multid_splitk(int do_verification,
auto
argument_ptr
=
gemm_ptr
->
MakeArgumentPointer
(
p_a
,
p_b
,
p_ds
,
p_c
,
gemm_descs
,
a_element_op
,
b_element_op
,
c_element_op
);
auto
invoker_ptr
=
gemm_ptr
->
MakeInvokerPointer
();
DeviceMem
gemm_arg_dev_mem
(
gptr
->
GetDeviceKernelArgSize
(
argument_ptr
.
get
()));
...
...
@@ -263,7 +262,7 @@ bool profile_ggemm_multid_splitk(int do_verification,
for
(
std
::
size_t
i
=
0
;
i
<
gemm_descs
.
size
();
i
++
)
c_device_buf
[
i
]
->
SetZero
();
invoker_ptr
->
Run
(
argument_ptr
.
get
(),
StreamConfig
{
nullptr
,
false
});
invoker_ptr
->
Run
(
argument_ptr
.
get
(),
StreamConfig
{
nullptr
,
false
,
1
});
if
(
do_verification
)
{
...
...
@@ -308,12 +307,10 @@ bool profile_ggemm_multid_splitk(int do_verification,
<<
std
::
endl
;
pass
=
pass
&&
instance_pass
;
// std::cout << ">>>>>CPU verification end!" << std::endl;
}
if
(
time_kernel
)
{
// std::cout << ">>>>>GPU time profiling start!" << std::endl;
float
avg_time
=
invoker_ptr
->
Run
(
argument_ptr
.
get
(),
StreamConfig
{
nullptr
,
time_kernel
,
0
,
warmup_iter
,
kernel_iter
});
...
...
@@ -342,7 +339,6 @@ bool profile_ggemm_multid_splitk(int do_verification,
best_gb_per_sec
=
gb_per_sec
;
best_kbatch
=
kbatch_curr
;
}
// std::cout << ">>>>>GPU time profiling end!" << std::endl;
}
}
else
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment