Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
MIGraphX
Commits
4ffaf3c7
"ts/vscode:/vscode.git/clone" did not exist on "862c67df3a2678dc82bec1b0404df530fac9a0b4"
Commit
4ffaf3c7
authored
Mar 08, 2019
by
Shucai Xiao
Browse files
merge changes to solve the slowndown of gpu::gemm
parents
770c7d27
afbfbdc0
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
42 additions
and
20 deletions
+42
-20
src/targets/gpu/gemm.cpp
src/targets/gpu/gemm.cpp
+42
-20
No files found.
src/targets/gpu/gemm.cpp
View file @
4ffaf3c7
...
@@ -127,7 +127,11 @@ argument miopen_gemm::compute(context& ctx,
...
@@ -127,7 +127,11 @@ argument miopen_gemm::compute(context& ctx,
auto
alpha_r
=
to_rocblas_type
(
as
(
op
.
alpha
));
auto
alpha_r
=
to_rocblas_type
(
as
(
op
.
alpha
));
auto
beta_r
=
to_rocblas_type
(
as
(
op
.
beta
));
auto
beta_r
=
to_rocblas_type
(
as
(
op
.
beta
));
auto
to_pointer
=
[
&
](
auto
&&
arg
)
{
return
to_rocblas_type
(
as
.
from
(
arg
.
data
()));
};
auto
to_pointer
=
[
&
](
auto
&&
arg
)
{
return
to_rocblas_type
(
as
.
from
(
arg
.
data
()));
};
generic_rocblas_batched_gemm
(
as
,
// call the strided implementation only if there are multiple matrices
if
(
batch_num
>
1
)
{
generic_rocblas_batched_gemm
(
as
,
ctx
.
get_stream
().
get_rocblas
(),
ctx
.
get_stream
().
get_rocblas
(),
transb
?
rocblas_operation_transpose
:
rocblas_operation_none
,
transb
?
rocblas_operation_transpose
:
rocblas_operation_none
,
transa
?
rocblas_operation_transpose
:
rocblas_operation_none
,
transa
?
rocblas_operation_transpose
:
rocblas_operation_none
,
...
@@ -142,11 +146,29 @@ argument miopen_gemm::compute(context& ctx,
...
@@ -142,11 +146,29 @@ argument miopen_gemm::compute(context& ctx,
lda
,
lda
,
m
*
k
,
m
*
k
,
&
beta_r
,
&
beta_r
,
is_3inputs
?
to_pointer
(
args
[
3
])
:
to_pointer
(
args
[
2
]),
to_pointer
(
args
[
2
]),
ldc
,
ldc
,
m
*
n
,
m
*
n
,
batch_num
);
batch_num
);
}
else
{
generic_rocblas_gemm
(
as
,
ctx
.
get_stream
().
get_rocblas
(),
transb
?
rocblas_operation_transpose
:
rocblas_operation_none
,
transa
?
rocblas_operation_transpose
:
rocblas_operation_none
,
n
,
m
,
k
,
&
alpha_r
,
to_pointer
(
args
[
1
]),
ldb
,
to_pointer
(
args
[
0
]),
lda
,
&
beta_r
,
to_pointer
(
args
[
2
]),
ldc
);
}
});
});
return
(
is_3inputs
?
args
[
3
]
:
args
[
2
]);
return
(
is_3inputs
?
args
[
3
]
:
args
[
2
]);
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment