Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
composable_kernel
Commits
a4b52461
Commit
a4b52461
authored
Jul 13, 2019
by
Chao Liu
Browse files
adding implicit GEMM v4r2
parent
e87aa851
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
7 additions
and
3 deletions
+7
-3
driver/include/device_convolution_implicit_gemm_v4r2_nchw_kcyx_nkhw.hpp
.../device_convolution_implicit_gemm_v4r2_nchw_kcyx_nkhw.hpp
+7
-3
No files found.
driver/include/device_convolution_implicit_gemm_v4r2_nchw_kcyx_nkhw.hpp
View file @
a4b52461
...
...
@@ -55,13 +55,13 @@ void device_convolution_implicit_gemm_v4r2_nchw_kcyx_nkhw(InDesc,
#if 1
// 1x1 filter, 8x8 image
constexpr
index_t
N
1
=
2
;
constexpr
index_t
N
0
=
1
;
constexpr
index_t
N2
=
1
;
constexpr
index_t
Ho
1
=
8
;
constexpr
index_t
Ho
0
=
1
;
constexpr
index_t
Ho2
=
1
;
constexpr
index_t
Wo
1
=
1
;
constexpr
index_t
Wo
0
=
2
;
constexpr
index_t
Wo2
=
4
;
constexpr
index_t
BlockSize
=
256
;
...
...
@@ -105,6 +105,10 @@ void device_convolution_implicit_gemm_v4r2_nchw_kcyx_nkhw(InDesc,
constexpr
index_t
WeiBlockCopyDstDataPerWrite_K
=
1
;
#endif
constexpr
index_t
N1
=
N
/
(
N0
*
N2
);
constexpr
index_t
Ho1
=
Ho
/
(
Ho0
*
Ho2
);
constexpr
index_t
Wo1
=
Wo
/
(
Wo0
*
Wo2
);
constexpr
index_t
B
=
N1
*
Ho1
*
Wo1
;
constexpr
index_t
GridSize
=
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment