Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
composable_kernel
Commits
56f93c6f
Unverified
Commit
56f93c6f
authored
Jul 28, 2021
by
Chao Liu
Committed by
GitHub
Jul 28, 2021
Browse files
Update README.md
parent
f63a23ac
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
26 additions
and
27 deletions
+26
-27
README.md
README.md
+26
-27
No files found.
README.md
View file @
56f93c6f
...
...
@@ -18,9 +18,6 @@ https://www.boost.org/doc/libs/1_66_0/more/getting_started/unix-variants.html#ea
# Build
Change target ID in source code, example below is gfx908
https://github.com/asroy/modular_convolution/blob/aafb5eb18781f1ac9e06a17c3e53d968dd53dcc0/composable_kernel/include/utility/config.amd.hpp.in#L16-L23
Add path of Boost
```
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
...
...
@@ -28,26 +25,28 @@ Add path of Boost
```
mkdir build && cd build
```
# need to manually set target ID, example below is gfx908
cmake cmd. Need to Specify target ID, example below is gfx908
```
cmake \
-D CMAKE_BUILD_TYPE=Release \
-D
DEVICE_BACKEND=AMD
\
-D
CMAKE_CXX_FLAGS="-O3 --amdgpu-target=gfx908 -mllvm --amdgpu-spill-vgpr-to-agpr=0 -gline-tables-only -save-temps=$CWD"
\
-D CMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc \
-D CMAKE_PREFIX_PATH=/opt/rocm \
-D CMAKE_VERBOSE_MAKEFILE:BOOL=ON \
-D CMAKE_BUILD_TYPE=Release
\
-D
CMAKE_CXX_FLAGS="-DCK_AMD_GPU_GFX908 -O3 --amdgpu-target=gfx908 -mllvm --amdgpu-spill-vgpr-to-agpr=0 -gline-tables-only -save-temps=$PWD"
\
-D
HIP_ONLINE_COMPILER_FLAGS="-DCK_AMD_GPU_GFX908"
\
-D CMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc
\
-D CMAKE_PREFIX_PATH=/opt/rocm
\
-D CMAKE_VERBOSE_MAKEFILE:BOOL=ON
\
..
```
Build drivers:
\
``conv_driver_
v2
``
is (offline compilation) driver for forward convolution,
\
``conv_bwd_
data_
driver_
v2
``
is (offline compilation) driver for backward-data convolution
\
``conv_driver_
v2_olc
``
is (online compilation) driver for forward convolution
``conv_
fwd_
driver_
offline
``
is (offline compilation) driver for forward convolution,
\
``conv_bwd_driver_
offline
``
is (offline compilation) driver for backward-data convolution
\
``conv_
fwd_
driver_
online
``
is (online compilation) driver for forward convolution
```
make -j conv_driver_
v2
make -j conv_bwd_
data_
driver_
v2
make -j conv_driver_
v2_olc
make -j conv_
fwd_
driver_
offline
make -j conv_bwd_driver_
offline
make -j conv_
fwd_
driver_
online
```
# Run
...
...
@@ -60,18 +59,18 @@ Build drivers: \
*
log: 0 = no log; 1 = do log
*
repeat: number of time kernel being launched
```
########################### layout algo verify init log repeat N__ K___ C___ Y X Hi_ Wi__ Strides Dilations LeftPads RightPads
./
conv
_driver_
v2
0
6
0
3
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
./
conv
_driver_
v2
0
6
0
3
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
./
conv
_driver_
v2
1
9
0
3
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
./
conv
_driver_
v2
1
9
0
3
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
./conv_bwd_
data_
driver_
v2
1
1
0
3
0 1 256 256 1024 3 3 14 14 1 1 1 1 1 1 1 1
#############################
########################### layout algo verify init log repeat N__ K___ C___ Y X Hi_ Wi__ Strides Dilations LeftPads RightPads
./
host/driver_offline/conv_fwd
_driver_
offline
0
4
0
0
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
./
host/driver_offline/conv_fwd
_driver_
offline
0
4
0
0
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
./
host/driver_offline/conv_fwd
_driver_
offline
1
5
0
0
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
./
host/driver_offline/conv_fwd
_driver_
offline
1
5
0
0
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
./
host/driver_offline/
conv_bwd_driver_
offline
1
5
0
0
0 1 256 256 1024 3 3 14 14 1 1 1 1 1 1 1 1
```
# Result
Forward convoltuion, FP16, NCHW
```
./
conv
_driver_
v2
0
6
0
3
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
./
host/driver_offline/conv_fwd
_driver_
offline
0
4
0
0
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
layout: 0
in: dim 4, lengths {128, 192, 71, 71}, strides {967872, 5041, 71, 1}
...
...
@@ -93,7 +92,7 @@ Average time : 1.4155 ms, 103.686 TFlop/s
Forward convoltuion, FP16, NCHW
```
./
conv
_driver_
v2
0
6
0
3
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
./
host/driver_offline/conv_fwd
_driver_
offline
0
4
0
0
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
layout: 0
in: dim 4, lengths {256, 256, 14, 14}, strides {50176, 196, 14, 1}
...
...
@@ -115,7 +114,7 @@ Average time : 2.21357 ms, 106.959 TFlop/s
Forward convolution, FP16, NHWC
```
./
conv
_driver_
v2
1
9
0
3
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
./
host/driver_offline/conv_fwd
_driver_
offline
1
5
0
0
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
layout: 1
in: dim 4, lengths {128, 71, 71, 192}, strides {967872, 13632, 192, 1}
...
...
@@ -137,7 +136,7 @@ Average time : 1.12014 ms, 131.025 TFlop/s
Forward convolution, FP16, NHWC
```
./
conv
_driver_
v2
1
9
0
3
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
./
host/driver_offline/conv_fwd
_driver_
offline
1
5
0
0
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
layout: 1
in: dim 4, lengths {256, 14, 14, 256}, strides {50176, 3584, 256, 1}
...
...
@@ -159,7 +158,7 @@ Average time : 1.86877 ms, 126.693 TFlop/s
Backward data convolution, FP16, NHWC
```
./conv_bwd_
data_
driver_
v2
1 1 0 3 0 1 256 256 1024 3 3 14 14 1 1 1 1 1 1 1 1
./
host/driver_offline/
conv_bwd_driver_
offline
1 1 0 3 0 1 256 256 1024 3 3 14 14 1 1 1 1 1 1 1 1
layout: 1
in: dim 4, lengths {256, 14, 14, 1024}, strides {200704, 14336, 1024, 1}
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment