Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
yangql
composable_kernel-1
Commits
56f93c6f
You need to sign in or sign up before continuing.
Unverified
Commit
56f93c6f
authored
Jul 28, 2021
by
Chao Liu
Committed by
GitHub
Jul 28, 2021
Browse files
Update README.md
parent
f63a23ac
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
26 additions
and
27 deletions
+26
-27
README.md
README.md
+26
-27
No files found.
README.md
View file @
56f93c6f
...
@@ -18,9 +18,6 @@ https://www.boost.org/doc/libs/1_66_0/more/getting_started/unix-variants.html#ea
...
@@ -18,9 +18,6 @@ https://www.boost.org/doc/libs/1_66_0/more/getting_started/unix-variants.html#ea
# Build
# Build
Change target ID in source code, example below is gfx908
https://github.com/asroy/modular_convolution/blob/aafb5eb18781f1ac9e06a17c3e53d968dd53dcc0/composable_kernel/include/utility/config.amd.hpp.in#L16-L23
Add path of Boost
Add path of Boost
```
```
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
...
@@ -28,26 +25,28 @@ Add path of Boost
...
@@ -28,26 +25,28 @@ Add path of Boost
```
```
mkdir build && cd build
mkdir build && cd build
```
# need to manually set target ID, example below is gfx908
cmake cmd. Need to Specify target ID, example below is gfx908
```
cmake \
cmake \
-D CMAKE_BUILD_TYPE=Release \
-D CMAKE_BUILD_TYPE=Release
\
-D
DEVICE_BACKEND=AMD
\
-D
CMAKE_CXX_FLAGS="-DCK_AMD_GPU_GFX908 -O3 --amdgpu-target=gfx908 -mllvm --amdgpu-spill-vgpr-to-agpr=0 -gline-tables-only -save-temps=$PWD"
\
-D
CMAKE_CXX_FLAGS="-O3 --amdgpu-target=gfx908 -mllvm --amdgpu-spill-vgpr-to-agpr=0 -gline-tables-only -save-temps=$CWD"
\
-D
HIP_ONLINE_COMPILER_FLAGS="-DCK_AMD_GPU_GFX908"
\
-D CMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc \
-D CMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc
\
-D CMAKE_PREFIX_PATH=/opt/rocm \
-D CMAKE_PREFIX_PATH=/opt/rocm
\
-D CMAKE_VERBOSE_MAKEFILE:BOOL=ON \
-D CMAKE_VERBOSE_MAKEFILE:BOOL=ON
\
..
..
```
```
Build drivers:
\
Build drivers:
\
``conv_driver_
v2
``
is (offline compilation) driver for forward convolution,
\
``conv_
fwd_
driver_
offline
``
is (offline compilation) driver for forward convolution,
\
``conv_bwd_
data_
driver_
v2
``
is (offline compilation) driver for backward-data convolution
\
``conv_bwd_driver_
offline
``
is (offline compilation) driver for backward-data convolution
\
``conv_driver_
v2_olc
``
is (online compilation) driver for forward convolution
``conv_
fwd_
driver_
online
``
is (online compilation) driver for forward convolution
```
```
make -j conv_driver_
v2
make -j conv_
fwd_
driver_
offline
make -j conv_bwd_
data_
driver_
v2
make -j conv_bwd_driver_
offline
make -j conv_driver_
v2_olc
make -j conv_
fwd_
driver_
online
```
```
# Run
# Run
...
@@ -60,18 +59,18 @@ Build drivers: \
...
@@ -60,18 +59,18 @@ Build drivers: \
*
log: 0 = no log; 1 = do log
*
log: 0 = no log; 1 = do log
*
repeat: number of time kernel being launched
*
repeat: number of time kernel being launched
```
```
########################### layout algo verify init log repeat N__ K___ C___ Y X Hi_ Wi__ Strides Dilations LeftPads RightPads
#############################
########################### layout algo verify init log repeat N__ K___ C___ Y X Hi_ Wi__ Strides Dilations LeftPads RightPads
./
conv
_driver_
v2
0
6
0
3
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
./
host/driver_offline/conv_fwd
_driver_
offline
0
4
0
0
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
./
conv
_driver_
v2
0
6
0
3
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
./
host/driver_offline/conv_fwd
_driver_
offline
0
4
0
0
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
./
conv
_driver_
v2
1
9
0
3
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
./
host/driver_offline/conv_fwd
_driver_
offline
1
5
0
0
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
./
conv
_driver_
v2
1
9
0
3
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
./
host/driver_offline/conv_fwd
_driver_
offline
1
5
0
0
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
./conv_bwd_
data_
driver_
v2
1
1
0
3
0 1 256 256 1024 3 3 14 14 1 1 1 1 1 1 1 1
./
host/driver_offline/
conv_bwd_driver_
offline
1
5
0
0
0 1 256 256 1024 3 3 14 14 1 1 1 1 1 1 1 1
```
```
# Result
# Result
Forward convoltuion, FP16, NCHW
Forward convoltuion, FP16, NCHW
```
```
./
conv
_driver_
v2
0
6
0
3
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
./
host/driver_offline/conv_fwd
_driver_
offline
0
4
0
0
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
layout: 0
layout: 0
in: dim 4, lengths {128, 192, 71, 71}, strides {967872, 5041, 71, 1}
in: dim 4, lengths {128, 192, 71, 71}, strides {967872, 5041, 71, 1}
...
@@ -93,7 +92,7 @@ Average time : 1.4155 ms, 103.686 TFlop/s
...
@@ -93,7 +92,7 @@ Average time : 1.4155 ms, 103.686 TFlop/s
Forward convoltuion, FP16, NCHW
Forward convoltuion, FP16, NCHW
```
```
./
conv
_driver_
v2
0
6
0
3
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
./
host/driver_offline/conv_fwd
_driver_
offline
0
4
0
0
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
layout: 0
layout: 0
in: dim 4, lengths {256, 256, 14, 14}, strides {50176, 196, 14, 1}
in: dim 4, lengths {256, 256, 14, 14}, strides {50176, 196, 14, 1}
...
@@ -115,7 +114,7 @@ Average time : 2.21357 ms, 106.959 TFlop/s
...
@@ -115,7 +114,7 @@ Average time : 2.21357 ms, 106.959 TFlop/s
Forward convolution, FP16, NHWC
Forward convolution, FP16, NHWC
```
```
./
conv
_driver_
v2
1
9
0
3
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
./
host/driver_offline/conv_fwd
_driver_
offline
1
5
0
0
0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
layout: 1
layout: 1
in: dim 4, lengths {128, 71, 71, 192}, strides {967872, 13632, 192, 1}
in: dim 4, lengths {128, 71, 71, 192}, strides {967872, 13632, 192, 1}
...
@@ -137,7 +136,7 @@ Average time : 1.12014 ms, 131.025 TFlop/s
...
@@ -137,7 +136,7 @@ Average time : 1.12014 ms, 131.025 TFlop/s
Forward convolution, FP16, NHWC
Forward convolution, FP16, NHWC
```
```
./
conv
_driver_
v2
1
9
0
3
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
./
host/driver_offline/conv_fwd
_driver_
offline
1
5
0
0
0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
layout: 1
layout: 1
in: dim 4, lengths {256, 14, 14, 256}, strides {50176, 3584, 256, 1}
in: dim 4, lengths {256, 14, 14, 256}, strides {50176, 3584, 256, 1}
...
@@ -159,7 +158,7 @@ Average time : 1.86877 ms, 126.693 TFlop/s
...
@@ -159,7 +158,7 @@ Average time : 1.86877 ms, 126.693 TFlop/s
Backward data convolution, FP16, NHWC
Backward data convolution, FP16, NHWC
```
```
./conv_bwd_
data_
driver_
v2
1 1 0 3 0 1 256 256 1024 3 3 14 14 1 1 1 1 1 1 1 1
./
host/driver_offline/
conv_bwd_driver_
offline
1 1 0 3 0 1 256 256 1024 3 3 14 14 1 1 1 1 1 1 1 1
layout: 1
layout: 1
in: dim 4, lengths {256, 14, 14, 1024}, strides {200704, 14336, 1024, 1}
in: dim 4, lengths {256, 14, 14, 1024}, strides {200704, 14336, 1024, 1}
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment