Commit 234c0580 authored by Po-Yen, Chen's avatar Po-Yen, Chen
Browse files

Use faster device op config for example 'HxWx4_fp16'

parent cc8aee3b
...@@ -12,7 +12,7 @@ using DevicePermuteInstance = ck::tensor_operation::device::DevicePermute ...@@ -12,7 +12,7 @@ using DevicePermuteInstance = ck::tensor_operation::device::DevicePermute
// ######| Type| Type| Operation| | Size| Block| Block| Block| LdsExtraW| ThreadClusterLengths| ThreadClusterArrangeOrder| VectorDim| VectorDim| ScalarPerVector| ScalarPerVector| // ######| Type| Type| Operation| | Size| Block| Block| Block| LdsExtraW| ThreadClusterLengths| ThreadClusterArrangeOrder| VectorDim| VectorDim| ScalarPerVector| ScalarPerVector|
// ######| | | | | | | | | | | | | | | | // ######| | | | | | | | | | | | | | | |
// ######| | | | | | | | | | | | | | | | // ######| | | | | | | | | | | | | | | |
< ADataType, BDataType, PassThrough, 3, 256, 4, 16, 16, 0, S<1, 16, 16>, S<0, 1, 2>, 2, 1, 1, 1>; < ADataType, BDataType, PassThrough, 3, 256, 1, 32, 32, 5, S<1, 32, 8>, S<0, 1, 2>, 2, 1, 4, 1>;
// clang-format on // clang-format on
#define NUM_ELEMS_IN_BUNDLE 4 #define NUM_ELEMS_IN_BUNDLE 4
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment