"git@developer.sourcefind.cn:modelzoo/resnet50_tensorflow.git" did not exist on "a52564cba9d49856417f3f0daca602ac7c78cd78"
Unverified Commit 1a0cd5d1 authored by Adam Osewski's avatar Adam Osewski Committed by GitHub
Browse files

Convolution FWD profiler refactor. (#183)



* Convolution ND

* Code unification across dimensions for generating tensor descriptors.
* Example
* Instances

* Move convnd f32 instance file to comply with repo structure.

* Conv 1D tensor layouts.

* Formatting and use ReferenceConv

* Reference ConvFwd supporting 1D and 2D convolution.

* Debug printing TensorLayout name.

* Conv fwd 1D instance f32

* Refactor conv ND example.

Needed to support various conv dimensio.

Needed to support various conv dimensions

* Rename conv nd example director to prevent conflicts.

* Refactor some common utility to single file.

Plus some tests.

* Refactor GetHostTensorDescriptor + UT.

* Add 1D test case.

* Test reference convolution 1d/2d

* Remove some leftovers.

* Fix convolution example error for 1D

* Refactor test check errors utility function.

* Test Conv2D Fwd XDL

* More UT for 1D case.

* Parameterize input & weight initializers.

* Rename example to prevent conflicts.

* Split convnd instance into separate files for 1d/2d

* Address review comments.

* Fix data type for flops/gbytes calculations.

* Assign example number 11.

* 3D cases for convolution utility functions.

* 3D reference convolution.

* Add support for 3D convolution.

* Check for inputs bigger than  2GB.

* Formatting

* Support for bf16/f16/f32/i8 - conv instances + UT.

* Use check_err from test_util.hpp.

* Split convnd test into separate files for each dim.

* Fix data generation and use proper instances.

* Formatting

* Skip tensor initialization if not necessary.

* Fix CMakefiles.

* Remove redundant conv2d_fwd test.

* Lower problem size for conv3D UT.

* 3D case for convnd example.

* Remove leftovers after merge.

* Add Conv Specialization string to GetTypeString

* Skip instance causing numerical errors.

* Small fixes.

* Remove redundant includes.

* Fix namespace name error.

* Script for automatic testing and logging convolution fwd UTs

* Comment out numactl cmd.

* Refine weights initalization and relax rtol for fp16

* Move test_util.hpp to check_err.hpp

* Refine weights initalization and relax rtol for fp16

* Refactor common part of test conv utils.

* Move utility function to single common place.

* Add additional common functions to utility.

* Refactor convnd_fwd_xdl examples.

* Remove redundant files.
* Unify structure.

* Add constructor to ConvParams.

* And add input parameters validation.

* Modify conv examples to use single utility file.

* Remove check_error from host_tensor.hpp

* Get rid of check_indices function.

* Remove bf16_to_f32 function overload for scalars.

* Fix namespace.

* Add half_float::half for check_err.

* Fix conv params size in UT.

* Fix weights initialization for int8.

* Fix weights initialization for int8.

* Add type_convert when store output in ref conv 1D.

* Get back old conv2d_fwd_xdl operation.

* Silence conv debug print.

* format

* clean

* clean

* Fix merge.

* Fix namespace for check_err

* Formatting.

* Fix merge artifacts.

* Remove deleted header.

* Fix some includes and use ck::utils::check_err.

* Remove unused check_indices restored by previous merge.

* Fix namespaces after merge.

* Fix compilation error.

* Small fixes.

* Use common functions.
* Fix filename
* Fix namespaces.

* Fix merge artifact - retrieve removed by accident fun.

* Fix ConvForwardSpecialization.

* Working example of OpInstanceRunEngine for conv2dfwd UT.

* Adhere to coding style rules.

* Formatting and adhere to coding style rules.

* Fix merge artifacts.

* Utility for collecting conv fwd instances.

+ Plus commmon part for parsing cmdline params.

* Refactor FillUniform because of segfault for int8_t.

* Naming convention.

* Elegant version of device mem allocation.

* Use OpInstanceRunEngine in conv fwd nd tests.

* Multiple refinements.

* conditional init
* don't run reference op if not provided.

* Use OpInstanceRunEngine for ckProfiler conv_fwd

* Refactor common tensor fill function to separate file.

* Clean up unused functions.

* Support different init methods.

* Create CMake target for conv_fwd_util.

* Add header for profile_convnd_fwd.cpp

* Fix CMakefiles to link with conv_fwd_util where needed.

* Fix some clutter.
Co-authored-by: default avatarAdam Osewski <aosewski@amd.com>
Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
parent 7353ec0c
add_test_executable(test_conv_util conv_util.cpp) add_test_executable(test_conv_util conv_util.cpp)
target_link_libraries(test_conv_util PRIVATE host_tensor) target_link_libraries(test_conv_util PRIVATE host_tensor conv_fwd_util)
...@@ -4,5 +4,4 @@ include_directories(BEFORE ...@@ -4,5 +4,4 @@ include_directories(BEFORE
) )
add_test_executable(test_convnd_bwd_data convnd_bwd_data.cpp) add_test_executable(test_convnd_bwd_data convnd_bwd_data.cpp)
target_link_libraries(test_convnd_bwd_data PRIVATE host_tensor) target_link_libraries(test_convnd_bwd_data PRIVATE host_tensor device_convnd_bwd_data_instance conv_fwd_util)
target_link_libraries(test_convnd_bwd_data PRIVATE device_convnd_bwd_data_instance)
add_custom_target(test_convnd_fwd) add_custom_target(test_convnd_fwd)
add_test_executable(test_conv1d_fwd conv1d_fwd.cpp) add_test_executable(test_conv1d_fwd conv1d_fwd.cpp)
target_link_libraries(test_conv1d_fwd PRIVATE host_tensor) target_link_libraries(test_conv1d_fwd PRIVATE host_tensor device_conv1d_fwd_instance conv_fwd_util)
target_link_libraries(test_conv1d_fwd PRIVATE device_conv1d_fwd_instance) target_link_libraries(test_conv1d_fwd PRIVATE )
add_dependencies(test_convnd_fwd test_conv1d_fwd) add_dependencies(test_convnd_fwd test_conv1d_fwd)
add_test_executable(test_conv2d_fwd conv2d_fwd.cpp) add_test_executable(test_conv2d_fwd conv2d_fwd.cpp)
target_link_libraries(test_conv2d_fwd PRIVATE host_tensor) target_link_libraries(test_conv2d_fwd PRIVATE host_tensor device_conv2d_fwd_instance conv_fwd_util)
target_link_libraries(test_conv2d_fwd PRIVATE device_conv2d_fwd_instance)
add_dependencies(test_convnd_fwd test_conv2d_fwd) add_dependencies(test_convnd_fwd test_conv2d_fwd)
add_test_executable(test_conv3d_fwd conv3d_fwd.cpp) add_test_executable(test_conv3d_fwd conv3d_fwd.cpp)
target_link_libraries(test_conv3d_fwd PRIVATE host_tensor) target_link_libraries(test_conv3d_fwd PRIVATE host_tensor device_conv3d_fwd_instance conv_fwd_util)
target_link_libraries(test_conv3d_fwd PRIVATE device_conv3d_fwd_instance)
add_dependencies(test_convnd_fwd test_conv3d_fwd) add_dependencies(test_convnd_fwd test_conv3d_fwd)
...@@ -7,37 +7,15 @@ ...@@ -7,37 +7,15 @@
#include "element_wise_operation.hpp" #include "element_wise_operation.hpp"
#include "conv_fwd_util.hpp" #include "conv_fwd_util.hpp"
#include "conv_util.hpp" #include "conv_util.hpp"
#include "host_tensor.hpp"
#include "tensor_layout.hpp"
#include "check_err.hpp"
// Forward declarations for conv instances.
using DeviceConvFwdNoOpPtr =
ck::tensor_operation::device::DeviceConvFwdPtr<ck::tensor_operation::element_wise::PassThrough,
ck::tensor_operation::element_wise::PassThrough,
ck::tensor_operation::element_wise::PassThrough>;
namespace ck {
namespace tensor_operation {
namespace device {
namespace device_conv1d_fwd_instance {
void add_device_conv1d_fwd_xdl_nwc_kxc_nwk_bf16_instances(std::vector<DeviceConvFwdNoOpPtr>&);
void add_device_conv1d_fwd_xdl_nwc_kxc_nwk_f16_instances(std::vector<DeviceConvFwdNoOpPtr>&);
void add_device_conv1d_fwd_xdl_nwc_kxc_nwk_f32_instances(std::vector<DeviceConvFwdNoOpPtr>&);
void add_device_conv1d_fwd_xdl_nwc_kxc_nwk_int8_instances(std::vector<DeviceConvFwdNoOpPtr>&);
} // namespace device_conv1d_fwd_instance
} // namespace device
} // namespace tensor_operation
} // namespace ck
namespace { namespace {
bool test_conv1D_nwc() bool test_conv1D_nwc()
{ {
bool res{true}; using namespace std::placeholders;
using namespace ck::utils;
namespace ctl = ck::tensor_layout::convolution;
ck::utils::conv::ConvParams params; ck::utils::conv::ConvParams params;
params.num_dim_spatial = 1; params.num_dim_spatial = 1;
params.N = 2; params.N = 2;
...@@ -50,30 +28,26 @@ bool test_conv1D_nwc() ...@@ -50,30 +28,26 @@ bool test_conv1D_nwc()
params.input_left_pads = std::vector<ck::index_t>{1}; params.input_left_pads = std::vector<ck::index_t>{1};
params.input_right_pads = std::vector<ck::index_t>{1}; params.input_right_pads = std::vector<ck::index_t>{1};
auto host_tensors = std::vector<test::conv::DeviceConvFwdNoOpPtr> conv_ptrs;
ck::utils::conv::get_host_tensors<float, test::conv::get_test_convolution_fwd_instance<1>(conv_ptrs);
float, conv::ConvFwdOpInstance<float, float, float, ctl::NWC, ctl::KCX, ctl::NWK> conv_instance(
float, params);
ck::tensor_layout::convolution::NWC,
ck::tensor_layout::convolution::KXC, auto reference_conv_fwd_fun = std::bind(
ck::tensor_layout::convolution::NWK>(params); conv::run_reference_convolution_forward<1, float, float, float>, params, _1, _2, _3);
const Tensor<float>& input = std::get<0>(host_tensors); OpInstanceRunEngine<float, float, float> run_engine(conv_instance, reference_conv_fwd_fun);
const Tensor<float>& weights = std::get<1>(host_tensors); run_engine.SetAtol(1e-5);
Tensor<float>& host_output = std::get<2>(host_tensors); run_engine.SetRtol(1e-4);
Tensor<float>& device_output = std::get<3>(host_tensors); return run_engine.Test(conv_ptrs);
ck::utils::conv::run_reference_convolution_forward<1>(params, input, weights, host_output);
test::conv::RunConv<1>(params, input, weights, device_output);
res = res &&
ck::utils::check_err(
device_output.mData, host_output.mData, "Error: incorrect results!", 1e-5f, 1e-4f);
return res;
} }
template <typename T> template <typename T>
bool test_conv1d_nwc_instances(const std::vector<DeviceConvFwdNoOpPtr>& conv_ptrs) bool test_conv1d_nwc_instances(const std::vector<test::conv::DeviceConvFwdNoOpPtr>& conv_ptrs)
{ {
using namespace std::placeholders;
using namespace ck::utils;
namespace ctl = ck::tensor_layout::convolution;
ck::utils::conv::ConvParams params; ck::utils::conv::ConvParams params;
params.num_dim_spatial = 1; params.num_dim_spatial = 1;
params.filter_spatial_lengths = std::vector<ck::index_t>{3}; params.filter_spatial_lengths = std::vector<ck::index_t>{3};
...@@ -83,52 +57,36 @@ bool test_conv1d_nwc_instances(const std::vector<DeviceConvFwdNoOpPtr>& conv_ptr ...@@ -83,52 +57,36 @@ bool test_conv1d_nwc_instances(const std::vector<DeviceConvFwdNoOpPtr>& conv_ptr
params.input_left_pads = std::vector<ck::index_t>{1}; params.input_left_pads = std::vector<ck::index_t>{1};
params.input_right_pads = std::vector<ck::index_t>{1}; params.input_right_pads = std::vector<ck::index_t>{1};
auto host_tensors = conv::ConvFwdOpInstance<T, T, T, ctl::NWC, ctl::KCX, ctl::NWK> conv_instance(params);
ck::utils::conv::get_host_tensors<T,
T, auto reference_conv_fwd_fun =
T, std::bind(conv::run_reference_convolution_forward<1, T, T, T>, params, _1, _2, _3);
ck::tensor_layout::convolution::NWC, OpInstanceRunEngine<T, T, T> run_engine(conv_instance, reference_conv_fwd_fun);
ck::tensor_layout::convolution::KXC, return run_engine.Test(conv_ptrs);
ck::tensor_layout::convolution::NWK>(params);
const Tensor<T>& input = std::get<0>(host_tensors);
const Tensor<T>& weights = std::get<1>(host_tensors);
Tensor<T>& host_output = std::get<2>(host_tensors);
Tensor<T>& device_output = std::get<3>(host_tensors);
ck::utils::conv::run_reference_convolution_forward<1>(params, input, weights, host_output);
return ck::utils::conv::run_convolution_forward_instances<1>(
params, conv_ptrs, input, weights, device_output, host_output);
} }
bool test_conv1d_nwc_bf16_instances() bool test_conv1d_nwc_bf16_instances()
{ {
std::vector<DeviceConvFwdNoOpPtr> conv_ptrs; return test_conv1d_nwc_instances<ck::bhalf_t>(
ck::tensor_operation::device::device_conv1d_fwd_instance:: ck::utils::conv::ConvolutionFwdInstances<ck::bhalf_t, ck::bhalf_t, ck::bhalf_t>::Get<1>());
add_device_conv1d_fwd_xdl_nwc_kxc_nwk_bf16_instances(conv_ptrs);
return test_conv1d_nwc_instances<ck::bhalf_t>(conv_ptrs);
} }
bool test_conv1d_nwc_f16_instances() bool test_conv1d_nwc_f16_instances()
{ {
std::vector<DeviceConvFwdNoOpPtr> conv_ptrs; return test_conv1d_nwc_instances<ck::half_t>(
ck::tensor_operation::device::device_conv1d_fwd_instance:: ck::utils::conv::ConvolutionFwdInstances<ck::half_t, ck::half_t, ck::half_t>::Get<1>());
add_device_conv1d_fwd_xdl_nwc_kxc_nwk_f16_instances(conv_ptrs);
return test_conv1d_nwc_instances<ck::half_t>(conv_ptrs);
} }
bool test_conv1d_nwc_f32_instances() bool test_conv1d_nwc_f32_instances()
{ {
std::vector<DeviceConvFwdNoOpPtr> conv_ptrs; return test_conv1d_nwc_instances<float>(
ck::tensor_operation::device::device_conv1d_fwd_instance:: ck::utils::conv::ConvolutionFwdInstances<float, float, float>::Get<1>());
add_device_conv1d_fwd_xdl_nwc_kxc_nwk_f32_instances(conv_ptrs);
return test_conv1d_nwc_instances<float>(conv_ptrs);
} }
bool test_conv1d_nwc_int8_instances() bool test_conv1d_nwc_int8_instances()
{ {
std::vector<DeviceConvFwdNoOpPtr> conv_ptrs; return test_conv1d_nwc_instances<int8_t>(
ck::tensor_operation::device::device_conv1d_fwd_instance:: ck::utils::conv::ConvolutionFwdInstances<int8_t, int8_t, int8_t>::Get<1>());
add_device_conv1d_fwd_xdl_nwc_kxc_nwk_int8_instances(conv_ptrs);
return test_conv1d_nwc_instances<int8_t>(conv_ptrs);
} }
} // anonymous namespace } // anonymous namespace
...@@ -149,7 +107,7 @@ int main() ...@@ -149,7 +107,7 @@ int main()
std::cout << "\ntest_conv1d_nwc_f32_instances ..... " << (res ? "SUCCESS" : "FAILURE") std::cout << "\ntest_conv1d_nwc_f32_instances ..... " << (res ? "SUCCESS" : "FAILURE")
<< std::endl; << std::endl;
res = test_conv1d_nwc_int8_instances(); res = test_conv1d_nwc_int8_instances();
std::cout << "\ntes_tconv1_dnw_cint_8instances ..... " << (res ? "SUCCESS" : "FAILURE") std::cout << "\ntest_conv1d_nwc_int8_instances ..... " << (res ? "SUCCESS" : "FAILURE")
<< std::endl; << std::endl;
return res ? 0 : 1; return res ? 0 : 1;
......
#include <half.hpp> #include <half.hpp>
#include <iostream> #include <iostream>
#include <stdexcept>
#include <tuple> #include <tuple>
#include <vector> #include <vector>
...@@ -8,38 +7,14 @@ ...@@ -8,38 +7,14 @@
#include "element_wise_operation.hpp" #include "element_wise_operation.hpp"
#include "conv_fwd_util.hpp" #include "conv_fwd_util.hpp"
#include "conv_util.hpp" #include "conv_util.hpp"
#include "host_tensor.hpp"
#include "tensor_layout.hpp"
#include "check_err.hpp"
// Forward declarations for conv instances.
using DeviceConvFwdNoOpPtr =
ck::tensor_operation::device::DeviceConvFwdPtr<ck::tensor_operation::element_wise::PassThrough,
ck::tensor_operation::element_wise::PassThrough,
ck::tensor_operation::element_wise::PassThrough>;
namespace ck {
namespace tensor_operation {
namespace device {
namespace device_conv2d_fwd_instance {
void add_device_conv2d_fwd_xdl_nhwc_kyxc_nhwk_bf16_instances(std::vector<DeviceConvFwdNoOpPtr>&);
void add_device_conv2d_fwd_xdl_nhwc_kyxc_nhwk_f16_instances(std::vector<DeviceConvFwdNoOpPtr>&);
void add_device_conv2d_fwd_xdl_c_shuffle_nhwc_kyxc_nhwk_f16_instances(
std::vector<DeviceConvFwdNoOpPtr>&);
void add_device_conv2d_fwd_xdl_nhwc_kyxc_nhwk_f32_instances(std::vector<DeviceConvFwdNoOpPtr>&);
void add_device_conv2d_fwd_xdl_nhwc_kyxc_nhwk_int8_instances(std::vector<DeviceConvFwdNoOpPtr>&);
} // namespace device_conv2d_fwd_instance
} // namespace device
} // namespace tensor_operation
} // namespace ck
namespace { namespace {
bool test_conv2d_nhwc() bool test_conv2d_nhwc()
{ {
bool res{true}; using namespace std::placeholders;
using namespace ck::utils;
ck::utils::conv::ConvParams params; ck::utils::conv::ConvParams params;
params.N = 2; params.N = 2;
params.K = 16; params.K = 16;
...@@ -47,25 +22,25 @@ bool test_conv2d_nhwc() ...@@ -47,25 +22,25 @@ bool test_conv2d_nhwc()
params.input_spatial_lengths = std::vector<ck::index_t>{16, 16}; params.input_spatial_lengths = std::vector<ck::index_t>{16, 16};
params.conv_filter_strides = std::vector<ck::index_t>{1, 1}; params.conv_filter_strides = std::vector<ck::index_t>{1, 1};
auto host_tensors = ck::utils::conv::get_host_tensors(params); std::vector<test::conv::DeviceConvFwdNoOpPtr> conv_ptrs;
const Tensor<float>& input = std::get<0>(host_tensors); test::conv::get_test_convolution_fwd_instance<2>(conv_ptrs);
const Tensor<float>& weights = std::get<1>(host_tensors); conv::ConvFwdOpInstance<float, float, float> conv_instance(params);
Tensor<float>& host_output = std::get<2>(host_tensors);
Tensor<float>& device_output = std::get<3>(host_tensors);
ck::utils::conv::run_reference_convolution_forward<2>(params, input, weights, host_output);
test::conv::RunConv<2>(params, input, weights, device_output);
res = res &&
ck::utils::check_err(
device_output.mData, host_output.mData, "Error: incorrect results!", 1e-5f, 1e-4f);
return res; auto reference_conv_fwd_fun = std::bind(
conv::run_reference_convolution_forward<2, float, float, float>, params, _1, _2, _3);
OpInstanceRunEngine<float, float, float> run_engine(conv_instance, reference_conv_fwd_fun);
run_engine.SetAtol(1e-5);
run_engine.SetRtol(1e-4);
return run_engine.Test(conv_ptrs);
} }
template <typename T> template <typename T>
bool test_conv2d_nhwc_instances(const std::vector<DeviceConvFwdNoOpPtr>& conv_ptrs) bool test_conv2d_nhwc_instances(const std::vector<test::conv::DeviceConvFwdNoOpPtr>& conv_ptrs)
{ {
ck::utils::conv::ConvParams params; using namespace std::placeholders;
using namespace ck::utils;
conv::ConvParams params;
params.num_dim_spatial = 2; params.num_dim_spatial = 2;
params.filter_spatial_lengths = std::vector<ck::index_t>{3, 3}; params.filter_spatial_lengths = std::vector<ck::index_t>{3, 3};
params.input_spatial_lengths = std::vector<ck::index_t>{71, 71}; params.input_spatial_lengths = std::vector<ck::index_t>{71, 71};
...@@ -74,55 +49,36 @@ bool test_conv2d_nhwc_instances(const std::vector<DeviceConvFwdNoOpPtr>& conv_pt ...@@ -74,55 +49,36 @@ bool test_conv2d_nhwc_instances(const std::vector<DeviceConvFwdNoOpPtr>& conv_pt
params.input_left_pads = std::vector<ck::index_t>{1, 1}; params.input_left_pads = std::vector<ck::index_t>{1, 1};
params.input_right_pads = std::vector<ck::index_t>{1, 1}; params.input_right_pads = std::vector<ck::index_t>{1, 1};
auto host_tensors = conv::ConvFwdOpInstance<T, T, T> conv_instance(params);
ck::utils::conv::get_host_tensors<T,
T, auto reference_conv_fwd_fun =
T, std::bind(conv::run_reference_convolution_forward<2, T, T, T>, params, _1, _2, _3);
ck::tensor_layout::convolution::NHWC, OpInstanceRunEngine<T, T, T> run_engine(conv_instance, reference_conv_fwd_fun);
ck::tensor_layout::convolution::KYXC, return run_engine.Test(conv_ptrs);
ck::tensor_layout::convolution::NHWK>(params);
const Tensor<T>& input = std::get<0>(host_tensors);
const Tensor<T>& weights = std::get<1>(host_tensors);
Tensor<T>& host_output = std::get<2>(host_tensors);
Tensor<T>& device_output = std::get<3>(host_tensors);
ck::utils::conv::run_reference_convolution_forward<2>(params, input, weights, host_output);
return ck::utils::conv::run_convolution_forward_instances<2>(
params, conv_ptrs, input, weights, device_output, host_output);
} }
bool test_conv2d_nhwc_bf16_instances() bool test_conv2d_nhwc_bf16_instances()
{ {
std::vector<DeviceConvFwdNoOpPtr> conv_ptrs; return test_conv2d_nhwc_instances<ck::bhalf_t>(
ck::tensor_operation::device::device_conv2d_fwd_instance:: ck::utils::conv::ConvolutionFwdInstances<ck::bhalf_t, ck::bhalf_t, ck::bhalf_t>::Get<2>());
add_device_conv2d_fwd_xdl_nhwc_kyxc_nhwk_bf16_instances(conv_ptrs);
return test_conv2d_nhwc_instances<ck::bhalf_t>(conv_ptrs);
} }
bool test_conv2d_nhwc_f16_instances() bool test_conv2d_nhwc_f16_instances()
{ {
std::vector<DeviceConvFwdNoOpPtr> conv_ptrs; return test_conv2d_nhwc_instances<ck::half_t>(
ck::tensor_operation::device::device_conv2d_fwd_instance:: ck::utils::conv::ConvolutionFwdInstances<ck::half_t, ck::half_t, ck::half_t>::Get<2>());
add_device_conv2d_fwd_xdl_nhwc_kyxc_nhwk_f16_instances(conv_ptrs);
ck::tensor_operation::device::device_conv2d_fwd_instance::
add_device_conv2d_fwd_xdl_c_shuffle_nhwc_kyxc_nhwk_f16_instances(conv_ptrs);
return test_conv2d_nhwc_instances<ck::half_t>(conv_ptrs);
} }
bool test_conv2d_nhwc_f32_instances() bool test_conv2d_nhwc_f32_instances()
{ {
std::vector<DeviceConvFwdNoOpPtr> conv_ptrs; return test_conv2d_nhwc_instances<float>(
ck::tensor_operation::device::device_conv2d_fwd_instance:: ck::utils::conv::ConvolutionFwdInstances<float, float, float>::Get<2>());
add_device_conv2d_fwd_xdl_nhwc_kyxc_nhwk_f32_instances(conv_ptrs);
return test_conv2d_nhwc_instances<float>(conv_ptrs);
} }
bool test_conv2d_nhwc_int8_instances() bool test_conv2d_nhwc_int8_instances()
{ {
std::vector<DeviceConvFwdNoOpPtr> conv_ptrs; return test_conv2d_nhwc_instances<int8_t>(
ck::tensor_operation::device::device_conv2d_fwd_instance:: ck::utils::conv::ConvolutionFwdInstances<int8_t, int8_t, int8_t>::Get<2>());
add_device_conv2d_fwd_xdl_nhwc_kyxc_nhwk_int8_instances(conv_ptrs);
return test_conv2d_nhwc_instances<int8_t>(conv_ptrs);
} }
} // anonymous namespace } // anonymous namespace
......
...@@ -8,37 +8,16 @@ ...@@ -8,37 +8,16 @@
#include "element_wise_operation.hpp" #include "element_wise_operation.hpp"
#include "conv_fwd_util.hpp" #include "conv_fwd_util.hpp"
#include "conv_util.hpp" #include "conv_util.hpp"
#include "host_tensor.hpp"
#include "tensor_layout.hpp"
#include "check_err.hpp"
// Forward declarations for conv instances.
using DeviceConvFwdNoOpPtr =
ck::tensor_operation::device::DeviceConvFwdPtr<ck::tensor_operation::element_wise::PassThrough,
ck::tensor_operation::element_wise::PassThrough,
ck::tensor_operation::element_wise::PassThrough>;
namespace ck {
namespace tensor_operation {
namespace device {
namespace device_conv3d_fwd_instance {
void add_device_conv3d_fwd_xdl_ndhwc_kzyxc_ndhwk_bf16_instances(std::vector<DeviceConvFwdNoOpPtr>&);
void add_device_conv3d_fwd_xdl_ndhwc_kzyxc_ndhwk_f16_instances(std::vector<DeviceConvFwdNoOpPtr>&);
void add_device_conv3d_fwd_xdl_ndhwc_kzyxc_ndhwk_f32_instances(std::vector<DeviceConvFwdNoOpPtr>&);
void add_device_conv3d_fwd_xdl_ndhwc_kzyxc_ndhwk_int8_instances(std::vector<DeviceConvFwdNoOpPtr>&);
} // namespace device_conv3d_fwd_instance
} // namespace device
} // namespace tensor_operation
} // namespace ck
namespace { namespace {
bool test_conv3d_ndhwc() bool test_conv3d_ndhwc()
{ {
bool res{true}; using namespace std::placeholders;
ck::utils::conv::ConvParams params; using namespace ck::utils;
namespace ctl = ck::tensor_layout::convolution;
conv::ConvParams params;
params.num_dim_spatial = 3; params.num_dim_spatial = 3;
params.N = 2; params.N = 2;
params.K = 16; params.K = 16;
...@@ -50,31 +29,26 @@ bool test_conv3d_ndhwc() ...@@ -50,31 +29,26 @@ bool test_conv3d_ndhwc()
params.input_left_pads = std::vector<ck::index_t>{1, 1, 1}; params.input_left_pads = std::vector<ck::index_t>{1, 1, 1};
params.input_right_pads = std::vector<ck::index_t>{1, 1, 1}; params.input_right_pads = std::vector<ck::index_t>{1, 1, 1};
auto host_tensors = std::vector<test::conv::DeviceConvFwdNoOpPtr> conv_ptrs;
ck::utils::conv::get_host_tensors<float, test::conv::get_test_convolution_fwd_instance<3>(conv_ptrs);
float, conv::ConvFwdOpInstance<float, float, float, ctl::NDHWC, ctl::KZYXC, ctl::NDHWK> conv_instance(
float, params);
ck::tensor_layout::convolution::NDHWC,
ck::tensor_layout::convolution::KZYXC, auto reference_conv_fwd_fun = std::bind(
ck::tensor_layout::convolution::NDHWK>(params); conv::run_reference_convolution_forward<3, float, float, float>, params, _1, _2, _3);
const Tensor<float>& input = std::get<0>(host_tensors); OpInstanceRunEngine<float, float, float> run_engine(conv_instance, reference_conv_fwd_fun);
const Tensor<float>& weights = std::get<1>(host_tensors); run_engine.SetAtol(1e-5);
Tensor<float>& host_output = std::get<2>(host_tensors); run_engine.SetRtol(1e-4);
Tensor<float>& device_output = std::get<3>(host_tensors); return run_engine.Test(conv_ptrs);
ck::utils::conv::run_reference_convolution_forward<3>(params, input, weights, host_output);
test::conv::RunConv<3>(params, input, weights, device_output);
res = res &&
ck::utils::check_err(
device_output.mData, host_output.mData, "Error: incorrect results!", 1e-5f, 1e-4f);
return res;
} }
bool test_conv3d_ndhwc_2gb_input() bool test_conv3d_ndhwc_2gb_input()
{ {
using PassThrough = ck::tensor_operation::element_wise::PassThrough;
using namespace ck::utils;
// >2GB Input // >2GB Input
ck::utils::conv::ConvParams params; conv::ConvParams params;
params.num_dim_spatial = 3; params.num_dim_spatial = 3;
params.N = 2; params.N = 2;
params.K = 16; params.K = 16;
...@@ -86,39 +60,35 @@ bool test_conv3d_ndhwc_2gb_input() ...@@ -86,39 +60,35 @@ bool test_conv3d_ndhwc_2gb_input()
params.input_left_pads = std::vector<ck::index_t>{1, 1, 1}; params.input_left_pads = std::vector<ck::index_t>{1, 1, 1};
params.input_right_pads = std::vector<ck::index_t>{1, 1, 1}; params.input_right_pads = std::vector<ck::index_t>{1, 1, 1};
auto host_tensors = std::vector<test::conv::DeviceConvFwdNoOpPtr> conv_ptrs;
ck::utils::conv::get_host_tensors<float, test::conv::get_test_convolution_fwd_instance<3>(conv_ptrs);
float,
float, auto arg = conv_ptrs.back()->MakeArgumentPointer(nullptr,
ck::tensor_layout::convolution::NDHWC, nullptr,
ck::tensor_layout::convolution::KZYXC, nullptr,
ck::tensor_layout::convolution::NDHWK>(params, false); params.N,
const Tensor<float>& input = std::get<0>(host_tensors); params.K,
const Tensor<float>& weights = std::get<1>(host_tensors); params.C,
Tensor<float>& device_output = std::get<3>(host_tensors); params.input_spatial_lengths,
params.filter_spatial_lengths,
try params.GetOutputSpatialLengths(),
{ params.conv_filter_strides,
test::conv::RunConv<3>(params, input, weights, device_output); params.conv_filter_dilations,
} params.input_left_pads,
catch(const std::runtime_error& err) params.input_right_pads,
{ PassThrough{},
std::string err_msg{"Error! device_conv with the specified compilation parameters does " PassThrough{},
"not support this Conv problem"}; PassThrough{});
if(err.what() != err_msg) return !(conv_ptrs.back()->IsSupportedArgument(arg.get()));
{
return false;
}
return true;
}
std::cout << "Error: Failure checking oversized tensor!" << std::endl;
return false;
} }
bool test_conv3d_ndhwc_2gb_filters() bool test_conv3d_ndhwc_2gb_filters()
{ {
using PassThrough = ck::tensor_operation::element_wise::PassThrough;
using namespace ck::utils;
// >2GB Filters // >2GB Filters
ck::utils::conv::ConvParams params; conv::ConvParams params;
params.num_dim_spatial = 3; params.num_dim_spatial = 3;
params.N = 2; params.N = 2;
params.K = 16; params.K = 16;
...@@ -130,39 +100,35 @@ bool test_conv3d_ndhwc_2gb_filters() ...@@ -130,39 +100,35 @@ bool test_conv3d_ndhwc_2gb_filters()
params.input_left_pads = std::vector<ck::index_t>{1, 1, 1}; params.input_left_pads = std::vector<ck::index_t>{1, 1, 1};
params.input_right_pads = std::vector<ck::index_t>{1, 1, 1}; params.input_right_pads = std::vector<ck::index_t>{1, 1, 1};
auto host_tensors = std::vector<test::conv::DeviceConvFwdNoOpPtr> conv_ptrs;
ck::utils::conv::get_host_tensors<float, test::conv::get_test_convolution_fwd_instance<3>(conv_ptrs);
float,
float, auto arg = conv_ptrs.back()->MakeArgumentPointer(nullptr,
ck::tensor_layout::convolution::NDHWC, nullptr,
ck::tensor_layout::convolution::KZYXC, nullptr,
ck::tensor_layout::convolution::NDHWK>(params, false); params.N,
const Tensor<float>& input = std::get<0>(host_tensors); params.K,
const Tensor<float>& weights = std::get<1>(host_tensors); params.C,
Tensor<float>& device_output = std::get<3>(host_tensors); params.input_spatial_lengths,
params.filter_spatial_lengths,
try params.GetOutputSpatialLengths(),
{ params.conv_filter_strides,
test::conv::RunConv<3>(params, input, weights, device_output); params.conv_filter_dilations,
} params.input_left_pads,
catch(const std::runtime_error& err) params.input_right_pads,
{ PassThrough{},
std::string err_msg{"Error! device_conv with the specified compilation parameters does " PassThrough{},
"not support this Conv problem"}; PassThrough{});
if(err.what() != err_msg) return !(conv_ptrs.back()->IsSupportedArgument(arg.get()));
{
return false;
}
return true;
}
std::cout << "Error: Failure checking oversized tensor!" << std::endl;
return false;
} }
bool test_conv3d_ndhwc_2gb_output() bool test_conv3d_ndhwc_2gb_output()
{ {
using PassThrough = ck::tensor_operation::element_wise::PassThrough;
using namespace ck::utils;
// >2GB Output // >2GB Output
ck::utils::conv::ConvParams params; conv::ConvParams params;
params.num_dim_spatial = 3; params.num_dim_spatial = 3;
params.N = 2; params.N = 2;
params.K = 16; params.K = 16;
...@@ -174,39 +140,35 @@ bool test_conv3d_ndhwc_2gb_output() ...@@ -174,39 +140,35 @@ bool test_conv3d_ndhwc_2gb_output()
params.input_left_pads = std::vector<ck::index_t>{2, 2, 2}; params.input_left_pads = std::vector<ck::index_t>{2, 2, 2};
params.input_right_pads = std::vector<ck::index_t>{2, 2, 2}; params.input_right_pads = std::vector<ck::index_t>{2, 2, 2};
auto host_tensors = std::vector<test::conv::DeviceConvFwdNoOpPtr> conv_ptrs;
ck::utils::conv::get_host_tensors<float, test::conv::get_test_convolution_fwd_instance<3>(conv_ptrs);
float, auto arg = conv_ptrs.back()->MakeArgumentPointer(nullptr,
float, nullptr,
ck::tensor_layout::convolution::NDHWC, nullptr,
ck::tensor_layout::convolution::KZYXC, params.N,
ck::tensor_layout::convolution::NDHWK>(params, false); params.K,
const Tensor<float>& input = std::get<0>(host_tensors); params.C,
const Tensor<float>& weights = std::get<1>(host_tensors); params.input_spatial_lengths,
Tensor<float>& device_output = std::get<3>(host_tensors); params.filter_spatial_lengths,
params.GetOutputSpatialLengths(),
try params.conv_filter_strides,
{ params.conv_filter_dilations,
test::conv::RunConv<3>(params, input, weights, device_output); params.input_left_pads,
} params.input_right_pads,
catch(const std::runtime_error& err) PassThrough{},
{ PassThrough{},
std::string err_msg{"Error! device_conv with the specified compilation parameters does " PassThrough{});
"not support this Conv problem"}; return !(conv_ptrs.back()->IsSupportedArgument(arg.get()));
if(err.what() != err_msg)
{
return false;
}
return true;
}
std::cout << "Error: Failure checking oversized tensor!" << std::endl;
return false;
} }
template <typename T> template <typename T>
bool test_conv3d_ndhwc_instances(const std::vector<DeviceConvFwdNoOpPtr>& conv_ptrs) bool test_conv3d_ndhwc_instances(const std::vector<test::conv::DeviceConvFwdNoOpPtr>& conv_ptrs)
{ {
ck::utils::conv::ConvParams params; using namespace std::placeholders;
using namespace ck::utils;
namespace ctl = ck::tensor_layout::convolution;
conv::ConvParams params;
params.N = 64; params.N = 64;
params.num_dim_spatial = 3; params.num_dim_spatial = 3;
params.filter_spatial_lengths = std::vector<ck::index_t>{3, 3, 2}; params.filter_spatial_lengths = std::vector<ck::index_t>{3, 3, 2};
...@@ -216,53 +178,36 @@ bool test_conv3d_ndhwc_instances(const std::vector<DeviceConvFwdNoOpPtr>& conv_p ...@@ -216,53 +178,36 @@ bool test_conv3d_ndhwc_instances(const std::vector<DeviceConvFwdNoOpPtr>& conv_p
params.input_left_pads = std::vector<ck::index_t>{1, 1, 1}; params.input_left_pads = std::vector<ck::index_t>{1, 1, 1};
params.input_right_pads = std::vector<ck::index_t>{1, 1, 1}; params.input_right_pads = std::vector<ck::index_t>{1, 1, 1};
auto host_tensors = conv::ConvFwdOpInstance<T, T, T, ctl::NDHWC, ctl::KZYXC, ctl::NDHWK> conv_instance(params);
ck::utils::conv::get_host_tensors<T,
T,
T,
ck::tensor_layout::convolution::NDHWC,
ck::tensor_layout::convolution::KZYXC,
ck::tensor_layout::convolution::NDHWK>(params);
const Tensor<T>& input = std::get<0>(host_tensors);
const Tensor<T>& weights = std::get<1>(host_tensors);
Tensor<T>& host_output = std::get<2>(host_tensors);
Tensor<T>& device_output = std::get<3>(host_tensors);
ck::utils::conv::run_reference_convolution_forward<3>(params, input, weights, host_output); auto reference_conv_fwd_fun =
return ck::utils::conv::run_convolution_forward_instances<3>( std::bind(conv::run_reference_convolution_forward<3, T, T, T>, params, _1, _2, _3);
params, conv_ptrs, input, weights, device_output, host_output); OpInstanceRunEngine<T, T, T> run_engine(conv_instance, reference_conv_fwd_fun);
return run_engine.Test(conv_ptrs);
} }
bool test_conv3d_ndhwc_bf16_instances() bool test_conv3d_ndhwc_bf16_instances()
{ {
std::vector<DeviceConvFwdNoOpPtr> conv_ptrs; return test_conv3d_ndhwc_instances<ck::bhalf_t>(
ck::tensor_operation::device::device_conv3d_fwd_instance:: ck::utils::conv::ConvolutionFwdInstances<ck::bhalf_t, ck::bhalf_t, ck::bhalf_t>::Get<3>());
add_device_conv3d_fwd_xdl_ndhwc_kzyxc_ndhwk_bf16_instances(conv_ptrs);
return test_conv3d_ndhwc_instances<ck::bhalf_t>(conv_ptrs);
} }
bool test_conv3d_ndhwc_f16_instances() bool test_conv3d_ndhwc_f16_instances()
{ {
std::vector<DeviceConvFwdNoOpPtr> conv_ptrs; return test_conv3d_ndhwc_instances<ck::half_t>(
ck::tensor_operation::device::device_conv3d_fwd_instance:: ck::utils::conv::ConvolutionFwdInstances<ck::half_t, ck::half_t, ck::half_t>::Get<3>());
add_device_conv3d_fwd_xdl_ndhwc_kzyxc_ndhwk_f16_instances(conv_ptrs);
return test_conv3d_ndhwc_instances<ck::half_t>(conv_ptrs);
} }
bool test_conv3d_ndhwc_f32_instances() bool test_conv3d_ndhwc_f32_instances()
{ {
std::vector<DeviceConvFwdNoOpPtr> conv_ptrs; return test_conv3d_ndhwc_instances<float>(
ck::tensor_operation::device::device_conv3d_fwd_instance:: ck::utils::conv::ConvolutionFwdInstances<float, float, float>::Get<3>());
add_device_conv3d_fwd_xdl_ndhwc_kzyxc_ndhwk_f32_instances(conv_ptrs);
return test_conv3d_ndhwc_instances<float>(conv_ptrs);
} }
bool test_conv3d_ndhwc_int8_instances() bool test_conv3d_ndhwc_int8_instances()
{ {
std::vector<DeviceConvFwdNoOpPtr> conv_ptrs; return test_conv3d_ndhwc_instances<int8_t>(
ck::tensor_operation::device::device_conv3d_fwd_instance:: ck::utils::conv::ConvolutionFwdInstances<int8_t, int8_t, int8_t>::Get<3>());
add_device_conv3d_fwd_xdl_ndhwc_kzyxc_ndhwk_int8_instances(conv_ptrs);
return test_conv3d_ndhwc_instances<int8_t>(conv_ptrs);
} }
} // anonymous namespace } // anonymous namespace
...@@ -293,7 +238,7 @@ int main() ...@@ -293,7 +238,7 @@ int main()
std::cout << "\ntest_conv3d_ndhwc_f32_instances ..... " << (res ? "SUCCESS" : "FAILURE") std::cout << "\ntest_conv3d_ndhwc_f32_instances ..... " << (res ? "SUCCESS" : "FAILURE")
<< std::endl; << std::endl;
res = test_conv3d_ndhwc_int8_instances(); res = test_conv3d_ndhwc_int8_instances();
std::cout << "\ntest_conv3d_ndhw_cint_8instances ..... " << (res ? "SUCCESS" : "FAILURE") std::cout << "\ntest_conv3d_ndhwc_int8_instances ..... " << (res ? "SUCCESS" : "FAILURE")
<< std::endl; << std::endl;
return res ? 0 : 1; return res ? 0 : 1;
......
...@@ -10,7 +10,8 @@ ...@@ -10,7 +10,8 @@
#include "host_tensor.hpp" #include "host_tensor.hpp"
#include "sequence.hpp" #include "sequence.hpp"
namespace { namespace test {
namespace conv {
template <ck::index_t... Is> template <ck::index_t... Is>
using S = ck::Sequence<Is...>; using S = ck::Sequence<Is...>;
...@@ -19,6 +20,9 @@ using InElementOp = ck::tensor_operation::element_wise::PassThrough; ...@@ -19,6 +20,9 @@ using InElementOp = ck::tensor_operation::element_wise::PassThrough;
using WeiElementOp = ck::tensor_operation::element_wise::PassThrough; using WeiElementOp = ck::tensor_operation::element_wise::PassThrough;
using OutElementOp = ck::tensor_operation::element_wise::PassThrough; using OutElementOp = ck::tensor_operation::element_wise::PassThrough;
using DeviceConvFwdNoOpPtr =
ck::tensor_operation::device::DeviceConvFwdPtr<InElementOp, WeiElementOp, OutElementOp>;
static constexpr auto ConvFwdDefault = static constexpr auto ConvFwdDefault =
ck::tensor_operation::device::ConvolutionForwardSpecialization::Default; ck::tensor_operation::device::ConvolutionForwardSpecialization::Default;
...@@ -62,26 +66,14 @@ using DeviceConvNDFwdInstance = ck::tensor_operation::device:: ...@@ -62,26 +66,14 @@ using DeviceConvNDFwdInstance = ck::tensor_operation::device::
1>; // CThreadTransferDstScalarPerVector 1>; // CThreadTransferDstScalarPerVector
// clang-format on // clang-format on
} // namespace
namespace test {
namespace conv {
template <ck::index_t NDim, template <ck::index_t NDim,
typename InDataType = float, typename InDataType = float,
typename WeiDataType = float, typename WeiDataType = float,
typename OutDataType = float> typename OutDataType = float>
void RunConv(const ck::utils::conv::ConvParams& params, void get_test_convolution_fwd_instance(std::vector<DeviceConvFwdNoOpPtr>& instances)
const Tensor<InDataType>& input,
const Tensor<WeiDataType>& weights,
Tensor<OutDataType>& output)
{ {
ck::utils::conv::run_convolution_forward<NDim, using ConvInstanceT = DeviceConvNDFwdInstance<NDim, InDataType, WeiDataType, OutDataType>;
InDataType, instances.emplace_back(std::make_unique<ConvInstanceT>());
WeiDataType,
OutDataType,
DeviceConvNDFwdInstance>(
params, input, weights, output);
} }
} // namespace conv } // namespace conv
......
add_test_executable(test_reference_conv_fwd reference_conv_fwd.cpp) add_test_executable(test_reference_conv_fwd reference_conv_fwd.cpp)
target_link_libraries(test_reference_conv_fwd PRIVATE host_tensor) target_link_libraries(test_reference_conv_fwd PRIVATE host_tensor conv_fwd_util)
#include <algorithm>
#include <cmath> #include <cmath>
#include <cstdlib> #include <cstdlib>
#include <half.hpp> #include <half.hpp>
...@@ -10,6 +9,7 @@ ...@@ -10,6 +9,7 @@
#include "config.hpp" #include "config.hpp"
#include "conv_fwd_util.hpp" #include "conv_fwd_util.hpp"
#include "element_wise_operation.hpp" #include "element_wise_operation.hpp"
#include "fill.hpp"
#include "host_tensor.hpp" #include "host_tensor.hpp"
#include "reference_conv_fwd.hpp" #include "reference_conv_fwd.hpp"
#include "tensor_layout.hpp" #include "tensor_layout.hpp"
...@@ -19,35 +19,6 @@ using InElementOp = ck::tensor_operation::element_wise::PassThrough; ...@@ -19,35 +19,6 @@ using InElementOp = ck::tensor_operation::element_wise::PassThrough;
using WeiElementOp = ck::tensor_operation::element_wise::PassThrough; using WeiElementOp = ck::tensor_operation::element_wise::PassThrough;
using OutElementOp = ck::tensor_operation::element_wise::PassThrough; using OutElementOp = ck::tensor_operation::element_wise::PassThrough;
template <typename T>
struct FillMonotonicSeq
{
T m_init_value{0};
T m_step{1};
template <typename ForwardIter>
void operator()(ForwardIter first, ForwardIter last) const
{
std::generate(first, last, [=, n = m_init_value]() mutable {
auto tmp = n;
n += m_step;
return tmp;
});
}
};
template <typename T>
struct FillConstant
{
T m_value{0};
template <typename ForwardIter>
void operator()(ForwardIter first, ForwardIter last) const
{
std::fill(first, last, m_value);
}
};
template <ck::index_t NDim, template <ck::index_t NDim,
typename InDataType = float, typename InDataType = float,
typename WeiDataType = float, typename WeiDataType = float,
...@@ -55,8 +26,8 @@ template <ck::index_t NDim, ...@@ -55,8 +26,8 @@ template <ck::index_t NDim,
typename InLayout = ck::tensor_layout::convolution::NHWC, typename InLayout = ck::tensor_layout::convolution::NHWC,
typename WeiLayout = ck::tensor_layout::convolution::KYXC, typename WeiLayout = ck::tensor_layout::convolution::KYXC,
typename OutLayout = ck::tensor_layout::convolution::NHWK, typename OutLayout = ck::tensor_layout::convolution::NHWK,
typename FillInputOp = FillMonotonicSeq<InDataType>, typename FillInputOp = ck::utils::FillMonotonicSeq<InDataType>,
typename FillWeightsOp = FillConstant<WeiDataType>> typename FillWeightsOp = ck::utils::FillConstant<WeiDataType>>
Tensor<OutDataType> Tensor<OutDataType>
run_reference_convolution_forward(const ck::utils::conv::ConvParams& params, run_reference_convolution_forward(const ck::utils::conv::ConvParams& params,
const FillInputOp& fill_input_op = FillInputOp{}, const FillInputOp& fill_input_op = FillInputOp{},
...@@ -251,7 +222,7 @@ bool test_conv1d_nwc() ...@@ -251,7 +222,7 @@ bool test_conv1d_nwc()
ck::tensor_layout::convolution::NWC, ck::tensor_layout::convolution::NWC,
ck::tensor_layout::convolution::KXC, ck::tensor_layout::convolution::KXC,
ck::tensor_layout::convolution::NWK>( ck::tensor_layout::convolution::NWK>(
params, FillMonotonicSeq<float>{0.f, 0.1f}); params, ck::utils::FillMonotonicSeq<float>{0.f, 0.1f});
ref_dims = std::vector<std::size_t>{2, 16, 16}; ref_dims = std::vector<std::size_t>{2, 16, 16};
ref_data = std::vector<float>{ ref_data = std::vector<float>{
...@@ -349,7 +320,7 @@ bool test_conv3d_ncdhw() ...@@ -349,7 +320,7 @@ bool test_conv3d_ncdhw()
ck::tensor_layout::convolution::NCDHW, ck::tensor_layout::convolution::NCDHW,
ck::tensor_layout::convolution::KCZYX, ck::tensor_layout::convolution::KCZYX,
ck::tensor_layout::convolution::NKDHW>( ck::tensor_layout::convolution::NKDHW>(
params, FillMonotonicSeq<float>{0.f, 0.1f}); params, ck::utils::FillMonotonicSeq<float>{0.f, 0.1f});
std::vector<std::size_t> ref_dims{1, 1, 4, 4, 4}; std::vector<std::size_t> ref_dims{1, 1, 4, 4, 4};
std::vector<float> ref_data{ std::vector<float> ref_data{
407.7, 410.40002, 413.09998, 415.80002, 423.90002, 426.6, 429.30002, 432., 407.7, 410.40002, 413.09998, 415.80002, 423.90002, 426.6, 429.30002, 432.,
...@@ -383,7 +354,7 @@ bool test_conv3d_ncdhw() ...@@ -383,7 +354,7 @@ bool test_conv3d_ncdhw()
ck::tensor_layout::convolution::NCDHW, ck::tensor_layout::convolution::NCDHW,
ck::tensor_layout::convolution::KCZYX, ck::tensor_layout::convolution::KCZYX,
ck::tensor_layout::convolution::NKDHW>( ck::tensor_layout::convolution::NKDHW>(
params, FillMonotonicSeq<float>{0.f, 0.1f}); params, ck::utils::FillMonotonicSeq<float>{0.f, 0.1f});
ref_dims = std::vector<std::size_t>{1, 2, 4, 4, 4}; ref_dims = std::vector<std::size_t>{1, 2, 4, 4, 4};
ref_data = std::vector<float>{ ref_data = std::vector<float>{
2756.7002, 2764.7998, 2772.9001, 2781., 2853.9001, 2862., 2870.1, 2878.2002, 2756.7002, 2764.7998, 2772.9001, 2781., 2853.9001, 2862., 2870.1, 2878.2002,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment