"git@developer.sourcefind.cn:modelzoo/resnet50_tensorflow.git" did not exist on "3dccfae1f9790866c05b698be0b8fdded56332eb"
[MIOpen Downstream] Fix Reduction Kernel (#34)
* Tiny fix in using data type template parameters in blockwise and direct_threadwise kernel
* Fix with regard to implementing GetZeroVal() in both kernel and host
* Avoid convert to compType from dstDataType before writting the output value
* Add half_t support to NumericLimits and make constexpr GetZeroVal() of binary operator
* Add CONSTANT decorator for descriptor read buffer
* Use get_thread_local_1d_id() for thread local Id
* Rename GetZeroVal() to GetReductionZeroVal() in the kernels
* Remove constexpr from initialized zeroVal and tiny fix in reduction_operator.hpp
* Occasional tiny simplification and update in the kernel files
* Update to re-order tensor dimensions on the host, split second_call kernel wrapper files and simplify reduce_all kernel wrappers
* Update to remove OpenCL tidy checking failures
* Update for better readability
* Remove unused codes and not-needed template parameters in the kernel wrappers
Co-authored-by:
Chao Liu <chao.liu2@amd.com>
Showing
Please register or sign in to comment