- 19 Jan, 2021 1 commit
-
-
Shucai Xiao authored
* add the and operator * clang format * add unit tests for the and operator * clang format * change the and name to logical_and and add the logical_or, logical_xor * clang format * add onnx unit tests for or and xor * add more unit tests
-
- 18 Jan, 2021 2 commits
-
-
Paul Fultz II authored
Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
kahmed10 authored
* initial testing * initial testing * add dequantize * formatting * add tests * formatting * revert file * add parse files * formatting * add axis tuning and fix tests * formatting * add tests and fix int8 * formatting * fix tidy * test with int32 * add default name and change string to upper * formatting * remove boost call * refactor to use tune_axis) * formatting
-
- 08 Jan, 2021 1 commit
-
-
Paul Fultz II authored
* Add build and test github workflow * Fix cget command * Remove def-requirements.txt * Add tmate session to debug workflow * Run tmate session after installing dependencies * Print date periodically * Add clang tidy action * Seperate build and run container in two different jobs * Run bash script * Remove interactive flag * Try to mount the files * Try to use the github workspace * WIthout double braces * Use env variable * Pipe bash script in * Run using hip-clang * Use correct path * Add verbose * Remove j flag * Only run for onnx file to debug * Manually run clang-tidy * Remove quiet flag * Print header file * Printout environment * Remove extra defines * Remove fixits and config flag * Show ldd * Add tmate session * Run onnx protobuf first * Generate proto for tensorflow * Update cppcheck version * Fix some cppcheck issues * Add const * Cppcheck fixes * Formatting * Fix more cppcheck issues * Run two jobs * Cache analysis and run format checking * Fix yaml issues * Fix yaml issues * Fix indentation * Switch to hip-clang for main docker file * Use hip-clang in the readme * Fixes for jenkins * Use ccache to build * Combine file * Set restore keys * Change stage name * Build with ccache * Add missing dependency for ccache * Build debug with codecov * Fix workflow syntax * Fix list * Use quotes * Got to correct build path * Install lcov * Use sudo * Echo all commands * Setup tmate * Add verbose output * Build with cmake directly * Add pthread flag * Remove python config * Continue on error * Use on or off for cmake flag * Use always upload cache * Verbose output * Verbose output from build * Build one target * Reduce debug symbols * Increase garbage collection * Remove dmesg * Increase it to 20 * Update rocm cmake version * Remove jobs from jenkins * Run on all 3 ubuntus * Remove gcc 5 jobs * Dont add flag on 16.04 * Only upload coverage on 18.04 * Dont build for ubuntu 20.04 * Use matrix.os * Use O2 for hip-clang since lower optimizations are broken * Use rocm 3.0 * Pass ccache as cmake variable instead of env variable * Build miopen from source * Show ccache statistics * Print log information * Set compression level * Use hash dir * Set hashdir * Install clang ocl from system * Up compression level * Add locale * Increase cache size to 1G * Lower compression level to 9 * Remove split dwarf * Remove Og * Add back Og * Seperate debug and codecov * Add missing backlash * Garbage collect more often * Add missing locales package * Use Os * Install onednn in docker and run tests * Include target headers in tests * Increase timeout * Remove if condtion * Make flag public * Suppress memory leaks in onednn * Use equal * Add gh annotations * Update rocm-cmake version * Add ldconfig Co-authored-by:Shucai Xiao <shucai@gmail.com>
-
- 07 Jan, 2021 1 commit
-
-
Paul Fultz II authored
-
- 06 Jan, 2021 1 commit
-
-
Shucai Xiao authored
* add an api get_main_module * clang format * modify onnx unit test for module * clang format * refactor ops unit test with the get_main_module * clang format * code backup * clang format * refine module c api * add python api for module * clang format * fix a python api issue * clang format * fix cppcheck error * clang format * refine unit tests changes * clang format * code backup * code backup * clang format * defer some changes to later PRs * change return of get_main_module from ref to pointer * clang format * add unit tests for the get_main_module_api * clang format * fix cppcheck error * clang format * fix cppcheck error * clang format * add more unit tests for more code change coverage * clang format * fixed a unit test error * clang format * fix unit test * clang format * code backup * code change for more code coverage * change program to module in various passes and matcher * clang format * modify the pass API * code backup * code backup * clang format * code backup * clang format * Add option to no generate a destroy method * Formatting * fix some review comments * clang format * fix review comments * clang format * clang format * code backup * code backup * clang format * fix cppcheck errors * clang format * clang format * fix build errors * clang format * modify gpu unit tests to using module * clang format * fix cppcheck error * clang format * Add flag to enable cpu backend * Make buffers shared * Enable optimizations * Formatting * fix review comments * code backup * clang format * code backup * clang format * fix a bug related to a unit test * clang format * clang format * fix a build error * remove unnecessary code * remove unnecessary files * code backup * clang format * remove the compile function from the module class * clang format * clang format * remove the context parameter from the from_value method of the module class * code refinement * clang format * merge changes from develop branch * clang format * fix cppcheck error * clang format * fix a build error * fixed a merge error * fix cppcheck error * fixed review comments * clang format * fix cppcheck error * fix a cppcheck error * fix cppcheck error * fix build error caused by merge * Add missing has_op function * Formatting * merge changes from develop branch * fix a cppcheck error * fixed some review comments * clang format * remove the begin/end function of the program class * clang format * refine code and fix cppcheck error * clang format * fix review comments * clang format * fix review comments * clang format * add unit tests for more code coverage * clang format * fix review comments * clang format * fix review comments * clang format * fix a build error in debug mode * clang format Co-authored-by:Paul <pfultz2@yahoo.com>
-
- 14 Dec, 2020 1 commit
-
-
Paul Fultz II authored
* Add flag to enable cpu backend * Make buffers shared * Enable optimizations * Add onednn * Formatting * Formatting * Add dnnl header * Formatting * Rewrite rnn first * Formatting * Call reference implementation * Formatting * Make literal data shared * Formatting * Add convolution * Formatting * Compensate for dilation * Formatting * Use name/make_op instead * Formatting * Rename gemm header * Formatting * Add dnnl convolution/gemm operators * Formatting * Add eliminate_contiguous * Add faster pointwise operators * Formatting * Formatting * Formatting * Add dnnl op class * Formatting * Add add op * Formatting * Add concat operator * Formatting * Add more ops * Create descriptor during finalization * Formatting * Dont rewrite pooling * Enable memory coloring * Formatting * Add output aliases * Formatting * Fix errors * Formatting * Convert literals * Add missing file * Remove batch_norm * Formatting * Use strides * Formatting * Add some debug checks * Formatting * Fix big in adjusting shape for gemm * Formatting * Fix fallback dot operator * Zero initialize buffers * Add suport for group convolutions * Formatting * Make adjust allocation target independent * Formatting * Enable adjust_allocation for gpu/cpu * Formatting * Add copy to allocation model * Formatting * Add copy operator * Formatting * Better handling of output parameters in adjust_allocation * Formatting * Build with dnnl * Make dnnl required * Fix compile error * Tidy fixes * Formatting * Tidy fixes * Formatting * Fix more tidy issues * Formatting * Add mul op * Add mul op * Set c compiler to clang as well * Compensate for normalized compute shape * Formatting * Fix cppcheck errors * Formatting * Add onednn library to hcc * Guard clang pragmas * Disable cpu mode for gcc for now * Leave it enabled it for gcc 7 * Fix cppcheck suppresion * Fix compile error on gcc 5 * Remove unused code Co-authored-by:
Shucai Xiao <shucai.xiao@amd.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 26 Nov, 2020 1 commit
-
-
kahmed10 authored
* initial testing * change tolerance * remove extra changes Co-authored-by:
Shucai Xiao <shucai.xiao@amd.com> Co-authored-by:
Paul Fultz II <pfultz2@yahoo.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 20 Nov, 2020 1 commit
-
-
Paul Fultz II authored
* Unify the vectorized and non-vectorized path * Formatting * Make fusion easily extendable * Add skip layernorm fusion * Formatting * Call correct layernorm function * Fix compile errors * Add DCE * Add test for skip layernorm * Formatting * Remove unused typedef * Formatting * Fix tidy issues * Formatting Co-authored-by:Shucai Xiao <shucai.xiao@amd.com>
-
- 16 Nov, 2020 1 commit
-
-
Shucai Xiao authored
* add a pass to normalize ops * clang format * add unit tests * clang format * code backup * clang format * code backup * clang format * add support for slice in the normalize_op function * clang format * add operation method api for whether we need to call normalize_op * clang format * fix review comments * clang format * rename a function namejJ * clang format * change compute_shape to normalize_compute_shape for corresponding operators * clang format * remove unnecessary code * fix various issues * clang format * add attributes to operators having axis attributes * clang format * fixed jenkins build error * clang format * fix a bug related to slice * clang format * code backup * clang format * code backup * clang format * rename a file * fix cppcheck error * some code refinement * clang format * change attributes to enum * clang format * refine the enum * clang format * remove unnecessary code * add unit tests for more code coverage and fixed a bug * clang format * remove unnecessary changes * change normalize_axes to normalize * clang format * revert back the changes in broadcast.hpp * rename normalize_axes to normalize * fix review comments * clang format * Add flag to enable cpu backend * Make buffers shared * Enable optimizations * Formatting * Try to avoid ambiguous assign in value class * fixed a build error * clang format * add the normalize_ops pass to the ref target * refactor program to module to normalize_ops pass Co-authored-by:
Paul <pfultz2@yahoo.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 11 Nov, 2020 1 commit
-
-
Shucai Xiao authored
* code backup * clang format * change corresponding tool files * clang format Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 10 Nov, 2020 1 commit
-
-
Paul Fultz II authored
* Add flag to enable cpu backend * Make buffers shared * Enable optimizations * Formatting * Enable cpu backend for gcc builds
-
- 09 Nov, 2020 1 commit
-
-
Paul Fultz II authored
* Add compiler flags * Add missing include * Add filesystem header * Formatting * Add tmp_dir to run * Formatting * Kernel compilation and launching * Formatting * Seperate pack_args * Formatting * Add alignment tests * Formatting * Add compile test * Formatting * Complete compile test * Formatting * Use is_regular_file free function * Fix is_regular_file call * Fix tidy issues * Fix tidy * Fix tidy issue * Print size in read_buffer to debug issue on jenkins * Add hip flags before src file * Fix reading output files * Fix unsued variable warning * Formatting * Formatting * Disable tidy check Co-authored-by:
Shucai Xiao <shucai.xiao@amd.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 28 Oct, 2020 1 commit
-
-
Paul Fultz II authored
* Fix fusions in bert model * Formatting * Add unit tests * Formatting * Fix one_half matcher * Workaround ICE on gcc * Formatting * Tidy fixes Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 15 Oct, 2020 1 commit
-
-
turneram authored
* Added greater and less operators * Fixed ops_test.cpp * Set commutative to false for less, greater * Refactored parse_equal/less/greater into parse_compare_op * Removed unnecessary function attributes() from greater.hpp/less.hpp * Added op_name arguments * Removed local settings * Formatting * Missing comma * Formatting * Formatting * Formatting * Formatting * Formatting * Missing space Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
- 09 Oct, 2020 1 commit
-
-
Paul Fultz II authored
* Add intial multi stream analysis * Formatting * Add more tests * Formatting * Remove comment * Analyze streams on the gpu * Formatting * Fix nstream * Formatting * Add test for return * Formatting * Make sure return has a stream assignment * Formatting * Fix asserts and checks * Improve error message for out-of-order sequence * Formatting
-
- 08 Oct, 2020 1 commit
-
-
kahmed10 authored
* add flag * formatting * remove env variable * fix api expression * add api test * add api test * add op test * formatting * fix function name * fix syntax * formatting * modify test * remove test and update doc * move test to new file * formatting * revert test files * rewrite check * New Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
- 07 Oct, 2020 1 commit
-
-
Paul Fultz II authored
* Enforce op name for check_shapes class * Add test for scalar * Formatting Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 30 Sep, 2020 1 commit
-
-
Paul Fultz II authored
* Make global variables const * Tidy fixes * Disable some lints * Formatting * Fix tidy const * Formatting * Add missing const keywords * Formatting * More fixes * Fix remaining tidy issues * Formatting * Fix rocblas function call * Formatting * Fix nodiscard warnings * Formatting * Use named parameters * Remove overload * Add overload * Remove noncps * Use named param for node * Add auto register header * Use named parameters * Refactor jenkinsfile * Fix shadow * Add missing body variable * Add more const methods * Add hip-clang docker builds * Remove comments * Add clang-format * Add more const * Formatting * Rename stage * Disable check * Add another const * Add python 2 dev packages * Add sphinx to dockerfile
-
- 14 Sep, 2020 1 commit
-
-
Paul Fultz II authored
* Fuse gemm in fuse ops * Formatting * Add const ref * Remove assert * Skip already fused gemms * Skip already fused gemm * Formatting * Use float_equal * Avoid non-standard shapes for inputs * Formatting Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 10 Sep, 2020 1 commit
-
-
Paul Fultz II authored
* Add save/load functions * Formatting * Add loading and saving to the driver * Formatting * Add return * Serialize the context with the program * Formatting * Add python API * Formatting * Add c/c++ apis * Formatting * Add tests * Formatting * Fix tidy error * Fix python doc * Restore python code * Add function name to errors * Formatting * Use lvalue for writing * Serialize context * Fix convolution and pooling operator for miopen * Formatting * Add const ref * Set target name to gpu * Add target tests * Formatting * Move register target to cpp file * Fix target test * Use make_target in driver * Formatting * Use make_target for the API * Formatting * Add cpu include * Increase timeout * Add more tests * Formatting Co-authored-by:
Shucai Xiao <shucai.xiao@amd.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 31 Aug, 2020 2 commits
-
-
Shucai Xiao authored
* not refect activation desriptor for some mipen operators * clang format
-
kahmed10 authored
* fix parsing to kdims * add 5d size * fix assert * add 3d test * formatting Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 27 Aug, 2020 2 commits
-
-
Shucai Xiao authored
* Add initial serialization * Formatting * Add unit tests * Formatting * Add tests for serialization * Formatting * Use or not and * Add value test * Formatting * Add more tests * Add shape serialization * Formatting * Add serializtion for literal and argument * Formatting * Add from and to value to operatation * Formatting * Serialize empty types * Formatting * Tidy fixes * Formatting * Fix tidy issues * Formatting * Reformat value type macro * Formatting * Handle enum types * Formatting * Use const ref * Update * Add tests for to_value/from_value * Formatting * code backup * clang format * code backup * clang format * code backup * clang format * remove the from/to_value method for the generate context struct * clang format * code backup * Dont print literal data in hip_copy_literal * clang format * add unit test to have better coverage * remove unnecessary code * remove unnecessary code * fix review comments * clang format * fix review comments Co-authored-by:
Paul <pfultz2@yahoo.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
Shucai Xiao authored
* add bool type * code backup * code backup * clang format * fix build warnings * clang format * add the equal operator * add the equal operator * clang format * remove unnecessary code * refine unit tests * clang format * fix review comments and a bug * clang format * additional changes * clang format * fix cppcheck error * add bool type in c api * fix cppcheck error * fix review comments * fix cppcheck error * fix a build error related to gcc * fix cppcheck error * fix cppcheck error * added the equal operator to register list * add parsing boolean type * clang format * fix bool type issue for python output * clang format * add support for automatic multibroadcast of the equal operator * additional unit tests for more code coverage * clang format * missing an onnx file Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
- 26 Aug, 2020 1 commit
-
-
Paul Fultz II authored
* Add make_op function * Formatting * Add more values * Formatting * Remove templates parse_conv functions * Formatting * Remove mat_mul template * Formatting * Reduce header includes * Fix compiling for gpu * Formatting * Use make_op in lowering * Formatting * Sort lines * Formatting * Add more tests * Formatting * Fix tidy error * Formatting * Add const refs * Add explicit this * Add more const refs * Sort the program * Remove commented out code * Formatting * Infer gpu prefix * Formatting Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 25 Aug, 2020 1 commit
-
-
Paul Fultz II authored
* Use increment instead of division to compute register offset * Formatting * Limit layernorm to 1024 elements * Formatting * Add verification to driver * Formatting * Remove early return * Use block_size 256 * Vectorize the kernel * Formatting * Convert to vector type * Add layernorm tests * Formatting * Formatting * Refactor layernorm to run both algos * Formatting * Fix compile error * Fix tidy warnings * Formatting * Add layernorm function * Formatting
-
- 21 Aug, 2020 1 commit
-
-
kahmed10 authored
Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 20 Aug, 2020 1 commit
-
-
Paul Fultz II authored
Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 19 Aug, 2020 1 commit
-
-
Shucai Xiao authored
* move initialization of miopen fusion operators to finalize method * clang format * fix cppcheck error * clang format * fix review comments * clang format * removed an unnecessary assert
-
- 18 Aug, 2020 1 commit
-
-
Paul Fultz II authored
* Register ops for main migraphx * Formatting * Register cpu ops * Formatting * Show list of operators in the driver * Formatting * Simplify regiter * Try to register gpu ops * Fix compiler errors * Register rest of the gpu operators * Add some tests * Formatting * Fix gcc compiler warnings * Formatting * Fix tidy warnings * Fix compile error * Use correct op name * Register layer norm * Use const ref * Make run const
-
- 14 Aug, 2020 1 commit
-
-
kahmed10 authored
* fix pad calc * bert tf passes correctness * formatting * add test * formatting * remove comment * add inline * formatting * fix order for literal * formatting * test no mul_add * formatting * debug layernorm * debug layernorm * manual merge * more progress * formatting * remove miopen batchnorm * remove headers * Fix compile error with no dpp reductions * fix indices * formatting * change matcher * formatting * remove binds * formatting * disable tf matcher * formatting * use fast div * formatting * fix matcher * formatting * remove comment * move find_matches * add assert * formatting * fix deepcode issue Co-authored-by:
Paul <pfultz2@yahoo.com> Co-authored-by:
Shucai Xiao <shucai.xiao@amd.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 13 Aug, 2020 2 commits
-
-
Shucai Xiao authored
* initial progress * formatting * add pooling changes * formatting * change eliminate_pad * formatting * rename var * fomratting * update op shape test and compute * formatting * revert conv constructor * formatting * change initializer * formatting * fix tidy * change quant conv and shape check * add tests and fixes * formatting * fix type * fix conv test * formatting * add pooling and bn tests * formatting * add inconsistent attr tests * fix padding issue * formatting * progress on 1d to 2d * formatting * change compute and compile functions * formatting * fix duplicate * fix conflict * fix issue with 1d conv * formatting * add check for 3d limit * rename function * formatting * update to MIOPen 2.3 * add support for nd pooling * formatting * test miopen 2.4 * change function name * rename functions * formatting * add op_shape test * add gpu ops tests * formatting * initial progress * formatting * add pkg-config * add to support asymmetric padding of averagepool * clang format * fix bug for average pooling * clang format * fix a bug * add unit tests for the asymmetric padding of averagepool * clang format * change functions * formatting * additional code refinement * clang format * check existing tests * formatting * change to copy_backward * formatting * change for loop to transform * formatting * add tests * formatting * remove comment * add more tests * remove an optimization for pooling * clang format * add and fix unit tests * clang format * update gpu miopen calls * formatting * initial progress * add cpu impl and tests * formatting * add NOLINT * add 3d test * formatting * add more op_shape tests * test diff miopen version * add submodule onnx * add pooling shape tests * fix error msg * add onnx_test_backend * reorganize python code * temp disable test * fix cppcheck error * fix cppcheck error * code backup * add support device choice * refine onnx backend test * revert to miopen 2.4 * fix review comments * fix review comments * clang format * fixed review comments * clang format * fix cppcheck error * copy onnx_backend_test to dest when building * add testdata folder * fix bounds * formatting * code backup * code backup * remove unnecessary file * fix various bugs * remove unnecessary changes * remove unnecessary submodule * remove unnecessary lines * fix algorithm * formatting * refine onnx backend unit tests * pin numpy version * fix build issue * fixed a filename to be copied * add the onnx dependency in docker image * ensure results are copied back correctly * specify onnx version * update excluded tests * remove unnecessary log info * turn on more unit tests * restrict onnx backend test to python 3.x * clang format * refine retrieving the input parameters * clang format * fix program input parameter names * clang format * avoid running onnx test in python 2.x * fix cppcheck error * fix python2.7 backend unit tests error * clang format * resolve the issue of ensure data copy to be completed * clang format * fix review comments * fix onnx backend unit test error * another change to make onnx backend test pass * clang format * fix onnx backend test error * clang format * disable onnx backend test to try * build try * update Dockerfile to try onnx backend test * remove unnecessary code * fix a bug in copying program * clang format * update dockerfile to include onnx * fix review comments * add the pytest module to the container * exclude real model to avoid to be downloaded * resolve the sync device for data copy from gpu to cpu * clang format * fix review comments * clang format * move sync_device after memory_coloring Co-authored-by:
Khalique <15948690+kahmed10@users.noreply.github.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com> Co-authored-by:
Paul Fultz II <pfultz2@yahoo.com>
-
Shucai Xiao authored
* code backup * code backup Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 12 Aug, 2020 1 commit
-
-
Paul Fultz II authored
* Add reduce dims * Formatting * Reduce dims on the gpu * Formatting * Fix tidy issues * Convert to assert * Reduce dims for contiguous * Formatting * Remove move * Fix arguments used * Formatting * Fix warnings * Formatting Co-authored-by:Shucai Xiao <shucai.xiao@amd.com>
-
- 28 Jul, 2020 1 commit
-
-
kahmed10 authored
* initial progres * formatting * remove comment * update reflect and error msg * formatting * remove header * move and rename function * formatting * fix tidy and remove extra function Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 21 Jul, 2020 2 commits
-
-
kahmed10 authored
* add reflect method * add reflect to cpu_op Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
Paul Fultz II authored
* Fix bug in eliminate_concat with negative axis * Formatting * Fix unused parameter * Formatting Co-authored-by:Shucai Xiao <shucai.xiao@amd.com>
-
- 10 Jul, 2020 2 commits
-
-
Paul Fultz II authored
* Add initial optimization when using a mul over a sliced convolution * Formatting * Add more tests * Formatting * Convert to an assert * Check if used once * Formatting * Add test with horiz fusion * Formatting * Optimize nested slice * Formatting * Fix test * Add const refs * Remove unnecessary assert Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
Shucai Xiao authored
* Initial cpu conv-nd * Formatting * Make index signed * Formatting * Assert the indices are greater than 0 * Use equal instead of lexicographical_compare * Formatting * change the batchnorm cpu implementation to support multiple input dimensions * clang format * add unit tests for cpu batch_norm nd implementation * clang format * support nd batchnormalization * clang format * add rewrite batch_norm unit tests * clang format * remove a unit test * Fix tidy errors * Formatting * Handle different types * Formatting * Fix nested visits * Formatting * Add 3d conv test * Formatting * revert unnecessary changes * remove a print line * Fix ICE * Formatting * fix the per_activation mode of 2d * clang format * code clean up * clang format * add 1d and 3d gpu unit test * clang format * add unit test for rewrite_batchnorm * clang format * additional refinement * fix review comments * added a unit test to have more code coverage Co-authored-by:
Paul <pfultz2@yahoo.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-