- 08 Jun, 2021 1 commit
-
-
Cagri Eryilmaz authored
* init reverseOp branch: ref op + ref test. WIP * first passing basic test * cleanup * additional axis implementation * additional test * ref op implementation vec to int for axis * ref op test change for axis * initial gpu files and test * updates to implementation and test * fixed some issues * clang format * cleanup * formatting * removing comments * remove local size, back to default * update tests: replace with std functions * multiple axis for reverse op * fix a build error * clang format * more tests * fix a bug for the reverse device function * clang format * fix a bug * clang format * ref test updates, multiaxis * formatting Co-authored-by:
Shucai Xiao <Shucai.Xiao@amd.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 03 May, 2021 1 commit
-
-
Paul Fultz II authored
* Remove unused data types * Formatting * Reduce types generated for hip kernels * Formatting * Fix onnx tests * Formatting Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 05 Mar, 2021 1 commit
-
-
kahmed10 authored
* fix relu6 * add more transposes * add multi output * formatting * add tests * formatting * fix tests * change to_nchw for outputs * add python api * fix cppcheck * remove variable * fix lambda * add multi_output test * add more tests and merge * fix help message * debugging work * fix valid op string * formatting * manual merge * mark function as const Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com> Co-authored-by:
Shucai Xiao <shucai@gmail.com>
-
- 26 Feb, 2021 1 commit
-
-
Cagri Eryilmaz authored
* changes for not operator * changed name of the op from unary_not to not * Added tests for op and onnx parsing * reordering not_test in onnx_test.cpp * not operator -- gpu implementation * added bool test for not operator * Added test and missing links for not operator on GPU * typo fix * adding .onnx test files for not operator * formatting Co-authored-by:
Shucai Xiao <shucai@gmail.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 08 Feb, 2021 1 commit
-
-
Paul Fultz II authored
* Add eliminate_data_type pass * Formatting * Auto convert quant ops * Formatting * Flip the order of decompose * Compute max size differently * Formatting * Clamp values in convert * Formatting * Fix loss of precision in reduce * Formatting * Fix bugs in reduction * Fix accumulator type in reference softmax implementation * Formatting * Update convert test * Remove unused variables * Remove unnecessary quant_dot check * Formatting * Add tests * Formatting * Remove unused code * Remove duplicate ops * Remove blaze dependency * Use set since shape::type_t is no hashable on gcc 5 * Formatting Co-authored-by:
Shucai Xiao <shucai@gmail.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 19 Jan, 2021 1 commit
-
-
Shucai Xiao authored
* add the and operator * clang format * add unit tests for the and operator * clang format * change the and name to logical_and and add the logical_or, logical_xor * clang format * add onnx unit tests for or and xor * add more unit tests
-
- 08 Jan, 2021 1 commit
-
-
Paul Fultz II authored
* Add build and test github workflow * Fix cget command * Remove def-requirements.txt * Add tmate session to debug workflow * Run tmate session after installing dependencies * Print date periodically * Add clang tidy action * Seperate build and run container in two different jobs * Run bash script * Remove interactive flag * Try to mount the files * Try to use the github workspace * WIthout double braces * Use env variable * Pipe bash script in * Run using hip-clang * Use correct path * Add verbose * Remove j flag * Only run for onnx file to debug * Manually run clang-tidy * Remove quiet flag * Print header file * Printout environment * Remove extra defines * Remove fixits and config flag * Show ldd * Add tmate session * Run onnx protobuf first * Generate proto for tensorflow * Update cppcheck version * Fix some cppcheck issues * Add const * Cppcheck fixes * Formatting * Fix more cppcheck issues * Run two jobs * Cache analysis and run format checking * Fix yaml issues * Fix yaml issues * Fix indentation * Switch to hip-clang for main docker file * Use hip-clang in the readme * Fixes for jenkins * Use ccache to build * Combine file * Set restore keys * Change stage name * Build with ccache * Add missing dependency for ccache * Build debug with codecov * Fix workflow syntax * Fix list * Use quotes * Got to correct build path * Install lcov * Use sudo * Echo all commands * Setup tmate * Add verbose output * Build with cmake directly * Add pthread flag * Remove python config * Continue on error * Use on or off for cmake flag * Use always upload cache * Verbose output * Verbose output from build * Build one target * Reduce debug symbols * Increase garbage collection * Remove dmesg * Increase it to 20 * Update rocm cmake version * Remove jobs from jenkins * Run on all 3 ubuntus * Remove gcc 5 jobs * Dont add flag on 16.04 * Only upload coverage on 18.04 * Dont build for ubuntu 20.04 * Use matrix.os * Use O2 for hip-clang since lower optimizations are broken * Use rocm 3.0 * Pass ccache as cmake variable instead of env variable * Build miopen from source * Show ccache statistics * Print log information * Set compression level * Use hash dir * Set hashdir * Install clang ocl from system * Up compression level * Add locale * Increase cache size to 1G * Lower compression level to 9 * Remove split dwarf * Remove Og * Add back Og * Seperate debug and codecov * Add missing backlash * Garbage collect more often * Add missing locales package * Use Os * Install onednn in docker and run tests * Include target headers in tests * Increase timeout * Remove if condtion * Make flag public * Suppress memory leaks in onednn * Use equal * Add gh annotations * Update rocm-cmake version * Add ldconfig Co-authored-by:Shucai Xiao <shucai@gmail.com>
-
- 20 Nov, 2020 1 commit
-
-
Paul Fultz II authored
* Unify the vectorized and non-vectorized path * Formatting * Make fusion easily extendable * Add skip layernorm fusion * Formatting * Call correct layernorm function * Fix compile errors * Add DCE * Add test for skip layernorm * Formatting * Remove unused typedef * Formatting * Fix tidy issues * Formatting Co-authored-by:Shucai Xiao <shucai.xiao@amd.com>
-
- 16 Nov, 2020 1 commit
-
-
Shucai Xiao authored
* add a pass to normalize ops * clang format * add unit tests * clang format * code backup * clang format * code backup * clang format * add support for slice in the normalize_op function * clang format * add operation method api for whether we need to call normalize_op * clang format * fix review comments * clang format * rename a function namejJ * clang format * change compute_shape to normalize_compute_shape for corresponding operators * clang format * remove unnecessary code * fix various issues * clang format * add attributes to operators having axis attributes * clang format * fixed jenkins build error * clang format * fix a bug related to slice * clang format * code backup * clang format * code backup * clang format * rename a file * fix cppcheck error * some code refinement * clang format * change attributes to enum * clang format * refine the enum * clang format * remove unnecessary code * add unit tests for more code coverage and fixed a bug * clang format * remove unnecessary changes * change normalize_axes to normalize * clang format * revert back the changes in broadcast.hpp * rename normalize_axes to normalize * fix review comments * clang format * Add flag to enable cpu backend * Make buffers shared * Enable optimizations * Formatting * Try to avoid ambiguous assign in value class * fixed a build error * clang format * add the normalize_ops pass to the ref target * refactor program to module to normalize_ops pass Co-authored-by:
Paul <pfultz2@yahoo.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 15 Oct, 2020 1 commit
-
-
turneram authored
* Added greater and less operators * Fixed ops_test.cpp * Set commutative to false for less, greater * Refactored parse_equal/less/greater into parse_compare_op * Removed unnecessary function attributes() from greater.hpp/less.hpp * Added op_name arguments * Removed local settings * Formatting * Missing comma * Formatting * Formatting * Formatting * Formatting * Formatting * Missing space Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
- 30 Sep, 2020 1 commit
-
-
Paul Fultz II authored
* Make global variables const * Tidy fixes * Disable some lints * Formatting * Fix tidy const * Formatting * Add missing const keywords * Formatting * More fixes * Fix remaining tidy issues * Formatting * Fix rocblas function call * Formatting * Fix nodiscard warnings * Formatting * Use named parameters * Remove overload * Add overload * Remove noncps * Use named param for node * Add auto register header * Use named parameters * Refactor jenkinsfile * Fix shadow * Add missing body variable * Add more const methods * Add hip-clang docker builds * Remove comments * Add clang-format * Add more const * Formatting * Rename stage * Disable check * Add another const * Add python 2 dev packages * Add sphinx to dockerfile
-
- 31 Aug, 2020 1 commit
-
-
kahmed10 authored
* fix parsing to kdims * add 5d size * fix assert * add 3d test * formatting Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 27 Aug, 2020 1 commit
-
-
Shucai Xiao authored
* add bool type * code backup * code backup * clang format * fix build warnings * clang format * add the equal operator * add the equal operator * clang format * remove unnecessary code * refine unit tests * clang format * fix review comments and a bug * clang format * additional changes * clang format * fix cppcheck error * add bool type in c api * fix cppcheck error * fix review comments * fix cppcheck error * fix a build error related to gcc * fix cppcheck error * fix cppcheck error * added the equal operator to register list * add parsing boolean type * clang format * fix bool type issue for python output * clang format * add support for automatic multibroadcast of the equal operator * additional unit tests for more code coverage * clang format * missing an onnx file Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
- 25 Aug, 2020 1 commit
-
-
Paul Fultz II authored
* Use increment instead of division to compute register offset * Formatting * Limit layernorm to 1024 elements * Formatting * Add verification to driver * Formatting * Remove early return * Use block_size 256 * Vectorize the kernel * Formatting * Convert to vector type * Add layernorm tests * Formatting * Formatting * Refactor layernorm to run both algos * Formatting * Fix compile error * Fix tidy warnings * Formatting * Add layernorm function * Formatting
-
- 14 Aug, 2020 1 commit
-
-
kahmed10 authored
* fix pad calc * bert tf passes correctness * formatting * add test * formatting * remove comment * add inline * formatting * fix order for literal * formatting * test no mul_add * formatting * debug layernorm * debug layernorm * manual merge * more progress * formatting * remove miopen batchnorm * remove headers * Fix compile error with no dpp reductions * fix indices * formatting * change matcher * formatting * remove binds * formatting * disable tf matcher * formatting * use fast div * formatting * fix matcher * formatting * remove comment * move find_matches * add assert * formatting * fix deepcode issue Co-authored-by:
Paul <pfultz2@yahoo.com> Co-authored-by:
Shucai Xiao <shucai.xiao@amd.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 12 Aug, 2020 1 commit
-
-
Paul Fultz II authored
* Add reduce dims * Formatting * Reduce dims on the gpu * Formatting * Fix tidy issues * Convert to assert * Reduce dims for contiguous * Formatting * Remove move * Fix arguments used * Formatting * Fix warnings * Formatting Co-authored-by:Shucai Xiao <shucai.xiao@amd.com>
-
- 10 Jul, 2020 1 commit
-
-
Paul Fultz II authored
Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 29 May, 2020 1 commit
-
-
mhbliao authored
-
- 20 May, 2020 1 commit
-
-
Shucai Xiao authored
* code backup * clang format * fix compiling errors * clang format * rename a few files * rename a few files * fix variable bugs * clang format * add an operator to shift input sequences * clang format * fixed a bug * clang format * fixed a bug * clang format * code backup * clang format * code backup * clang format * code backup * clang format * refine code related lstm operator optimization * clang format * fix various bugs * clang format * fixed a bug in rewrite_lstm * clang format * fixed another bug * refine two operator names * clang format * refine file names * fix cppcheck error * clang format * fix cppcheck error * clang format * fix cppcheck error * fixed review comments * clang format * add unit tests * clang format * add unit tests * clang format * refine unit tests for better coverage * clang format * fixed a bug * fix cppcheck error * fix review comments * clang format * rename two operators according to review comments * clang format * fix review comments * clang format * fix review comments * clang format * fix review comments * fix a cppcheck error * clang format * fix review comments * clang format Co-authored-by:
Shucai Xiao <scxiao@prj47-rack-99.local.lan> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 15 May, 2020 1 commit
-
-
kahmed10 authored
* fix pad calc * bert tf passes correctness * formatting * add test * formatting * remove comment * add inline * formatting * fix order for literal * formatting * add test for gelu * formatting * added add_gelu fusion * add files * formatting * remove layernorm opt * revert reduce file * add gelu_fn and tests * formatting * fix matcher, remove extra tests * formatting * fix matcher * add used_once * formatting * start on new gelu * formatting * add matchers in fuse_ops * formatting * add dce to fix add_gelu * add simplify_rsqrt and test * formatting * debugging value for matcher * formatting * add more to matchers * formatting * fix errors * remove onnx gen * add any_arg, change matchers to use either_arg * formatting * formatting * add used_once * formatting Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 11 May, 2020 1 commit
-
-
Paul Fultz II authored
* Fix handling of lowest values in pad operator * Formatting * Formatting * Formatting * Add cpu test for lowest padding * Add test for max Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 21 Apr, 2020 1 commit
-
-
Yaxun (Sam) Liu authored
-
- 08 Apr, 2020 1 commit
-
-
kahmed10 authored
* add recip gpu and tests * formatting * remove to_hip_type Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 29 Mar, 2020 1 commit
-
-
kahmed10 authored
* fix pad calc * modify clip for more args * formatting * add test, flip order, revert to unary * fix error msg * add min and max args to clip * formatting * fixes to quantization * formatting * fix logic and add extra test * formatting * fix logic, add extra test * formatting * fix bug in test Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com> Co-authored-by:
Paul Fultz II <pfultz2@yahoo.com>
-
- 07 Mar, 2020 1 commit
-
-
Shucai Xiao authored
* add prelu operator * clang format * add prelu to gpu lowering * add unit tests for the PRelu operator * clang format * add missing onnx file for PRelu operator * update unit tests for prelu operator * clang format Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com> Co-authored-by:
Paul Fultz II <pfultz2@yahoo.com>
-
- 12 Feb, 2020 1 commit
-
-
Aaron Enye Shi authored
* Fix HIP-Clang GPU build issues Add missing device attributes for GPU functions. GPU functions must be annotated with __device__ in HIP. * Use HIP device function max and min * Fix clang-format-5.0 issues * Undo change that breaks on HIP-HCC Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 10 Feb, 2020 1 commit
-
-
Shucai Xiao authored
* Add initial api * Formatting * Add more api * Formatting * add more operators (asinh, acosh, atanh, MatMulInteger, ConvInteger) * clang format * add unit tests for new operators * clang format * Add auto api generation * Formatting * Fix some compilation errors * Change handle struct * Formatting * Fix reamining compilation errors * Formatting * Simplify using ctype * Formatting * Initial c++ generation * Formatting * Add C++header * Formatting * Add test * Formatting * Add initial tests * Formatting * Try to fix formatting * Cleanup formatting * Formatting * Fix constructors on the same line * Fix tests * Formatting * Fix tidy issues * Fix tidy issues * Fix naming issue * Add onnx API to parse buffer * Formatting * Add arguments api * Formatting * Fix verify parameters * Fix cppcheck issues * Formatting * Add method to get output shapes and bytes * Formatting * Try formatting * Formatting * Improve the test coverage * Formatting * Add print method * Formatting * Fix cppcheck issue * Fix package dependency * Add nolint * Try fix formatting * Formatting * formatting * formatting * Fix formatting Co-authored-by:
Paul Fultz II <pfultz2@yahoo.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com> Co-authored-by:
kahmed10 <15948690+kahmed10@users.noreply.github.com>
-
- 17 Jan, 2020 1 commit
-
-
Shucai Xiao authored
* add reduce operators as required by onnxruntime * clang format * remove a test since it can cause overflow * resolve cppcheck error * clang format * fix cppcheck error * clang format
-
- 20 Dec, 2019 1 commit
-
-
Shucai Xiao authored
* improve unsqueeze to support negative axis and parsing scalar * clang format * add a test example for the negative axis of unsqueeze * improve the squeeze operator to support negative axis * clang format * fixed a small bug in the lrn implementation * clang format * support negative axis in argmax and argmin * clang format * improve flatten to support negative axis * clang format * change softmax/logsoftmax to support negative axis * clang format * improve transpose by adding default perm * clang format * add one more dimens for tensor size * add one more dimens for tensor size * disable conv ops fusion for non-symmetric cases * clang format * fixed review comments * move computing axis from the device function to the compute function * clang format * move computing axis from device function to the operator computing function * clang format Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 18 Nov, 2019 1 commit
-
-
Shucai Xiao authored
* improve gather implementation to handle negative input indices * clang format * clang format * improve concat to support neg axis input * clang format * fix cppcheck error * clang format * code cleanup * clang format * fix review comments * clang format
-
- 15 Nov, 2019 1 commit
-
-
Paul Fultz II authored
* Add compiler options * Add copy operators * Formatting * Use run_passes in tests * Formatting * Use run_pass in schedule test * Formatting * Add compile_options to get_passes in target * Formatting * Offload copy option * Formatting * Copy using pinned memory * Formatting * Improve performance of gpu copying * Formatting * Dont copy * Formatting * Always make an extra copy * Formatting * Remove unused write op * Add missing include * Remove copy_to_gpu function in python api * Make offload copy disabled by default on C++ * Formatting * Fix tidy issues * Formatting * Fix namespace * Fix python tests * Turn clang format off since its broken * Fix compile error on gcc 5 * Remove commented code
-
- 04 Nov, 2019 1 commit
-
-
Paul Fultz II authored
* Add functions to do multi-index for local strides as well * Formatting * Use same multi-index path for block_reduce * Formatting * Use multi-index calc in reduce * Formatting * Fix warning * Fix compiler warning * Disable some tidy checks
-
- 15 Oct, 2019 1 commit
-
-
Paul Fultz II authored
* use 32bit integers for indices * Formatting * Update more index types * Formatting
-
- 09 Oct, 2019 1 commit
-
-
Paul Fultz II authored
* Fix bug in bert accuraccy * Formatting * add another test * Fix add and overflow * Formatting * Fix bug in shape_for_each * Use front instead of iterator * Use result.front() * Split add_unary files * Formatting * Fix incorrect last index * Remove comment * Inline function * Fix carry check * Fix metadata errors * Formatting * Reflow * Reflow
-
- 07 Oct, 2019 1 commit
-
-
Paul Fultz II authored
* Implement fast-div for index calculations * Formatting * Use fast_div for broadcasts * Formatting * Add remiander function * Compute mult-index using lens instead of strides * Formatting * Simplify equation * Formatting
-
- 04 Oct, 2019 1 commit
-
-
kahmed10 authored
* initial testing of add_clip fusion * formatting * clipped relu fusion * formatting * remove some executables, add fusion test * formatting * remove clipped_relu code * fix clang-tidy * revert changes to cmake files * remove fusion from weight map * formatting * fix syntax error * formatting * fix syntax error * fix syntax error * formatting
-
- 03 Oct, 2019 1 commit
-
-
Paul Fultz II authored
* Add env to trace nary device functions * Formatting * Improve contiguous and concat performance * Formatting * Remove unused variable * Formatting * Fix gpu tests * Formatting * Add more test for transposed concat * Formatting * Compute offset and not index * Compute multi-index once * Formatting * Fix transposed inputs * Formatting * Use product order for comparisons of hip_array * Formatting * Add missing s parameter * Formatting * Dont invert permutation * Fix tidy warnings * Formatting * Remove incorrect license * Use a single integer for stride * Formatting * Fix tidy issue
-
- 27 Sep, 2019 1 commit
-
-
Shucai Xiao authored
* add two operators ceil and floor * clang format * add unit test for the ceil and floor operators * remove unintended code
-
- 25 Sep, 2019 1 commit
-
-
Shucai Xiao authored
* first version of refactoring reduce operators. * clang format * refactor the gpu implemantation of the reduce_mean operator * clang format * refactor gpu implementation of the resuce_sum operator * fix cpp check error * fix cppcheck error * fix cppcheck error * fix review comments * clang format * fix a jenkin error * fixed review comments * clang format * fix review comments * clang format * fix review comments * clang format * add implemenation of reduce_min and reduce_max * clang format * add unit test for reduce_min/max operator * clang format * add more unit tests * clang format * fix review comments
-
- 16 Sep, 2019 1 commit
-
-
kahmed10 authored
* add tests, fix bug in ternary op * formatting * uncomment fusion
-