"vscode:/vscode.git/clone" did not exist on "3c45f2ed5b47d1ce5f31ea968b7244ecc7ad4198"
- 24 Jan, 2020 1 commit
-
-
kahmed10 authored
* initial testing * add shape op * formatting * add env variable for batch sizes * formatting * progress on driver * progress on driver * cleanup * cleanup * add and modified prev tests * formatting * remove comment * add shape op test * formatting * manually insert shape op in test * formatting * create options struct for parsers * formatting * Add documentation for python * Fix c++ documentaion * add documentation to parser * formatting * add argmin and tests * fix doc and definitions * formatting * revert test functions * formatting * cpu impl of conv_transpose * more work on conv_transpose * rename files, added extratests * formatting * add more tests * formatting * changes * fix tests * fix tidy * formatting * fixed function parameter * fix function parameter * add cpu ops test * formatting Co-authored-by:
Paul Fultz II <pfultz2@yahoo.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 17 Jan, 2020 1 commit
-
-
Shucai Xiao authored
* add reduce operators as required by onnxruntime * clang format * remove a test since it can cause overflow * resolve cppcheck error * clang format * fix cppcheck error * clang format
-
- 20 Dec, 2019 1 commit
-
-
Shucai Xiao authored
* improve unsqueeze to support negative axis and parsing scalar * clang format * add a test example for the negative axis of unsqueeze * improve the squeeze operator to support negative axis * clang format * fixed a small bug in the lrn implementation * clang format * support negative axis in argmax and argmin * clang format * improve flatten to support negative axis * clang format * change softmax/logsoftmax to support negative axis * clang format * improve transpose by adding default perm * clang format * add one more dimens for tensor size * add one more dimens for tensor size * disable conv ops fusion for non-symmetric cases * clang format * fixed review comments * move computing axis from the device function to the compute function * clang format * move computing axis from device function to the operator computing function * clang format Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 27 Nov, 2019 1 commit
-
-
Paul Fultz II authored
* Add experimental support for c++ output * Format * Fix syntax errors * Add resnet50 model * Formatting * Add inceptionv3 model * Formatting * Add alexnet * Formatting * Fix name of pooling mode * Formatting * Fix tidy issues * Ignore driver directory * Show accetable values
-
- 18 Nov, 2019 1 commit
-
-
Shucai Xiao authored
* improve gather implementation to handle negative input indices * clang format * clang format * improve concat to support neg axis input * clang format * fix cppcheck error * clang format * code cleanup * clang format * fix review comments * clang format
-
- 17 Nov, 2019 1 commit
-
-
Paul authored
-
- 15 Nov, 2019 1 commit
-
-
Paul Fultz II authored
* Add compiler options * Add copy operators * Formatting * Use run_passes in tests * Formatting * Use run_pass in schedule test * Formatting * Add compile_options to get_passes in target * Formatting * Offload copy option * Formatting * Copy using pinned memory * Formatting * Improve performance of gpu copying * Formatting * Dont copy * Formatting * Always make an extra copy * Formatting * Remove unused write op * Add missing include * Remove copy_to_gpu function in python api * Make offload copy disabled by default on C++ * Formatting * Fix tidy issues * Formatting * Fix namespace * Fix python tests * Turn clang format off since its broken * Fix compile error on gcc 5 * Remove commented code
-
- 04 Nov, 2019 2 commits
-
-
Paul Fultz II authored
* Add functions to do multi-index for local strides as well * Formatting * Use same multi-index path for block_reduce * Formatting * Use multi-index calc in reduce * Formatting * Fix warning * Fix compiler warning * Disable some tidy checks
-
Paul Fultz II authored
* Fix bug in eliminate_concat * Formatting * Skip context_free operators * Formatting * Fix unit test * Formatting
-
- 30 Oct, 2019 1 commit
-
-
Paul Fultz II authored
* Enable scheduler for 1 stream * Formatting * Improve performance of sorting * Formatting * Adjust the weight calculation * Formatting * Simplify formula * Formatting * Avoid division by zero * Fix scheduler test * Check for either 1 or 2 * Check for waits when order may change * Formatting
-
- 25 Oct, 2019 1 commit
-
-
Shucai Xiao authored
* simplify cpu implementation of the convolution, softmax, and logsoftmax * clang format * fix cppcheck error * improve code coverage
-
- 15 Oct, 2019 1 commit
-
-
Paul Fultz II authored
* use 32bit integers for indices * Formatting * Update more index types * Formatting
-
- 09 Oct, 2019 1 commit
-
-
Paul Fultz II authored
* Fix bug in bert accuraccy * Formatting * add another test * Fix add and overflow * Formatting * Fix bug in shape_for_each * Use front instead of iterator * Use result.front() * Split add_unary files * Formatting * Fix incorrect last index * Remove comment * Inline function * Fix carry check * Fix metadata errors * Formatting * Reflow * Reflow
-
- 07 Oct, 2019 1 commit
-
-
Paul Fultz II authored
* Implement fast-div for index calculations * Formatting * Use fast_div for broadcasts * Formatting * Add remiander function * Compute mult-index using lens instead of strides * Formatting * Simplify equation * Formatting
-
- 04 Oct, 2019 1 commit
-
-
kahmed10 authored
* initial testing of add_clip fusion * formatting * clipped relu fusion * formatting * remove some executables, add fusion test * formatting * remove clipped_relu code * fix clang-tidy * revert changes to cmake files * remove fusion from weight map * formatting * fix syntax error * formatting * fix syntax error * fix syntax error * formatting
-
- 03 Oct, 2019 2 commits
-
-
Shucai Xiao authored
* fixed a bug related to removing gemm copy * clang format * fix review comments * clang format * fix unit test failure * fix review comments * clang format
-
Paul Fultz II authored
* Add env to trace nary device functions * Formatting * Improve contiguous and concat performance * Formatting * Remove unused variable * Formatting * Fix gpu tests * Formatting * Add more test for transposed concat * Formatting * Compute offset and not index * Compute multi-index once * Formatting * Fix transposed inputs * Formatting * Use product order for comparisons of hip_array * Formatting * Add missing s parameter * Formatting * Dont invert permutation * Fix tidy warnings * Formatting * Remove incorrect license * Use a single integer for stride * Formatting * Fix tidy issue
-
- 27 Sep, 2019 1 commit
-
-
Shucai Xiao authored
* add two operators ceil and floor * clang format * add unit test for the ceil and floor operators * remove unintended code
-
- 25 Sep, 2019 1 commit
-
-
Shucai Xiao authored
* first version of refactoring reduce operators. * clang format * refactor the gpu implemantation of the reduce_mean operator * clang format * refactor gpu implementation of the resuce_sum operator * fix cpp check error * fix cppcheck error * fix cppcheck error * fix review comments * clang format * fix a jenkin error * fixed review comments * clang format * fix review comments * clang format * fix review comments * clang format * add implemenation of reduce_min and reduce_max * clang format * add unit test for reduce_min/max operator * clang format * add more unit tests * clang format * fix review comments
-
- 19 Sep, 2019 1 commit
-
-
Paul Fultz II authored
* Disable fusion when winograd is used except for 3x3 * Formatting
-
- 18 Sep, 2019 1 commit
-
-
Shucai Xiao authored
* Remove extra copy in gemm * combine rocblas gemm call * clang format * fix a bug in calling rocblas function * clang format' * backup of temporary changes * clang format * unify the gemm call to avoid multiple gpu implemantation * clang format * remove unnecessary code * backup temp changes * clang format * fix cppcheck error * code backup * clang format * remove unnecessary synchronization function * clang format * fix bugs * clang format * more optimization related to gemm * clang format * code cleanup * implementation that can achieves better performance * clang format * temp changes to try performance * clang format * revert to previous commits * fixed review comments * clang format * fix review comments
-
- 16 Sep, 2019 2 commits
-
-
kahmed10 authored
* add tests, fix bug in ternary op * formatting * uncomment fusion
-
Shucai Xiao authored
* first version of refactoring reduce operators. * clang format * refactor the gpu implemantation of the reduce_mean operator * clang format * refactor gpu implementation of the resuce_sum operator * fix cpp check error * fix cppcheck error * fix cppcheck error * fix review comments * clang format * fix a jenkin error * fixed review comments * clang format * fix review comments * clang format * fix review comments * clang format
-
- 04 Sep, 2019 1 commit
-
-
Paul authored
-
- 28 Aug, 2019 1 commit
-
-
Paul authored
-
- 27 Aug, 2019 2 commits
- 26 Aug, 2019 2 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 23 Aug, 2019 2 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 20 Aug, 2019 3 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Paul authored
-
- 16 Aug, 2019 1 commit
-
-
Paul authored
-
- 15 Aug, 2019 3 commits
- 14 Aug, 2019 1 commit
-
-
Shucai Xiao authored
-
- 13 Aug, 2019 1 commit
-
-
Paul authored
-