- 17 Nov, 2019 1 commit
-
-
Paul authored
-
- 15 Nov, 2019 1 commit
-
-
Paul Fultz II authored
* Add compiler options * Add copy operators * Formatting * Use run_passes in tests * Formatting * Use run_pass in schedule test * Formatting * Add compile_options to get_passes in target * Formatting * Offload copy option * Formatting * Copy using pinned memory * Formatting * Improve performance of gpu copying * Formatting * Dont copy * Formatting * Always make an extra copy * Formatting * Remove unused write op * Add missing include * Remove copy_to_gpu function in python api * Make offload copy disabled by default on C++ * Formatting * Fix tidy issues * Formatting * Fix namespace * Fix python tests * Turn clang format off since its broken * Fix compile error on gcc 5 * Remove commented code
-
- 04 Nov, 2019 1 commit
-
-
Paul Fultz II authored
* Add functions to do multi-index for local strides as well * Formatting * Use same multi-index path for block_reduce * Formatting * Use multi-index calc in reduce * Formatting * Fix warning * Fix compiler warning * Disable some tidy checks
-
- 09 Oct, 2019 1 commit
-
-
Paul Fultz II authored
* Fix bug in bert accuraccy * Formatting * add another test * Fix add and overflow * Formatting * Fix bug in shape_for_each * Use front instead of iterator * Use result.front() * Split add_unary files * Formatting * Fix incorrect last index * Remove comment * Inline function * Fix carry check * Fix metadata errors * Formatting * Reflow * Reflow
-
- 27 Sep, 2019 1 commit
-
-
Shucai Xiao authored
* add two operators ceil and floor * clang format * add unit test for the ceil and floor operators * remove unintended code
-
- 25 Sep, 2019 1 commit
-
-
Shucai Xiao authored
* first version of refactoring reduce operators. * clang format * refactor the gpu implemantation of the reduce_mean operator * clang format * refactor gpu implementation of the resuce_sum operator * fix cpp check error * fix cppcheck error * fix cppcheck error * fix review comments * clang format * fix a jenkin error * fixed review comments * clang format * fix review comments * clang format * fix review comments * clang format * add implemenation of reduce_min and reduce_max * clang format * add unit test for reduce_min/max operator * clang format * add more unit tests * clang format * fix review comments
-
- 18 Sep, 2019 1 commit
-
-
Shucai Xiao authored
* Remove extra copy in gemm * combine rocblas gemm call * clang format * fix a bug in calling rocblas function * clang format' * backup of temporary changes * clang format * unify the gemm call to avoid multiple gpu implemantation * clang format * remove unnecessary code * backup temp changes * clang format * fix cppcheck error * code backup * clang format * remove unnecessary synchronization function * clang format * fix bugs * clang format * more optimization related to gemm * clang format * code cleanup * implementation that can achieves better performance * clang format * temp changes to try performance * clang format * revert to previous commits * fixed review comments * clang format * fix review comments
-
- 16 Sep, 2019 2 commits
-
-
kahmed10 authored
* add tests, fix bug in ternary op * formatting * uncomment fusion
-
Shucai Xiao authored
* first version of refactoring reduce operators. * clang format * refactor the gpu implemantation of the reduce_mean operator * clang format * refactor gpu implementation of the resuce_sum operator * fix cpp check error * fix cppcheck error * fix cppcheck error * fix review comments * clang format * fix a jenkin error * fixed review comments * clang format * fix review comments * clang format * fix review comments * clang format
-
- 04 Sep, 2019 1 commit
-
-
Paul authored
-
- 14 Aug, 2019 1 commit
-
-
Shucai Xiao authored
-
- 06 Aug, 2019 1 commit
-
-
Shucai Xiao authored
-
- 05 Aug, 2019 1 commit
-
-
Shucai Xiao authored
-
- 03 Aug, 2019 1 commit
-
-
Shucai Xiao authored
-
- 02 Aug, 2019 1 commit
-
-
Shucai Xiao authored
-
- 01 Aug, 2019 1 commit
-
-
Shucai Xiao authored
-
- 24 Jul, 2019 1 commit
-
-
Paul authored
-
- 11 Jul, 2019 1 commit
-
-
Shucai Xiao authored
-
- 10 Jul, 2019 2 commits
- 09 Jul, 2019 2 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 03 Jul, 2019 2 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 02 Jul, 2019 1 commit
-
-
Shucai Xiao authored
-
- 25 Jun, 2019 1 commit
-
-
Paul authored
-
- 23 May, 2019 1 commit
-
-
Khalique authored
-
- 14 May, 2019 1 commit
-
-
Shucai Xiao authored
-
- 09 May, 2019 1 commit
-
-
Shucai Xiao authored
-
- 06 May, 2019 2 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 01 May, 2019 1 commit
-
-
Shucai Xiao authored
-
- 30 Apr, 2019 1 commit
-
-
Khalique authored
-
- 29 Apr, 2019 1 commit
-
-
Shucai Xiao authored
-
- 18 Apr, 2019 2 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 17 Apr, 2019 2 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 16 Apr, 2019 1 commit
-
-
Shucai Xiao authored
add a pass to resolve the problem that hip_allocation shape is different from instruction output shape.
-
- 03 Apr, 2019 1 commit
-
-
Shucai Xiao authored
-