- 14 Feb, 2023 1 commit
-
-
Brian Pickrell authored
First try to compile a call to rocblas-beta API; doesn't compile. Note the update to rocm version in Dockerfile. May require a new Docker image build.
-
- 07 Nov, 2022 1 commit
-
-
arvindcheru authored
-
- 13 Sep, 2022 1 commit
-
-
turneram authored
Improves performance for 4/6 GEMMs used by huggingface BERT models with batch_size>1 by using a non-batched rocBLAS call for GEMMs where the B input has a broadcasted batch dimension. The four verify tests added reflect the actual configurations used by bert-base-cased, with varied batch sizes. Also adds a matcher to simplify_reshapes to move multibroadcasts after concats.
-
- 06 Sep, 2022 1 commit
-
-
Paul Fultz II authored
Using not and or improves readability. The cppcheck rule will help ensure we are doing it consistently.
-
- 27 Aug, 2022 1 commit
-
-
Paul Fultz II authored
This will rewrite dot operators like X(Y + b) to XY + Xb when b is constant as we can fold the add away. This improves handling pointwise with broadcasted operators, this helps improves const propagation. Improve gemm fusion with a mul_add Improve support for broadcast shapes in gemm
-
- 22 Jun, 2022 1 commit
-
-
Ted Themistokleous authored
Updated each source file in the repo with the existing license.
-
- 24 May, 2022 1 commit
-
-
Paul Fultz II authored
* Improve applicable batched gemms for bert
-
- 03 Mar, 2022 1 commit
-
-
kahmed10 authored
better performance doing it this way
-
- 12 Jul, 2021 1 commit
-
-
Shucai Xiao authored
Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 15 Jun, 2021 1 commit
-
-
Shucai Xiao authored
* add a flag to indicate int8x4 input format * clang format * code backup * clang format * code backup * clang format * code backup * clang format * code backup * clang format * code backup * clang format * remove log info * remove unnecessary changes * fix cppcheck error * add unit tests to have more code coverage * clang format * add debug info * remove log info * fix cppcheck error * clang format * clang format * add one more unit tests for more scenarios * fix cppcheck error * clang format * fix review comments * clang format * rename p to m * fix review comments * refine unit tests * clang format * refine unit tests and fixed a bug * clang format * fix build error related to rocm4.2 * fix a bug related to alpha and beta * refine two unit tests related to int8_gemm * fix cppcheck error * refine unit test to pass on mi100 * add unit test for packing int8 args * clang format * change unit tests back * disable some unit tests for gpu * clang format * refine unit tests to run on mi100 * clang format * refine unit tests * refine unit tests * clang format * change back a unit test Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 27 Apr, 2021 1 commit
-
-
Paul Fultz II authored
* Add definitions for all pointwise operators * Formatting * Add cpp generator class * Formatting * Move compilation to core * Formatting * Add clock to tmp name * Add dynamic loader * Formatting * Add tests for code gen * Formatting * Add test for literals * Formatting * Use with_char * Add missing header * Fix mismerge * Ignore tidy warning * Fxx gcc 5 errors * Apply fixits * Skip signed bitwise of status * Remove unused parameters * Explicitly add c++14 flag * Fix tidy warning * Add tuple type to shape class * Formatting * Make data member private * Formatting * Add sub arguments * Formatting * Trun clang format off * Disable clang-format * Improve visiting tuples * Formatting * Add more argument tests * Formatting * Handle tuple in load * Formatting * Remove .o files * Add tuple type to api * Formatting * Fix tidy warnings * Fix tidy warnings * Add a test for share method * Formatting * Add a test cpp_type * Suppress tidy warning Co-authored-by:Shucai Xiao <Shucai.Xiao@amd.com>
-
- 10 Mar, 2021 1 commit
-
-
Shucai Xiao authored
* fix the flag in rocblas api for int8 data type * used different flag for different rocblas versions * clang format Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 30 Sep, 2020 1 commit
-
-
Paul Fultz II authored
* Make global variables const * Tidy fixes * Disable some lints * Formatting * Fix tidy const * Formatting * Add missing const keywords * Formatting * More fixes * Fix remaining tidy issues * Formatting * Fix rocblas function call * Formatting * Fix nodiscard warnings * Formatting * Use named parameters * Remove overload * Add overload * Remove noncps * Use named param for node * Add auto register header * Use named parameters * Refactor jenkinsfile * Fix shadow * Add missing body variable * Add more const methods * Add hip-clang docker builds * Remove comments * Add clang-format * Add more const * Formatting * Rename stage * Disable check * Add another const * Add python 2 dev packages * Add sphinx to dockerfile
-
- 27 Aug, 2020 1 commit
-
-
Shucai Xiao authored
* add bool type * code backup * code backup * clang format * fix build warnings * clang format * add the equal operator * add the equal operator * clang format * remove unnecessary code * refine unit tests * clang format * fix review comments and a bug * clang format * additional changes * clang format * fix cppcheck error * add bool type in c api * fix cppcheck error * fix review comments * fix cppcheck error * fix a build error related to gcc * fix cppcheck error * fix cppcheck error * added the equal operator to register list * add parsing boolean type * clang format * fix bool type issue for python output * clang format * add support for automatic multibroadcast of the equal operator * additional unit tests for more code coverage * clang format * missing an onnx file Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
- 22 May, 2020 1 commit
-
-
Paul Fultz II authored
-
- 18 Sep, 2019 1 commit
-
-
Shucai Xiao authored
* Remove extra copy in gemm * combine rocblas gemm call * clang format * fix a bug in calling rocblas function * clang format' * backup of temporary changes * clang format * unify the gemm call to avoid multiple gpu implemantation * clang format * remove unnecessary code * backup temp changes * clang format * fix cppcheck error * code backup * clang format * remove unnecessary synchronization function * clang format * fix bugs * clang format * more optimization related to gemm * clang format * code cleanup * implementation that can achieves better performance * clang format * temp changes to try performance * clang format * revert to previous commits * fixed review comments * clang format * fix review comments
-
- 23 Aug, 2019 2 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 06 Aug, 2019 3 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 05 Aug, 2019 2 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 06 Jun, 2019 1 commit
-
-
Shucai Xiao authored
-
- 03 Jun, 2019 3 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 28 May, 2019 2 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 24 May, 2019 2 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 13 May, 2019 2 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 10 May, 2019 6 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 09 May, 2019 1 commit
-
-
Shucai Xiao authored
-