"git@developer.sourcefind.cn:cnjsdfcy/simbricks.git" did not exist on "546d736d66f48b2a2536d0bedd71823e88100b04"
Add a gpu gemm reference kernel (#1528)
* Add a gpu gemm reference kernel
* Switch to gpu reference in gemm examples
* Remove redundant arguments
* Update all related examples
* Update more examples
* Try less threads per block
* Try even less threads per block
* Add support for all matrix layouts
* Increase block size
* Clean up
* Remove hardcoded strides
* Clean up
* Try a column-major case
* Revert back to row-major
* Run both CPU and GPU veriffication
---------
Co-authored-by:
Po Yen Chen <PoYen.Chen@amd.com>
Showing
Please register or sign in to comment