"src/include/blockwise_batched_gemm.hpp" did not exist on "268d1c717c01f070e511bd9a60966117bb60cf41"
Improve buffer address for out of bound check (#21)
* Use buffer load built-in OOB check. buffer size is limited to 2GB. * buffer APIs use combined wave and thread offset * use uint32_t for addr shift in buffer addressing
Showing
Please register or sign in to comment