".github/git@developer.sourcefind.cn:OpenDAS/ollama.git" did not exist on "2a038c1d7e8351b386dbf6944e63f1053cf9b9b6"
Manual control of MAC cluster for improved interwave performance (#184)
* manual control of MAC cluster for improved 2-wave performance ensure setprio's order; ensure inner loop size >= local read size synchronize when single mac cluster * format * use value field from ck::integral_constant * roll out inter-wave loop scheduler to c-shuffle gemm variants will gradually roll out to other applicable device ops when occasional reg spill is resolved * additional comments * format * fix mismatch between inter-wave pipeline and interwave blockwise gemm * address review feedback * amend
Showing
Please register or sign in to comment