Add cuda kernel support for GGUF inference (#11869)
* add gguf kernel support Signed-off-by:Isotr0py <2037008807@qq.com> * fix Signed-off-by:
Isotr0py <2037008807@qq.com> * optimize Signed-off-by:
Isotr0py <2037008807@qq.com> * update * update * update * update * update --------- Signed-off-by:
Isotr0py <2037008807@qq.com> Co-authored-by:
DN6 <dhruv.nair@gmail.com>
Showing
Please register or sign in to comment