"vscode:/vscode.git/clone" did not exist on "20f1c8e3748f4b28f209f1cf62b2ffcf075810c5"
feat(src): add kv cache int8 quantization (#22)
* feat(src): add int8 and compile passed * feat(kernels): fix * feat(llama): update kernel * feat(src): add debug * fix(kernel): k_cache use int8_t pointer * style(llama): clean code * feat(deploy.py): revert to enable fmha * style(LlamaV2): clean code * feat(deploy.py): add default quant policy
Showing
Please register or sign in to comment