".github/git@developer.sourcefind.cn:OpenDAS/ollama.git" did not exist on "1f50356e8e3c3a2956c5ffacc3b9fa33b8285541"
feat(src): add kv cache int8 quantization (#22)
* feat(src): add int8 and compile passed * feat(kernels): fix * feat(llama): update kernel * feat(src): add debug * fix(kernel): k_cache use int8_t pointer * style(llama): clean code * feat(deploy.py): revert to enable fmha * style(LlamaV2): clean code * feat(deploy.py): add default quant policy
Showing
Please register or sign in to comment