"vscode:/vscode.git/clone" did not exist on "9d30a056e7a1c81382a53ac63dc476c5fbe0091d"
Disable chunked prefill and/or prefix caching when MLA is enabled (#12642)
From @mgoin in https://github.com/vllm-project/vllm/pull/12638 I cannot push to that branch, therefore a new PR to unblock release. --------- Signed-off-by:mgoin <michael@neuralmagic.com> Signed-off-by:
simon-mo <simon.mo@hey.com> Co-authored-by:
mgoin <michael@neuralmagic.com>
Showing
Please register or sign in to comment