Unverified Commit 334c715e authored by Nicolò Lucchesi's avatar Nicolò Lucchesi Committed by GitHub
Browse files

[Docs] Spec decoding docs warning removal (#34439)


Signed-off-by: default avatarNickLucche <nlucches@redhat.com>
parent 7b5a8b4a
# Speculative Decoding
!!! warning
Please note that speculative decoding in vLLM is not yet optimized and does
not usually yield inter-token latency reductions for all prompt datasets or sampling parameters.
The work to optimize it is ongoing and can be followed here: <https://github.com/vllm-project/vllm/issues/4630>
!!! warning
Currently, speculative decoding in vLLM is not compatible with pipeline parallelism.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment