Unverified Commit a073be6d authored by Chen Zhang's avatar Chen Zhang Committed by GitHub
Browse files

[Doc] Update the doc for log probs + prefix caching (#23399)


Signed-off-by: default avatarChen Zhang <zhangch99@outlook.com>
Co-authored-by: default avatargemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
parent 695e7adc
...@@ -166,7 +166,7 @@ Processed means the values after applying all processors, including temperature ...@@ -166,7 +166,7 @@ Processed means the values after applying all processors, including temperature
##### Prompt Logprobs with Prefix Caching ##### Prompt Logprobs with Prefix Caching
Currently prompt logprobs are only supported when prefix caching is turned off via `--no-enable-prefix-caching`. In a future release, prompt logprobs will be compatible with prefix caching, but a recomputation will be triggered to recover the full prompt logprobs even upon a prefix cache hit. See details in [RFC #13414](gh-issue:13414). Logprobs are not cached. For a request requiring prompt logprobs, the engine will ignore the prefix cache and recompute the prefill of full prompt to generate the logprobs.
#### Deprecated Features #### Deprecated Features
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment