Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
65e03893
Unverified
Commit
65e03893
authored
Sep 05, 2025
by
Nicolò Lucchesi
Committed by
GitHub
Sep 04, 2025
Browse files
[Frontend] Skip unnecessary detokenization when token_id is requested (#24236)
Signed-off-by:
NickLucche
<
nlucches@redhat.com
>
parent
886ccbe5
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
1 deletion
+2
-1
vllm/entrypoints/openai/serving_chat.py
vllm/entrypoints/openai/serving_chat.py
+2
-1
No files found.
vllm/entrypoints/openai/serving_chat.py
View file @
65e03893
...
@@ -1419,9 +1419,10 @@ class OpenAIServingChat(OpenAIServing):
...
@@ -1419,9 +1419,10 @@ class OpenAIServingChat(OpenAIServing):
step_top_logprobs
=
top_logprobs
[
i
]
step_top_logprobs
=
top_logprobs
[
i
]
if
step_top_logprobs
is
None
or
step_top_logprobs
.
get
(
if
step_top_logprobs
is
None
or
step_top_logprobs
.
get
(
token_id
)
is
None
:
token_id
)
is
None
:
token
=
tokenizer
.
decode
(
token_id
)
if
should_return_as_token_id
:
if
should_return_as_token_id
:
token
=
f
"token_id:
{
token_id
}
"
token
=
f
"token_id:
{
token_id
}
"
else
:
token
=
tokenizer
.
decode
(
token_id
)
logprobs_content
.
append
(
logprobs_content
.
append
(
ChatCompletionLogProbsContent
(
ChatCompletionLogProbsContent
(
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment