Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
e2c8f1ed
Unverified
Commit
e2c8f1ed
authored
Aug 07, 2025
by
Andrew Sansom
Committed by
GitHub
Aug 07, 2025
Browse files
[PERF] Use pybase64 to more quickly decode prompt embeddings (#22469)
Signed-off-by:
Andrew Sansom
<
andrew@protopia.ai
>
parent
1ee5ead5
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
2 deletions
+3
-2
vllm/entrypoints/openai/serving_engine.py
vllm/entrypoints/openai/serving_engine.py
+3
-2
No files found.
vllm/entrypoints/openai/serving_engine.py
View file @
e2c8f1ed
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
import
asyncio
import
base64
import
io
import
json
import
sys
...
...
@@ -12,6 +11,7 @@ from http import HTTPStatus
from
typing
import
(
Annotated
,
Any
,
Callable
,
ClassVar
,
Generic
,
Optional
,
TypeVar
,
Union
,
cast
,
overload
)
import
pybase64
import
torch
from
fastapi
import
Request
from
pydantic
import
BaseModel
,
ConfigDict
,
Field
...
...
@@ -1008,7 +1008,8 @@ class OpenAIServing:
)
->
list
[
EmbedsPrompt
]:
def
_load_and_validate_embed
(
embed
:
bytes
)
->
EmbedsPrompt
:
tensor
=
torch
.
load
(
io
.
BytesIO
(
base64
.
b64decode
(
embed
)),
tensor
=
torch
.
load
(
io
.
BytesIO
(
pybase64
.
b64decode
(
embed
,
validate
=
True
)),
weights_only
=
True
)
assert
isinstance
(
tensor
,
torch
.
Tensor
)
and
tensor
.
dtype
in
(
torch
.
float32
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment