Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
6b6d4961
Unverified
Commit
6b6d4961
authored
May 27, 2025
by
chunxiaozheng
Committed by
GitHub
May 27, 2025
Browse files
optimize get_kv_cache_torch_dtype (#18531)
Signed-off-by:
idellzheng
<
idellzheng@tencent.com
>
parent
aaa4ac1c
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
4 deletions
+3
-4
vllm/utils.py
vllm/utils.py
+3
-4
No files found.
vllm/utils.py
View file @
6b6d4961
...
...
@@ -759,16 +759,15 @@ def get_kv_cache_torch_dtype(
model_dtype
:
Optional
[
Union
[
str
,
torch
.
dtype
]]
=
None
)
->
torch
.
dtype
:
if
isinstance
(
cache_dtype
,
str
):
if
cache_dtype
==
"auto"
:
if
isinstance
(
model_dtype
,
str
):
if
isinstance
(
model_dtype
,
str
)
and
model_dtype
in
STR_DTYPE_TO_TORCH_DTYPE
:
torch_dtype
=
STR_DTYPE_TO_TORCH_DTYPE
[
model_dtype
]
elif
isinstance
(
model_dtype
,
torch
.
dtype
):
torch_dtype
=
model_dtype
else
:
raise
ValueError
(
f
"Invalid model dtype:
{
model_dtype
}
"
)
elif
cache_dtype
in
[
"half"
,
"bfloat16"
,
"float"
]
:
elif
cache_dtype
in
STR_DTYPE_TO_TORCH_DTYPE
:
torch_dtype
=
STR_DTYPE_TO_TORCH_DTYPE
[
cache_dtype
]
elif
cache_dtype
==
"fp8"
:
torch_dtype
=
torch
.
uint8
else
:
raise
ValueError
(
f
"Invalid kv cache dtype:
{
cache_dtype
}
"
)
elif
isinstance
(
cache_dtype
,
torch
.
dtype
):
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment