Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
a4d28758
Commit
a4d28758
authored
Feb 11, 2026
by
zhuwenwen
Browse files
update Q/K/V_SCALE_CONSTANT
parent
04343d9d
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
6 deletions
+6
-6
vllm/envs.py
vllm/envs.py
+6
-6
No files found.
vllm/envs.py
View file @
a4d28758
...
@@ -118,9 +118,9 @@ if TYPE_CHECKING:
...
@@ -118,9 +118,9 @@ if TYPE_CHECKING:
VLLM_ENABLE_V1_MULTIPROCESSING
:
bool
=
True
VLLM_ENABLE_V1_MULTIPROCESSING
:
bool
=
True
VLLM_LOG_BATCHSIZE_INTERVAL
:
float
=
-
1
VLLM_LOG_BATCHSIZE_INTERVAL
:
float
=
-
1
VLLM_DISABLE_COMPILE_CACHE
:
bool
=
False
VLLM_DISABLE_COMPILE_CACHE
:
bool
=
False
Q_SCALE_CONSTANT
:
int
=
20
0
Q_SCALE_CONSTANT
:
int
=
1
0
K_SCALE_CONSTANT
:
int
=
20
0
K_SCALE_CONSTANT
:
int
=
1
0
V_SCALE_CONSTANT
:
int
=
10
0
V_SCALE_CONSTANT
:
int
=
10
VLLM_SERVER_DEV_MODE
:
bool
=
False
VLLM_SERVER_DEV_MODE
:
bool
=
False
VLLM_V1_OUTPUT_PROC_CHUNK_SIZE
:
int
=
128
VLLM_V1_OUTPUT_PROC_CHUNK_SIZE
:
int
=
128
VLLM_MLA_DISABLE
:
bool
=
False
VLLM_MLA_DISABLE
:
bool
=
False
...
@@ -1049,13 +1049,13 @@ environment_variables: dict[str, Callable[[], Any]] = {
...
@@ -1049,13 +1049,13 @@ environment_variables: dict[str, Callable[[], Any]] = {
# Divisor for dynamic query scale factor calculation for FP8 KV Cache
# Divisor for dynamic query scale factor calculation for FP8 KV Cache
"Q_SCALE_CONSTANT"
:
"Q_SCALE_CONSTANT"
:
lambda
:
int
(
os
.
getenv
(
"Q_SCALE_CONSTANT"
,
"
20
0"
)),
lambda
:
int
(
os
.
getenv
(
"Q_SCALE_CONSTANT"
,
"
1
0"
)),
# Divisor for dynamic key scale factor calculation for FP8 KV Cache
# Divisor for dynamic key scale factor calculation for FP8 KV Cache
"K_SCALE_CONSTANT"
:
"K_SCALE_CONSTANT"
:
lambda
:
int
(
os
.
getenv
(
"K_SCALE_CONSTANT"
,
"
20
0"
)),
lambda
:
int
(
os
.
getenv
(
"K_SCALE_CONSTANT"
,
"
1
0"
)),
# Divisor for dynamic value scale factor calculation for FP8 KV Cache
# Divisor for dynamic value scale factor calculation for FP8 KV Cache
"V_SCALE_CONSTANT"
:
"V_SCALE_CONSTANT"
:
lambda
:
int
(
os
.
getenv
(
"V_SCALE_CONSTANT"
,
"10
0
"
)),
lambda
:
int
(
os
.
getenv
(
"V_SCALE_CONSTANT"
,
"10"
)),
# If set, enable multiprocessing in LLM for the V1 code path.
# If set, enable multiprocessing in LLM for the V1 code path.
"VLLM_ENABLE_V1_MULTIPROCESSING"
:
"VLLM_ENABLE_V1_MULTIPROCESSING"
:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment