Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
a80bcb5a
"host/online_compile/include/manage_ptr.hpp" did not exist on "1264925422920f24b3bb4fa34f178e31a23c97b5"
Unverified
Commit
a80bcb5a
authored
Oct 31, 2025
by
yinghui
Committed by
GitHub
Oct 31, 2025
Browse files
Add env var to disable FA4 warmup (#12430)
parent
f7f9e41b
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
11 additions
and
0 deletions
+11
-0
docs/references/environment_variables.md
docs/references/environment_variables.md
+1
-0
sgl-kernel/python/sgl_kernel/_fa4_interface.py
sgl-kernel/python/sgl_kernel/_fa4_interface.py
+10
-0
No files found.
docs/references/environment_variables.md
View file @
a80bcb5a
...
...
@@ -28,6 +28,7 @@ SGLang supports various environment variables that can be used to configure its
|
`SGLANG_SKIP_P2P_CHECK`
| Skip P2P (peer-to-peer) access check |
`false`
|
|
`SGL_CHUNKED_PREFIX_CACHE_THRESHOLD`
| Sets the threshold for enabling chunked prefix caching |
`8192`
|
|
`SGLANG_FUSED_MLA_ENABLE_ROPE_FUSION`
| Enable RoPE fusion in Fused Multi-Layer Attention |
`1`
|
|
`SGLANG_DISABLE_FA4_WARMUP`
| Disable Flash Attention 4 warmup passes (set to
`1`
,
`true`
,
`yes`
, or
`on`
to disable) |
`false`
|
## DeepGEMM Configuration (Advanced Optimization)
...
...
sgl-kernel/python/sgl_kernel/_fa4_interface.py
View file @
a80bcb5a
...
...
@@ -8,6 +8,7 @@ import copy
import
gc
import
logging
import
math
import
os
from
typing
import
Callable
,
Optional
,
Tuple
logger
=
logging
.
getLogger
(
__name__
)
...
...
@@ -416,6 +417,15 @@ def warmup_flash_attn(f):
- Executes sequentially to minimize peak GPU mem
- Does not modify user tensors (clones)
"""
disable_warmup
=
os
.
getenv
(
"SGLANG_DISABLE_FA4_WARMUP"
,
""
).
lower
()
in
(
"1"
,
"true"
,
"yes"
,
"on"
,
)
if
disable_warmup
:
return
f
done
=
False
def
_clone_args
(
args
,
kwargs
):
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment