Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
971a0dfa
"tests/vscode:/vscode.git/clone" did not exist on "c375903db58826494d858e02b44d21b42669ff5e"
Unverified
Commit
971a0dfa
authored
Jun 08, 2025
by
Baizhou Zhang
Committed by
GitHub
Jun 08, 2025
Browse files
Extend cuda graph capture bs for B200 (#6937)
parent
2fc12995
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
0 deletions
+2
-0
python/sglang/srt/model_executor/cuda_graph_runner.py
python/sglang/srt/model_executor/cuda_graph_runner.py
+2
-0
No files found.
python/sglang/srt/model_executor/cuda_graph_runner.py
View file @
971a0dfa
...
...
@@ -139,6 +139,8 @@ def get_batch_sizes_to_capture(model_runner: ModelRunner):
gpu_mem
=
get_device_memory_capacity
()
if
gpu_mem
is
not
None
and
gpu_mem
>
96
*
1024
:
capture_bs
+=
list
(
range
(
160
,
257
,
8
))
if
gpu_mem
is
not
None
and
gpu_mem
>
180
*
1000
:
capture_bs
+=
list
(
range
(
256
,
513
,
16
))
if
max
(
capture_bs
)
>
model_runner
.
req_to_token_pool
.
size
:
# In some cases (e.g., with a small GPU or --max-running-requests), the #max-running-requests
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment