Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
zhaoyu6
sglang
Commits
4a102a2b
"python/vscode:/vscode.git/clone" did not exist on "a5a134f39f9b032496fa895050e56485d8fe9957"
Unverified
Commit
4a102a2b
authored
Jun 10, 2025
by
Lianmin Zheng
Committed by
GitHub
Jun 10, 2025
Browse files
Minor style fix in cuda_graph_runner.py (#7053)
parent
6406408a
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
4 deletions
+5
-4
python/sglang/srt/model_executor/cuda_graph_runner.py
python/sglang/srt/model_executor/cuda_graph_runner.py
+5
-4
No files found.
python/sglang/srt/model_executor/cuda_graph_runner.py
View file @
4a102a2b
...
@@ -152,10 +152,11 @@ def get_batch_sizes_to_capture(model_runner: ModelRunner):
...
@@ -152,10 +152,11 @@ def get_batch_sizes_to_capture(model_runner: ModelRunner):
)
)
gpu_mem
=
get_device_memory_capacity
()
gpu_mem
=
get_device_memory_capacity
()
if
gpu_mem
is
not
None
and
gpu_mem
>
96
*
1024
:
if
gpu_mem
is
not
None
:
capture_bs
+=
list
(
range
(
160
,
257
,
8
))
if
gpu_mem
>
90
*
1024
:
# H200
if
gpu_mem
is
not
None
and
gpu_mem
>
180
*
1000
:
capture_bs
+=
list
(
range
(
160
,
257
,
8
))
capture_bs
+=
list
(
range
(
256
,
513
,
16
))
if
gpu_mem
>
160
*
1000
:
# B200, MI300
capture_bs
+=
list
(
range
(
256
,
513
,
16
))
if
max
(
capture_bs
)
>
model_runner
.
req_to_token_pool
.
size
:
if
max
(
capture_bs
)
>
model_runner
.
req_to_token_pool
.
size
:
# In some cases (e.g., with a small GPU or --max-running-requests), the #max-running-requests
# In some cases (e.g., with a small GPU or --max-running-requests), the #max-running-requests
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment