Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
4bec99ec
Unverified
Commit
4bec99ec
authored
Aug 02, 2025
by
Yusong Gao
Committed by
GitHub
Aug 02, 2025
Browse files
Fix: resolve prefill of retracted request out-of-memory issue when ignore_eos is enabled (#7434)
parent
89caf7a3
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
1 deletion
+3
-1
python/sglang/srt/managers/schedule_policy.py
python/sglang/srt/managers/schedule_policy.py
+3
-1
No files found.
python/sglang/srt/managers/schedule_policy.py
View file @
4bec99ec
...
...
@@ -455,7 +455,9 @@ class PrefillAdder:
if
not
self
.
is_hybrid
:
# Skip this logic for swa. The SWA has different memory management, and
# this mechanism is underestimating the memory usage.
cur_rem_tokens
=
self
.
cur_rem_tokens
-
len
(
req
.
origin_input_ids
)
cur_rem_tokens
=
self
.
cur_rem_tokens
-
self
.
ceil_paged_tokens
(
req
.
extend_input_len
)
tokens_freed
=
0
for
i
,
(
tokens_left
,
tokens_occupied
)
in
enumerate
(
self
.
req_states
):
# tokens_left gives a reservative calculation as the last token is not stored
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment