Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
8fae5ed7
Unverified
Commit
8fae5ed7
authored
Sep 25, 2024
by
Woo-Yeon Lee
Committed by
GitHub
Sep 25, 2024
Browse files
[Misc] Fix minor typo in scheduler (#8765)
parent
3368c3ab
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
3 deletions
+3
-3
vllm/core/scheduler.py
vllm/core/scheduler.py
+3
-3
No files found.
vllm/core/scheduler.py
View file @
8fae5ed7
...
...
@@ -1554,14 +1554,14 @@ class Scheduler:
# the number of new tokens that is dividable by the block size
# to avoid partial block matching.
block_size
=
self
.
cache_config
.
block_size
reminder
=
budget
.
token_budget
%
block_size
if
reminder
!=
0
:
rem
a
inder
=
budget
.
token_budget
%
block_size
if
rem
a
inder
!=
0
:
raise
ValueError
(
"When enabling chunked prefill and "
"prefix caching, max_num_batched_tokens "
"(chunk size) must be dividable by "
"block size, but got chunk_size "
f
"(
{
budget
.
token_budget
}
) % block_size "
f
"(
{
block_size
}
) =
{
reminder
}
"
)
f
"(
{
block_size
}
) =
{
rem
a
inder
}
"
)
if
remaining_token_budget
<
num_new_tokens
:
num_new_tokens
=
(
remaining_token_budget
//
block_size
)
*
block_size
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment