Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
3b450752
Unverified
Commit
3b450752
authored
Oct 18, 2025
by
Nick Hill
Committed by
GitHub
Oct 18, 2025
Browse files
[Minor] Add some clarifying comments to recent changes (#27130)
Signed-off-by:
Nick Hill
<
nhill@redhat.com
>
parent
168e578e
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
10 additions
and
2 deletions
+10
-2
vllm/distributed/device_communicators/shm_broadcast.py
vllm/distributed/device_communicators/shm_broadcast.py
+7
-1
vllm/v1/core/sched/output.py
vllm/v1/core/sched/output.py
+3
-1
No files found.
vllm/distributed/device_communicators/shm_broadcast.py
View file @
3b450752
...
@@ -236,7 +236,9 @@ class MessageQueue:
...
@@ -236,7 +236,9 @@ class MessageQueue:
n_reader
,
# number of all readers
n_reader
,
# number of all readers
n_local_reader
,
# number of local readers through shared memory
n_local_reader
,
# number of local readers through shared memory
local_reader_ranks
:
list
[
int
]
|
None
=
None
,
local_reader_ranks
:
list
[
int
]
|
None
=
None
,
max_chunk_bytes
:
int
=
1024
*
1024
*
24
,
# 24MiB
# Default of 24MiB chosen to be large enough to accommodate grammar
# bitmask tensors for large batches (1024 requests).
max_chunk_bytes
:
int
=
1024
*
1024
*
24
,
max_chunks
:
int
=
10
,
max_chunks
:
int
=
10
,
connect_ip
:
str
|
None
=
None
,
connect_ip
:
str
|
None
=
None
,
):
):
...
@@ -538,6 +540,10 @@ class MessageQueue:
...
@@ -538,6 +540,10 @@ class MessageQueue:
buf
[
0
]
=
1
# overflow
buf
[
0
]
=
1
# overflow
self
.
local_socket
.
send_multipart
(
all_buffers
,
copy
=
False
)
self
.
local_socket
.
send_multipart
(
all_buffers
,
copy
=
False
)
else
:
else
:
# Byte 0: 0
# Bytes 1-2: Count of buffers
# Then each buffer follows, preceded by 4 bytes containing its length:
# [4 byte int L][L bytes of buffer content] ...
with
self
.
acquire_write
(
timeout
)
as
buf
:
with
self
.
acquire_write
(
timeout
)
as
buf
:
buf
[
0
]
=
0
# not overflow
buf
[
0
]
=
0
# not overflow
offset
=
3
offset
=
3
...
...
vllm/v1/core/sched/output.py
View file @
3b450752
...
@@ -165,7 +165,9 @@ class SchedulerOutput:
...
@@ -165,7 +165,9 @@ class SchedulerOutput:
# freed from the encoder cache.
# freed from the encoder cache.
free_encoder_mm_hashes
:
list
[
str
]
free_encoder_mm_hashes
:
list
[
str
]
# ids of structured outputs requests included in the bitmask, in order.
# ids of structured outputs requests included in the bitmask, in the
# same order as the corresponding stacked rows of the bitmask.
# There may be more than one row per request in the case of speculative decoding.
structured_output_request_ids
:
list
[
str
]
structured_output_request_ids
:
list
[
str
]
# the bitmask for the whole batch
# the bitmask for the whole batch
grammar_bitmask
:
"npt.NDArray[np.int32] | None"
grammar_bitmask
:
"npt.NDArray[np.int32] | None"
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment