Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
6557f493
Unverified
Commit
6557f493
authored
Mar 30, 2026
by
Li, Jiang
Committed by
GitHub
Mar 30, 2026
Browse files
[Bugfix][CPU] Skip set_num_threads after thread binding (#38535)
Signed-off-by:
jiang1.li
<
jiang1.li@intel.com
>
parent
677424c7
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
12 additions
and
2 deletions
+12
-2
.buildkite/scripts/hardware_ci/run-cpu-distributed-smoke-test.sh
...ite/scripts/hardware_ci/run-cpu-distributed-smoke-test.sh
+3
-2
vllm/v1/worker/cpu_worker.py
vllm/v1/worker/cpu_worker.py
+9
-0
No files found.
.buildkite/scripts/hardware_ci/run-cpu-distributed-smoke-test.sh
View file @
6557f493
#!/bin/bash
#!/bin/bash
set
-euox
pipefail
set
-euox
pipefail
export
VLLM_CPU_CI_ENV
=
0
export
VLLM_CPU_CI_ENV
=
0
export
VLLM_CPU_KVCACHE_SPACE
=
1
# avoid OOM
echo
"--- PP+TP"
echo
"--- PP+TP"
vllm serve meta-llama/Llama-3.2-3B-Instruct
-tp
=
2
-pp
=
2 &
vllm serve meta-llama/Llama-3.2-3B-Instruct
-tp
=
2
-pp
=
2
--max-model-len
=
4096
&
server_pid
=
$!
server_pid
=
$!
timeout
600 bash
-c
"until curl localhost:8000/v1/models > /dev/null 2>&1; do sleep 1; done"
||
exit
1
timeout
600 bash
-c
"until curl localhost:8000/v1/models > /dev/null 2>&1; do sleep 1; done"
||
exit
1
vllm bench serve
\
vllm bench serve
\
...
@@ -23,7 +24,7 @@ if [ "$failed_req" -ne 0 ]; then
...
@@ -23,7 +24,7 @@ if [ "$failed_req" -ne 0 ]; then
fi
fi
echo
"--- DP+TP"
echo
"--- DP+TP"
vllm serve meta-llama/Llama-3.2-3B-Instruct
-tp
=
2
-dp
=
2 &
vllm serve meta-llama/Llama-3.2-3B-Instruct
-tp
=
2
-dp
=
2
--max-model-len
=
4096
&
server_pid
=
$!
server_pid
=
$!
timeout
600 bash
-c
"until curl localhost:8000/v1/models > /dev/null 2>&1; do sleep 1; done"
||
exit
1
timeout
600 bash
-c
"until curl localhost:8000/v1/models > /dev/null 2>&1; do sleep 1; done"
||
exit
1
vllm bench serve
\
vllm bench serve
\
...
...
vllm/v1/worker/cpu_worker.py
View file @
6557f493
...
@@ -108,6 +108,15 @@ class CPUWorker(Worker):
...
@@ -108,6 +108,15 @@ class CPUWorker(Worker):
if
ret
:
if
ret
:
logger
.
info
(
ret
)
logger
.
info
(
ret
)
# After the thread binding, changing thread num is not allowed
def
skip_set_num_threads
(
x
:
int
):
logger
.
warning
(
"CPU backend doesn't allow to use "
"`torch.set_num_threads` after the thread binding, skip it."
)
torch
.
set_num_threads
=
skip_set_num_threads
# Note: unique identifier for creating allreduce shared memory
# Note: unique identifier for creating allreduce shared memory
os
.
environ
[
"VLLM_DIST_IDENT"
]
=
self
.
distributed_init_method
.
split
(
":"
)[
-
1
]
os
.
environ
[
"VLLM_DIST_IDENT"
]
=
self
.
distributed_init_method
.
split
(
":"
)[
-
1
]
# Initialize the distributed environment.
# Initialize the distributed environment.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment