Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
d201d419
Unverified
Commit
d201d419
authored
Nov 12, 2024
by
Yuan
Committed by
GitHub
Nov 12, 2024
Browse files
[CI][CPU]refactor CPU tests to allow to bind with different cores (#10222)
Signed-off-by:
Yuan Zhou
<
yuan.zhou@intel.com
>
parent
3a28f18b
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
11 additions
and
7 deletions
+11
-7
.buildkite/run-cpu-test.sh
.buildkite/run-cpu-test.sh
+11
-7
No files found.
.buildkite/run-cpu-test.sh
View file @
d201d419
...
@@ -4,9 +4,13 @@
...
@@ -4,9 +4,13 @@
# It serves a sanity check for compilation and basic model usage.
# It serves a sanity check for compilation and basic model usage.
set
-ex
set
-ex
# allow to bind to different cores
CORE_RANGE
=
${
CORE_RANGE
:-
48
-95
}
NUMA_NODE
=
${
NUMA_NODE
:-
1
}
# Try building the docker image
# Try building the docker image
numactl
-C
48-95
-N
1
docker build
-t
cpu-test
-f
Dockerfile.cpu
.
numactl
-C
$CORE_RANGE
-N
$NUMA_NODE
docker build
-t
cpu-test
-f
Dockerfile.cpu
.
numactl
-C
48-95
-N
1
docker build
--build-arg
VLLM_CPU_DISABLE_AVX512
=
"true"
-t
cpu-test-avx2
-f
Dockerfile.cpu
.
numactl
-C
$CORE_RANGE
-N
$NUMA_NODE
docker build
--build-arg
VLLM_CPU_DISABLE_AVX512
=
"true"
-t
cpu-test-avx2
-f
Dockerfile.cpu
.
# Setup cleanup
# Setup cleanup
remove_docker_container
()
{
docker
rm
-f
cpu-test cpu-test-avx2
||
true
;
}
remove_docker_container
()
{
docker
rm
-f
cpu-test cpu-test-avx2
||
true
;
}
...
@@ -14,10 +18,10 @@ trap remove_docker_container EXIT
...
@@ -14,10 +18,10 @@ trap remove_docker_container EXIT
remove_docker_container
remove_docker_container
# Run the image, setting --shm-size=4g for tensor parallel.
# Run the image, setting --shm-size=4g for tensor parallel.
docker run
-itd
--entrypoint
/bin/bash
-v
~/.cache/huggingface:/root/.cache/huggingface
--cpuset-cpus
=
48-95
\
docker run
-itd
--entrypoint
/bin/bash
-v
~/.cache/huggingface:/root/.cache/huggingface
--cpuset-cpus
=
$CORE_RANGE
\
--cpuset-mems
=
1
--privileged
=
true
--network
host
-e
HF_TOKEN
--env
VLLM_CPU_KVCACHE_SPACE
=
4
--shm-size
=
4g
--name
cpu-test cpu-test
--cpuset-mems
=
$NUMA_NODE
--privileged
=
true
--network
host
-e
HF_TOKEN
--env
VLLM_CPU_KVCACHE_SPACE
=
4
--shm-size
=
4g
--name
cpu-test cpu-test
docker run
-itd
--entrypoint
/bin/bash
-v
~/.cache/huggingface:/root/.cache/huggingface
--cpuset-cpus
=
48-95
\
docker run
-itd
--entrypoint
/bin/bash
-v
~/.cache/huggingface:/root/.cache/huggingface
--cpuset-cpus
=
$CORE_RANGE
\
--cpuset-mems
=
1
--privileged
=
true
--network
host
-e
HF_TOKEN
--env
VLLM_CPU_KVCACHE_SPACE
=
4
--shm-size
=
4g
--name
cpu-test-avx2 cpu-test-avx2
--cpuset-mems
=
$NUMA_NODE
--privileged
=
true
--network
host
-e
HF_TOKEN
--env
VLLM_CPU_KVCACHE_SPACE
=
4
--shm-size
=
4g
--name
cpu-test-avx2 cpu-test-avx2
function
cpu_tests
()
{
function
cpu_tests
()
{
set
-e
set
-e
...
@@ -57,7 +61,7 @@ function cpu_tests() {
...
@@ -57,7 +61,7 @@ function cpu_tests() {
docker
exec
cpu-test bash
-c
"
docker
exec
cpu-test bash
-c
"
set -e
set -e
export VLLM_CPU_KVCACHE_SPACE=10
export VLLM_CPU_KVCACHE_SPACE=10
export VLLM_CPU_OMP_THREADS_BIND=
48-92
export VLLM_CPU_OMP_THREADS_BIND=
$CORE_RANGE
python3 -m vllm.entrypoints.openai.api_server --model facebook/opt-125m --dtype half &
python3 -m vllm.entrypoints.openai.api_server --model facebook/opt-125m --dtype half &
timeout 600 bash -c 'until curl localhost:8000/v1/models; do sleep 1; done' || exit 1
timeout 600 bash -c 'until curl localhost:8000/v1/models; do sleep 1; done' || exit 1
python3 benchmarks/benchmark_serving.py
\
python3 benchmarks/benchmark_serving.py
\
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment