Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
d2aab336
Unverified
Commit
d2aab336
authored
Jul 31, 2025
by
Daniele
Committed by
GitHub
Jul 31, 2025
Browse files
[CI/Build] get rid of unused VLLM_FA_CMAKE_GPU_ARCHES (#21599)
Signed-off-by:
Daniele Trifirò
<
dtrifiro@redhat.com
>
parent
9532a6d5
Changes
5
Hide whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
2 additions
and
11 deletions
+2
-11
.buildkite/scripts/hardware_ci/run-gh200-test.sh
.buildkite/scripts/hardware_ci/run-gh200-test.sh
+1
-2
.github/workflows/scripts/build.sh
.github/workflows/scripts/build.sh
+0
-1
docker/Dockerfile
docker/Dockerfile
+0
-3
docker/Dockerfile.nightly_torch
docker/Dockerfile.nightly_torch
+0
-3
docs/deployment/docker.md
docs/deployment/docker.md
+1
-2
No files found.
.buildkite/scripts/hardware_ci/run-gh200-test.sh
View file @
d2aab336
...
...
@@ -16,8 +16,7 @@ DOCKER_BUILDKIT=1 docker build . \
--build-arg
max_jobs
=
66
\
--build-arg
nvcc_threads
=
2
\
--build-arg
RUN_WHEEL_CHECK
=
false
\
--build-arg
torch_cuda_arch_list
=
"9.0+PTX"
\
--build-arg
vllm_fa_cmake_gpu_arches
=
"90-real"
--build-arg
torch_cuda_arch_list
=
"9.0+PTX"
# Setup cleanup
remove_docker_container
()
{
docker
rm
-f
gh200-test
||
true
;
}
...
...
.github/workflows/scripts/build.sh
View file @
d2aab336
...
...
@@ -15,7 +15,6 @@ $python_executable -m pip install -r requirements/build.txt -r requirements/cuda
export
MAX_JOBS
=
1
# Make sure release wheels are built for the following architectures
export
TORCH_CUDA_ARCH_LIST
=
"7.0 7.5 8.0 8.6 8.9 9.0+PTX"
export
VLLM_FA_CMAKE_GPU_ARCHES
=
"80-real;90-real"
bash tools/check_repo.sh
...
...
docker/Dockerfile
View file @
d2aab336
...
...
@@ -164,9 +164,6 @@ RUN --mount=type=cache,target=/root/.cache/uv \
# see https://github.com/pytorch/pytorch/pull/123243
ARG
torch_cuda_arch_list='7.0 7.5 8.0 8.9 9.0 10.0 12.0'
ENV
TORCH_CUDA_ARCH_LIST=${torch_cuda_arch_list}
# Override the arch list for flash-attn to reduce the binary size
ARG
vllm_fa_cmake_gpu_arches='80-real;90-real'
ENV
VLLM_FA_CMAKE_GPU_ARCHES=${vllm_fa_cmake_gpu_arches}
#################### BASE BUILD IMAGE ####################
#################### WHEEL BUILD IMAGE ####################
...
...
docker/Dockerfile.nightly_torch
View file @
d2aab336
...
...
@@ -114,9 +114,6 @@ RUN cat torch_build_versions.txt
# explicitly set the list to avoid issues with torch 2.2
# see https://github.com/pytorch/pytorch/pull/123243
# Override the arch list for flash-attn to reduce the binary size
ARG vllm_fa_cmake_gpu_arches='80-real;90-real'
ENV VLLM_FA_CMAKE_GPU_ARCHES=${vllm_fa_cmake_gpu_arches}
#################### BASE BUILD IMAGE ####################
#################### WHEEL BUILD IMAGE ####################
...
...
docs/deployment/docker.md
View file @
d2aab336
...
...
@@ -106,8 +106,7 @@ of PyTorch Nightly and should be considered **experimental**. Using the flag `--
-t vllm/vllm-gh200-openai:latest \
--build-arg max_jobs=66 \
--build-arg nvcc_threads=2 \
--build-arg torch_cuda_arch_list="9.0 10.0+PTX" \
--build-arg vllm_fa_cmake_gpu_arches="90-real"
--build-arg torch_cuda_arch_list="9.0 10.0+PTX"
```
!!! note
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment