Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
22482e49
Unverified
Commit
22482e49
authored
Oct 04, 2024
by
Lucas Wilkinson
Committed by
GitHub
Oct 04, 2024
Browse files
[Bugfix] Flash attention arches not getting set properly (#9062)
parent
3d826d2c
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
11 additions
and
0 deletions
+11
-0
CMakeLists.txt
CMakeLists.txt
+11
-0
No files found.
CMakeLists.txt
View file @
22482e49
...
@@ -482,6 +482,17 @@ if (NOT VLLM_TARGET_DEVICE STREQUAL "cuda")
...
@@ -482,6 +482,17 @@ if (NOT VLLM_TARGET_DEVICE STREQUAL "cuda")
return
()
return
()
endif
()
endif
()
# vLLM flash attention requires VLLM_GPU_ARCHES to contain the set of target
# arches in the CMake syntax (75-real, 89-virtual, etc), since we clear the
# arches in the CUDA case (and instead set the gencodes on a per file basis)
# we need to manually set VLLM_GPU_ARCHES here.
if
(
VLLM_GPU_LANG STREQUAL
"CUDA"
)
foreach
(
_ARCH
${
CUDA_ARCHS
}
)
string
(
REPLACE
"."
""
_ARCH
"
${
_ARCH
}
"
)
list
(
APPEND VLLM_GPU_ARCHES
"
${
_ARCH
}
-real"
)
endforeach
()
endif
()
#
#
# Build vLLM flash attention from source
# Build vLLM flash attention from source
#
#
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment