Unverified Commit 8c1e77fb authored by Woosuk Kwon's avatar Woosuk Kwon Committed by GitHub
Browse files

[Kernel] Update vllm-flash-attn version to reduce CPU overheads (#10742)


Signed-off-by: default avatarWoosuk Kwon <woosuk.kwon@berkeley.edu>
parent 5fc5ce0f
......@@ -522,7 +522,7 @@ else()
FetchContent_Declare(
vllm-flash-attn
GIT_REPOSITORY https://github.com/vllm-project/flash-attention.git
GIT_TAG d886f88165702b3c7e7744502772cd98b06be9e1
GIT_TAG fdf6d72b48aea41f4ae6a89139a453dae554abc8
GIT_PROGRESS TRUE
# Don't share the vllm-flash-attn build between build types
BINARY_DIR ${CMAKE_BINARY_DIR}/vllm-flash-attn
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment