[Kernel] Update vllm-flash-attn version to reduce CPU overheads (#10742)

Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>

[Kernel] Update vllm-flash-attn version to reduce CPU overheads (#10742)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
8c1e77fb · Woosuk Kwon · GitHub · 5fc5ce0f · 8c1e77fb
Unverified Commit 8c1e77fb authored Nov 28, 2024 by Woosuk Kwon Committed by GitHub Nov 28, 2024
Show whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

CMakeLists.txt CMakeLists.txt +1 -1

No files found.
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -522,7 +522,7 @@ else()
  FetchContent_Declare(
          vllm-flash-attn
          GIT_REPOSITORY https://github.com/vllm-project/flash-attention.git
-          GIT_TAG d886f88165702b3c7e7744502772cd98b06be9e1
+          GIT_TAG fdf6d72b48aea41f4ae6a89139a453dae554abc8
          GIT_PROGRESS TRUE
          # Don't share the vllm-flash-attn build between build types
          BINARY_DIR ${CMAKE_BINARY_DIR}/vllm-flash-attn