Fix flash attention speed issue (#32028)

Add the lru_cache for speed

Fix flash attention speed issue (#32028)
Add the lru_cache for speed
a5b226ce · Cyril Vallez · GitHub · a1844a32 · a5b226ce
Unverified Commit a5b226ce authored Jul 23, 2024 by Cyril Vallez Committed by GitHub Jul 23, 2024
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 0 deletions

src/transformers/utils/import_utils.py src/transformers/utils/import_utils.py +1 -0

No files found.
--- a/src/transformers/utils/import_utils.py
+++ b/src/transformers/utils/import_utils.py
@@ -820,6 +820,7 @@ def is_flash_attn_greater_or_equal_2_10():
    return version.parse(importlib.metadata.version("flash_attn")) >= version.parse("2.1.0")
+@lru_cache()
 def is_flash_attn_greater_or_equal(library_version: str):
    if not _is_package_available("flash_attn"):
        return False