Commit 083ff6ec authored by zhuwenwen's avatar zhuwenwen
Browse files

when v1 is set to block_size 16, switch to triton implementation

parent a4fc4d7e
...@@ -299,6 +299,7 @@ class RocmPlatform(Platform): ...@@ -299,6 +299,7 @@ class RocmPlatform(Platform):
return FLASH_ATTN_V1 return FLASH_ATTN_V1
else: else:
os.environ['VLLM_USE_FLASH_ATTN_PA'] = '0'
logger.info_once("Using Triton backend on V1 engine.") logger.info_once("Using Triton backend on V1 engine.")
return TRITON_ATTN_VLLM_V1 return TRITON_ATTN_VLLM_V1
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment