• fanwl's avatar
    Add FA Unified Attention 2D · eb35ba1b
    fanwl authored
    - Add VLLM_V1_USE_FA_UNIFIED_ATTN_2D 环境变量
    - 0: Triton attention, 1: FA unified attention
    eb35ba1b
envs.py 93.1 KB