• zhuwenwen's avatar
    add VLLM_USE_LIGHTOP_MOE_SUM_MUL_ADD · c2e6f453
    zhuwenwen authored
    support prefix cache on kme
    fix the error in test_moe caused by moe align not supporting 511 and 211
    multi-modal switching to torch implementation on z100l&k100
    c2e6f453
utils.py 16.4 KB