"benchmarks/benchmark_text_completion.py" did not exist on "27f1410d065ceca53a07abd2518082eb25228e4f"
-
zhuwenwen authored
add VLLM_USE_FUSED_CACHE_QUANT_BMM_MLA to use fused rmsnorm + contiguous + rope(for dpsk-v3) + concat_and_cache_mla + q quant, control bmm(todo) + cat +mla (fp8)
9dd70f0e