add VLLM_USE_FUSED_CACHE_QUANT_BMM_MLA to use fused rmsnorm + contiguous +...
add VLLM_USE_FUSED_CACHE_QUANT_BMM_MLA to use fused rmsnorm + contiguous + rope(for dpsk-v3) + concat_and_cache_mla + q quant, control bmm(todo) + cat +mla (fp8)
Showing
Please register or sign in to comment