[fix]解决moe_fused_gate编译错误,去掉mla中mtp部分的修改
restore the default settings of disable_cascade_attn add VLLM_USE_OPT_ZEROS to replace triton_ (torch.zeros) set default_max_num_batched_tokens = 10240 update qwen3_moe of layernorm
Showing
Please register or sign in to comment