关闭sparse_mla的num_head到64/128的pad,以及添加控制fp8_use_mixed_batch模式的环境变量控制,FP8_USE_MIXED_BATCH,默认为false,为分离模式
Attach a file by drag & drop or click to upload