VLLM_USE_LIGHTOP and VLLM_USE_OPT_CAT
add shared_output and routed_scaling_factor of CompressedTensorsW8A8Int8MoEMethod
Showing
Please register or sign in to comment
add shared_output and routed_scaling_factor of CompressedTensorsW8A8Int8MoEMethod