perf(qwen3): 融合 q/k RMSNorm + RoPE
set fp8_e4m3 only supported on nmz and support q&kvcache fp8 set VLLM_PCIE_USE_CUSTOM_ALLREDUCE=1
Showing
Please register or sign in to comment
set fp8_e4m3 only supported on nmz and support q&kvcache fp8 set VLLM_PCIE_USE_CUSTOM_ALLREDUCE=1