Merge branch 'v0.9.2-dev-ds-wm-1115' into 'v0.9.2-dev-ds'
[feat]1.w8a8 marlin适配deepep低延迟;2.非naive ep模式,去掉多余的dp padding,避免allreduce耗时 See merge request dcutoolkit/deeplearing/vllm!256
Showing
Please register or sign in to comment
[feat]1.w8a8 marlin适配deepep低延迟;2.非naive ep模式,去掉多余的dp padding,避免allreduce耗时 See merge request dcutoolkit/deeplearing/vllm!256