- 18 Dec, 2025 1 commit
-
-
zhuwenwen authored
-
- 17 Dec, 2025 1 commit
-
-
zhuwenwen authored
修复CompressedTensorsLinearMethod中的w4a16的冲突问题 feat(moe): add Marlin W16A16 fused MoE behind VLLM_USE_MARLIN_W16A16_MOE replace the fp8_mqa_logits and fp8_paged_mqa_logits interfaces in deepgemm with mqa_logits and paged_mqa_logits from lightop
-
- 13 Dec, 2025 6 commits
- 05 Dec, 2025 4 commits
- 04 Dec, 2025 7 commits
-
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
Add new benchmark configurations for gfx936_80cu with E=512,N=64 and E=512,N=128 Qwen3-Next-80B-A3B-Instruct nn tp4 tp8 moe json See merge request dcutoolkit/deeplearing/vllm!283
-
laibao authored
Add new benchmark configurations for gfx936_80cu with E=512,N=64 and E=512,N=128 Qwen3-Next-80B-A3B-Instruct nn tp4 tp8 moe json
-
- 03 Dec, 2025 5 commits
-
-
zhuwenwen authored
-
zhuwenwen authored
add VLLM_USE_OPT_RESHAPE_AND_CACHE、VLLM_USE_FUSE_SILU_AND_MUL and VLLM_USE_TOPK_RENORM for qwen3-30b
-
zhuwenwen authored
-
Arpit Khandelwal authored
Signed-off-by:
arpitkh101 <arpit5khandelwal@gmail.com> Co-authored-by:
Luka Govedič <ProExpertProg@users.noreply.github.com> (cherry picked from commit d7284a26)
-
Lucas Wilkinson authored
Signed-off-by:
Lucas Wilkinson <lwilkins@redhat.com> (cherry picked from commit 5cdd6645)
-
- 02 Dec, 2025 16 commits
-
-
Isotr0py authored
Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by:
Isotr0py <2037008807@qq.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> (cherry picked from commit 0ec84221)
-
Julien Denize authored
Signed-off-by:
juliendenize <julien.denize@mistral.ai> (cherry picked from commit 1b1e35aa)
-
Julien Denize authored
Signed-off-by:
juliendenize <julien.denize@mistral.ai> (cherry picked from commit 5e5646e2)
-
Chauncey authored
Signed-off-by:
chaunceyjiang <chaunceyjiang@gmail.com> Signed-off-by:
Nick Hill <nhill@redhat.com> Co-authored-by:
Nick Hill <nhill@redhat.com> (cherry picked from commit 0a9caca9)
-
Sage Moore authored
Signed-off-by:
Sage Moore <sage@neuralmagic.com> (cherry picked from commit e6f114ac)
-
jthomson04 authored
Signed-off-by:
jthomson04 <jwillthomson19@gmail.com> (cherry picked from commit 1528e079)
-
Benjamin Bartels authored
Signed-off-by:
bbartels <benjamin@bartels.dev> (cherry picked from commit 2d613de9)
-
Matthew Bonanni authored
Signed-off-by:
Matthew Bonanni <mbonanni@redhat.com> Co-authored-by:
Benjamin Chislett <bchislett@nvidia.com> (cherry picked from commit 51c57b51)
-
Cyrus Leung authored
Signed-off-by:
DarkLight1337 <tlleungac@connect.ust.hk> (cherry picked from commit 68ffbca7)
-
Harry Mellor authored
Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> (cherry picked from commit 951445a5)
-
Julien Denize authored
Signed-off-by:
Julien Denize <julien.denize@mistral.ai> Signed-off-by:
Julien Denize <40604584+juliendenize@users.noreply.github.com> Signed-off-by:
Mickael Seznec <mickael@mistral.ai> Signed-off-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Mickael Seznec <mickael@mistral.ai>
-
Louie Tsai authored
Signed-off-by:
Tsai, Louie <louie.tsai@intel.com> Signed-off-by:
Louie Tsai <louie.tsai@intel.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Li, Jiang <bigpyj64@gmail.com>
-
Boyuan Feng authored
Signed-off-by:Boyuan Feng <boyuan@meta.com>
-
杰兮 authored
Signed-off-by:
zhyajie <yajizhan@amd.com> Co-authored-by:
zhyajie <yajizhan@amd.com>
-
Boyuan Feng authored
Signed-off-by:Boyuan Feng <boyuan@meta.com>
-
Wushi Dong authored
Signed-off-by:Wushi Dong <dongws@meta.com>
-