Commits · v0.14.1-dev · OpenDAS / vllm_cscc

04 Feb, 2026 2 commits

Merge branch 'v0.14.1-dev_yql_2.33' into 'v0.14.1-dev' · ce63ba55

yangql authored Feb 04, 2026

修复awq的triton支持，和moe模型的接口bug，以及awq_moe_marlin的接口相关问题，以及解决一些w4a16的精度问题

See merge request dcutoolkit/deeplearing/vllm!403

ce63ba55

Merge branch 'v0.14.1-dev' into 'v0.14.1-dev_yql_2.33' · 718d091f
yangql authored Feb 04, 2026
```
# Conflicts:
#   vllm/model_executor/layers/fused_moe/layer.py
```
718d091f

03 Feb, 2026 5 commits
- Merge branch 'v0.14.1-dev' of http://10.16.6.30/dcutoolkit/deeplearing/vllm into v0.14.1-dev · 4e4db0b4
  zhuwenwen authored Feb 03, 2026
  
  4e4db0b4
- __syncwarp isn't defined · f6653ed9
  zhuwenwen authored Feb 03, 2026
  
  f6653ed9
- 修复awq的triton支持，和moe模型的接口bug，以及awq_moe_marlin的接口相关问题，以及解决一些w4a16的精度问题 · b129793f
  chenyue3 authored Feb 03, 2026
  
  b129793f
- Merge branch 'v0.14.1-dev-wm' into 'v0.14.1-dev' · de21e4d1
  zhuwenwen authored Feb 03, 2026
```
[feat]1.适配ds w8a8及mtp;2.添加宽松mtp;3.适配w8a8 DEEPEP;4.解决ds 671B精度异常

See merge request dcutoolkit/deeplearing/vllm!400
```
  de21e4d1
- [feat]1.适配ds w8a8及mtp;2.添加宽松mtp;3.适配w8a8 DEEPEP;4.解决ds 671B精度异常 · fd1b3940
  王敏 authored Feb 03, 2026
  
  fd1b3940
02 Feb, 2026 2 commits
- skip aiter · 3280b0a0
  zhuwenwen authored Feb 02, 2026
  
  3280b0a0
- import envs · 6df19df0
  zhuwenwen authored Feb 02, 2026
  
  6df19df0
30 Jan, 2026 3 commits
- fix version · d942fe3e
  zhuwenwen authored Jan 30, 2026
  
  d942fe3e
- set MOE_NN=0, VLLM_USE_FUSED_RMS_ROPE=0, VLLM_USE_FUSE_SILU_AND_MUL=0 and VLLM_W8A8_BACKEND=1 · 3eccb64e
  zhuwenwen authored Jan 30, 2026
  
  3eccb64e
- update version · 39562a7f
  zhuwenwen authored Jan 30, 2026
  
  39562a7f
28 Jan, 2026 4 commits
- Merge branch 'v0.14.1-dev_yql_1.28_2' into 'v0.14.1-dev' · a4df8463
  zhuwenwen authored Jan 28, 2026
```
修复vit attn的导入问题，以及w4a16的gptq的接口问题

See merge request dcutoolkit/deeplearing/vllm!395
```
  a4df8463
- 修复vit attn的导入问题，以及w4a16的gptq的接口问题 · 8900b622
  chenyue3 authored Jan 28, 2026
  
  8900b622
- Merge branch 'v0.14.1-dev_yql_1.28' into 'v0.14.1-dev' · 82277b17
  zhuwenwen authored Jan 28, 2026
```
修复了awq的shape的bug，以及兼容了lmslim注册导入的的情况

See merge request dcutoolkit/deeplearing/vllm!394
```
  82277b17
- 修复了awq的shape的bug，以及兼容了lmslim注册导入的的情况 · eafda883
  chenyue3 authored Jan 28, 2026
  
  eafda883
26 Jan, 2026 2 commits
- update deps · e29bd946
  zhuwenwen authored Jan 26, 2026
  
  e29bd946
- Merge tag 'v0.14.1' into v0.14.0-dev · f9fc5e42
  zhuwenwen authored Jan 26, 2026
  
  f9fc5e42
23 Jan, 2026 4 commits
- [CI] fix version comparsion and exclusion patterns in upload-release-wheels.sh (#32971) · d7de043d
  Shengqi Chen authored Jan 23, 2026
```
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
(cherry picked from commit 136c499f)
```
  d7de043d
- [Bugfix] Fix Whisper/encoder-decoder GPU memory leak (#32789) · 4dc11b06
  Nicolò Lucchesi authored Jan 22, 2026
```
Signed-off-by: NickLucche <nlucches@redhat.com>
(cherry picked from commit ea6102b8)
```
  4dc11b06
- [Misc] Bump opencv-python dependecy version to 4.13 (#32668) · 2bd95d80
  Isotr0py authored Jan 22, 2026
```
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
(cherry picked from commit 444e2e7e)
```
  2bd95d80
- [Misc] Replace urllib's `urlparse` with urllib3's `parse_url` (#32746) · f46d576c
  Isotr0py authored Jan 22, 2026
```
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
(cherry picked from commit 8ebf271b)
```
  f46d576c
22 Jan, 2026 2 commits
- fix rn_add_forward_autograd import · 573915c9
  zhuwenwen authored Jan 22, 2026
  
  573915c9
- update use_nn_moe · 9ddd0f97
  zhuwenwen authored Jan 22, 2026
  
  9ddd0f97
21 Jan, 2026 3 commits
- fix fa error and remove layernorm kernel · 90ddfba8
  zhuwenwen authored Jan 21, 2026
  
  90ddfba8
- skip concat_and_cache_mla_rope_fused · 7f7894c0
  zhuwenwen authored Jan 21, 2026
  
  7f7894c0
- Merge tag 'v0.14.0' into v0.14.0-dev · 7e63ef82
  zhuwenwen authored Jan 21, 2026
  
  7e63ef82
18 Jan, 2026 1 commit
- [build] fix cu130 related release pipeline steps and publish as nightly image (#32522) · d6820940
  Shengqi Chen authored Jan 18, 2026
```
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
(cherry picked from commit 965765ae)
```
  d6820940
17 Jan, 2026 1 commit

[CI] Implement uploading to PyPI and GitHub in the release pipeline, enable... · b17039bc

Shengqi Chen authored Jan 17, 2026

[CI] Implement uploading to PyPI and GitHub in the release pipeline, enable release image building for CUDA 13.0 (#31032)

(cherry picked from commit 8e61425e)

b17039bc

16 Jan, 2026 11 commits
- [Frontend] Standardize use of `create_error_response` (#32319) · 48b67ba7
  Cyrus Leung authored Jan 14, 2026
```
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
```
  48b67ba7
- set VLLM_USE_MARLIN_W16A16_MOE=0 on bw · 8cbcac5d
  zhuwenwen authored Jan 16, 2026
  
  8cbcac5d
- 解决custom cudagraph模式需要拷贝的问题，需要配合dtk进行使用。 · f1bc9890
  zhuwenwen authored Jan 16, 2026
```
区分pcie和hglink custom allreduce的使用
vllm：export VLLM_CUSTOM_CACHE=1
dtk：export HIP_KERNEL_EVENT_SYSTENFENCE=1

set VLLM_USE_FUSED_RMS_ROPE=1
add SUPPORT_MOE_MARLIN_W16A16 to use moe marlin on bw
support fa kvcache fp8 (todo: add VLLM_USE_QUERY_QUANT to not use q quant)
update moe_align_block_size
```
  f1bc9890
- Switch default w8a8 gemm impl to blaslt. · f06d1125
  zhuwenwen authored Jan 16, 2026
```
fix _forward_encoder_attention
remove medusa
set VLLM_PCIE_USE_CUSTOM_ALLREDUCE=1
```
  f06d1125
- [Bugfix] Fix ROCm dockerfiles (#32447) · 09f4264a
  TJian authored Jan 16, 2026
```
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
```
  09f4264a
- [CI] Fix LM Eval Large Models (H100) (#32423) · 7f42dc20
  Matthew Bonanni authored Jan 15, 2026
```
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
(cherry picked from commit bcf2333c)
```
  7f42dc20
- Cherry pick [ROCm] [CI] [Release] Rocm wheel pipeline with sccache #32264 · c2a37a3c
  TJian authored Jan 16, 2026
```
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
```
  c2a37a3c
- [UX] Use kv_offloading_backend=native by default (#32421) · 0e31fc79
  Michael Goin authored Jan 15, 2026
```
Signed-off-by: mgoin <mgoin64@gmail.com>
(cherry picked from commit 1be5a735)
```
  0e31fc79
- [ROCm][Bugfix] Disable hip sampler to fix deepseek's accuracy issue on ROCm (#32413) · 6ac0fcf4
  Pleaplusone authored Jan 16, 2026
```
Signed-off-by: ganyi <ygan@amd.com>
(cherry picked from commit 77c16df3)
```
  6ac0fcf4
- [ROCM] Add ROCm image build to release pipeline (#31995) · b6224972
  Douglas Lehr authored Jan 15, 2026
```
Signed-off-by: Doug Lehr <douglehr@amd.com>
Co-authored-by: Doug Lehr <douglehr@amd.com>
(cherry picked from commit c5891b54)
```
  b6224972
- [Bugfix][ROCm][performance] Resolve the performance regression issue of the... · 1b572752
  vllmellm authored Jan 15, 2026
```
[Bugfix][ROCm][performance] Resolve the performance regression issue of the Qwen3-Next-80B-A3B-Thinking under rocm_atten (#32336)
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
(cherry picked from commit e27078ea)
```
  1b572752