Commits · eafda8836417cc4760fe8d14b90ff992fe978b21 · OpenDAS / vllm_cscc

28 Jan, 2026 1 commit
- 修复了awq的shape的bug，以及兼容了lmslim注册导入的的情况 · eafda883
  chenyue3 authored Jan 28, 2026
  
  eafda883
26 Jan, 2026 2 commits
- update deps · e29bd946
  zhuwenwen authored Jan 26, 2026
  
  e29bd946
- Merge tag 'v0.14.1' into v0.14.0-dev · f9fc5e42
  zhuwenwen authored Jan 26, 2026
  
  f9fc5e42
23 Jan, 2026 4 commits
- [CI] fix version comparsion and exclusion patterns in upload-release-wheels.sh (#32971) · d7de043d
  Shengqi Chen authored Jan 23, 2026
```
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
(cherry picked from commit 136c499f)
```
  d7de043d
- [Bugfix] Fix Whisper/encoder-decoder GPU memory leak (#32789) · 4dc11b06
  Nicolò Lucchesi authored Jan 22, 2026
```
Signed-off-by: NickLucche <nlucches@redhat.com>
(cherry picked from commit ea6102b8)
```
  4dc11b06
- [Misc] Bump opencv-python dependecy version to 4.13 (#32668) · 2bd95d80
  Isotr0py authored Jan 22, 2026
```
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
(cherry picked from commit 444e2e7e)
```
  2bd95d80
- [Misc] Replace urllib's `urlparse` with urllib3's `parse_url` (#32746) · f46d576c
  Isotr0py authored Jan 22, 2026
```
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
(cherry picked from commit 8ebf271b)
```
  f46d576c
22 Jan, 2026 2 commits
- fix rn_add_forward_autograd import · 573915c9
  zhuwenwen authored Jan 22, 2026
  
  573915c9
- update use_nn_moe · 9ddd0f97
  zhuwenwen authored Jan 22, 2026
  
  9ddd0f97
21 Jan, 2026 3 commits
- fix fa error and remove layernorm kernel · 90ddfba8
  zhuwenwen authored Jan 21, 2026
  
  90ddfba8
- skip concat_and_cache_mla_rope_fused · 7f7894c0
  zhuwenwen authored Jan 21, 2026
  
  7f7894c0
- Merge tag 'v0.14.0' into v0.14.0-dev · 7e63ef82
  zhuwenwen authored Jan 21, 2026
  
  7e63ef82
18 Jan, 2026 1 commit
- [build] fix cu130 related release pipeline steps and publish as nightly image (#32522) · d6820940
  Shengqi Chen authored Jan 18, 2026
```
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
(cherry picked from commit 965765ae)
```
  d6820940
17 Jan, 2026 1 commit

[CI] Implement uploading to PyPI and GitHub in the release pipeline, enable... · b17039bc

Shengqi Chen authored Jan 17, 2026

[CI] Implement uploading to PyPI and GitHub in the release pipeline, enable release image building for CUDA 13.0 (#31032)

(cherry picked from commit 8e61425e)

b17039bc

16 Jan, 2026 11 commits
- [Frontend] Standardize use of `create_error_response` (#32319) · 48b67ba7
  Cyrus Leung authored Jan 14, 2026
```
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
```
  48b67ba7
- set VLLM_USE_MARLIN_W16A16_MOE=0 on bw · 8cbcac5d
  zhuwenwen authored Jan 16, 2026
  
  8cbcac5d
- 解决custom cudagraph模式需要拷贝的问题，需要配合dtk进行使用。 · f1bc9890
  zhuwenwen authored Jan 16, 2026
```
区分pcie和hglink custom allreduce的使用
vllm：export VLLM_CUSTOM_CACHE=1
dtk：export HIP_KERNEL_EVENT_SYSTENFENCE=1

set VLLM_USE_FUSED_RMS_ROPE=1
add SUPPORT_MOE_MARLIN_W16A16 to use moe marlin on bw
support fa kvcache fp8 (todo: add VLLM_USE_QUERY_QUANT to not use q quant)
update moe_align_block_size
```
  f1bc9890
- Switch default w8a8 gemm impl to blaslt. · f06d1125
  zhuwenwen authored Jan 16, 2026
```
fix _forward_encoder_attention
remove medusa
set VLLM_PCIE_USE_CUSTOM_ALLREDUCE=1
```
  f06d1125
- [Bugfix] Fix ROCm dockerfiles (#32447) · 09f4264a
  TJian authored Jan 16, 2026
```
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
```
  09f4264a
- [CI] Fix LM Eval Large Models (H100) (#32423) · 7f42dc20
  Matthew Bonanni authored Jan 15, 2026
```
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
(cherry picked from commit bcf2333c)
```
  7f42dc20
- Cherry pick [ROCm] [CI] [Release] Rocm wheel pipeline with sccache #32264 · c2a37a3c
  TJian authored Jan 16, 2026
```
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
```
  c2a37a3c
- [UX] Use kv_offloading_backend=native by default (#32421) · 0e31fc79
  Michael Goin authored Jan 15, 2026
```
Signed-off-by: mgoin <mgoin64@gmail.com>
(cherry picked from commit 1be5a735)
```
  0e31fc79
- [ROCm][Bugfix] Disable hip sampler to fix deepseek's accuracy issue on ROCm (#32413) · 6ac0fcf4
  Pleaplusone authored Jan 16, 2026
```
Signed-off-by: ganyi <ygan@amd.com>
(cherry picked from commit 77c16df3)
```
  6ac0fcf4
- [ROCM] Add ROCm image build to release pipeline (#31995) · b6224972
  Douglas Lehr authored Jan 15, 2026
```
Signed-off-by: Doug Lehr <douglehr@amd.com>
Co-authored-by: Doug Lehr <douglehr@amd.com>
(cherry picked from commit c5891b54)
```
  b6224972
- [Bugfix][ROCm][performance] Resolve the performance regression issue of the... · 1b572752
  vllmellm authored Jan 15, 2026
```
[Bugfix][ROCm][performance] Resolve the performance regression issue of the Qwen3-Next-80B-A3B-Thinking under rocm_atten (#32336)
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
(cherry picked from commit e27078ea)
```
  1b572752
13 Jan, 2026 11 commits
- [BugFix] [KVConnector] Fix KV events for LMCache connector (#32169) · 2c24bc69
  Martin Hickey authored Jan 13, 2026
```
Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
```
  2c24bc69
- [Bugfix] Replace `PoolingParams.normalize` with `use_activation` (#32243) · 0aa8c405
  Cyrus Leung authored Jan 13, 2026
```
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
```
  0aa8c405
- [ROCm][Bugfix] Fix Mamba batched decode producing incorrect output (#32099) · 11b6af52
  Andreas Karatzas authored Jan 12, 2026
```
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
```
  11b6af52
- [Perf] Optimize requests abort (#32211) · 2a719e08
  Wentao Ye authored Jan 12, 2026
```
Signed-off-by: yewentao256 <zhyanwentao@126.com>
```
  2a719e08
- Fix various typos found in `docs` (#32212) · f243abc9
  Andrew Bennett authored Jan 12, 2026
```
Signed-off-by: Andrew Bennett <potatosaladx@meta.com>
```
  f243abc9
- [Frontend] Add `reasoning_effort` to `OpenAIServing._preprocess_chat()` (#31956) · 60b77e14
  Sanghoon Yoon authored Jan 13, 2026
```
Signed-off-by: Sanghoon Yoon <seanyoon@kakao.com>
```
  60b77e14
- [Misc] improve warning/assert messages (#32226) · 15b33ff0
  cjackal authored Jan 13, 2026
```
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
```
  15b33ff0
- [BugFix] Fix engine crash caused by chat tools + response_format (#32127) · c6bb5b56
  Nick Hill authored Jan 12, 2026
```
Signed-off-by: Nick Hill <nickhill123@gmail.com>
```
  c6bb5b56
- [Misc] Allow enabling NCCL for DP sync when async scheduling (#32197) · 9273a427
  Nick Hill authored Jan 12, 2026
```
Signed-off-by: Nick Hill <nickhill123@gmail.com>
```
  9273a427
- [Model] Handle `trust_remote_code` for transformers backend (#32194) · 78d13ea9
  Cyrus Leung authored Jan 13, 2026
```
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
```
  78d13ea9
- [responsesAPI] add unit test for optional function tool call id (#32036) · a307ac07
  Andrew Xia authored Jan 12, 2026
```
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
```
  a307ac07
12 Jan, 2026 4 commits
- [ROCm][CI] Handle pytest status code 5 when a shard isn't allocated any tests (#32040) · a28d9f44
  Divakar Verma authored Jan 12, 2026
```
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
```
  a28d9f44
- [Kernel][MoE] fix computation order of MoE weight multiplication and improve flow (#31962) · 629584bf
  xuebwang-amd authored Jan 13, 2026
```
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
```
  629584bf
- [Model Runner V2] Add support for M-RoPE (#32143) · 0a7dd237
  Woosuk Kwon authored Jan 12, 2026
```
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
```
  0a7dd237
- [Model Runner V2] Minor refactor for logit_bias (#32209) · dec28688
  Woosuk Kwon authored Jan 12, 2026
```
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
```
  dec28688