Commits · 48eb976dd21d6b94d75995934619af8bf7de8dd9 · OpenDAS / vllm_cscc

"examples/offline_inference/rlhf_online_quant.py" did not exist on "151b08e0fea93af4eb128bf09fd3808f38a73319"

11 Oct, 2025 3 commits
- close profile.StartTracer · a92daffa
  maxiao1 authored Oct 11, 2025
  
  a92daffa
- fix pd send async perfomance · 1707bebe
  maxiao1 authored Oct 11, 2025
  
  1707bebe
- change tbo about cudagrah · e42a922c
  maxiao1 authored Oct 11, 2025
  
  e42a922c
10 Oct, 2025 6 commits
- roll back version.py · 394648be
  lizhigong authored Oct 10, 2025
  
  394648be
- token split by token adapt to pd separation & p2p can be used async · 1a135a9d
  maxiao1 authored Oct 10, 2025
  
  1a135a9d
- keep same version · fc5bfc66
  maxiao1 authored Oct 10, 2025
  
  fc5bfc66
- pd分离_tbo · 12291212
  maxiao1 authored Oct 06, 2025
  
  12291212
- When VLLM_USE_FLASH_ATTN_PA=0, support the use of default block_size · 3daae57c
  zhuwenwen authored Oct 10, 2025
  
  3daae57c
- fix deepseek_v2.py · 2e1d8e9a
  zhuwenwen authored Oct 10, 2025
  
  2e1d8e9a
09 Oct, 2025 3 commits
- [fix]优化mori ep · 8b791547
  王敏 authored Oct 09, 2025
  
  8b791547
- Revert "Merge branch 'v0.9.2-dev-lmcache-pd' into 'v0.9.2-dev'" · 7b30ecd8
  zhuwenwen authored Oct 09, 2025
```
This reverts merge request !219
```
  7b30ecd8
- [fix]修复不开启ep报错 · 7bf7df7f
  王敏 authored Oct 09, 2025
  
  7bf7df7f
06 Oct, 2025 1 commit
- modify engied_id str · 4bb8c6af
  yangshj1 authored Oct 06, 2025
  
  4bb8c6af
05 Oct, 2025 2 commits
- use opt rope · 8c0143db
  zhuwenwen authored Oct 05, 2025
  
  8c0143db
- fix rope acc error · b7b4e639
  zhuwenwen authored Oct 05, 2025
  
  b7b4e639
30 Sep, 2025 4 commits
- fix Z100L&K100 precision error · 9e5ec696
  zhuwenwen authored Sep 30, 2025
  
  9e5ec696
- 修复部分代码 · e0ba23b5
  王敏 authored Sep 30, 2025
  
  e0ba23b5
- 修复部分代码 · 46e26bf1
  王敏 authored Sep 30, 2025
  
  46e26bf1
- [feat]优化mori计算逻辑，支持cudagraph，按照bs*ep_size截断fused_moe的输入，共享专家不tp切分，去掉最后的allreduce · d2e57a90
  王敏 authored Sep 30, 2025
  
  d2e57a90
29 Sep, 2025 4 commits
- update moe_sum interface · 00bbf0bb
  zhuwenwen authored Sep 29, 2025
  
  00bbf0bb
- fix pp error and update min_block_size · 484fcfca
  zhuwenwen authored Sep 29, 2025
  
  484fcfca
- when v1 is set to block_size 16, switch to triton implementation · e470f6a1
  zhuwenwen authored Sep 29, 2025
  
  e470f6a1
- remove redundant shared_output · 50bed026
  zhuwenwen authored Sep 29, 2025
  
  50bed026
28 Sep, 2025 1 commit
- 修复kvcache-fp8—e5m2的不能开cp的bug · b2d14ba3
  yangql authored Sep 28, 2025
  
  b2d14ba3
26 Sep, 2025 7 commits
- update cat kernel · e0fdf4e8
  zhuwenwen authored Sep 26, 2025
  
  e0fdf4e8
- off rotary_embedding_deepseek_fuse · dc675592
  zhuwenwen authored Sep 26, 2025
  
  dc675592
- feat: moe_align_block_size 更新lightop 接口,加入对ep的支持 · 0dcc2e60
  jujl1 authored Sep 26, 2025
  
  0dcc2e60
- update op · 92c6171e
  zhuwenwen authored Sep 26, 2025
  
  92c6171e
- add PD P do pp · ecd5815f
  xiabo authored Sep 26, 2025
  
  ecd5815f
- VLLM_USE_LIGHTOP and VLLM_USE_OPT_CAT · c1ece8c6
  zhuwenwen authored Sep 26, 2025
```
add shared_output and routed_scaling_factor of CompressedTensorsW8A8Int8MoEMethod
```
  c1ece8c6
- [fix] pp+mtp bs 1 correctness · 22d6f9af
  lizhigong authored Sep 26, 2025
  
  22d6f9af
25 Sep, 2025 4 commits
- add shared_output and routed_scaling_factor of w4a8 · b33ff2d6
  zhuwenwen authored Sep 25, 2025
  
  b33ff2d6
- DeepSeek-R1-Channel-INT8调用rmsquant融合。 · a5043e83
  wujl5 authored Sep 25, 2025
  
  a5043e83
- fix connector metadata · 1586d41f
  yangshj1 authored Sep 25, 2025
  
  1586d41f
- [kernels] add fused_rms_norm_contiguous and rotary_embedding_deepseek_fuse · 49810c37
  zhuwenwen authored Sep 25, 2025
```
[kernels] update moe_align_block_size and moe_sum interface
```
  49810c37
24 Sep, 2025 5 commits
- update VLLM_USE_OPT_CAT · 4d97c5fc
  zhuwenwen authored Sep 24, 2025
  
  4d97c5fc
- [kernel] add lightop's moe_sum(mul+add) fusion operator for deepseek · 8d2cac26
  zhuwenwen authored Sep 24, 2025
```
[FIX] 修复mtp和VLLM_USE_TRITON_CAT不能一起开的bug
```
  8d2cac26
- 修复mtp和VLLM_USE_TRITON_CAT不能一起开的bug · cc2dca96
  SAC_fanth authored Sep 24, 2025
  
  cc2dca96
- fix scheduler issu in pp + mtp · 2cb921da
  lizhigong authored Sep 24, 2025
  
  2cb921da
- add env · c6b7a44b
  yangshj1 authored Sep 24, 2025
  
  c6b7a44b