Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
4b3e2d5edd5354093d03432086752ba0aa7c9f03
Switch branch/tag
vllm_cscc
vllm
15 Oct, 2025
6 commits
update deepseek_v2.py
· 4b3e2d5e
zhuwenwen
authored
Oct 15, 2025
4b3e2d5e
update deepseek_v2.py
· 4ae3fc04
zhuwenwen
authored
Oct 15, 2025
4ae3fc04
删除DPSK_FP16_QUICK,以及增加awq和blockwiseint8的shared_output接口
· 50cb9270
yangql
authored
Oct 15, 2025
50cb9270
set VLLM_USE_OPT_MOE_SUM=1 and VLLM_USE_LIGHTOP_MOE_SUM=1
· 15ef12c1
zhuwenwen
authored
Oct 15, 2025
15ef12c1
删除DPSK_FP16_QUICK,以及增加awq和blockwiseint8的shared_output接口
· 7f459b46
yangql
authored
Oct 15, 2025
7f459b46
support --no-enable-chunked-prefill of v1
· 2c4b2c80
zhuwenwen
authored
Oct 15, 2025
2c4b2c80
14 Oct, 2025
1 commit
remove redundant maybe_calc_kv_scales
· c3b8a0ae
zhuwenwen
authored
Oct 14, 2025
c3b8a0ae
13 Oct, 2025
9 commits
去掉all2all ep相关代码
· 0b467604
王敏
authored
Oct 13, 2025
0b467604
update the envs for moe_sum and moe_align
· 766663e6
zhuwenwen
authored
Oct 13, 2025
766663e6
update moe_sum and moe_align
· 1277ff09
zhuwenwen
authored
Oct 13, 2025
1277ff09
optimize the implementation of moe_sum (lightop)
· 7e68a7fe
zhuwenwen
authored
Oct 13, 2025
7e68a7fe
remove --no-enable-chunked-prefill of v1
· bdaaf39d
zhuwenwen
authored
Oct 13, 2025
bdaaf39d
optimize the implementation of moe_sum
· 06a1bee2
zhuwenwen
authored
Oct 13, 2025
06a1bee2
disable cascade_attn
· b7989b07
zhuwenwen
authored
Oct 13, 2025
b7989b07
support dsv32
· 633f8199
zhuwenwen
authored
Oct 13, 2025
633f8199
新增fp8—e5m2
· ed3cdc81
zhuwenwen
authored
Oct 13, 2025
ed3cdc81
12 Oct, 2025
1 commit
[fix]修复开启dp并且不开启ep报vmfault
· f0f159a4
王敏
authored
Oct 12, 2025
f0f159a4
11 Oct, 2025
3 commits
close profile.StartTracer
· a92daffa
maxiao1
authored
Oct 11, 2025
a92daffa
fix pd send async perfomance
· 1707bebe
maxiao1
authored
Oct 11, 2025
1707bebe
change tbo about cudagrah
· e42a922c
maxiao1
authored
Oct 11, 2025
e42a922c
10 Oct, 2025
6 commits
roll back version.py
· 394648be
lizhigong
authored
Oct 10, 2025
394648be
token split by token adapt to pd separation & p2p can be used async
· 1a135a9d
maxiao1
authored
Oct 10, 2025
1a135a9d
keep same version
· fc5bfc66
maxiao1
authored
Oct 10, 2025
fc5bfc66
pd分离_tbo
· 12291212
maxiao1
authored
Oct 06, 2025
12291212
When VLLM_USE_FLASH_ATTN_PA=0, support the use of default block_size
· 3daae57c
zhuwenwen
authored
Oct 10, 2025
3daae57c
fix deepseek_v2.py
· 2e1d8e9a
zhuwenwen
authored
Oct 10, 2025
2e1d8e9a
09 Oct, 2025
4 commits
优化mori ep
· 5dcc5cb8
王敏
authored
Oct 09, 2025
5dcc5cb8
[fix]优化mori ep
· 8b791547
王敏
authored
Oct 09, 2025
8b791547
Revert "Merge branch 'v0.9.2-dev-lmcache-pd' into 'v0.9.2-dev'"
· 7b30ecd8
zhuwenwen
authored
Oct 09, 2025
This reverts merge request !219
7b30ecd8
[fix]修复不开启ep报错
· 7bf7df7f
王敏
authored
Oct 09, 2025
7bf7df7f
06 Oct, 2025
1 commit
modify engied_id str
· 4bb8c6af
yangshj1
authored
Oct 06, 2025
4bb8c6af
05 Oct, 2025
2 commits
use opt rope
· 8c0143db
zhuwenwen
authored
Oct 05, 2025
8c0143db
fix rope acc error
· b7b4e639
zhuwenwen
authored
Oct 05, 2025
b7b4e639
30 Sep, 2025
4 commits
fix Z100L&K100 precision error
· 9e5ec696
zhuwenwen
authored
Sep 30, 2025
9e5ec696
修复部分代码
· e0ba23b5
王敏
authored
Sep 30, 2025
e0ba23b5
修复部分代码
· 46e26bf1
王敏
authored
Sep 30, 2025
46e26bf1
[feat]优化mori计算逻辑,支持cudagraph,按照bs*ep_size截断fused_moe的输入,共享专家不tp切分,去掉最后的allreduce
· d2e57a90
王敏
authored
Sep 30, 2025
d2e57a90
29 Sep, 2025
3 commits
update moe_sum interface
· 00bbf0bb
zhuwenwen
authored
Sep 29, 2025
00bbf0bb
fix pp error and update min_block_size
· 484fcfca
zhuwenwen
authored
Sep 29, 2025
484fcfca
when v1 is set to block_size 16, switch to triton implementation
· e470f6a1
zhuwenwen
authored
Sep 29, 2025
e470f6a1