Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
a1239b53e809ef2deb3df6818d7dca00669eec0c
Switch branch/tag
vllm_cscc
vllm
07 Aug, 2025
1 commit
[feat]支持mtp模型full_cuda_graph
· a1239b53
王敏
authored
Aug 07, 2025
a1239b53
06 Aug, 2025
4 commits
[feat]支持mtp模型full_cuda_graph
· 9dd945c1
王敏
authored
Aug 06, 2025
9dd945c1
Revert "Merge remote-tracking branch 'origin/v0.9.2-dev-wm' into v0.9.2-dev"
· 0c1cd0f5
zhuwenwen
authored
Aug 06, 2025
This reverts merge request !169
0c1cd0f5
update lmslim import
· 0d4ff65d
zhuwenwen
authored
Aug 06, 2025
0d4ff65d
Revert "update lmslim import"
· 3ae8665d
zhuwenwen
authored
Aug 06, 2025
This reverts commit
1d575d52
.
3ae8665d
05 Aug, 2025
7 commits
[feat]优化mtp相关函数返回类型
· 7e71c143
王敏
authored
Aug 05, 2025
7e71c143
merge and debug tbo on 0.9.2
· 3f8b2afe
lizhigong
authored
Aug 05, 2025
3f8b2afe
[feat]1.支持mtp模型 full_cuda_graph; 2.优化mtp拒绝采样
· 8e0ae19d
王敏
authored
Aug 05, 2025
8e0ae19d
update lmslim import
· 1d575d52
zhuwenwen
authored
Aug 05, 2025
1d575d52
add glm4.5 k100-ai config
· d160ae26
zhuwenwen
authored
Aug 05, 2025
d160ae26
add step3-vl k100-ai config
· 3e1ed13b
zhuwenwen
authored
Aug 05, 2025
3e1ed13b
when using VLLM_FLASH_ATTN_V1, set block_size to 64
· 80a682c7
zhuwenwen
authored
Aug 05, 2025
80a682c7
04 Aug, 2025
4 commits
add step3-vl config
· 8e1c204b
zhuwenwen
authored
Aug 04, 2025
8e1c204b
add step3-vl tuning
· 2d364c4e
zhuwenwen
authored
Aug 04, 2025
2d364c4e
add tbo on v1 engine
· 20e75ed6
lizhigong
authored
Aug 02, 2025
20e75ed6
update conv layout
· eba84521
zhuwenwen
authored
Aug 04, 2025
eba84521
02 Aug, 2025
1 commit
add glm4.5 config
· 94b06a94
zhuwenwen
authored
Aug 02, 2025
94b06a94
01 Aug, 2025
9 commits
set default block_size to 16
· 80045bf7
zhuwenwen
authored
Aug 01, 2025
80045bf7
update N to N1
· 8c7075d1
zhuwenwen
authored
Aug 01, 2025
8c7075d1
增加w4a8相关支持修改
· 2767fc34
gaoqiong
authored
Aug 01, 2025
2767fc34
back to default conv layout
· 5f18e876
zhuwenwen
authored
Aug 01, 2025
5f18e876
update rocm.py
· 0480314d
zhuwenwen
authored
Aug 01, 2025
0480314d
[Model] Update step3 vl
· 66540380
zhuwenwen
authored
Aug 01, 2025
66540380
[Model] Add step3 vl
· 53ffe40e
zhuwenwen
authored
Aug 01, 2025
53ffe40e
[fix]避免mla中cudagraph的适配影响非并行解码的逻辑
· 0e5d399a
王敏
authored
Aug 01, 2025
0e5d399a
update HIP_VISIBLE_DEVICES of rocm
· d0cc5577
zhuwenwen
authored
Aug 01, 2025
d0cc5577
31 Jul, 2025
7 commits
[feat]支持v1 engine mtp cudagraph
· fe393be8
王敏
authored
Jul 31, 2025
fe393be8
update mlp
· 741dbbbb
zhuwenwen
authored
Jul 31, 2025
741dbbbb
去除多余的w4a8参数
· 961cce86
gaoqiong
authored
Jul 31, 2025
961cce86
增加fused moe文件中w4a8的相关修改
· f5a7f12c
gaoqiong
authored
Jul 31, 2025
f5a7f12c
update common.py
· 3ae77d07
zhuwenwen
authored
Jul 31, 2025
3ae77d07
update common.py
· ef33478d
zhuwenwen
authored
Jul 31, 2025
ef33478d
update arg_utils.py
· ffaad3df
zhuwenwen
authored
Jul 31, 2025
ffaad3df
30 Jul, 2025
2 commits
Merge v0.9.2-dev-disagg into v0.9.2-dev
· af27c177
xuxz
authored
Jul 30, 2025
af27c177
解决v1cudagraph的问题以及接入fuse moe marlin_V3版本
· 9820d063
yangql
authored
Jul 30, 2025
9820d063
29 Jul, 2025
3 commits
[Misc] Clean up Aimv2 config registration in Ovis config
· be0549c4
zhuwenwen
authored
Jul 29, 2025
be0549c4
fix: 修复W8A8读config路径错误,删除int8_utils.py文件
· 7e5fb6fe
jujl1
authored
Jul 29, 2025
7e5fb6fe
GLM-4.5 Model Support
· 751c492c
zhuwenwen
authored
Jul 29, 2025
751c492c
28 Jul, 2025
2 commits
fix: 修复W8A8INT8读config问题
· c6187ade
jujl1
authored
Jul 28, 2025
c6187ade
修改W4A8 以及W8A8量化量化092接口
· 7017f30c
gaoqiong
authored
Jul 28, 2025
7017f30c