- 02 Dec, 2025 1 commit
-
-
Yang Yong (雍洋) authored
-
- 28 Nov, 2025 1 commit
-
-
Kane authored
1. 修复muxi int8-vllm推理结果精度问题。torch.empty导致推理结果有nan值。 Co-authored-by:root <root@master.cluster.local>
-
- 27 Nov, 2025 2 commits
-
-
Kane authored
1. 修复mlu int8量化
-
Gu Shiqiao authored
-
- 26 Nov, 2025 1 commit
-
-
Kane authored
1. 修复之前的代码合并冲突,并测试通过。 --------- Co-authored-by:Yang Yong (雍洋) <yongyang1030@163.com>
-
- 25 Nov, 2025 1 commit
-
-
yihuiwen authored
Co-authored-by:yihuiwen <yihuiwen@sensetime.com>
-
- 21 Nov, 2025 1 commit
-
-
Yang Yong (雍洋) authored
Thanks to HunyuanVideo Team and ModelTC Team. --------- Co-authored-by:
gushiqiao <975033167@qq.com> Co-authored-by:
gushiqiao <77222802+gushiqiao@users.noreply.github.com> Co-authored-by:
chendingyu <chendingyu1@sensetime.com> Co-authored-by:
XHPlus <xhplus@163.com> Co-authored-by:
wangshankun <wangshankun2011@hotmail.com> Co-authored-by:
STwangyingrui <86730325+STwangyingrui@users.noreply.github.com> Co-authored-by:
root <root@pt-80f094c20fc44a8cad096e5f3dbc962e-worker-0.pt-80f094c20fc44a8cad096e5f3dbc962e.ns-devsft-3460edd0.svc.cluster.local>
-
- 19 Nov, 2025 1 commit
-
-
Kane authored
Feature: 1. added mlu590 bfloat16, single-gpu and multi-gpus inference. 2. added mlu590 int8 inference.
-
- 14 Nov, 2025 1 commit
-
-
gushiqiao authored
-
- 13 Nov, 2025 1 commit
-
-
Watebear authored
-
- 07 Nov, 2025 1 commit
-
-
gushiqiao authored
-
- 03 Nov, 2025 1 commit
- 31 Oct, 2025 1 commit
-
-
gushiqiao authored
Co-authored-by:gushiqiao <975033167@qq.ocm>
-
- 30 Oct, 2025 1 commit
-
-
Bilang ZHANG authored
-
- 17 Oct, 2025 1 commit
-
-
gushiqiao authored
-
- 24 Sep, 2025 1 commit
-
-
gushiqiao authored
-
- 18 Sep, 2025 1 commit
-
-
gushiqiao authored
-
- 16 Sep, 2025 2 commits
- 25 Aug, 2025 1 commit
-
-
gushiqiao authored
-
- 06 Aug, 2025 3 commits
- 05 Aug, 2025 2 commits
-
-
Zhuguanyu Wu authored
* add gguf loading functions * support wan22 distill
-
PengGao authored
-
- 21 Jul, 2025 3 commits
- 12 Jul, 2025 1 commit
-
-
gushiqiao authored
-
- 16 Jun, 2025 2 commits
- 11 Jun, 2025 1 commit
-
-
gushiqiao authored
-
- 10 Jun, 2025 1 commit
-
-
gushiqiao authored
-
- 09 Jun, 2025 2 commits
-
-
gushiqiao authored
-
gushiqiao authored
* reconstruct quantization and fix memory leak bug. * Support lazy load inference. * reconstruct quantization * Fix hunyuan bugs * deleted tmp file --------- Co-authored-by:
root <root@pt-c0b333b3a1834e81a0d4d5f412c6ffa1-worker-0.pt-c0b333b3a1834e81a0d4d5f412c6ffa1.ns-devsft-3460edd0.svc.cluster.local> Co-authored-by:
gushiqiao <gushqiaio@sensetime.com> Co-authored-by:
gushiqiao <gushiqiao@sensetime.com>
-
- 28 May, 2025 1 commit
-
-
Xinchi Huang authored
* fix offload extra latency in the first step by pre-allocating pinned memory * pre-commit --------- Co-authored-by:“de1star” <“843414674@qq.com”>
-
- 14 May, 2025 1 commit
-
-
Xinchi Huang authored
* fix offload * fix offload --------- Co-authored-by:“de1star” <“843414674@qq.com”>
-
- 09 May, 2025 2 commits
-
-
gushiqiao authored
* Support load advance ptq model. * Update run_wan_i2v_advanced_ptq.sh --------- Co-authored-by:
gushiqiao <gushiqiao@sensetime.com> Co-authored-by:
Yang Yong(雍洋) <yongyang1030@163.com>
-
helloyongyang authored
-
- 07 May, 2025 1 commit
-
-
helloyongyang authored
-