- 05 Aug, 2025 1 commit
-
-
PengGao authored
-
- 01 Aug, 2025 2 commits
-
-
gushiqiao authored
-
helloyongyang authored
-
- 31 Jul, 2025 1 commit
-
-
helloyongyang authored
-
- 30 Jul, 2025 1 commit
-
-
gushiqiao authored
-
- 28 Jul, 2025 2 commits
-
-
gushiqiao authored
-
helloyongyang authored
-
- 21 Jul, 2025 3 commits
- 17 Jul, 2025 1 commit
-
-
helloyongyang authored
-
- 16 Jul, 2025 1 commit
-
-
gushiqiao authored
-
- 12 Jul, 2025 2 commits
- 11 Jul, 2025 1 commit
-
-
gushiqiao authored
-
- 10 Jul, 2025 1 commit
-
-
gushiqiao authored
-
- 09 Jul, 2025 1 commit
-
-
gushiqiao authored
-
- 08 Jul, 2025 2 commits
- 03 Jul, 2025 1 commit
-
-
wangshankun authored
-
- 02 Jul, 2025 1 commit
-
-
gushiqiao authored
Enable 720p model inference on low-spec GPUs/CPUs and accelerate T5/CLIP quantized models with vLLM operators
-
- 29 Jun, 2025 2 commits
-
-
Yang Yong(雍洋) authored
Co-authored-by:Linboyan-trc <1584340372@qq.com>
-
helloyongyang authored
-
- 26 Jun, 2025 1 commit
-
-
gushiqiao authored
-
- 23 Jun, 2025 1 commit
-
-
gushiqiao authored
-
- 16 Jun, 2025 3 commits
-
-
gushiqiao authored
-
gushiqiao authored
-
Zhuguanyu Wu authored
* add step & cfg distillation wan model
-
- 12 Jun, 2025 1 commit
-
-
Zhuguanyu Wu authored
* add step & cfg distillation wan model * bug fixed
-
- 11 Jun, 2025 1 commit
-
-
gushiqiao authored
-
- 10 Jun, 2025 1 commit
-
-
gushiqiao authored
-
- 09 Jun, 2025 2 commits
-
-
gushiqiao authored
-
gushiqiao authored
* reconstruct quantization and fix memory leak bug. * Support lazy load inference. * reconstruct quantization * Fix hunyuan bugs * deleted tmp file --------- Co-authored-by:
root <root@pt-c0b333b3a1834e81a0d4d5f412c6ffa1-worker-0.pt-c0b333b3a1834e81a0d4d5f412c6ffa1.ns-devsft-3460edd0.svc.cluster.local> Co-authored-by:
gushiqiao <gushqiaio@sensetime.com> Co-authored-by:
gushiqiao <gushiqiao@sensetime.com>
-
- 30 May, 2025 1 commit
-
-
Zhuguanyu Wu authored
* split dit server from default runner * split dit server from default runner * update loading functions * simplify loader functions and runner functions * simplify code && split dit service * simplify code && split dit service * support split server for cogvideox * clear code.
-
- 28 May, 2025 1 commit
-
-
Xinchi Huang authored
* fix offload extra latency in the first step by pre-allocating pinned memory * pre-commit --------- Co-authored-by:“de1star” <“843414674@qq.com”>
-
- 27 May, 2025 1 commit
-
-
Watebear authored
-
- 23 May, 2025 2 commits
-
-
Zhuguanyu Wu authored
* support prompt enhancer server * bugs fixed * finished prompt enhancer service
-
Zhuguanyu Wu authored
* add load_transformer methods for split server * add service utils * [feature] support split servers
-
- 22 May, 2025 2 commits
-
-
Xinchi Huang authored
* async offload & context4debug * offload ratio * Merge branch 'main' into xinchi/fix_offload * adding offload ratio * pre-commit --------- Co-authored-by:“de1star” <“843414674@qq.com”>
-
root authored
-