- 11 Jul, 2025 1 commit
-
-
gushiqiao authored
-
- 02 Jul, 2025 1 commit
-
-
gushiqiao authored
Enable 720p model inference on low-spec GPUs/CPUs and accelerate T5/CLIP quantized models with vLLM operators
-
- 09 Jun, 2025 1 commit
-
-
gushiqiao authored
* reconstruct quantization and fix memory leak bug. * Support lazy load inference. * reconstruct quantization * Fix hunyuan bugs * deleted tmp file --------- Co-authored-by:
root <root@pt-c0b333b3a1834e81a0d4d5f412c6ffa1-worker-0.pt-c0b333b3a1834e81a0d4d5f412c6ffa1.ns-devsft-3460edd0.svc.cluster.local> Co-authored-by:
gushiqiao <gushqiaio@sensetime.com> Co-authored-by:
gushiqiao <gushiqiao@sensetime.com>
-
- 22 May, 2025 2 commits
-
-
Xinchi Huang authored
* async offload & context4debug * offload ratio * Merge branch 'main' into xinchi/fix_offload * adding offload ratio * pre-commit --------- Co-authored-by:“de1star” <“843414674@qq.com”>
-
root authored
-
- 14 May, 2025 1 commit
-
-
Xinchi Huang authored
* fix offload * fix offload --------- Co-authored-by:“de1star” <“843414674@qq.com”>
-
- 29 Apr, 2025 1 commit
-
-
root authored
-
- 20 Apr, 2025 2 commits
-
-
helloyongyang authored
-
helloyongyang authored
-
- 08 Apr, 2025 4 commits
-
-
zhiwei.dong authored
-
Dongz authored
* [minor]: optimize dockerfile for fewer layer * [feature]: add pre-commit lint, update readme for contribution guidance * [minor]: fix run shell privileges * [auto]: first lint without rule F, fix rule E * [minor]: fix docker file error
-
TorynCurtis authored
* 修改了main.py, t5的model, wan的model、三个weights文件和三个infer文件, 并且在common的conv3d算子中注册新算子 * 修改了Conv3dWeightForceBF16算子,更新了wan的pre_weights中对此算子的使用 * 修复了import中的bug * 修复了WanPreWeights, WanTransformerWeights没有self.config的bug * 修复了WanPreWeights, WanTransformerWeights没有self.config的bug * 修复了config的bug,目前在使用cpu_offload的时候,vae阶段有tensor不在同一device的bug * 修复了vae阶段迁移的bug * 修复了scale在mean和inv_std迁移后仍需重新赋值的bug
-
helloyongyang authored
-