- 01 Mar, 2025 2 commits
- 28 Feb, 2025 1 commit
-
-
liam authored
-
- 27 Feb, 2025 4 commits
-
-
Shuaiyi authored
-
qiyuxinlin authored
-
Atream authored
-
lazymio authored
-
- 26 Feb, 2025 5 commits
- 25 Feb, 2025 9 commits
- 24 Feb, 2025 11 commits
-
-
Azure authored
-
Xiaodong Ye authored
Signed-off-by:Xiaodong Ye <xiaodong.ye@mthreads.com>
-
lazymio authored
-
lazymio authored
-
lazymio authored
-
lazymio authored
-
lazymio authored
-
lazymio authored
-
Azure authored
-
Atream authored
-
Yuhao Tsui authored
This change explicitly clears CUDA cache during weight loading to mitigate memory fragmentation issues, particularly beneficial for low-VRAM GPUs.
-
- 23 Feb, 2025 7 commits
-
-
akemimadoka authored
-
Atream authored
-
Atream authored
-
DDong Jianwei authored
-
Atream authored
-
Atream authored
-
Xiaodong Ye authored
Signed-off-by:Xiaodong Ye <xiaodong.ye@mthreads.com>
-
- 22 Feb, 2025 1 commit
-
-
Azure authored
Add fp8 linear kernel;\n Add empty cache to fit in 16G VRAM; By 'wkGCaSS - 知乎 https://zhuanlan.zhihu.com/p/25491611225'
-