- 25 Feb, 2025 7 commits
- 24 Feb, 2025 9 commits
-
-
Azure authored
-
Atream authored
musa: support bf16
-
Atream authored
Ensure backward compatibility with PyTorch 2.2
-
Xiaodong Ye authored
Signed-off-by:Xiaodong Ye <xiaodong.ye@mthreads.com>
-
Azure authored
-
Azure authored
-
Atream authored
-
Atream authored
fix KExpertsMarlin on GPU with out CUDA Graph
-
Atream authored
-
- 23 Feb, 2025 8 commits
-
-
Atream authored
support moonlight, use ktransformers/optimize/optimize_rules/Moonlight-16B-A3B.yaml
-
Atream authored
-
Atream authored
-
DDong Jianwei authored
-
Atream authored
-
Atream authored
fix bf16 load, TODO: refactor cpu dequant
-
Atream authored
-
Xiaodong Ye authored
Signed-off-by:Xiaodong Ye <xiaodong.ye@mthreads.com>
-
- 22 Feb, 2025 8 commits
-
-
Azure authored
Add fp8 linear kernel;\n Add empty cache to fit in 16G VRAM; By 'wkGCaSS - 知乎 https://zhuanlan.zhihu.com/p/25491611225'
-
Atream authored
optimize CMake multi core parallel
-
Atream authored
Feat more context
-
Atream authored
-
Atream authored
Fix the link address in the doc install.md
-
Atream authored
Adjust the installation link to the correct section of docs
-
Atream authored
-
Atream authored
use marlin for lm_head, lm_head only calc last token for prefill extend context window to 19K for DeepSeek-V3/R1 within 24GB VRAM
-
- 21 Feb, 2025 3 commits
-
-
_ authored
-
JiamingMai authored
-
Atream authored
-
- 20 Feb, 2025 5 commits
-
-
Azure authored
feat: Support Moore Threads GPU
-
ZiWei Yuan authored
Docker dev
-
liam authored
-
-
ZiWei Yuan authored
feat: add GitHub Actions workflow for building Docker image
-