- 26 Feb, 2025 6 commits
-
-
Atream authored
-
Chen Hongtao authored
fix numa cpu distribution
-
ZiWei Yuan authored
fix dockerfile in devcontainer and fix expert torch
-
liam authored
-
liam authored
-
wkgcass authored
The numa node location would be calculated based on the total number of worker threads. So we should always use the actual number of threads instead of using a min() op.
-
- 25 Feb, 2025 26 commits
-
-
Azure authored
📝 update benchmark.md -
liam authored
-
Azure authored
[update] Update doc.
-
liam authored
-
Azure authored
-
ZiWei Yuan authored
⚡ release v0.2.2rc1 -
liam authored
-
Azure authored
[release] Release 0.2.2rc.
-
Azure authored
[update] Update readme.
-
Azure authored
Update README.md
-
Atream authored
-
Atream authored
-
Azure authored
-
Atream authored
-
Atream authored
-
Azure authored
-
ZiWei Yuan authored
📝 add benchmark.md -
liam authored
-
ZiWei Yuan authored
⚡ update git ignore add docker dev container -
liam authored
-
Azure authored
-
Azure authored
-
Atream authored
Feat absorb for long prefill
-
Atream authored
-
Azure authored
[feat] Support fp8 linear kernel;
-
Azure authored
-
- 24 Feb, 2025 8 commits
-
-
Azure authored
-
Atream authored
musa: support bf16
-
Atream authored
Ensure backward compatibility with PyTorch 2.2
-
Xiaodong Ye authored
Signed-off-by:Xiaodong Ye <xiaodong.ye@mthreads.com>
-
Azure authored
-
Azure authored
-
Atream authored
-
Atream authored
fix KExpertsMarlin on GPU with out CUDA Graph
-