"...text-generation-inference.git" did not exist on "26b3916612bb85067b8328d988138f67453a89e1"
- 26 Feb, 2025 9 commits
-
-
Atream authored
Update DeepseekR1_V3_tutorial.md
-
Atream authored
-
Atream authored
Update DeepSeek-V3-Chat-multi-gpu-marlin.yaml
-
Atream authored
-
Chen Hongtao authored
fix numa cpu distribution
-
ZiWei Yuan authored
fix dockerfile in devcontainer and fix expert torch
-
liam authored
-
liam authored
-
wkgcass authored
The numa node location would be calculated based on the total number of worker threads. So we should always use the actual number of threads instead of using a min() op.
-
- 25 Feb, 2025 26 commits
-
-
Azure authored
📝 update benchmark.md -
liam authored
-
Azure authored
[update] Update doc.
-
liam authored
-
Azure authored
-
ZiWei Yuan authored
⚡ release v0.2.2rc1 -
liam authored
-
Azure authored
[release] Release 0.2.2rc.
-
Azure authored
[update] Update readme.
-
Azure authored
Update README.md
-
Atream authored
-
Atream authored
-
Azure authored
-
Atream authored
-
Atream authored
-
Azure authored
-
ZiWei Yuan authored
📝 add benchmark.md -
liam authored
-
ZiWei Yuan authored
⚡ update git ignore add docker dev container -
liam authored
-
Azure authored
-
Azure authored
-
Atream authored
Feat absorb for long prefill
-
Atream authored
-
Azure authored
[feat] Support fp8 linear kernel;
-
Azure authored
-
- 24 Feb, 2025 5 commits
-
-
Azure authored
-
Atream authored
musa: support bf16
-
Atream authored
Ensure backward compatibility with PyTorch 2.2
-
Xiaodong Ye authored
Signed-off-by:Xiaodong Ye <xiaodong.ye@mthreads.com>
-
Azure authored
-