- 17 Feb, 2025 3 commits
-
-
ceerrep authored
Merge branch 'fix_precision_MLA' of https://github.com/kvcache-ai/ktransformers into server-prefix-cache
-
Atream authored
-
ceerrep authored
-
- 16 Feb, 2025 10 commits
-
-
ceerrep authored
-
ceerrep authored
-
ceerrep authored
-
wang jiahao authored
Update install.md
-
wang jiahao authored
Update SUMMARY.md
-
Atream authored
-
wang jiahao authored
-
wang jiahao authored
-
Azure authored
[fix] Mock triton mla due to precision issue
-
Azure authored
-
- 15 Feb, 2025 20 commits
-
-
Atream authored
toy support for experts on GPU, no CUDA Graph
-
Atream authored
-
Atream authored
-
wang jiahao authored
thanks..., I was about to submit and found that you had already modified it. Thank you for your contribution
-
Azure authored
ensure that gguf_path argument is a directory.
-
Azure authored
Update DeepseekR1_V3_tutorial.md
-
12f23eddde authored
-
ZiWei Yuan authored
📝 fix typo -
liam authored
-
彼方 authored
-
UnicornChan authored
[feature] update docker image and entrypoint
-
ZiWei Yuan authored
📝 update V0.2.1 Doc -
liam authored
-
chenxl authored
-
Atream authored
Atream add adapted
-
Atream authored
-
Atream authored
-
Shuaiyi authored
-
Azure authored
[update] Reorganize documentation/README
-
Azure authored
-
- 14 Feb, 2025 7 commits
-
-
Azure authored
* Consolidate the installation section, as it's currently too cluttered * Move the Multi-GPU section to the top-level structure * Add a **detailed** tutorial on registering extra GPU memory with Marlin -
Azure authored
Revert "[update] Reorganize documentation/README"
-
Azure authored
-
Azure authored
[update] Reorganize documentation/README
-
Azure authored
-
Azure authored
-
Atream authored
warm_up before capture
-