Commits · bb1cadfff39381ad28e0f341bb70c129cc788b54 · OpenDAS / ktransformers

17 Feb, 2025 3 commits
- Merge branch 'fix_precision_MLA' of... · bb1cadff
  ceerrep authored Feb 17, 2025
```
Merge branch 'fix_precision_MLA' of https://github.com/kvcache-ai/ktransformers into server-prefix-cache
```
  bb1cadff
- fix precision bug imported by position_ids in 0.2.0 · 038bc308
  Atream authored Feb 17, 2025
  
  038bc308
- fix: server: drop <think> tag in chat template · cd9f7f8f
  ceerrep authored Feb 17, 2025
  
  cd9f7f8f
16 Feb, 2025 10 commits
- feat: use model name in openai endpoint · ca2090d8
  ceerrep authored Feb 17, 2025
  
  ca2090d8
- fix: use flash_attn for faster prefill · 5ac26608
  ceerrep authored Feb 17, 2025
  
  5ac26608
- feat: add prefix cache for server · bb0ccc7b
  ceerrep authored Feb 17, 2025
  
  bb0ccc7b
- Merge pull request #355 from kvcache-ai/qiyuxinlin-patch-1 · c515cc49
  wang jiahao authored Feb 16, 2025
```
Update install.md
```
  c515cc49
- Merge pull request #357 from kvcache-ai/qiyuxinlin-patch-2 · 24de607b
  wang jiahao authored Feb 16, 2025
```
Update SUMMARY.md
```
  24de607b
- support bf16 read · b8452462
  Atream authored Feb 16, 2025
  
  b8452462
- Update SUMMARY.md · 76554dd6
  wang jiahao authored Feb 16, 2025
  
  76554dd6
- Update install.md · 56d19d61
  wang jiahao authored Feb 16, 2025
  
  56d19d61
- Merge pull request #354 from Azure-Tang/fix-mockTritonMLA · 9f9c3738
  Azure authored Feb 16, 2025
```
[fix] Mock triton mla due to precision issue
```
  9f9c3738
- Mock triton mla due to precision issue · ff6b265e
  Azure authored Feb 16, 2025
  
  ff6b265e
15 Feb, 2025 20 commits
- Merge pull request #333 from kvcache-ai/feat_experts_gpu · c5f036e8
  Atream authored Feb 15, 2025
```
toy support for experts on GPU, no CUDA Graph
```
  c5f036e8
- Update FAQ.md · 8ed8eb2a
  Atream authored Feb 15, 2025
  
  8ed8eb2a
- toy support for experts on GPU, no CUDA Graph · c189d55b
  Atream authored Feb 15, 2025
  
  c189d55b
- Merge pull request #330 from hrz6976/fix-nonetype · ae8da019
  wang jiahao authored Feb 15, 2025
```
thanks..., I was about to submit and found that you had already modified it. Thank you for your contribution
```
  ae8da019
- Merge pull request #290 from ZhangShuaiyi/dev/check_gguf_path · 56382aa8
  Azure authored Feb 15, 2025
```
ensure that gguf_path argument is a directory.
```
  56382aa8
- Merge pull request #324 from BiFangKNT/patch-2 · c8bf2501
  Azure authored Feb 15, 2025
```
Update DeepseekR1_V3_tutorial.md
```
  c8bf2501
- Fix NoneType object has no attribute zero_ · 4516282c
  12f23eddde authored Feb 15, 2025
  
  4516282c
- Merge pull request #329 from kvcache-ai/fix_doc · 3c6035aa
  ZiWei Yuan authored Feb 15, 2025
```
📝 fix typo
```
  3c6035aa
- 📝 fix typo · 69b00753
  liam authored Feb 15, 2025
  
  69b00753
- Update DeepseekR1_V3_tutorial.md · bd693b69
  彼方 authored Feb 15, 2025
  
  bd693b69
- Merge pull request #317 from kvcache-ai/develop-0.2.1 · 65d73ea3
  UnicornChan authored Feb 15, 2025
```
[feature] update  docker image and entrypoint
```
  65d73ea3
- Merge pull request #316 from KMSorSMS/main · 718a71b3
  ZiWei Yuan authored Feb 15, 2025
```
📝 update V0.2.1 Doc
```
  718a71b3
- 📝 update V0.2.1 Doc · 13382f88
  liam authored Feb 15, 2025
  
  13382f88
- [feature] update docker image and entrypoint · 0e4b7a39
  chenxl authored Feb 15, 2025
  
  0e4b7a39
- Merge pull request #315 from kvcache-ai/Atream-add-adapted · f9f9f746
  Atream authored Feb 15, 2025
```
Atream add adapted
```
  f9f9f746
- Update attention.py · 92399283
  Atream authored Feb 15, 2025
  
  92399283
- Update triton_attention.py · d90749d3
  Atream authored Feb 15, 2025
  
  d90749d3
- get dirname if gguf_path is a file · 22280bf1
  Shuaiyi authored Feb 14, 2025
  
  22280bf1
- Merge pull request #307 from Azure-Tang/main · 1548c992
  Azure authored Feb 15, 2025
```
[update]  Reorganize documentation/README
```
  1548c992
- update zh readme · 227e81b0
  Azure authored Feb 15, 2025
  
  227e81b0
14 Feb, 2025 7 commits
- * Reorganize documentation/README · ef89b152
  Azure authored Feb 14, 2025
```
    * Consolidate the installation section, as it's currently too cluttered
    * Move the Multi-GPU section to the top-level structure
    * Add a **detailed** tutorial on registering extra GPU memory with Marlin
```
  ef89b152
- Merge pull request #306 from kvcache-ai/revert-305-main · b0b90270
  Azure authored Feb 15, 2025
```
Revert "[update] Reorganize documentation/README"
```
  b0b90270
- Revert "[update] Reorganize documentation/README" · 4f4ed364
  Azure authored Feb 15, 2025
  
  4f4ed364
- [update] Reorganize documentation/README · 19d4a50b
  Azure authored Feb 15, 2025
```
[update] Reorganize documentation/README
```
  19d4a50b
- fix typo and detail · 483182fc
  Azure authored Feb 14, 2025
  
  483182fc
- Reorganize documentation/README · 823b25ee
  Azure authored Feb 14, 2025
  
  823b25ee
- Merge pull request #301 from kvcache-ai/fix-cuda-graph-bug · cc8d627e
  Atream authored Feb 14, 2025
```
warm_up before capture
```
  cc8d627e