- 28 Apr, 2025 3 commits
- 25 Apr, 2025 3 commits
-
-
chenht2022 authored
-
wang jiahao authored
fix load default max_new_tokens
-
qiyuxinlin authored
-
- 24 Apr, 2025 2 commits
- 23 Apr, 2025 3 commits
-
-
wang jiahao authored
add check-para
-
Alisehen authored
-
Alisehen authored
-
- 22 Apr, 2025 9 commits
-
-
wang jiahao authored
change test
-
qiyuxinlin authored
-
Alisehen authored
-
wang jiahao authored
kill serve lead to kill sched and engine
-
qiyuxinlin authored
-
wang jiahao authored
update speed test
-
qiyuxinlin authored
-
wang jiahao authored
Update param
-
qiyuxinlin authored
-
- 21 Apr, 2025 1 commit
-
-
qiyuxinlin authored
-
- 19 Apr, 2025 2 commits
-
-
wang jiahao authored
Update Function call
-
Creeper-MZ authored
优化提示词,解决部分Deepseek r1的兼容性 fix non stream
-
- 18 Apr, 2025 9 commits
-
-
Atream authored
Fix cmake config error
-
wang jiahao authored
Move KV cache creation to balance_serve
-
qiyuxinlin authored
-
mykg authored
Signed-off-by:onepick <jiajuku12@163.com>
-
Atream authored
Enh: Make Ollama perf data more accurate, consistent with OpenAI's implementation
-
Atream authored
remove hard code max_length
-
Atream authored
-
Jianwei Dong authored
update llama4 tutorial
-
djw authored
-
- 17 Apr, 2025 8 commits
-
-
Creeper-MZ authored
-
Yuhao Tsui authored
-
Creeper-MZ authored
-
ZiWei Yuan authored
Fix some build error for ROCM
-
mykg authored
Signed-off-by:onepick <jiajuku12@163.com>
-
Yuhao Tsui authored
Modify the performance data calculation module from estimation to retrieving from `raw_usage`.
-
wang jiahao authored
Feat: Support Non-streaming chat in Ollama backend
-
wang jiahao authored
Fix the error caused by the client not passing temperature and top_p being empty
-