- 01 Mar, 2025 5 commits
- 28 Feb, 2025 11 commits
-
-
ZiWei Yuan authored
fix cache_lens bug in server and rm test prompt.txt
-
-
liam authored
-
Atream authored
Delete duplicate code
-
liam authored
-
ZiWei Yuan authored
⚡ fox docker build -
liam authored
-
Azure authored
[fix] Fix template name
-
Azure authored
-
Azure authored
[UPDATE] Update ZH/EN issue template
-
Azure authored
-
- 27 Feb, 2025 17 commits
-
-
Shuaiyi authored
-
wang jiahao authored
fix temperature
-
qiyuxinlin authored
-
Atream authored
use generation config from json file in official repo
-
Atream authored
-
wang jiahao authored
Allow temperature and top_p from /v1/chat/completions
-
lazymio authored
-
wang jiahao authored
-
Azure authored
Update issue templates
-
Azure authored
-
Atream authored
Fix missing macro definition for KTRANSFORMERS_USE_CUDA and <chrono> includes on MSVC
-
Atream authored
Fix RuntimeError on Windows caused by integer overflow in np.prod
-
Atream authored
fix: fix SSE formatting
-
Atream authored
feat: basic api key support
-
Atream authored
Feat: Clear cache during weight loading to prevent OOM on GPUs with <=8GB VRAM
-
Atream authored
-
wang jiahao authored
feat:implementation of chat routing for Ollama
-
- 26 Feb, 2025 7 commits