- 28 Feb, 2025 11 commits
-
-
ZiWei Yuan authored
fix cache_lens bug in server and rm test prompt.txt
-
-
liam authored
-
Atream authored
Delete duplicate code
-
liam authored
-
ZiWei Yuan authored
⚡ fox docker build -
liam authored
-
Azure authored
[fix] Fix template name
-
Azure authored
-
Azure authored
[UPDATE] Update ZH/EN issue template
-
Azure authored
-
- 27 Feb, 2025 17 commits
-
-
Shuaiyi authored
-
wang jiahao authored
fix temperature
-
qiyuxinlin authored
-
Atream authored
use generation config from json file in official repo
-
Atream authored
-
wang jiahao authored
Allow temperature and top_p from /v1/chat/completions
-
lazymio authored
-
wang jiahao authored
-
Azure authored
Update issue templates
-
Azure authored
-
Atream authored
Fix missing macro definition for KTRANSFORMERS_USE_CUDA and <chrono> includes on MSVC
-
Atream authored
Fix RuntimeError on Windows caused by integer overflow in np.prod
-
Atream authored
fix: fix SSE formatting
-
Atream authored
feat: basic api key support
-
Atream authored
Feat: Clear cache during weight loading to prevent OOM on GPUs with <=8GB VRAM
-
Atream authored
-
wang jiahao authored
feat:implementation of chat routing for Ollama
-
- 26 Feb, 2025 12 commits
-
-
Azure authored
[UPDATE] Update documents.
-
Azure authored
-
Atream authored
Update DeepseekR1_V3_tutorial.md
-
Atream authored
-
Atream authored
Update DeepSeek-V3-Chat-multi-gpu-marlin.yaml
-
Atream authored
-
swu-hyk authored
-
swu-hyk authored
-
Chen Hongtao authored
fix numa cpu distribution
-
ZiWei Yuan authored
fix dockerfile in devcontainer and fix expert torch
-
liam authored
-
liam authored
-