Merge pull request #1085 from kvcache-ai/qiyuxinlin-patch-5

Update balance-serve.md

Merge pull request #1085 from kvcache-ai/qiyuxinlin-patch-5
Update balance-serve.md
94476ce5 · wang jiahao · GitHub · 41ce92bb · 23ceb1c0 · 94476ce5
Unverified Commit 94476ce5 authored Apr 08, 2025 by wang jiahao Committed by GitHub Apr 08, 2025
Show whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

doc/en/balance-serve.md doc/en/balance-serve.md +1 -1

No files found.
--- a/doc/en/balance-serve.md
+++ b/doc/en/balance-serve.md
@@ -133,7 +133,7 @@ It features the following arguments:
  corresponding to 32768 tokens, and the space occupied will be released after the requests are completed.
 - `--backend_type`: `balance_serve` is a multi-concurrency backend engine introduced in version v0.2.4. The original single-concurrency engine is `ktransformers`.
 - `--model_path`: Path to safetensor config path (only config required, not model safetensors).  
-  Please note that, since `ver 0.2.4`, the last segment of `${model_path}` directory name **MUST** be one of the model names defined in `ktransformers/configs/model_configs.json`.
+  Please note that, since `ver 0.2.4`, the last segment of `${model_path}` directory name **MUST** be a local directory that contains the model's configuration files. Hugging Face links (e.g., deepseek-ai/DeepSeek-R1) are not supported at the moment.
 - `--force_think`: Force responding the reasoning tag of `DeepSeek R1`.

 The relationship between `max_batch_size`, `cache_lens`, and `max_new_tokens` should satisfy: