Merge pull request #1031 from wangkuigang-yewu-cmss/doc-update

文档更新：model_path名字要求以及在示例中添加force_think

Merge pull request #1031 from wangkuigang-yewu-cmss/doc-update
文档更新：model_path名字要求以及在示例中添加force_think
47a89ae7 · wang jiahao · GitHub · 72e8e16f · c5905832 · 47a89ae7
Unverified Commit 47a89ae7 authored Apr 03, 2025 by wang jiahao Committed by GitHub Apr 03, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 5 additions and 1 deletion

doc/en/balance-serve.md doc/en/balance-serve.md +5 -1

No files found.
--- a/doc/en/balance-serve.md
+++ b/doc/en/balance-serve.md
@@ -120,7 +120,8 @@ python ktransformers/server/main.py \
  --cache_lens 32768 \
  --chunk_size 256 \
  --max_batch_size 4 \
-  --backend_type balance_serve
+  --backend_type balance_serve \
+  --force_think # useful for R1
 ```
 It features the following arguments:
@@ -131,6 +132,9 @@ It features the following arguments:
  corresponding to 32768 tokens, and the space occupied will be released after the requests are completed.
 - `--max_batch_size`: Maximum number of requests (prefill + decode) processed in a single run by the engine. (Supported only by `balance_serve`)
 - `--backend_type`: `balance_serve` is a multi-concurrency backend engine introduced in version v0.2.4. The original single-concurrency engine is `ktransformers`.
+- `--model_path`: Path to safetensor config path (only config required, not model safetensors).  
+  Please note that, since `ver 0.2.4`, the last segment of `${model_path}` directory name **MUST** be one of the model names defined in `ktransformers/configs/model_configs.json`.
+- `--force_think`: Force responding the reasoning tag of `DeepSeek R1`.
 ### 2. access server