Commit c5905832 authored by wangkuigang-yewu-cmss's avatar wangkuigang-yewu-cmss
Browse files

doc upgrade: model_path requirements and reasoning

* add documentations about `--model_path` requirements
* add `--force_think` in doc (most users would run R1 and would want it to provide reasoning process)
parent 72e8e16f
...@@ -120,7 +120,8 @@ python ktransformers/server/main.py \ ...@@ -120,7 +120,8 @@ python ktransformers/server/main.py \
--cache_lens 32768 \ --cache_lens 32768 \
--chunk_size 256 \ --chunk_size 256 \
--max_batch_size 4 \ --max_batch_size 4 \
--backend_type balance_serve --backend_type balance_serve \
--force_think # useful for R1
``` ```
It features the following arguments: It features the following arguments:
...@@ -131,6 +132,9 @@ It features the following arguments: ...@@ -131,6 +132,9 @@ It features the following arguments:
corresponding to 32768 tokens, and the space occupied will be released after the requests are completed. corresponding to 32768 tokens, and the space occupied will be released after the requests are completed.
- `--max_batch_size`: Maximum number of requests (prefill + decode) processed in a single run by the engine. (Supported only by `balance_serve`) - `--max_batch_size`: Maximum number of requests (prefill + decode) processed in a single run by the engine. (Supported only by `balance_serve`)
- `--backend_type`: `balance_serve` is a multi-concurrency backend engine introduced in version v0.2.4. The original single-concurrency engine is `ktransformers`. - `--backend_type`: `balance_serve` is a multi-concurrency backend engine introduced in version v0.2.4. The original single-concurrency engine is `ktransformers`.
- `--model_path`: Path to safetensor config path (only config required, not model safetensors).
Please note that, since `ver 0.2.4`, the last segment of `${model_path}` directory name **MUST** be one of the model names defined in `ktransformers/configs/model_configs.json`.
- `--force_think`: Force responding the reasoning tag of `DeepSeek R1`.
### 2. access server ### 2. access server
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment