Unverified Commit f2a75a66 authored by Ximingwang-09's avatar Ximingwang-09 Committed by GitHub
Browse files

update doc (#7046)


Co-authored-by: default avatarximing.wxm <ximing.wxm@antgroup.com>
parent 6b12d6a8
...@@ -185,7 +185,7 @@ Please consult the documentation below and [server_args.py](https://github.com/s ...@@ -185,7 +185,7 @@ Please consult the documentation below and [server_args.py](https://github.com/s
| Arguments | Description | Defaults | | Arguments | Description | Defaults |
|----------|-------------|---------| |----------|-------------|---------|
| `speculative_draft_model_path` | The draft model path for speculative decoding. | None | | `speculative_draft_model_path` | The draft model path for speculative decoding. | None |
| `speculative_algorithm` | The algorithm for speculative decoding. Currently [EAGLE](https://arxiv.org/html/2406.16858v1) and [EAGLE3](https://arxiv.org/pdf/2503.01840) are supported. Note that the radix cache, chunked prefill, and overlap scheduler are disabled when using eagle speculative decoding. | None | | `speculative_algorithm` | The algorithm for speculative decoding. Currently [EAGLE](https://arxiv.org/html/2406.16858v1) and [EAGLE3](https://arxiv.org/pdf/2503.01840) are supported. Note that the overlap scheduler is disabled when using eagle speculative decoding. | None |
| `speculative_num_steps` | How many draft passes we run before verifying. | None | | `speculative_num_steps` | How many draft passes we run before verifying. | None |
| `speculative_num_draft_tokens` | The number of tokens proposed in a draft. | None | | `speculative_num_draft_tokens` | The number of tokens proposed in a draft. | None |
| `speculative_eagle_topk` | The number of top candidates we keep for verification at each step for [Eagle](https://arxiv.org/html/2406.16858v1). | None | | `speculative_eagle_topk` | The number of top candidates we keep for verification at each step for [Eagle](https://arxiv.org/html/2406.16858v1). | None |
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment