Unverified Commit 656f7fc1 authored by Jhin's avatar Jhin Committed by GitHub
Browse files

Docs: Quick fix for Speculative_decoding doc (#3228)


Co-authored-by: default avatarChayenne <zhaochenyang@ucla.edu>
Co-authored-by: default avatarChayenne <zhaochen20@outlook.com>
parent cf0f7eaf
...@@ -8,10 +8,11 @@ ...@@ -8,10 +8,11 @@
"\n", "\n",
"SGLang now provides an EAGLE-based speculative decoding option. The implementation aims to maximize speed and efficiency and is considered to be among the fastest in open-source LLM engines.\n", "SGLang now provides an EAGLE-based speculative decoding option. The implementation aims to maximize speed and efficiency and is considered to be among the fastest in open-source LLM engines.\n",
"\n", "\n",
"**Note:** Currently, Speculative Decoding in SGLang does not support radix cache.\n",
"\n",
"To run the following tests or benchmarks, you also need to install [**cutex**](https://pypi.org/project/cutex/): \n", "To run the following tests or benchmarks, you also need to install [**cutex**](https://pypi.org/project/cutex/): \n",
"> ```bash\n", "\n",
"> pip install cutex\n", "`pip install cutex`\n",
"> ```\n",
"\n", "\n",
"### Performance Highlights\n", "### Performance Highlights\n",
"\n", "\n",
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment