Unverified Commit 95c231e5 authored by vzed's avatar vzed Committed by GitHub
Browse files

Tool Call: Add `chat_template_kwargs` documentation (#5679)

parent 3042f1da
...@@ -94,7 +94,63 @@ ...@@ -94,7 +94,63 @@
"\n", "\n",
"The chat completions API accepts OpenAI Chat Completions API's parameters. Refer to [OpenAI Chat Completions API](https://platform.openai.com/docs/api-reference/chat/create) for more details.\n", "The chat completions API accepts OpenAI Chat Completions API's parameters. Refer to [OpenAI Chat Completions API](https://platform.openai.com/docs/api-reference/chat/create) for more details.\n",
"\n", "\n",
"Here is an example of a detailed chat completion request:" "SGLang extends the standard API with the `extra_body` parameter, allowing for additional customization. One key option within `extra_body` is `chat_template_kwargs`, which can be used to pass arguments to the chat template processor.\n",
"\n",
"#### Enabling Model Thinking/Reasoning\n",
"\n",
"You can use `chat_template_kwargs` to enable or disable the model's internal thinking or reasoning process output. Set `\"enable_thinking\": True` within `chat_template_kwargs` to include the reasoning steps in the response. This requires launching the server with a compatible reasoning parser (e.g., `--reasoning-parser qwen3` for Qwen3 models).\n",
"\n",
"Here's an example demonstrating how to enable thinking and retrieve the reasoning content separately (using `separate_reasoning: True`):\n",
"\n",
"```python\n",
"# Ensure the server is launched with a compatible reasoning parser, e.g.:\n",
"# python3 -m sglang.launch_server --model-path QwQ/Qwen3-32B-250415 --reasoning-parser qwen3 ...\n",
"\n",
"from openai import OpenAI\n",
"\n",
"# Modify OpenAI's API key and API base to use SGLang's API server.\n",
"openai_api_key = \"EMPTY\"\n",
"openai_api_base = f\"http://127.0.0.1:{port}/v1\" # Use the correct port\n",
"\n",
"client = OpenAI(\n",
" api_key=openai_api_key,\n",
" base_url=openai_api_base,\n",
")\n",
"\n",
"model = \"QwQ/Qwen3-32B-250415\" # Use the model loaded by the server\n",
"messages = [{\"role\": \"user\", \"content\": \"9.11 and 9.8, which is greater?\"}]\n",
"\n",
"response = client.chat.completions.create(\n",
" model=model,\n",
" messages=messages,\n",
" extra_body={\n",
" \"chat_template_kwargs\": {\"enable_thinking\": True},\n",
" \"separate_reasoning\": True\n",
" }\n",
")\n",
"\n",
"print(\"response.choices[0].message.reasoning_content: \\n\", response.choices[0].message.reasoning_content)\n",
"print(\"response.choices[0].message.content: \\n\", response.choices[0].message.content)\n",
"```\n",
"\n",
"**Example Output:**\n",
"\n",
"```\n",
"response.choices[0].message.reasoning_content: \n",
" Okay, so I need to figure out which number is greater between 9.11 and 9.8. Hmm, let me think. Both numbers start with 9, right? So the whole number part is the same. That means I need to look at the decimal parts to determine which one is bigger.\n",
"...\n",
"Therefore, after checking multiple methods—aligning decimals, subtracting, converting to fractions, and using a real-world analogy—it's clear that 9.8 is greater than 9.11.\n",
"\n",
"response.choices[0].message.content: \n",
" To determine which number is greater between **9.11** and **9.8**, follow these steps:\n",
"...\n",
"**Answer**: \n",
"9.8 is greater than 9.11.\n",
"```\n",
"\n",
"Setting `\"enable_thinking\": False` (or omitting it) will result in `reasoning_content` being `None`.\n",
"\n",
"Here is an example of a detailed chat completion request using standard OpenAI parameters:"
] ]
}, },
{ {
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment