Unverified Commit 4d51588e authored by Yifan Qiao's avatar Yifan Qiao Committed by GitHub
Browse files

[Feat] DeepSeek V4 Rebased (#40860)


Signed-off-by: default avatarYifan Qiao <yifanqiao@inferact.ai>
Signed-off-by: default avatarWoosuk Kwon <woosuk@inferact.ai>
Signed-off-by: default avatarqizixi <zixi@inferact.ai>
Signed-off-by: default avatarJee Jee Li <pandaleefree@gmail.com>
Signed-off-by: default avatarYongye Zhu <zyy1102000@gmail.com>
Co-authored-by: default avatarYongye Zhu <zyy1102000@gmail.com>
Co-authored-by: default avatarYongye Zhu <yongye@inferact.ai>
Co-authored-by: default avatarSimon Mo <simon@inferact.ai>
Co-authored-by: default avatarBugen Zhao <i@bugenzhao.com>
Co-authored-by: default avatarGiancarlo Delfin <gdelfin@inferact.ai>
Co-authored-by: default avatarJee Jee Li <pandaleefree@gmail.com>
Co-authored-by: default avatarNick Hill <nickhill123@gmail.com>
Co-authored-by: default avatarRoger Wang <hey@rogerw.io>
Co-authored-by: default avatarRoy Wang <yasong.wang@inferact.ai>
Co-authored-by: default avatarWoosuk Kwon <woosuk@inferact.ai>
Co-authored-by: default avataryoukaichao <youkaichao@gmail.com>
Co-authored-by: default avatarZhewen Li <jerven.vllm@gmail.com>
Co-authored-by: default avatarZijing Liu <liuzijing2014@gmail.com>
Co-authored-by: default avatarkhluu <khluu000@gmail.com>
Co-authored-by: default avatarqizixi <zixi@inferact.ai>
Co-authored-by: Zh...
parent 32e45636
<|begin▁of▁sentence|>该助手为DeepSeek,由深度求索公司创造。<|latest_reminder|>2026-02-21,星期六,广州,App,中文<|User|>小柴胡冲剂和布洛芬能一起吃吗?
CITATION FORMAT: 【{cursor_id}†L{start_line_id}(-L{end_line_id})?】
## Tools
You have access to a set of tools to help answer the user's question. You can invoke tools by writing a "<|DSML|tool_calls>" block like the following:
<|DSML|tool_calls>
<|DSML|invoke name="$TOOL_NAME">
<|DSML|parameter name="$PARAMETER_NAME" string="true|false">$PARAMETER_VALUE</|DSML|parameter>
...
</|DSML|invoke>
<|DSML|invoke name="$TOOL_NAME2">
...
</|DSML|invoke>
</|DSML|tool_calls>
String parameters should be specified as is and set `string="true"`. For all other types (numbers, booleans, arrays, objects), pass the value in JSON format and set `string="false"`.
If thinking_mode is enabled (triggered by <think>), you MUST output your complete reasoning inside <think>...</think> BEFORE any tool calls or final response.
Otherwise, output directly after </think> with tool calls or final response.
### Available Tool Schemas
{"name": "search", "description": "Web search. Split multiple queries with '||'.", "parameters": {"type": "object", "properties": {"queries": {"type": "string", "description": "query1||query2"}}, "required": ["queries"], "additionalProperties": false, "$schema": "http://json-schema.org/draft-07/schema#"}}
{"name": "open", "description": "Batch open IDs (format 【{id}†...】) or URLs.", "parameters": {"type": "object", "properties": {"open_list": {"type": "array", "items": {"type": "object", "properties": {"id": {"description": "ID or URL", "anyOf": [{"type": "integer"}, {"type": "string"}], "default": -1}, "cursor": {"type": "integer", "description": "", "default": -1}, "loc": {"type": "integer", "description": "Start line", "default": -1}, "num_lines": {"type": "integer", "description": "", "default": -1}, "view_source": {"type": "boolean", "description": "", "default": false}}, "additionalProperties": false}, "description": ""}}, "required": ["open_list"], "additionalProperties": false, "$schema": "http://json-schema.org/draft-07/schema#"}}
{"name": "find", "description": "Find exact text pattern in pages.", "parameters": {"type": "object", "properties": {"find_list": {"type": "array", "items": {"type": "object", "properties": {"pattern": {"type": "string", "description": ""}, "cursor": {"type": "integer", "description": "", "default": -1}}, "required": ["pattern"], "additionalProperties": false}, "description": ""}}, "required": ["find_list"], "additionalProperties": false, "$schema": "http://json-schema.org/draft-07/schema#"}}
You MUST strictly follow the above defined tool name and parameter schemas to invoke tool calls.
<|Assistant|><think>用户想知道小柴胡冲剂和布洛芬能否一起服用。</think>
<|DSML|tool_calls>
<|DSML|invoke name="search">
<|DSML|parameter name="queries" string="true">小柴胡冲剂 布洛芬 相互作用 一起吃</|DSML|parameter>
</|DSML|invoke>
</|DSML|tool_calls><|end▁of▁sentence|><|User|><tool_result>[0]</tool_result><|Assistant|><think>现在开始组织回答。</think>请及时就医。<|end▁of▁sentence|>
\ No newline at end of file
<|begin▁of▁sentence|>该助手为DeepSeek-V3,由深度求索公司创造。
今天是2025年10月17日,星期五。<|latest_reminder|>2024-11-15,上海市,App,中文<|User|>热海大滚锅是世界著名温泉吗<|Assistant|></think>关于热海大滚锅是否是世界著名温泉,可以这样总结:它在**中国乃至全球的地热奇观中占有重要地位**,但“世界著名”的称号更侧重于它作为**独特的地质现象和旅游景点**,而非普遍的温泉疗养体验。
为了让你快速了解,我整理了一个简要的表格:
| 维度 | 热海大滚锅的地位与特点 |
| :--- | :--- |
| **地质奇观** | **世界罕见**的高温地热系统,泉眼多、水温高、形态多样。 |
| **国内声誉** | **中国三大地热区之一**,**国家5A级旅游景区**,自古闻名(徐霞客曾游历并记载)。 |
| **国际知名度** | 在一些旅游平台被提及为“世界六大温泉”之一,但此说法流传不广,其国际声誉更多建立在地质独特性上。 |
| **核心体验** | **观赏地热奇观**(如97℃沸腾的“大滚锅”)、**体验温泉煮鸡蛋**。 |
### 💡 游玩攻略与温馨提示
如果你计划前往热海大滚锅,这里有一些实用信息供你参考:
- **门票与开放时间**:
- **门票**:景区门票约为**50元/人**。如果选择包含温泉沐浴的套餐,价格会更高,例如约**288元**。
- **开放时间**:景区一般**08:00-18:00**开放,但具体时间可能变动,建议提前核实。
- **特色体验**:
- **温泉煮鸡蛋**:这几乎是必试项目。可以在景区门口购买用草绳串起的生鸡蛋(约5-8元/串),然后到“大滚锅”旁的指定区域蒸煮,几分钟便可熟食,趣味十足。
- **金汤足浴**:可以直接用从“大滚锅”流出的温泉水泡脚,缓解旅途疲劳。
- **注意事项**:
- **安全第一**:“大滚锅”水温极高,务必遵守游览规则,在指定区域内观赏,切勿随意触碰泉水。
- **规划行程**:建议为热海景区预留**3-4小时**的游览时间。景区内步道不走回头路,出入口有观光车接送。
希望这些信息能帮助你更好地了解热海大滚锅。如果你对腾冲的其他景点或者行程规划有更多疑问,我很乐意提供进一步的信息。<|end▁of▁sentence|><|User|>世界著名温泉有哪些<|Assistant|></think><|action|>Search<|end▁of▁sentence|>
\ No newline at end of file
This diff is collapsed.
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
"""Unit tests for DeepSeekV4ToolParser."""
import json
from unittest.mock import MagicMock
from vllm.tool_parsers import ToolParserManager
from vllm.tool_parsers.deepseekv4_tool_parser import DeepSeekV4ToolParser
MOCK_TOKENIZER = MagicMock()
MOCK_TOKENIZER.get_vocab.return_value = {}
TC_START = "<|DSML|tool_calls>"
TC_END = "</|DSML|tool_calls>"
INV_START = '<|DSML|invoke name="'
INV_END = "</|DSML|invoke>"
PARAM_START = '<|DSML|parameter name="'
PARAM_END = "</|DSML|parameter>"
def make_parser(tools=None) -> DeepSeekV4ToolParser:
return DeepSeekV4ToolParser(MOCK_TOKENIZER, tools=tools)
def make_request(tools=None) -> MagicMock:
req = MagicMock()
req.tools = tools
return req
def build_tool_call(func_name: str, params: dict[str, str]) -> str:
param_strs = "".join(
f'{PARAM_START}{k}" string="true">{v}{PARAM_END}\n' for k, v in params.items()
)
return f'{TC_START}\n{INV_START}{func_name}">\n{param_strs}{INV_END}\n{TC_END}'
def stream(parser: DeepSeekV4ToolParser, full_text: str, chunk_size: int = 7):
deltas = []
previous_text = ""
for start in range(0, len(full_text), chunk_size):
delta_text = full_text[start : start + chunk_size]
current_text = previous_text + delta_text
delta = parser.extract_tool_calls_streaming(
previous_text=previous_text,
current_text=current_text,
delta_text=delta_text,
previous_token_ids=[],
current_token_ids=[],
delta_token_ids=[1],
request=make_request(),
)
previous_text = current_text
if delta is not None:
deltas.append(delta)
return deltas
def reconstruct_args(deltas, tool_index: int = 0) -> str:
fragments = []
for delta in deltas:
if delta.tool_calls:
for tool_call in delta.tool_calls:
if (
tool_call.index == tool_index
and tool_call.function
and tool_call.function.arguments
):
fragments.append(tool_call.function.arguments)
return "".join(fragments)
def test_registered():
assert ToolParserManager.get_tool_parser("deepseek_v4") is DeepSeekV4ToolParser
def test_extract_tool_calls():
parser = make_parser()
model_output = "Let me check. " + build_tool_call(
"get_weather", {"location": "Beijing", "unit": "celsius"}
)
result = parser.extract_tool_calls(model_output, make_request())
assert result.tools_called
assert result.content == "Let me check. "
assert len(result.tool_calls) == 1
tool_call = result.tool_calls[0]
assert tool_call.function.name == "get_weather"
assert json.loads(tool_call.function.arguments) == {
"location": "Beijing",
"unit": "celsius",
}
def test_function_calls_block_is_not_accepted():
parser = make_parser()
model_output = build_tool_call("search", {"query": "vllm"}).replace(
"tool_calls", "function_calls"
)
result = parser.extract_tool_calls(model_output, make_request())
assert not result.tools_called
assert result.content == model_output
def test_streaming_extracts_complete_invokes():
parser = make_parser()
full_text = build_tool_call("search", {"query": "deepseek v4"})
deltas = stream(parser, full_text, chunk_size=5)
names = [
tool_call.function.name
for delta in deltas
if delta.tool_calls
for tool_call in delta.tool_calls
]
assert names == ["search"]
assert json.loads(reconstruct_args(deltas)) == {"query": "deepseek v4"}
This diff is collapsed.
...@@ -1855,10 +1855,11 @@ def test_generate_scheduler_kv_cache_config(): ...@@ -1855,10 +1855,11 @@ def test_generate_scheduler_kv_cache_config():
def new_mla_spec(cache_dtype_str=None): def new_mla_spec(cache_dtype_str=None):
# head_size = kv_lora_rank(512) + qk_rope_head_dim(64) = 576
return MLAAttentionSpec( return MLAAttentionSpec(
block_size=16, block_size=16,
num_kv_heads=16, num_kv_heads=1,
head_size=64, head_size=576,
dtype=torch.float32, dtype=torch.float32,
cache_dtype_str=cache_dtype_str, cache_dtype_str=cache_dtype_str,
) )
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment