@@ -158,13 +158,13 @@ All Llama 3.1, 3.2 and 4 models should be supported.
...
@@ -158,13 +158,13 @@ All Llama 3.1, 3.2 and 4 models should be supported.
* `meta-llama/Llama-3.2-*`
* `meta-llama/Llama-3.2-*`
* `meta-llama/Llama-4-*`
* `meta-llama/Llama-4-*`
The tool calling that is supported is the [JSON based tool calling](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1/#json-based-tool-calling). For [pythonic tool calling](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/text_prompt_format.md#zero-shot-function-calling) introduced by the Llama-3.2 models, see the `pythonic` tool parser below.
The tool calling that is supported is the [JSON based tool calling](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1/#json-based-tool-calling). For [pythonic tool calling](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/text_prompt_format.md#zero-shot-function-calling) introduced by the Llama-3.2 models, see the `pythonic` tool parser below. As for llama 4 models, it is recommended to use the `llama4_pythonic` tool parser.
Other tool calling formats like the built in python tool calling or custom tool calling are not supported.
Other tool calling formats like the built in python tool calling or custom tool calling are not supported.
Known issues:
Known issues:
1. Parallel tool calls are not supported.
1. Parallel tool calls are not supported for llama 3, but it is supported in llama 4 models.
2. The model can generate parameters with a wrong format, such as generating
2. The model can generate parameters with a wrong format, such as generating
an array serialized as string instead of an array.
an array serialized as string instead of an array.
VLLM also provides a JSON based chat template for Llama 4:
VLLM also provides a pythonic and JSON based chat template for Llama 4, but pythonic tool calling is recommended:
* <gh-file:examples/tool_chat_template_llama4_json.jinja> - this is based on the "official" chat template for the Llama 4
* <gh-file:examples/tool_chat_template_llama4_pythonic.jinja> - this is based on the [official chat template](https://www.llama.com/docs/model-cards-and-prompt-formats/llama4/) for the Llama 4 models.
models, but tweaked so that it works better with vLLM.
For Llama 4 use `--tool-call-parser llama4_json examples/tool_chat_template_llama4_json.jinja`.
For Llama 4 model, use `--tool-call-parser llama4_pythonic --chat-template examples/tool_chat_template_llama4_pythonic.jinja`.
{%- set tool_definition = tool_definition ~ (tools | tojson(indent=4)) %}
{%- endif %}
{%- else %}
{%- if not tools is defined %}
{%- set tools = none %}
{%- set tools = none %}
{%- endif %}
{%- endif %}
{#- This block extracts the system message, so we can slot it into the right place. #}
{#- This block extracts the system message, so we can slot it into the right place. #}
{%- if messages[0]['role'] == 'system' %}
{%- if messages[0]['role'] == 'system' %}
{%- set user_provided_system_message = true %}
{%- if messages[0]['content'] is string %}
{%- if messages[0]['content'] is string %}
{%- set system_message = messages[0]['content']|trim %}
{%- set system_message = messages[0]['content']|trim %}
{%- else %}
{%- else %}
...
@@ -18,68 +19,33 @@
...
@@ -18,68 +19,33 @@
{%- endif %}
{%- endif %}
{%- set messages = messages[1:] %}
{%- set messages = messages[1:] %}
{%- else %}
{%- else %}
{%- if tools is not none %}
{%- if tools is not none %}
{#- Add default tool system message when tools are provided #}
{#- Since not system_message was provided by user, if tool is provided, system_message is now default tool system message #}
{%- set system_message = "You are a helpful assistant with tool calling "
{#- This system message is from llama website:https://www.llama.com/docs/model-cards-and-prompt-formats/llama4/ #}
"capabilities. Only reply with a tool call if the function exists in the "
{%- set system_message = "You are a helpful assistant and an expert in function composition. You can answer general questions using your internal knowledge OR invoke functions when necessary. Follow these strict guidelines:\n\n1. FUNCTION CALLS:\n- ONLY use functions that are EXPLICITLY listed in the function list below\n- If NO functions are listed (empty function list []), respond ONLY with internal knowledge or \"I don't have access to [Unavailable service] information\"\n- If a function is not in the list, respond ONLY with internal knowledge or \"I don't have access to [Unavailable service] information\"\n- If ALL required parameters are present AND the query EXACTLY matches a listed function's purpose: output ONLY the function call(s)\n- Use exact format: [func_name1(param1=value1, param2=value2), func_name2(...)]\nExamples:\nCORRECT: [get_weather(location=\"Vancouver\"), calculate_route(start=\"Boston\", end=\"New York\")] <- Only if get_weather and calculate_route are in function list\nINCORRECT: get_weather(location=\"New York\")\nINCORRECT: Let me check the weather: [get_weather(location=\"New York\")]\nINCORRECT: [get_events(location=\"Singapore\")] <- If function not in list\n\n2. RESPONSE RULES:\n- For pure function requests matching a listed function: ONLY output the function call(s)\n- For knowledge questions: ONLY output text\n- For missing parameters: ONLY request the specific missing parameters\n- For unavailable services (not in function list): output ONLY with internal knowledge or \"I don't have access to [Unavailable service] information\". Do NOT execute a function call.\n- If the query asks for information beyond what a listed function provides: output ONLY with internal knowledge about your limitations\n- NEVER combine text and function calls in the same response\n- NEVER suggest alternative functions when the requested service is unavailable\n- NEVER create or invent new functions not listed below\n\n3. STRICT BOUNDARIES:\n- ONLY use functions from the list below - no exceptions\n- NEVER use a function as an alternative to unavailable information\n- NEVER call functions not present in the function list\n- NEVER add explanatory text to function calls\n- NEVER respond with empty brackets\n- Use proper Python/JSON syntax for function calls\n- Check the function list carefully before responding\n\n4. TOOL RESPONSE HANDLING:\n- When receiving tool responses: provide concise, natural language responses\n- Don't repeat tool response verbatim\n- Don't add supplementary information\n\nHere is a list of functions in JSON format that you can invoke:\n" %}
"library provided by the user. If it doesn't exist, just reply directly in "
"natural language. When you receive a tool call response, use the output to "
"format an answer to the original user question." %}
{%- else %}
{%- else %}
{%- set system_message = "" %}
{%- set system_message = "" %}
{%- endif %}
{%- endif %}
{%- endif %}
{%- endif %}
{#- Now writing the system message: use the user provided system message if user_provided_system_message, else default tool system message if tools presented #}
{#- System message if the user supplied one, or if tools are used (default tool system message) #}
{%- if system_message %}
{%- if system_message %}
{#- always use user provided system message to override default tool system message #}
{#- always use user provided system message to override default tool system message #}
{{- "<|header_start|>system<|header_end|>\n\n" }}
{{- "<|header_start|>system<|header_end|>\n\n" }}
{{- system_message }}
{{- system_message }}
{%- if tools is not none and not tools_in_user_message %}
{%- if user_provided_system_message and tools %}
{{- "Tools: You have access to the following tools. You might need to use one "
{{- "\nHere is a list of functions in JSON format that you can invoke. Use exact format: [func_name1(param1=value1, param2=value2), func_name2(...)]\n" }}
"or more function/tool calls to fulfill the task. \n"
{{- tool_definition -}}
"If none are needed, then proceed to the response.\n\n"
{%- elif tool_definition %}
"Tool Call Syntax: You can call tools using the following syntax:\n"