Unverified Commit dabf0197 authored by Matt's avatar Matt Committed by GitHub
Browse files

Make "tool_use" the default chat template key when tools are passed (#31429)

* Make "tool_use" the default when tools are passed

* Add some opinionated text to the docs

* Add some opinionated text to the docs
parent cd71f938
...@@ -677,6 +677,24 @@ template. This will ensure that text generation tools can correctly figure out w ...@@ -677,6 +677,24 @@ template. This will ensure that text generation tools can correctly figure out w
</Tip> </Tip>
### Why do some models have multiple templates?
Some models use different templates for different use cases. For example, they might use one template for normal chat
and another for tool-use, or retrieval-augmented generation. In these cases, `tokenizer.chat_template` is a dictionary.
This can cause some confusion, and where possible, we recommend using a single template for all use-cases. You can use
Jinja statements like `if tools is defined` and `{% macro %}` definitions to easily wrap multiple code paths in a
single template.
When a tokenizer has multiple templates, `tokenizer.chat_template` will be a `dict`, where each key is the name
of a template. The `apply_chat_template` method has special handling for certain template names: Specifically, it will
look for a template named `default` in most cases, and will raise an error if it can't find one. However, if a template
named `tool_use` exists when the user has passed a `tools` argument, it will use that instead. To access templates
with other names, pass the name of the template you want to the `chat_template` argument of
`apply_chat_template()`.
We find that this can be a bit confusing for users, though - so if you're writing a template yourself, we recommend
trying to put it all in a single template where possible!
### What are "default" templates? ### What are "default" templates?
Before the introduction of chat templates, chat handling was hardcoded at the model class level. For backwards Before the introduction of chat templates, chat handling was hardcoded at the model class level. For backwards
......
...@@ -1781,16 +1781,20 @@ class PreTrainedTokenizerBase(SpecialTokensMixin, PushToHubMixin): ...@@ -1781,16 +1781,20 @@ class PreTrainedTokenizerBase(SpecialTokensMixin, PushToHubMixin):
chat_template = template_dict[chat_template] chat_template = template_dict[chat_template]
if using_default_dict: if using_default_dict:
using_default_template = True using_default_template = True
elif chat_template is None and "default" in template_dict: elif chat_template is None:
chat_template = template_dict["default"] if tools is not None and "tool_use" in template_dict:
chat_template = template_dict["tool_use"]
elif "default" in template_dict:
chat_template = template_dict["default"]
else:
raise ValueError(
"This model has multiple chat templates with no default specified! Please either pass a chat "
"template or the name of the template you wish to use to the `chat_template` argument. Available "
f"template names are {sorted(template_dict.keys())}."
)
if using_default_dict: if using_default_dict:
using_default_template = True using_default_template = True
elif chat_template is None:
raise ValueError(
"This model has multiple chat templates with no default specified! Please either pass a chat "
"template or the name of the template you wish to use to the `chat_template` argument. Available "
f"template names are {sorted(template_dict.keys())}."
)
elif chat_template is None: elif chat_template is None:
# These are the cases when the model has a single template # These are the cases when the model has a single template
# priority: `chat_template` argument > `tokenizer.chat_template` > `tokenizer.default_chat_template # priority: `chat_template` argument > `tokenizer.chat_template` > `tokenizer.default_chat_template
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment