1. 11 Jun, 2024 1 commit
    • Matt's avatar
      Chat Template support for function calling and RAG (#30621) · edc1dffd
      Matt authored
      
      
      * First draft, still missing automatic function conversion
      
      * First draft of the automatic schema generator
      
      * Lots of small fixes
      
      * the walrus has betrayed me
      
      * please stop committing your debug breakpoints
      
      * Lots of cleanup and edge cases, looking better now
      
      * Comments and bugfixes for the type hint parser
      
      * More cleanup
      
      * Add tests, update schema generator
      
      * Update tests, proper handling of return values
      
      * Small docstring change
      
      * More doc updates
      
      * More doc updates
      
      * Add json_schema decorator
      
      * Clean up the TODOs and finish the docs
      
      * self.maxDiff = None to see the whole diff for the nested list test
      
      * add import for add_json_schema
      
      * Quick test fix
      
      * Fix something that was bugging me in the chat template docstring
      
      * Less "anyOf" when unnecessary
      
      * Support return types for the templates that need them
      
      * Proper return type tests
      
      * Switch to Google format docstrings
      
      * Update chat templating docs to match new format
      
      * Stop putting the return type in with the other parameters
      
      * Add Tuple support
      
      * No more decorator - we just do it implicitly!
      
      * Add enum support to get_json_schema
      
      * Update docstring
      
      * Add copyright header
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update docs/source/en/chat_templating.md
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/utils/chat_template_utils.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/utils/chat_template_utils.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Add copyright header
      
      * make fixup
      
      * Fix indentation
      
      * Reformat chat_template_utils
      
      * Correct return value
      
      * Make regexes module-level
      
      * Support more complex, multi-line arg docstrings
      
      * Update error message for ...
      
      * Update ruff
      
      * Add document type validation
      
      * Refactor docs
      
      * Refactor docs
      
      * Refactor docs
      
      * Clean up Tuple error
      
      * Add an extra test for very complex defs and docstrings and clean everything up for it
      
      * Document enum block
      
      * Quick test fixes
      
      * Stop supporting type hints in docstring to fix bugs and simplify the regex
      
      * Update docs for the regex change
      
      * Clean up enum regex
      
      * Wrap functions in {"type": "function", "function": ...}
      
      * Update src/transformers/utils/chat_template_utils.py
      Co-authored-by: default avatarPablo Montalvo <39954772+molbap@users.noreply.github.com>
      
      * Temporary tool calling commit
      
      * Add type hints to chat template utils, partially update docs (incomplete!)
      
      * Code cleanup based on @molbap's suggestion
      
      * Add comments to explain regexes
      
      * Fix up type parsing for unions and lists
      
      * Add custom exception types and adjust tests to look for them
      
      * Update docs with a demo!
      
      * Docs cleanup
      
      * Pass content as string
      
      * Update tool call formatting
      
      * Update docs with new function format
      
      * Update docs
      
      * Update docs with a second tool to show the model choosing correctly
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      Co-authored-by: default avatarPablo Montalvo <39954772+molbap@users.noreply.github.com>
      edc1dffd
  2. 19 Apr, 2024 1 commit
    • Matt's avatar
      Deprecate default chat templates (#30346) · 0927bfd0
      Matt authored
      * initial commit, remove warnings on default chat templates
      
      * stash commit
      
      * Raise a much sterner warning for default chat templates, and prepare for depreciation
      
      * Update the docs
      0927bfd0
  3. 13 Mar, 2024 1 commit
  4. 16 Feb, 2024 1 commit
  5. 12 Feb, 2024 1 commit
  6. 01 Feb, 2024 1 commit
  7. 27 Nov, 2023 1 commit
  8. 14 Nov, 2023 1 commit
    • Matt's avatar
      Update and reorder docs for chat templates (#27443) · 5468ab35
      Matt authored
      * Update and reorder docs for chat templates
      
      * Fix Mistral docstring
      
      * Add section link and small fixes
      
      * Remove unneeded line in Mistral example
      
      * Add comment on saving memory
      
      * Fix generation prompts linl
      
      * Fix code block languages
      5468ab35
  9. 06 Oct, 2023 1 commit
  10. 04 Oct, 2023 2 commits
  11. 15 Sep, 2023 1 commit
    • Matt's avatar
      Tweaks to Chat Templates docs (#26168) · 2518e368
      Matt authored
      * Put tokenizer methods in the right alphabetical order in the docs
      
      * Quick tweak to ConversationalPipeline
      
      * Typo fixes in the developer doc
      
      * make fixup
      2518e368
  12. 14 Sep, 2023 1 commit
    • Matt's avatar
      Overhaul Conversation class and prompt templating (#25323) · 866df66f
      Matt authored
      
      
      * First commit while I figure this out
      
      * make fixup
      
      * Remove unused method
      
      * Store prompt attrib
      
      * Fix prompt argument for tests
      
      * Make same changes in fast tokenizer
      
      * Remove global prompts from fast tokenizer too
      
      * stash commit
      
      * stash commit
      
      * Migrate PromptConfig to its True Final Location
      
      * Replace Conversation entirely with the new class
      
      * Import/dependency fixes
      
      * Import/dependency fixes
      
      * Change format for lots of default prompts
      
      * More default prompt fixups
      
      * Revert llama old methods so we can compare
      
      * Fix some default configs
      
      * Fix some default configs
      
      * Fix misspelled kwarg
      
      * Fixes for Blenderbot
      
      * make fixup
      
      * little rebase cleanup
      
      * Add basic documentation
      
      * Quick doc fix
      
      * Truncate docstring for now
      
      * Add handling for the case when messages is a single string
      
      * Quick llama merges
      
      * Update conversational pipeline and tests
      
      * Add a couple of legacy properties for backward compatibility
      
      * More legacy handling
      
      * Add docstring for build_conversation_input_ids
      
      * Restructure PromptConfig
      
      * Let's start T E M P L A T I N G
      
      * Refactor all default configs to use templates instead
      
      * Revert changes to the special token properties since we don't need them anymore
      
      * More class templates
      
      * Make the sandbox even sandier
      
      * Everything replaced with pure templating
      
      * Remove docs for PromptConfig
      
      * Add testing and optional requirement boilerplate
      
      * Fix imports and make fixup
      
      * Fix LLaMA tests and add Conversation docstring
      
      * Finally get LLaMA working with the template system
      
      * Finally get LLaMA working with the template system
      
      * make fixup
      
      * make fixup
      
      * fmt-off for the long lists of test tokens
      
      * Rename method to apply_chat_template for now
      
      * Start on documentation
      
      * Make chat_template a property that reads through to the default if it's not set
      
      * Expand docs
      
      * Expand chat templating doc some more
      
      * trim/lstrip blocks by default and update doc
      
      * Few doc tweaks
      
      * rebase cleanup
      
      * Clarify docstring
      
      * rebase cleanup
      
      * rebase cleanup
      
      * make fixup
      
      * Quick doc edit
      
      * Reformat the standard template to match ChatML
      
      * Re-add PEFT check
      
      * Update docs/source/en/chat_templating.md
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Add apply_chat_template to the tokenizer doc
      
      * make fixup
      
      * Add doc links
      
      * Fix chat links
      
      * Fix chat links
      
      * Explain system messages in the doc
      
      * Add chat template test
      
      * Proper save-loading for chat template attribute
      
      * Add test skips for layout models
      
      * Remove _build_conversation_input_ids, add default_chat_template to code_llama
      
      * Make sure all LLaMA models are using the latest template
      
      * Remove default_system_prompt block in code_llama because it has no default prompt
      
      * Update ConversationPipeline preprocess
      
      * Add correct #Copied from links to the default_chat_templates
      
      * Remove unneeded type checking line
      
      * Add a dummy mark_processsed method
      
      * Reorganize Conversation to have **deprecated_kwargs
      
      * Update chat_templating.md
      
      * Quick fix to LLAMA tests
      
      * Small doc tweaks
      
      * Add proper docstrings and "copied from" statements to all default chat templates
      
      * Merge use_default_system_prompt support for code_llama too
      
      * Improve clarity around self.chat_template
      
      * Docstring fix
      
      * Fix blenderbot default template
      
      * More doctest fix
      
      * Break out some tokenizer kwargs
      
      * Update doc to explain default templates
      
      * Quick tweaks to tokenizer args
      
      * Cleanups for tokenizer args
      
      * Add note about cacheing
      
      * Quick tweak to the chat-templating doc
      
      * Update the LLaMA template with error checking and correct system message embedding
      
      * make fixup
      
      * make fixup
      
      * add requires_jinja
      
      * Cleanup to expected output formatting
      
      * Add cacheing
      
      * Fix typo in llama default template
      
      * Update LLaMA tests
      
      * Update documentation
      
      * Improved legacy handling in the Conversation class
      
      * Update Jinja template with proper error handling
      
      * Quick bugfix
      
      * Proper exception raising
      
      * Change cacheing behaviour so it doesn't try to pickle an entire Jinja env
      
      * make fixup
      
      * rebase cleanup
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      866df66f