1. 29 Sep, 2023 3 commits
  2. 27 Sep, 2023 5 commits
  3. 26 Sep, 2023 3 commits
    • titi's avatar
      Deleted duplicate sentence (#26394) · a8531f3b
      titi authored
      a8531f3b
    • NielsRogge's avatar
      [ViTMatte] Add resources (#26317) · a09130fe
      NielsRogge authored
      Add resource
      a09130fe
    • NielsRogge's avatar
      Add Nougat (#25942) · ace74d16
      NielsRogge authored
      
      
      * Add conversion script
      
      * Add NougatImageProcessor
      
      * Add crop margin
      
      * More improvements
      
      * Add docs, READMEs
      
      * Remove print statements
      
      * Include model_max_length
      
      * Add NougatTokenizerFast
      
      * Fix imports
      
      * Improve postprocessing
      
      * Improve image processor
      
      * Fix image processor
      
      * Improve normalize method
      
      * More improvements
      
      * More improvements
      
      * Add processor, improve docs
      
      * Simplify fast tokenizer
      
      * Remove test file
      
      * Fix docstrings
      
      * Use NougatProcessor in conversion script
      
      * Add is_levensthein_available
      
      * Add tokenizer tests
      
      * More improvements
      
      * Use numpy instead of opencv
      
      * Add is_cv2_available
      
      * Fix cv2_available
      
      * Add is_nltk_available
      
      * Add image processor tests, improve crop_margin
      
      * Add integration tests
      
      * Improve integration test
      
      * Use do_rescale instead of hacks, thanks Amy
      
      * Remove random_padding
      
      * Address comments
      
      * Address more comments
      
      * Add import
      
      * Address more comments
      
      * Address more comments
      
      * Address comment
      
      * Address comment
      
      * Set max_model_input_sizes
      
      * Add tests
      
      * Add requires_backends
      
      * Add Nougat to exotic tests
      
      * Use to_pil_image
      
      * Address comment regarding nltk
      
      * Add NLTK
      
      * Improve variable names, integration test
      
      * Add test
      
      * refactor, document, and test regexes
      
      * remove named capture groups, add comments
      
      * format
      
      * add non-markdown fixed tokenization
      
      * format
      
      * correct flakyness of args parse
      
      * add regex comments
      
      * test functionalities for crop_image, align long axis and expected output
      
      * add regex tests
      
      * remove cv2 dependency
      
      * test crop_margin equality between cv2 and python
      
      * refactor table regexes to markdown
      
      add newline
      
      * change print to log, improve doc
      
      * fix high count tables correction
      
      * address PR comments: naming, linting, asserts
      
      * Address comments
      
      * Add copied from
      
      * Update conversion script
      
      * Update conversion script to convert both small and base versions
      
      * Add inference example
      
      * Add more info
      
      * Fix style
      
      * Add require annotators to test
      
      * Define all keyword arguments explicitly
      
      * Move cv2 annotator
      
      * Add tokenizer init method
      
      * Transfer checkpoints
      
      * Add reference to Donut
      
      * Address comments
      
      * Skip test
      
      * Remove cv2 method
      
      * Add copied from statements
      
      * Use cached_property
      
      * Fix docstring
      
      * Add file to not doctested
      
      ---------
      Co-authored-by: default avatarPablo Montalvo <pablo.montalvo.leroux@gmail.com>
      ace74d16
  4. 25 Sep, 2023 3 commits
  5. 22 Sep, 2023 3 commits
  6. 19 Sep, 2023 1 commit
    • NielsRogge's avatar
      Add ViTMatte (#25843) · 7d6354e0
      NielsRogge authored
      * First draft
      
      * Simplify image processor
      
      * Fix rebase
      
      * Address comments
      
      * Address more comments
      
      * Address more comments
      
      * Address more comments
      
      * Address more comments
      
      * Improve pad_image
      
      * Add tests
      
      * Update integration test
      
      * Fix image processor tests
      
      * Fix model tests
      
      * Convert checkpoints
      
      * Fix doc tests
      
      * Remove file
      
      * Apply suggestions
      
      * Address comments
      
      * Fix typing hint
      
      * Add batch_norm_eps
      
      * Address comments
      
      * Fix style
      7d6354e0
  7. 18 Sep, 2023 4 commits
  8. 15 Sep, 2023 2 commits
  9. 14 Sep, 2023 2 commits
    • Jinho Park's avatar
      Add BROS (#23190) · 17fdd354
      Jinho Park authored
      
      
      * add Bros boilerplate
      
      * copy and pasted modeling_bros.py from official Bros repo
      
      * update copyright of bros files
      
      * copy tokenization_bros.py from official repo and update import path
      
      * copy tokenization_bros_fast.py from official repo and update import path
      
      * copy configuration_bros.py from official repo and update import path
      
      * remove trailing period in copyright line
      
      * copy and paste bros/__init__.py from official repo
      
      * save formatting
      
      * remove unused unnecessary pe_type argument - using only crel type
      
      * resolve import issue
      
      * remove unused model classes
      
      * remove unnecessary tests
      
      * remove unused classes
      
      * fix original code's bug - layer_module's argument order
      
      * clean up modeling auto
      
      * add bbox to prepare_config_and_inputs
      
      * set temporary value to hidden_size (32 is too low because of the of the
      Bros' positional embedding)
      
      * remove decoder test, update create_and_check* input arguemnts
      
      * add missing variable to model tests
      
      * do make fixup
      
      * update bros.mdx
      
      * add boilerate plate for no_head inference test
      
      * update BROS_PRETRAINED_MODEL_ARCHIVE_LIST (add naver-clova-ocr prefix)
      
      * add prepare_bros_batch_inputs function
      
      * update modeling_common to add bbox inputs in Bros Model Test
      
      * remove unnecessary model inference
      
      * add test case
      
      * add model_doc
      
      * add test case for token_classification
      
      * apply fixup
      
      * update modeling code
      
      * update BrosForTokenClassification loss calculation logic
      
      * revert logits preprocessing logic to make sure logits have original shape
      
      * - update class name
      
      * - add BrosSpadeOutput
      - update BrosConfig arguments
      
      * add boilerate plate for no_head inference test
      
      * add prepare_bros_batch_inputs function
      
      * add test case
      
      * add test case for token_classification
      
      * update modeling code
      
      * update BrosForTokenClassification loss calculation logic
      
      * revert logits preprocessing logic to make sure logits have original shape
      
      * apply masking on the fly
      
      * add BrosSpadeForTokenLinking
      
      * update class name
      put docstring to the beginning of the file
      
      * separate the logits calculation logic and loss calculation logic
      
      * update logic for loss calculation so that logits shape doesn't change
      when return
      
      * update typo
      
      * update prepare_config_and_inputs
      
      * update dummy node initialization
      
      * update last_hidden_states getting logic to consider when return_dict is False
      
      * update box first token mask param
      
      * bugfix: remove random attention mask generation
      
      * update keys to ignore on load missing
      
      * run make style and quality
      
      * apply make style and quality of other codes
      
      * update box_first_token_mask to bool type
      
      * update index.md
      
      * apply make style and quality
      
      * apply make fix-copies
      
      * pass check_repo
      
      * update bros model doc
      
      * docstring bugfix fix
      
      * add checkpoint for doc, tokenizer for doc
      
      * Update README.md
      
      * Update docs/source/en/model_doc/bros.md
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update bros.md
      
      * Update src/transformers/__init__.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update docs/source/en/model_doc/bros.md
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * apply suggestions from code review
      
      * apply suggestions from code review
      
      * revert test_processor_markuplm.py
      
      * Update test_processor_markuplm.py
      
      * apply suggestions from code review
      
      * apply suggestions from code review
      
      * apply suggestions from code review
      
      * update BrosSpadeELForTokenClassification head name to entity linker
      
      * add doc string for config params
      
      * update class, var names to more explicit and apply suggestions from code review
      
      * remove unnecessary keys to ignore
      
      * update relation extractor to be initialized with config
      
      * add bros processor
      
      * apply make style and quality
      
      * update bros.md
      
      * remove bros tokenizer, add bros processor that wraps bert tokenizer
      
      * revert change
      
      * apply make fix-copies
      
      * update processor code, update itc -> initial token, stc -> subsequent token
      
      * add type hint
      
      * remove unnecessary condition branches in embedding forward
      
      * fix auto tokenizer fail
      
      * update docstring for each classes
      
      * update bbox input dimension as standard 2 points and convert them to 4
      points in forward pass
      
      * update bros docs
      
      * apply suggestions from code review : update Bros -> BROS in bros.md
      
      * 1. box prefix var -> bbox
      2. update variable names to be more explicit
      
      * replace einsum with torch matmul
      
      * apply style and quality
      
      * remove unused argument
      
      * remove unused arguments
      
      * update docstrings
      
      * apply suggestions from code review: add BrosBboxEmbeddings, replace
      einsum with classical matrix operations
      
      * revert einsum update
      
      * update bros processor
      
      * apply suggestions from code review
      
      * add conversion script for bros
      
      * Apply suggestions from code review
      
      * fix readme
      
      * apply fix-copies
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      17fdd354
    • Matt's avatar
      Overhaul Conversation class and prompt templating (#25323) · 866df66f
      Matt authored
      
      
      * First commit while I figure this out
      
      * make fixup
      
      * Remove unused method
      
      * Store prompt attrib
      
      * Fix prompt argument for tests
      
      * Make same changes in fast tokenizer
      
      * Remove global prompts from fast tokenizer too
      
      * stash commit
      
      * stash commit
      
      * Migrate PromptConfig to its True Final Location
      
      * Replace Conversation entirely with the new class
      
      * Import/dependency fixes
      
      * Import/dependency fixes
      
      * Change format for lots of default prompts
      
      * More default prompt fixups
      
      * Revert llama old methods so we can compare
      
      * Fix some default configs
      
      * Fix some default configs
      
      * Fix misspelled kwarg
      
      * Fixes for Blenderbot
      
      * make fixup
      
      * little rebase cleanup
      
      * Add basic documentation
      
      * Quick doc fix
      
      * Truncate docstring for now
      
      * Add handling for the case when messages is a single string
      
      * Quick llama merges
      
      * Update conversational pipeline and tests
      
      * Add a couple of legacy properties for backward compatibility
      
      * More legacy handling
      
      * Add docstring for build_conversation_input_ids
      
      * Restructure PromptConfig
      
      * Let's start T E M P L A T I N G
      
      * Refactor all default configs to use templates instead
      
      * Revert changes to the special token properties since we don't need them anymore
      
      * More class templates
      
      * Make the sandbox even sandier
      
      * Everything replaced with pure templating
      
      * Remove docs for PromptConfig
      
      * Add testing and optional requirement boilerplate
      
      * Fix imports and make fixup
      
      * Fix LLaMA tests and add Conversation docstring
      
      * Finally get LLaMA working with the template system
      
      * Finally get LLaMA working with the template system
      
      * make fixup
      
      * make fixup
      
      * fmt-off for the long lists of test tokens
      
      * Rename method to apply_chat_template for now
      
      * Start on documentation
      
      * Make chat_template a property that reads through to the default if it's not set
      
      * Expand docs
      
      * Expand chat templating doc some more
      
      * trim/lstrip blocks by default and update doc
      
      * Few doc tweaks
      
      * rebase cleanup
      
      * Clarify docstring
      
      * rebase cleanup
      
      * rebase cleanup
      
      * make fixup
      
      * Quick doc edit
      
      * Reformat the standard template to match ChatML
      
      * Re-add PEFT check
      
      * Update docs/source/en/chat_templating.md
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Add apply_chat_template to the tokenizer doc
      
      * make fixup
      
      * Add doc links
      
      * Fix chat links
      
      * Fix chat links
      
      * Explain system messages in the doc
      
      * Add chat template test
      
      * Proper save-loading for chat template attribute
      
      * Add test skips for layout models
      
      * Remove _build_conversation_input_ids, add default_chat_template to code_llama
      
      * Make sure all LLaMA models are using the latest template
      
      * Remove default_system_prompt block in code_llama because it has no default prompt
      
      * Update ConversationPipeline preprocess
      
      * Add correct #Copied from links to the default_chat_templates
      
      * Remove unneeded type checking line
      
      * Add a dummy mark_processsed method
      
      * Reorganize Conversation to have **deprecated_kwargs
      
      * Update chat_templating.md
      
      * Quick fix to LLAMA tests
      
      * Small doc tweaks
      
      * Add proper docstrings and "copied from" statements to all default chat templates
      
      * Merge use_default_system_prompt support for code_llama too
      
      * Improve clarity around self.chat_template
      
      * Docstring fix
      
      * Fix blenderbot default template
      
      * More doctest fix
      
      * Break out some tokenizer kwargs
      
      * Update doc to explain default templates
      
      * Quick tweaks to tokenizer args
      
      * Cleanups for tokenizer args
      
      * Add note about cacheing
      
      * Quick tweak to the chat-templating doc
      
      * Update the LLaMA template with error checking and correct system message embedding
      
      * make fixup
      
      * make fixup
      
      * add requires_jinja
      
      * Cleanup to expected output formatting
      
      * Add cacheing
      
      * Fix typo in llama default template
      
      * Update LLaMA tests
      
      * Update documentation
      
      * Improved legacy handling in the Conversation class
      
      * Update Jinja template with proper error handling
      
      * Quick bugfix
      
      * Proper exception raising
      
      * Change cacheing behaviour so it doesn't try to pickle an entire Jinja env
      
      * make fixup
      
      * rebase cleanup
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      866df66f
  10. 13 Sep, 2023 2 commits
  11. 12 Sep, 2023 6 commits
  12. 11 Sep, 2023 1 commit
  13. 08 Sep, 2023 1 commit
  14. 07 Sep, 2023 1 commit
    • Muskan Kumar's avatar
      Added HerBERT to README.md (#26020) · 02c4a77f
      Muskan Kumar authored
      * Added HerBERT to README.md
      
      * Update README.md to contain HerBERT (#26016)
      
      * Resolved #26016: Updated READMEs and index.md to contain Herbert
      
      Updated READMEs and ran make fix-copies
      02c4a77f
  15. 06 Sep, 2023 2 commits
  16. 05 Sep, 2023 1 commit