1. 23 Feb, 2022 2 commits
  2. 18 Feb, 2022 2 commits
    • Sylvain Gugger's avatar
      d5083c33
    • Gunjan Chhablani's avatar
      Add PLBart (#13269) · ae1f8350
      Gunjan Chhablani authored
      * Init PLBART
      
      * Add missing configuration file
      
      * Add conversion script and configurationf ile
      
      * Fix style
      
      * Update modeling and conversion scripts
      
      * Fix scale embedding in config
      
      * Add comment
      
      * Fix conversion script
      
      * Add classification option to conversion script
      
      * Fix vocab size in config doc
      
      * Add tokenizer files from MBart50
      
      * Allow no lang code in regular tokenizer
      
      * Add PLBart Tokenizer Converters
      
      * Remove mask from multi tokenizer
      
      * Remove mask from multi tokenizer
      
      * Change from MBart-50 to MBart tokenizer
      
      * Fix names and modify src/tgt behavior
      
      * Fix imports for tokenizer
      
      * Remove <mask> from multi tokenizer
      
      * Fix style
      
      * Change tokenizer_class to processor_class
      
      * Add attribute map to config class
      
      * Update modeling file to modified MBart code
      
      * Update configuration file to MBart style configuration
      
      * Fix tokenizer
      
      * Separate tokenizers
      
      * Fix error in tokenization auto
      
      * Copy MBart tests
      
      * Replace with MBart tokenization tests
      
      * Fix style
      
      * Fix language code in multi tokenizer
      
      * Fix configuration docs
      
      * Add entry for plbart_multi in transformers init
      
      * Add dummy objects and fix imports
      
      * Fix modeling tests
      
      * Add TODO in config
      
      * Fix copyright year
      
      * Fix modeling docs and test
      
      * Fix some tokenization tests and style
      
      * Add changes from review
      
      * Fix copies
      
      * Fix docs
      
      * Fix docs
      
      * Fix style
      
      * Fix year
      
      * Add changes from review
      
      * Remove extra changes
      
      * Fix base tokenizer and doc
      
      * Fix style
      
      * Fix modeling and slow tokenizer tests
      
      * Remove Multi-tokenizer Converter and Tests
      
      * Delete QA model and Multi Tokenizer dummy objects
      
      * Fix repo consistency and code quality issues
      
      * Fix example documentation
      
      * Fix style
      
      * Remove PLBartTokenizer from type checking in init
      
      * Fix consistency issue
      
      * Add changes from review
      
      * Fix style
      
      * Remove PLBartTokenizerFast
      
      * Remove FastTokenizer converter
      
      * Fix AutoTokenzier mapping
      
      * Add plbart to toctree and fix consistency issues
      
      * Add language codes tokenizer test
      
      * Fix styling and doc issues
      
      * Add fixes for failing tests
      
      * Fix copies
      
      * Fix failing modeling test
      
      * Change assert to assertTrue in modeling tests
      ae1f8350
  3. 15 Feb, 2022 1 commit
  4. 11 Feb, 2022 1 commit
    • Sylvain Gugger's avatar
      Custom feature extractor (#15630) · 7a32e472
      Sylvain Gugger authored
      * Rework AutoFeatureExtractor.from_pretrained internal
      
      * Custom feature extractor
      
      * Add more tests
      
      * Add support for custom feature extractor code
      
      * Clean up
      7a32e472
  5. 10 Feb, 2022 1 commit
  6. 09 Feb, 2022 2 commits
  7. 07 Feb, 2022 1 commit
  8. 04 Feb, 2022 1 commit
  9. 02 Feb, 2022 1 commit
  10. 28 Jan, 2022 1 commit
    • Suraj Patil's avatar
      Add XGLM models (#14876) · d25e25ee
      Suraj Patil authored
      
      
      * add xglm
      
      * update vocab size
      
      * fix model name
      
      * style and tokenizer
      
      * typo
      
      * no mask token
      
      * fix pos embed compute
      
      * fix args
      
      * fix tokenizer
      
      * fix positions
      
      * fix tokenization
      
      * style and dic fixes
      
      * fix imports
      
      * add fast tokenizer
      
      * update names
      
      * add pt tests
      
      * fix tokenizer
      
      * fix typo
      
      * fix tokenizer import
      
      * fix fast tokenizer
      
      * fix tokenizer
      
      * fix converter
      
      * add tokenizer test
      
      * update checkpoint names
      
      * fix tokenizer tests
      
      * fix slow tests
      
      * add copied from comments
      
      * rst -> mdx
      
      * flax model
      
      * update flax tests
      
      * quality
      
      * style
      
      * doc
      
      * update index and readme
      
      * fix copies
      
      * fix doc
      
      * update toctrr
      
      * fix indent
      
      * minor fixes
      
      * fix config doc
      
      * don't save embed_pos weights
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * address Sylvains commnets, few doc fixes
      
      * fix check_repo
      
      * align order of arguments
      
      * fix copies
      
      * fix labels
      
      * remove unnecessary mapping
      
      * fix saving tokenizer
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      d25e25ee
  11. 27 Jan, 2022 2 commits
    • Sylvain Gugger's avatar
      Fix tests_fetcher (#15376) · a81fd355
      Sylvain Gugger authored
      a81fd355
    • Patrick von Platen's avatar
      [DocTests Speech] Add doc tests for all speech models (#15031) · 9f831bde
      Patrick von Platen authored
      * fix_torch_device_generate_test
      
      * remove @
      
      * doc tests
      
      * up
      
      * up
      
      * fix doctests
      
      * adapt files
      
      * finish refactor
      
      * up
      
      * save intermediate
      
      * add more logic
      
      * new change
      
      * improve
      
      * next try
      
      * next try
      
      * next try
      
      * next try
      
      * fix final spaces
      
      * fix final spaces
      
      * improve
      
      * renaming
      
      * correct more bugs
      
      * finish wavlm
      
      * add comment
      
      * run on test runner
      
      * finish all speech models
      
      * adapt
      
      * finish
      9f831bde
  12. 24 Jan, 2022 1 commit
    • Sylvain Gugger's avatar
      Add model like (#14992) · 81156d20
      Sylvain Gugger authored
      
      
      * Add new model like command
      
      * Bad doc-styler
      
      * black and doc-styler, stop fighting!
      
      * black and doc-styler, stop fighting!
      
      * At last
      
      * Clean up
      
      * Typo
      
      * Bad doc-styler
      
      * Bad doc-styler
      
      * All good maybe?
      
      * Use constants
      
      * Add doc and type hints
      
      * More cleaning
      
      * Add doc
      
      * Fix Copied from
      
      * Doc template
      
      * Use typing.Pattern instead
      
      * Framework-specific files
      
      * Fixes
      
      * Select frameworks clean model init
      
      * Deal with frameworks in main init
      
      * fixes
      
      * Last fix
      
      * Prompt user for info
      
      * Delete exemple config
      
      * Last fixes
      
      * Add test config
      
      * Fix bug with model_type included in each other
      
      * Fixes
      
      * More fixes
      
      * More fixes
      
      * Adapt config
      
      * Remove print statements
      
      * Will fix tokenization later, leave it broken for now
      
      * Add test
      
      * Quality
      
      * Try this way
      
      * Debug
      
      * Maybe by setting the path?
      
      * Let's try another way
      
      * It should go better when actually passing the arg...
      
      * Remove debug statements and style
      
      * Fix config
      
      * Add tests
      
      * Test require the three backends
      
      * intermediate commit
      
      * Revamp pattern replacements and start work on feature extractors
      
      * Adapt model info
      
      * Finalize code for processors
      
      * Fix in main init additions
      
      * Finish questionnaire for processing classes
      
      * Fix file name
      
      * Fix for real
      
      * Fix patterns
      
      * Style
      
      * Remove needless warnings
      
      * Copied from should work now.
      
      * Include Copied form in blocks
      
      * Add test
      
      * More fixes and tests
      
      * Apply suggestions from code review
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Address review comment
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      81156d20
  13. 21 Jan, 2022 1 commit
    • Sylvain Gugger's avatar
      Refine errors for pretrained objects (#15261) · 6ac77534
      Sylvain Gugger authored
      * Refine errors for pretrained objects
      
      * PoC to avoid using get_list_of_files
      
      * Adapt tests to use new errors
      
      * Quality + Fix PoC
      
      * Revert "PoC to avoid using get_list_of_files"
      
      This reverts commit cb93b7cae8504ef837c2a7663cb7955e714f323e.
      
      * Revert "Quality + Fix PoC"
      
      This reverts commit 3ba6d0d4ca546708b31d355baa9e68ba9736508f.
      
      * Fix doc
      
      * Revert PoC
      
      * Add feature extractors
      
      * More tests and PT model
      
      * Adapt error message
      
      * Feature extractor tests
      
      * TF model
      
      * Flax model and test
      
      * Merge flax auto tests
      
      * Add tokenization
      
      * Fix test
      6ac77534
  14. 19 Jan, 2022 1 commit
    • NielsRogge's avatar
      Add ViLT (#14895) · ac227093
      NielsRogge authored
      
      
      * First commit
      
      * Add conversion script
      
      * Make conversion script work for base model
      
      * More improvements
      
      * Update conversion script, works for vqa
      
      * Add indexing argument to meshgrid
      
      * Make conversion script work for ViltForPreTraining
      
      * Add ViltForPreTraining to docs
      
      * Fix device issue
      
      * Add processor
      
      * Add MinMaxResize to feature extractor
      
      * Implement call method of ViltProcessor
      
      * Fix tests
      
      * Add integration test
      
      * Add loss calculation for VQA
      
      * Improve tests
      
      * Improve some more tests
      
      * Debug tests
      
      * Small improvements
      
      * Add support for attention_mask
      
      * Remove mask_it
      
      * Add pixel_mask
      
      * Add tests for ViltFeatureExtractor
      
      * Improve tests
      
      * Add ViltForNaturalLanguageVisualReasoning
      
      * Add ViltForNaturalLanguageVisualReasoning to conversion script
      
      * Minor fixes
      
      * Add support for image_embeds, update docstrings to markdown
      
      * Update docs to markdown
      
      * Improve conversion script
      
      * Rename ViltForPreTraining to ViltForMaskedLM
      
      * Improve conversion script
      
      * Convert docstrings to markdown
      
      * Fix code example of retrieval model
      
      * Properly convert masked language model
      
      * Add integration test for nlvr
      
      * Fix code quality
      
      * Apply suggestions from code review
      
      * Add copied from statements
      
      * Fix pretrained_config_archive_map
      
      * Fix docs
      
      * Add model to README
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Apply more suggestions from code review
      
      * Make code more readable
      
      * Add ViltForNaturalLanguageVisualReasoning to the tests
      
      * Rename ViltForVisualQuestionAnswering to ViltForQuestionAnswering
      
      * Replace pixel_values_2 by single tensor
      
      * Add hidden_states and attentions
      
      * Fix one more test
      
      * Fix all tests
      
      * Update year
      
      * Fix rebase issues
      
      * Fix another rebase issue
      
      * Remove ViltForPreTraining from auto mapping
      
      * Rename ViltForImageRetrievalTextRetrieval to ViltForImageAndTextRetrieval
      
      * Make it possible to use BertTokenizerFast in the processor
      
      * Use BertTokenizerFast by default
      
      * Rename ViltForNaturalLanguageVisualReasoning, define custom model output
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      ac227093
  15. 18 Jan, 2022 4 commits
    • Sylvain Gugger's avatar
      Ignore empty subfolders when identifying submodules (#15204) · 84c60a7b
      Sylvain Gugger authored
      * Ignore empty subfolders when identifying submodules
      
      * Update utils/check_inits.py
      84c60a7b
    • Sylvain Gugger's avatar
      Copies and docstring styling (#15202) · 1144d336
      Sylvain Gugger authored
      * Style docstrings when making/checking copies
      
      * Polish
      1144d336
    • Sylvain Gugger's avatar
      f6d3fee8
    • Li-Huai (Allan) Lin's avatar
      Add REALM (#13292) · 22454ae4
      Li-Huai (Allan) Lin authored
      
      
      * REALM initial commit
      
      * Retriever OK (Update new_gelu).
      
      * Encoder prediction score OK
      
      * Encoder pretrained model OK
      
      * Update retriever comments
      
      * Update docs, tests, and imports
      
      * Prune unused models
      
      * Make embedder as a module `RealmEmbedder`
      
      * Add RealmRetrieverOutput
      
      * Update tokenization
      
      * Pass all tests in test_modeling_realm.py
      
      * Prune RealmModel
      
      * Update docs
      
      * Add training test.
      
      * Remove completed TODO
      
      * Style & Quality
      
      * Prune `RealmModel`
      
      * Fixup
      
      * Changes:
      1. Remove RealmTokenizerFast
      2. Update docstrings
      3. Add a method to RealmTokenizer to handle candidates tokenization.
      
      * Fix up
      
      * Style
      
      * Add tokenization tests
      
      * Update `from_pretrained` tests
      
      * Apply suggestions
      
      * Style & Quality
      
      * Copy BERT model
      
      * Fix comment to avoid docstring copying
      
      * Make RealmBertModel private
      
      * Fix bug
      
      * Style
      
      * Basic QA
      
      * Save
      
      * Complete reader logits
      
      * Add searcher
      
      * Complete searcher & reader
      
      * Move block records init to constructor
      
      * Fix training bug
      
      * Add some outputs to RealmReader
      
      * Add finetuned checkpoint variable names parsing
      
      * Fix bug
      
      * Update REALM config
      
      * Add RealmForOpenQA
      
      * Update convert_tfrecord logits
      
      * Fix bugs
      
      * Complete imports
      
      * Update docs
      
      * Update naming
      
      * Add brute-force searcher
      
      * Pass realm model tests
      
      * Style
      
      * Exclude RealmReader from common tests
      
      * Fix
      
      * Fix
      
      * convert docs
      
      * up
      
      * up
      
      * more make style
      
      * up
      
      * upload
      
      * up
      
      * Fix
      
      * Update src/transformers/__init__.py
      
      * adapt testing
      
      * change modeling code
      
      * fix test
      
      * up
      
      * up
      
      * up
      
      * correct more
      
      * make retriever work
      
      * update
      
      * make style
      
      * finish main structure
      
      * Resolve merge conflict
      
      * Make everything work
      
      * Style
      
      * Fixup
      
      * Fixup
      
      * Update training test
      
      * fix retriever
      
      * remove hardcoded path
      
      * Fix
      
      * Fix modeling test
      
      * Update model links
      
      * Initial retrieval test
      
      * Fix modeling test
      
      * Complete retrieval tests
      
      * Fix
      
      * style
      
      * Fix tests
      
      * Fix docstring example
      
      * Minor fix of retrieval test
      
      * Update license headers and docs
      
      * Apply suggestions from code review
      
      * Style
      
      * Apply suggestions from code review
      
      * Add an example to RealmEmbedder
      
      * Fix
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      22454ae4
  16. 14 Jan, 2022 2 commits
  17. 11 Jan, 2022 2 commits
    • lewtun's avatar
      Update ONNX docs (#14904) · 16f0b7d7
      lewtun authored
      
      
      * Remove docs for deprecated ONNX export
      
      * Tidy up the CLI help messages
      
      * Revamp ONNX docs
      
      * Update auto-config table
      
      * Use DistilBERT as example for consistency
      
      * Wrap up first pass at ONNX docs
      
      * Fix table check
      
      * Add tweaks and introduction
      
      * Add cross-ref
      
      * Fix missing import
      
      * Fix style
      
      * Add permalinks to ONNX configs
      
      * Clarify role of OrderedDict
      
      * Update docs/source/serialization.mdx
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Add doctest syntax to code blocks
      
      * Remove permalinks
      
      * Revert "Remove permalinks"
      
      This reverts commit 099701daf0db27823457867938efdb2d4f22a7c1.
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      16f0b7d7
    • Sylvain Gugger's avatar
      Doc styler tip (#15105) · 704d1fec
      Sylvain Gugger authored
      * Add new lines before/after tips
      
      * Check end of lines
      704d1fec
  18. 10 Jan, 2022 3 commits
  19. 03 Jan, 2022 1 commit
  20. 28 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Doc styler examples (#14953) · b5e2b183
      Sylvain Gugger authored
      * Fix bad examples
      
      * Add black formatting to style_doc
      
      * Use first nonempty line
      
      * Put it at the right place
      
      * Don't add spaces to empty lines
      
      * Better templates
      
      * Deal with triple quotes in docstrings
      
      * Result of style_doc
      
      * Enable mdx treatment and fix code examples in MDXs
      
      * Result of doc styler on doc source files
      
      * Last fixes
      
      * Break copy from
      b5e2b183
  21. 27 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Doc styler v2 (#14950) · 87e6e4fe
      Sylvain Gugger authored
      * New doc styler
      
      * Fix issue with args at the start
      
      * Code sample fixes
      
      * Style code examples in MDX
      
      * Fix more patterns
      
      * Typo
      
      * Typo
      
      * More patterns
      
      * Do without black for now
      
      * Get more info in error
      
      * Docstring style
      
      * Re-enable check
      
      * Quality
      
      * Fix add_end_docstring decorator
      
      * Fix docstring
      87e6e4fe
  22. 23 Dec, 2021 1 commit
    • Yih-Dar's avatar
      Add TFCLIPModel (#13967) · 8f2cc1c3
      Yih-Dar authored
      
      
      * Start the work for TFCLIPModel
      
      * Convert to TF code (TODO: loss + doc)
      
      * Clean up
      
      * Fix pooled_output for TFCLIPTextTransformer - using tf.gather_nd
      
      * assert -> raise error
      
      * Expose TFCLIPModel
      
      * Deal with dummy_inputs
      
      * Add tests
      
      * Fix all tests. TODO: manual check weight loading + add more comments
      
      * Fix pt tf equivalence test
      
      * fixes
      
      * update TFCLIPVisionEmbeddings's Conv2D
      
      * Fix loss + overwrite test_pt_tf_model_equivalence from common
      
      * Add a comment about the change about MainLayer in test_keras_save_load
      
      * Set return_loss=True in TFCLIPModelTester + make tests pass
      
      * overwrite test_pt_tf_model_equivalence from tf common
      
      * fix base_model_prefix
      
      * Fix examples
      
      * remove unused
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * apply review suggestions
      
      * change self.pre_layrnorm to self.pre_layernorm
      
      * apply more review suggestions
      
      * return attention probs before dropout (to align with PT)
      
      * fix weight init
      
      * fix
      
      * build doc
      
      * fix missing doc
      
      * fix for test
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      8f2cc1c3
  23. 22 Dec, 2021 2 commits
  24. 21 Dec, 2021 2 commits
  25. 16 Dec, 2021 1 commit
    • Anton Lozhkov's avatar
      Add Speaker Diarization and Verification heads (#14723) · 48463ebb
      Anton Lozhkov authored
      * Models
      
      * Squashed commit of the following:
      
      commit 72278e1e931a16d0879acc77f65762f3364833d0
      Author: anton-l <aglozhkov@gmail.com>
      Date:   Fri Dec 10 21:45:08 2021 +0300
      
      * Add unispeech heads
      
      * Add sd/sv automodels
      
      * Docs cleanup
      
      * Fix docstrings
      
      * rename xvector classes
      
      * examples
      
      * Tests cleanup
      
      * Style
      
      * Better checkpoints for tests
      
      * leftover docs
      
      * apply review suggestions
      
      * Style + init tests
      
      * Update unispeech-sat tdnn downsampling
      48463ebb
  26. 15 Dec, 2021 2 commits