1. 08 Dec, 2021 3 commits
    • NielsRogge's avatar
      Add Perceiver IO (#14487) · 65b20b73
      NielsRogge authored
      * First draft
      
      * Style and remove mlm
      
      * Make forward pass work
      
      * More improvements
      
      * More improvements
      
      * Fix bug
      
      * More improvements
      
      * More improvements
      
      * Add PerceiverTokenizer first draft
      
      * Improve conversion script
      
      * More improvements
      
      * Make conversion script work for the encoder
      
      * Make conversion script work with local pickle files
      
      * Style & quality, fix-copies
      
      * Add dummy input to conversion script
      
      * Add absolute position embeddings to TextPreProcessor
      
      * Make forward pass of encoder work
      
      * More improvements
      
      * Move text preprocessor to separate script
      
      * More improvements
      
      * More improvements
      
      * Add post processor
      
      * Make MLM model work
      
      * Style
      
      * Add PerceiverForMaskedLM
      
      * Add PerceiverImagePreprocessor
      
      * Make style
      
      * Make PerceiverForImageClassification work
      
      * More improvements
      
      * More improvements
      
      * Use tokenizer in conversion script
      
      * Use PerceiverForMaskedLM in conversion script
      
      * Define custom PerceiverModelOutput
      
      * Improve PerceiverAttention to make it work for both MLM and image classification
      
      * More improvements
      
      * More improvements
      
      * More improvements to the conversion script
      
      * Make conversion script work for both MLM and image classification
      
      * Add PerceiverFeatureExtractor
      
      * More improvements
      
      * Style and quality
      
      * Add center cropping
      
      * Fix bug
      
      * Small fix
      
      * Add print statement
      
      * Fix bug in image preprocessor
      
      * Fix bug with conversion script
      
      * Make output position embeddings an nn.Parameter layer instead of nn.Embedding
      
      * Comment out print statements
      
      * Add position encoding classes
      
      * More improvements
      
      * Use position_encoding_kwargs
      
      * Add PerceiverForImageClassificationFourier
      
      * Make style & quality
      
      * Add PerceiverForImageClassificationConvProcessing
      
      * Style & quality
      
      * Add flow model
      
      * Move processors to modeling file
      
      * Make position encodings modular
      
      * Make basic decoder use modular position encodings
      
      * Add PerceiverForOpticalFlow to conversion script
      
      * Add AudioPreprocessor
      
      * Make it possible for the basic decoder to use Fourier position embeddings
      
      * Add PerceiverForMultimodalAutoencoding
      
      * Improve model for optical flow
      
      * Improve _build_network_inputs method
      
      * Add print statement
      
      * Fix device issue
      
      * Fix device of Fourier embeddings
      
      * Add print statements for debugging
      
      * Add another print statement
      
      * Add another print statement
      
      * Add another print statement
      
      * Add another print statement
      
      * Improve PerceiverAudioPreprocessor
      
      * Improve conversion script for multimodal modal
      
      * More improvements
      
      * More improvements
      
      * Improve multimodal model
      
      * Make forward pass multimodal model work
      
      * More improvements
      
      * Improve tests
      
      * Fix some more tests
      
      * Add output dataclasses
      
      * Make more tests pass
      
      * Add print statements for debuggin
      
      * Add tests for image classification
      
      * Add PerceiverClassifierOutput
      
      * More improvements
      
      * Make more tests pass for the optical flow model
      
      * Make style & quality
      
      * Small improvements
      
      * Don't support training for optical flow model for now
      
      * Fix _prepare_for_class for tests
      
      * Make more tests pass, add some docs
      
      * Add multimodal model to tests
      
      * Minor fixes
      
      * Fix tests
      
      * Improve conversion script
      
      * Make fixup
      
      * Remove pos_dim argument
      
      * Fix device issue
      
      * Potential fix for OOM
      
      * Revert previous commit
      
      * Fix test_initialization
      
      * Add print statements for debugging
      
      * Fix print statement
      
      * Add print statement
      
      * Add print statement
      
      * Add print statement
      
      * Add print statement
      
      * Add print statement
      
      * Add print statement
      
      * Remove need for output_shape
      
      * Comment out output_shape
      
      * Remove unnecessary code
      
      * Improve docs
      
      * Fix make fixup
      
      * Remove PerceiverTextProcessor from init
      
      * Improve docs
      
      * Small improvement
      
      * Apply first batch of suggestions from code review
      
      * Apply more suggestions from code review
      
      * Update docstrings
      
      * Define dicts beforehand for readability
      
      * Rename task to architecture in conversion script, include PerceiverModel in tests
      
      * Add print statements for debugging
      
      * Fix tests on GPU
      
      * Remove preprocessors, postprocessors and decoders from main init
      
      * Add integration test
      
      * Fix docs
      
      * Replace einops by torch
      
      * Update for new docs frontend
      
      * Rename PerceiverForImageClassification
      
      * Improve docs
      
      * Improve docs
      
      * Improve docs of PerceiverModel
      
      * Fix some more tests
      
      * Improve center_crop
      
      * Add PerceiverForSequenceClassification
      
      * Small improvements
      
      * Fix tests
      
      * Add integration test for optical flow model
      
      * Clean up
      
      * Add tests for tokenizer
      
      * Fix tokenizer by adding special tokens properly
      
      * Fix CI
      65b20b73
    • Patrick von Platen's avatar
      [Wav2Vec2] PyCTCDecode Integration to support language model boosted decoding (#14339) · 961732c2
      Patrick von Platen authored
      
      
      * up
      
      * up
      
      * up
      
      * make it cleaner
      
      * correct
      
      * make styhahalal
      
      * add more tests
      
      * finish
      
      * small fix
      
      * make style
      
      * up
      
      * tryout to solve cicrle ci
      
      * up
      
      * fix more tests
      
      * fix more tests
      
      * apply sylvains suggestions
      
      * fix import
      
      * correct docs
      
      * add pyctcdecode only to speech tests
      
      * fix more tests
      
      * add tf, flax and pt tests
      
      * add pt
      
      * fix last tests
      
      * fix more tests
      
      * Apply suggestions from code review
      
      * change lines
      
      * Apply suggestions from code review
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * correct tests
      
      * correct tests
      
      * add doc string
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      961732c2
    • Nicolas Patry's avatar
      Fixing Dataset for TQA + token-classification. (#14658) · 2e12d90b
      Nicolas Patry authored
      * Fixing Dataset for TQA + token-classification.
      
      * Fixing the tests.
      
      * Making sure `offset_mappings` is a valid argument.
      2e12d90b
  2. 07 Dec, 2021 2 commits
    • Stas Bekman's avatar
      [deepspeed] fix --load_best_model_at_end (#14652) · b66c5ab2
      Stas Bekman authored
      * [deepspeed] fix load_best_model_at_end
      
      * try with pull_request_target
      
      * revert: try with pull_request_target
      
      * style
      
      * add test
      
      * cleanup
      b66c5ab2
    • Ryokan RI's avatar
      Add mLUKE (#14640) · 30646a0a
      Ryokan RI authored
      * implement MLukeTokenizer and LukeForMaskedLM
      
      * update tests
      
      * update docs
      
      * add LukeForMaskedLM to check_repo.py
      
      * update README
      
      * fix test and specify the entity pad id in tokenization_(m)luke
      
      * fix EntityPredictionHeadTransform
      30646a0a
  3. 06 Dec, 2021 3 commits
    • Yih-Dar's avatar
      Use cross_attention_hidden_size in Encoder-Decoder models (#14378) · 4cdb67ca
      Yih-Dar authored
      
      
      * add cross_attention_hidden_size to text-2-text encoder-decoder models (PT/Flax)
      
      * for TFEncoderDecoderModel
      
      * add equivalence test for TFEncoderDecoderModel
      
      * fix
      
      * fix failed equivalence tests
      
      * remove unused import
      
      * add detailed comment
      
      * Fix check_equivalence_tf_to_pt by using encoder/decoder
      
      * cleaning
      
      * Use cross_attention_hidden_size in speech-to-text
      
      * clean fast init logging msg in encoder decoder models
      
      * increase tol from 1e-5 to 1e-3 for tf test
      
      * style
      
      * style
      
      * make sure projection layer can run
      
      * remove type conversion + add check
      
      * fix conflict (config.output_hidden_size)
      
      * Remove TF -> PT in check_pt_tf_equivalence for TFEncoderDecoderModel
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      4cdb67ca
    • Lysandre Debut's avatar
      Auto processor fix (#14623) · e9688875
      Lysandre Debut authored
      
      
      * Add AutoProcessor class
      Init and tests
      Add doc
      Fix init
      Update src/transformers/models/auto/processing_auto.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Reverts to tokenizer or feature extractor when available
      Adapt test
      
      * Revert "Adapt test"
      
      This reverts commit bbdde5fab02465f24b54b227390073082cb32093.
      
      * Revert "Reverts to tokenizer or feature extractor when available"
      
      This reverts commit 77659ff5d21b6cc0baf6f443017e35e056a525bb.
      
      * Don't revert everything Lysandre!
      Co-authored-by: default avatarSylvain Gugger <sylvain.gugger@gmail.com>
      e9688875
    • tucan9389's avatar
      Add GPTJForQuestionAnswering (#14503) · 0f3f045e
      tucan9389 authored
      
      
      * Add GPTJForQuestionAnswering
      
      * Reformat for GPTJForQuestionAnswering
      
      * Fix isort error
      
      * make style for GPTJForQA
      
      * Add _keys_to_ignore_on_load_missing
      
      * Change the sequence of qa and classification
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      0f3f045e
  4. 03 Dec, 2021 2 commits
  5. 02 Dec, 2021 2 commits
    • Nik's avatar
      fix #14524 (IndexError when mask prob is too low) (#14525) · 6645eb61
      Nik authored
      * fix #14524 (IndexError when mask prob is too low)
      
      * fix formatting
      
      * correct documentation, add option for setting min_num_masks
      
      * change the semantic meaning of `mask_prob` in _compute_mask_indices
      
      With this commit the meaing of `mask_prob` actually adhered to the probability for each
      vector to be the start of a masked span of length.
      
      * fix check_copies test
      
      * fix documentation to semantic meaning of `upper bound of overall masking percentage`, revert changes to _compute_mask_indices
      
      * fix typo
      6645eb61
    • Daniel Stancl's avatar
      [Flax] Add FlaxBlenderbotSmall (#14576) · 50d909be
      Daniel Stancl authored
      
      
      * [WIP] Add FlaxBlenderbotSmall
      
      * Revert some unintentionally changed files
      
      Revert some unintentionally files changed by improperly filled cookiecutter instructions.
      
      * Fix repo consistency
      
      * Fix Flax-PT equivalence
      
      * Apply suggestions from code review
      
      * Update index.mdx
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      50d909be
  6. 01 Dec, 2021 2 commits
  7. 30 Nov, 2021 3 commits
    • Suraj Patil's avatar
      VisionTextDualEncoder (#13511) · fc1d97f2
      Suraj Patil authored
      
      
      * init vision_text_dual_encoder
      
      * fix merge
      
      * remove extra heads
      
      * fix tests
      
      * remove VISION_TEXT_DUAL_ENCODER_PRETRAINED_CONFIG_ARCHIVE_MAP
      
      * remove archive map
      
      * fix imports
      
      * fix more imports
      
      * fix init
      
      * delete tokenizers
      
      * fix imports
      
      * clean
      
      * support clip's vision model
      
      * handle None config
      
      * begin tests
      
      * more test and few fixes
      
      * warn about newly init weights
      
      * more tests
      
      * add loss to model
      
      * remove extra classes from doc
      
      * add processor
      
      * doc and small fixes
      
      * add start docstr
      
      * update flax model
      
      * flax tests
      
      * more flax tests
      
      * doc
      
      * quality
      
      * doc and quality
      
      * fix doc
      
      * doc
      
      * remove comments
      
      * update warning
      
      * quality
      
      * fix docs
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * replace asserts, fix imports
      
      * update imports
      
      * fix import
      
      * address some review comments
      
      * fix check
      
      * reduce tolerance
      
      * fix test
      
      * add flax integration test
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * address Sylvain's comments
      
      * fix style
      
      * add pt_flax_equivalence test in PT tests
      
      * add pt integration test
      
      * update test
      
      * use pre-trained checkpoint in examples
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      fc1d97f2
    • Daniel Stancl's avatar
      [Flax] Add FlaxBlenderbot (#13633) · faacd747
      Daniel Stancl authored
      
      
      * Init Flax implementation for Blenderbot
      
      * Add a majority of stuff except for tests
      
      * make style quality
      
      * Add tests and fix some bugs
      
      * Add tests
      
      * Clean source code and fix some bugs
      
      * Fix copies and docs
      
      * Fix jax device condition for tests
      
      * Fix layer norm in the encoder
      
      * Fix a few typos in the test file
      
      * make fix-copies
      
      * make fix-copies
      
      * fix layer norm
      
      * Fix Flax params dtype (#13090)
      
      * Fix PR reference (#13098)
      
      * make fix-copies
      
      * Update tests/test_modeling_flax_blenderbot.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      faacd747
    • Kamal Raj's avatar
      Tapas tf (#13393) · c468a87a
      Kamal Raj authored
      * TF Tapas first commit
      
      * updated docs
      
      * updated logger message
      
      * updated pytorch weight conversion
      script to support scalar array
      
      * added use_cache to tapas model config to
      work properly with tf input_processing
      
      * 1. rm embeddings_sum
      2. added # Copied
      3. + TFTapasMLMHead
      4. and lot other small fixes
      
      * updated docs
      
      * + test for tapas
      
      * updated testing_utils to check
      is_tensorflow_probability_available
      
      * converted model logits post processing using
      numpy to work with both PT and TF models
      
      * + TFAutoModelForTableQuestionAnswering
      
      * added TF support
      
      * added test for
      TFAutoModelForTableQuestionAnswering
      
      * added test for
      TFAutoModelForTableQuestionAnswering pipeline
      
      * updated auto model docs
      
      * fixed typo in import
      
      * added tensorflow_probability to run tests
      
      * updated MLM head
      
      * updated tapas.rst with TF  model docs
      
      * fixed optimizer import in docs
      
      * updated convert to np
      data from pt model is not
      `transformers.tokenization_utils_base.BatchEncoding`
      after pipeline upgrade
      
      * updated pipeline:
      1. with torch.no_gard removed, pipeline forward handles
      2. token_type_ids converted to numpy
      
      * updated docs.
      
      * removed `use_cache` from config
      
      * removed floats_tensor
      
      * updated code comment
      
      * updated Copyright Year and
      logits_aggregation Optional
      
      * updated docs and comments
      
      * updated docstring
      
      * fixed model weight loading
      
      * make fixup
      
      * fix indentation
      
      * added tf slow pipeline test
      
      * pip upgrade
      
      * upgrade python to 3.7
      
      * removed from_pt from tests
      
      * revert commit f18cfa9
      c468a87a
  8. 29 Nov, 2021 1 commit
  9. 25 Nov, 2021 1 commit
  10. 24 Nov, 2021 1 commit
  11. 23 Nov, 2021 1 commit
  12. 22 Nov, 2021 2 commits
  13. 19 Nov, 2021 4 commits
  14. 18 Nov, 2021 3 commits
    • Sylvain Gugger's avatar
    • NielsRogge's avatar
      Add ImageGPT (#14240) · da36c557
      NielsRogge authored
      * First draft
      
      * More improvements
      
      * Improve conversion script
      
      * Fix init weights for layer norm
      
      * Fix correct model for conversion script
      
      * Don't tie input and output embeddings
      
      * Add print statements for debugging
      
      * Add print statements for debugging
      
      * Fix vocab size of model
      
      * Improve documentation, remove fast tokenizer
      
      * Add ImageGPTForImageClassification, improve docs
      
      * Fix docs issue
      
      * Set verbosity level back to info
      
      * Improve tests
      
      * Fix tests and add figure
      
      * Delete tokenizer file
      
      * Remove ImageGPTTokenizer from init files
      
      * Remove ImageGPTLayer from init files
      
      * Remove ImageGPT tokenizer from docs
      
      * First draft of ImageGPTFeatureExtractor
      
      * Fix typo
      
      * Fix bug
      
      * More improvements
      
      * Apply suggestions from code review, add tests for feature extractor
      
      * Fix layernorm
      
      * Update save_pretrained method
      
      * Fix issue
      
      * Make all tests of ImageGPTFeatureExtractor pass
      
      * Update code examples
      
      * Rename model inputs to pixel_values
      
      * Improve code examples
      
      * Update init_weights to post_init
      
      * Fix post_init
      da36c557
    • Sylvain Gugger's avatar
      Add a post init method to all models (#14431) · d83b0e0c
      Sylvain Gugger authored
      * Add a post init method to all models
      
      * Fix tests
      
      * Fix last tests
      
      * Fix templates
      
      * Add comment
      
      * Forgot to save
      d83b0e0c
  15. 17 Nov, 2021 3 commits
    • N's avatar
      [WIP] Ensure TF model configs can be converted to proper JSON (#14415) · 1991da07
      N authored
      
      
      * test: make sure model configs are jsonifiable
      
      * fix: return python dict instead of config object
      
      * fix: accept pretrained config and use correct class
      
      * Re-enabling slow tests and applying them to core models only
      
      * Re-enabling slow tests and applying them to core models only
      
      * Add new test file to fetcher
      
      * Remove tooslow tests from test_modeling_tf_common.py
      
      * make style
      
      * Style fixes
      
      * Style fixes
      
      * Style fixes
      
      * Style fixes
      
      * Adding core tests to GPT2 and BART
      
      * Removing unused imports
      Co-authored-by: default avatarniklas.fruehauf <niklas.fruehauf@sovanta.com>
      Co-authored-by: default avatarmatt <rocketknight1@gmail.com>
      1991da07
    • NielsRogge's avatar
      Improve semantic segmentation models (#14355) · a2864a50
      NielsRogge authored
      * Improve tests
      
      * Improve documentation
      
      * Add ignore_index attribute
      
      * Add semantic_ignore_index to BEiT model
      
      * Add segmentation maps argument to BEiTFeatureExtractor
      
      * Simplify SegformerFeatureExtractor and corresponding tests
      
      * Improve tests
      
      * Apply suggestions from code review
      
      * Minor docs improvements
      
      * Streamline segmentation map tests of SegFormer and BEiT
      
      * Improve reduce_labels docs and test
      
      * Fix code quality
      
      * Fix code quality again
      a2864a50
    • Patrick von Platen's avatar
      [Wav2Vec2] Add New Wav2Vec2 Translation (#14392) · 700a748f
      Patrick von Platen authored
      * add new wav2vec2 translation
      
      * correct
      
      * up
      
      * add tests
      
      * correct end copy
      
      * correct more
      
      * up
      
      * correct unispeech sat
      
      * finish
      
      * finalize
      
      * finish
      
      * up
      700a748f
  16. 16 Nov, 2021 2 commits
    • Valentin's avatar
      Avoid looping when data exhausted (#14413) · a33168aa
      Valentin authored
      * stop training when a finite IterableDataset is exhausted
      
      when using an iterable dataset num_epochs is set to
      sys.maxsize to make sure all data is consumed
      likewise we want to set max_steps high enough
      but still stop when all data is consumed
      
      (cherry picked from commit 6f0e1d6363153da9051e93acffe1cbab3a3f3b12)
      
      * fix typo flase -> false
      
      * add test for stopping training on exhausted finite iterable dataset
      
      * remove redundant gradient_accumulation_steps
      
      * run make style
      
      reformat training_args docstring
      a33168aa
    • Sylvain Gugger's avatar
      Fix gradient_checkpointing backward compatibility (#14408) · 040fd471
      Sylvain Gugger authored
      
      
      * Fix gradient_checkpointing backward compatibility
      
      * Remove needless line
      
      * make sure mask prob is big enough and length small enough
      
      * Fix tests
      Co-authored-by: default avatarpatrickvonplaten <patrick.v.platen@gmail.com>
      040fd471
  17. 15 Nov, 2021 4 commits
  18. 13 Nov, 2021 1 commit