1. 18 Nov, 2021 1 commit
    • NielsRogge's avatar
      Add ImageGPT (#14240) · da36c557
      NielsRogge authored
      * First draft
      
      * More improvements
      
      * Improve conversion script
      
      * Fix init weights for layer norm
      
      * Fix correct model for conversion script
      
      * Don't tie input and output embeddings
      
      * Add print statements for debugging
      
      * Add print statements for debugging
      
      * Fix vocab size of model
      
      * Improve documentation, remove fast tokenizer
      
      * Add ImageGPTForImageClassification, improve docs
      
      * Fix docs issue
      
      * Set verbosity level back to info
      
      * Improve tests
      
      * Fix tests and add figure
      
      * Delete tokenizer file
      
      * Remove ImageGPTTokenizer from init files
      
      * Remove ImageGPTLayer from init files
      
      * Remove ImageGPT tokenizer from docs
      
      * First draft of ImageGPTFeatureExtractor
      
      * Fix typo
      
      * Fix bug
      
      * More improvements
      
      * Apply suggestions from code review, add tests for feature extractor
      
      * Fix layernorm
      
      * Update save_pretrained method
      
      * Fix issue
      
      * Make all tests of ImageGPTFeatureExtractor pass
      
      * Update code examples
      
      * Rename model inputs to pixel_values
      
      * Improve code examples
      
      * Update init_weights to post_init
      
      * Fix post_init
      da36c557
  2. 17 Nov, 2021 2 commits
    • Patrick von Platen's avatar
      [Bart] Fix docs (#14434) · 754202de
      Patrick von Platen authored
      754202de
    • NielsRogge's avatar
      Improve semantic segmentation models (#14355) · a2864a50
      NielsRogge authored
      * Improve tests
      
      * Improve documentation
      
      * Add ignore_index attribute
      
      * Add semantic_ignore_index to BEiT model
      
      * Add segmentation maps argument to BEiTFeatureExtractor
      
      * Simplify SegformerFeatureExtractor and corresponding tests
      
      * Improve tests
      
      * Apply suggestions from code review
      
      * Minor docs improvements
      
      * Streamline segmentation map tests of SegFormer and BEiT
      
      * Improve reduce_labels docs and test
      
      * Fix code quality
      
      * Fix code quality again
      a2864a50
  3. 15 Nov, 2021 1 commit
  4. 09 Nov, 2021 2 commits
    • Yih-Dar's avatar
      Add TFViTModel (#13778) · be4a6c64
      Yih-Dar authored
      
      
      * Start the work for TFViTModel
      
      * Convert to TF code - need to check in the follow up commits
      
      * Clean up model code
      
      * Expose TFViTModel
      
      * make style
      
      * make quality
      
      * Add test
      
      * make style & quality
      
      * Fix some imports
      
      * fix wrong usage - *kwargs => ** kwargs
      
      * Fix Conv2D weight loading (PT->TF) issue
      
      * Add tests for images with different sizes + fix model
      
      * Fix some common tests for TFViTModel
      
      * Use inputs instead of input_ids in test_compile_tf_model
      
      * Add a comment about transpose and Conv2D in convert_tf_weight_name_to_pt_weight_name
      
      * Avoid transpose in TFViT call
      
      * Fix Conv2D issue in load_tf2_weights_in_pytorch_model
      
      * Use tf.keras.layers.Conv2D instead of tf.nn.conv2d
      
      * Using simpler heuristic to detect Conv2D layer
      
      * Change convert_tf_weight_name_to_pt_weight_name to return TransposeType
      
      * Check tf_weight_shape is not None before using it
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * fix missing comma
      
      * fix input dtype
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      be4a6c64
    • Yih-Dar's avatar
      Add FlaxVisionEncoderDecoderModel (#13359) · 95b3ec3b
      Yih-Dar authored
      
      
      * Start the work on FlaxVisionEncoderDecoderModel
      
      * Add FlaxVisionEncoderDecoderModel
      
      * Add VisionEncoderDecoderConfig
      
      * Make FlaxVisionEncoderDecoderModel visible to transformers
      
      * Add test
      
      * Fix wrong getattr usage
      
      * Fix tests
      
      * Add FlaxAutoModelForVision2Seq
      
      * Expose FLAX_MODEL_FOR_VISION_2_SEQ_MAPPING
      
      * clean-up
      
      * add integration test
      
      * update expected logits
      
      * update expected scores
      
      * Add ViT2GPT2ModelIntegrationTest + some cleaning
      
      * Add projection layer + PT/Flax equivalence tests
      
      * Fix import
      
      * minor changes
      
      * make test slow again
      
      * Apply suggestions
      
      * Add modeling_flax_vision_encoder_decoder to _ignore_modules in get_model_modules()
      
      * fix copies
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * split long strings in multiple lines
      
      * decoder_input_ids can't be None
      
      * Add back test_configuration_tie
      
      * Remove attention_mask parameter
      
      * fix test - encoder_last_hidden_state should be encoder_outputs.last_hidden_state instead of the projected vector
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Remove more encoder_attention_mask
      
      * remove encoder_attention_mask when calling self.decode (in FlaxVisionEncoderDecoderModule)
      
      * Fix style + pass 1s instead of None as encoder_attention_mask
      
      * fix init_weights
      
      * pass None for encoder_attention_mask
      
      * pass 1s instead of None as encoder_attention_mask
      
      * Fix doc style
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      95b3ec3b
  5. 06 Nov, 2021 1 commit
  6. 03 Nov, 2021 1 commit
    • NielsRogge's avatar
      Add LayoutXLMProcessor (and LayoutXLMTokenizer, LayoutXLMTokenizerFast) (#14115) · 5f789a68
      NielsRogge authored
      
      
      * Add LayoutXLMTokenizer and LayoutXLMTokenizerFast
      
      * Fix styling issues
      
      * Fix more styling issues
      
      * Fix more styling issues
      
      * Fix docstring
      
      * Fix unit tests
      
      * Fix docs
      
      * Fix unit tests
      
      * Fix typos and styling issues
      
      * Fix styling issues
      
      * Fix docstring
      
      * Make all tests of test_tokenization_layoutxlm pass
      
      * Add LayoutXLMProcessor
      
      * Make fixup
      
      * Make all LayoutXLMProcessor tests pass
      
      * Minor fixes
      
      * Leave LayoutLMv2Processor tests unchanged
      
      * Fix code quality
      
      * Move LayoutXLM tokenizers and processor to separate folder
      
      * Fix code quality
      
      * Apply suggestions from code review
      
      * Replace assertions by value errors
      
      * Remove methods from fast tokenizer
      Co-authored-by: default avatarKing Yiu Suen <kingyiusuen@gmail.com>
      5f789a68
  7. 02 Nov, 2021 2 commits
  8. 01 Nov, 2021 1 commit
    • NielsRogge's avatar
      Add BeitForSemanticSegmentation (#14096) · e20faa6f
      NielsRogge authored
      
      
      * Add first draft
      
      * Make forward pass work
      
      * Improve conversion script
      
      * Add notebook that checks if it works
      
      * Add BeitForSemanticSegmentation to the tests
      
      * More improvements
      
      * Make BeitForSemanticSegmentation consistent with Segformer
      
      * Small bug fix
      
      * Add BeitForSemanticSegmentation to docs
      
      * Make sure model doesn't output hidden states when the user doesn't want to
      
      * Make it possible to convert the large model
      
      * Fix issue
      
      * Fix conversion script for large model
      
      * Add auxiliary_head option to semantic segmentation model
      
      * Apply suggestions from @sgugger's review
      
      * Apply suggestions from code review
      
      * Fix failing test
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      e20faa6f
  9. 29 Oct, 2021 1 commit
    • Daniel Stancl's avatar
      Add `BlenderbotTokenizerFast` (#13720) · d37f1fb8
      Daniel Stancl authored
      * Add the support for the fast (rust) implementation of BlenbderbotTokenizer
      
      * Fix a converter and a typo in a doc
      
      * Apply the patil-suraj's suggestion
      
      * (Nitpick) Fast tokenization -> Fast Tokenization in doc
      
      * Apply the SaulLu's suggestion
      
      * Apply Narsil's suggestion to fix test pipelines
      
      * Add encoder_no_repeat_ngram_size according to the Narsil's suggestion
      
      * Revert the last (unnecessary) commit
      
      * Override pipeline config for Blenderbot to allow for larger pos. emb.
      
      * make fix-copies
      d37f1fb8
  10. 28 Oct, 2021 1 commit
    • NielsRogge's avatar
      Add SegFormer (#14019) · 1dc96a76
      NielsRogge authored
      
      
      * First draft
      
      * Make style & quality
      
      * Improve conversion script
      
      * Add print statement to see actual slice
      
      * Make absolute tolerance smaller
      
      * Fix image classification models
      
      * Add post_process_semantic method
      
      * Disable padding
      
      * Improve conversion script
      
      * Rename to ForSemanticSegmentation, add integration test, remove post_process methods
      
      * Improve docs
      
      * Fix code quality
      
      * Fix feature extractor tests
      
      * Fix tests for image classification model
      
      * Delete file
      
      * Add is_torch_available to feature extractor
      
      * Improve documentation of feature extractor methods
      
      * Apply suggestions from @sgugger's code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Apply some more suggestions of code review
      
      * Rebase with master
      
      * Fix rebase issues
      
      * Make sure model only outputs hidden states when the user wants to
      
      * Apply suggestions from code review
      
      * Add pad method
      
      * Support padding of 2d images
      
      * Add print statement
      
      * Add print statement
      
      * Move padding method to SegformerFeatureExtractor
      
      * Fix issue
      
      * Add casting of segmentation maps
      
      * Add test for padding
      
      * Add small note about padding
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      1dc96a76
  11. 26 Oct, 2021 1 commit
  12. 18 Oct, 2021 3 commits
  13. 15 Oct, 2021 1 commit
  14. 14 Oct, 2021 1 commit
  15. 13 Oct, 2021 1 commit
    • NielsRogge's avatar
      Add TrOCR + VisionEncoderDecoderModel (#13874) · 408b2d2b
      NielsRogge authored
      * First draft
      
      * Update self-attention of RoBERTa as proposition
      
      * Improve conversion script
      
      * Add TrOCR decoder-only model
      
      * More improvements
      
      * Make forward pass with pretrained weights work
      
      * More improvements
      
      * Some more improvements
      
      * More improvements
      
      * Make conversion work
      
      * Clean up print statements
      
      * Add documentation, processor
      
      * Add test files
      
      * Small improvements
      
      * Some more improvements
      
      * Make fix-copies, improve docs
      
      * Make all vision encoder decoder model tests pass
      
      * Make conversion script support other models
      
      * Update URL for OCR image
      
      * Update conversion script
      
      * Fix style & quality
      
      * Add support for the large-printed model
      
      * Fix some issues
      
      * Add print statement for debugging
      
      * Add print statements for debugging
      
      * Make possible fix for sinusoidal embedding
      
      * Further debugging
      
      * Potential fix v2
      
      * Add more print statements for debugging
      
      * Add more print statements for debugging
      
      * Deubg more
      
      * Comment out print statements
      
      * Make conversion of large printed model possible, address review comments
      
      * Make it possible to convert the stage1 checkpoints
      
      * Clean up code, apply suggestions from code review
      
      * Apply suggestions from code review, use Microsoft models in tests
      
      * Rename encoder_hidden_size to cross_attention_hidden_size
      
      * Improve docs
      408b2d2b
  16. 12 Oct, 2021 1 commit
    • Yih-Dar's avatar
      Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222) · 8b240a06
      Yih-Dar authored
      
      
      * Add cross attentions to TFGPT2Model
      
      * Add TFEncoderDecoderModel
      
      * Add TFBaseModelOutputWithPoolingAndCrossAttentions
      
      * Add cross attentions to TFBertModel
      
      * Fix past or past_key_values argument issue
      
      * Fix generation
      
      * Fix save and load
      
      * Add some checks and comments
      
      * Clean the code that deals with past keys/values
      
      * Add kwargs to processing_inputs
      
      * Add serving_output to TFEncoderDecoderModel
      
      * Some cleaning + fix use_cache value issue
      
      * Fix tests + add bert2bert/bert2gpt2 tests
      
      * Fix more tests
      
      * Ignore crossattention.bias when loading GPT2 weights into TFGPT2
      
      * Fix return_dict_in_generate in tf generation
      
      * Fix is_token_logit_eos_token bug in tf generation
      
      * Finalize the tests after fixing some bugs
      
      * Fix another is_token_logit_eos_token bug in tf generation
      
      * Add/Update docs
      
      * Add TFBertEncoderDecoderModelTest
      
      * Clean test script
      
      * Add TFEncoderDecoderModel to the library
      
      * Add cross attentions to TFRobertaModel
      
      * Add TFRobertaEncoderDecoderModelTest
      
      * make style
      
      * Change the way of position_ids computation
      
      * bug fix
      
      * Fix copies in tf_albert
      
      * Remove some copied from and apply some fix-copies
      
      * Remove some copied
      
      * Add cross attentions to some other TF models
      
      * Remove encoder_hidden_states from TFLayoutLMModel.call for now
      
      * Make style
      
      * Fix TFRemBertForCausalLM
      
      * Revert the change to longformer + Remove copies
      
      * Revert the change to albert and convbert + Remove copies
      
      * make quality
      
      * make style
      
      * Add TFRembertEncoderDecoderModelTest
      
      * make quality and fix-copies
      
      * test TFRobertaForCausalLM
      
      * Fixes for failed tests
      
      * Fixes for failed tests
      
      * fix more tests
      
      * Fixes for failed tests
      
      * Fix Auto mapping order
      
      * Fix TFRemBertEncoder return value
      
      * fix tf_rembert
      
      * Check copies are OK
      
      * Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined
      
      * Add TFEncoderDecoderModelSaveLoadTests
      
      * fix tf weight loading
      
      * check the change of use_cache
      
      * Revert the change
      
      * Add missing test_for_causal_lm for TFRobertaModelTest
      
      * Try cleaning past
      
      * fix _reorder_cache
      
      * Revert some files to original versions
      
      * Keep as many copies as possible
      
      * Apply suggested changes - Use raise ValueError instead of assert
      
      * Move import to top
      
      * Fix wrong require_torch
      
      * Replace more assert by raise ValueError
      
      * Add test_pt_tf_model_equivalence (the test won't pass for now)
      
      * add test for loading/saving
      
      * finish
      
      * finish
      
      * Remove test_pt_tf_model_equivalence
      
      * Update tf modeling template
      
      * Remove pooling, added in the prev. commit, from MainLayer
      
      * Update tf modeling test template
      
      * Move inputs["use_cache"] = False to modeling_tf_utils.py
      
      * Fix torch.Tensor in the comment
      
      * fix use_cache
      
      * Fix missing use_cache in ElectraConfig
      
      * Add a note to from_pretrained
      
      * Fix style
      
      * Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt
      
      * Fix TFMLP (in TFGPT2) activation issue
      
      * Fix None past_key_values value in serving_output
      
      * Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub
      
      * Apply review suggestions - style for cross_attns in serving_output
      
      * Apply review suggestions - change assert + docstrings
      
      * break the error message to respect the char limit
      
      * deprecate the argument past
      
      * fix docstring style
      
      * Update the encoder-decoder rst file
      
      * fix Unknown interpreted text role "method"
      
      * fix typo
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      8b240a06
  17. 08 Oct, 2021 2 commits
  18. 04 Oct, 2021 2 commits
    • Sidd Karamcheti's avatar
      Add Mistral GPT-2 Stability Tweaks (#13573) · 3a8de58c
      Sidd Karamcheti authored
      
      
      * Add layer-wise scaling
      
      * Add reorder & upcasting argument
      
      * Add OpenAI GPT-2 weight initialization scheme
      
      * start `layer_idx` count at zero for consistency
      
      * disentangle attn and reordered and upscaled attn function
      
      * rename `scale_attn_by_layer` to `scale_attn_by_layer_id`
      
      * make autocast from amp compatible with pytorch<1.6
      
      * fix docstring
      
      * style fixes
      
      * Add fixes from PR feedback, style tweaks
      
      * Fix doc whitespace
      
      * Reformat
      
      * First pass scale_attn_by_layer_idx and reorder_and_upcast_attn tests
      
      * Rename scale_attn_by_layer_idx, add tip
      
      * Remove extra newline
      
      * add test for weight initialization
      
      * update code format
      
      * add assert check weights are fp32
      
      * remove assert
      
      * Fix incorrect merge
      
      * Fix shape mismatch in baddbmm
      
      * Add generation test for Mistral flags
      Co-authored-by: default avatarleandro <leandro.vonwerra@spoud.io>
      Co-authored-by: default avatarKeshav Santhanam <keshav2@stanford.edu>
      Co-authored-by: default avatarJ38 <jebolton@stanford.edu>
      3a8de58c
    • Yaser Abdelaziz's avatar
      [docs/gpt-j] fix typo (#13851) · 955fd4fe
      Yaser Abdelaziz authored
      955fd4fe
  19. 30 Sep, 2021 1 commit
  20. 29 Sep, 2021 1 commit
  21. 22 Sep, 2021 3 commits
  22. 21 Sep, 2021 2 commits
    • Kamal Raj's avatar
      beit-flax (#13515) · a2dec768
      Kamal Raj authored
      * beit-flax
      
      * updated FLAX_BEIT_MLM_DOCSTRING
      
      * removed bool_masked_pos from classification
      
      * updated Copyright
      
      * code refactoring: x -> embeddings
      
      * updated test: rm from_pt
      
      * Update docs/source/model_doc/beit.rst
      
      * model code dtype updates and
      other changes according to review
      
      * relative_position_bias
      revert back to pytorch design
      a2dec768
    • Patrick von Platen's avatar
      Add Speech AutoModels (#13655) · 48fa42e5
      Patrick von Platen authored
      * upload
      
      * correct
      
      * correct
      
      * correct
      
      * finish
      
      * up
      
      * up
      
      * up again
      48fa42e5
  23. 20 Sep, 2021 3 commits
    • flozi00's avatar
      Fix typo distilbert doc (#13643) · ea921365
      flozi00 authored
      ea921365
    • Ayaka Mikazuki's avatar
      Fix mT5 documentation (#13639) · 04976a32
      Ayaka Mikazuki authored
      * Fix MT5 documentation
      
      The abstract is incomplete
      
      * MT5 -> mT5
      04976a32
    • Gunjan Chhablani's avatar
      Add FNet (#13045) · d8049331
      Gunjan Chhablani authored
      
      
      * Init FNet
      
      * Update config
      
      * Fix config
      
      * Update model classes
      
      * Update tokenizers to use sentencepiece
      
      * Fix errors in model
      
      * Fix defaults in config
      
      * Remove position embedding type completely
      
      * Fix typo and take only real numbers
      
      * Fix type vocab size in configuration
      
      * Add projection layer to embeddings
      
      * Fix position ids bug in embeddings
      
      * Add minor changes
      
      * Add conversion script and remove CausalLM vestiges
      
      * Fix conversion script
      
      * Fix conversion script
      
      * Remove CausalLM Test
      
      * Update checkpoint names to dummy checkpoints
      
      * Add tokenizer mapping
      
      * Fix modeling file and corresponding tests
      
      * Add tokenization test file
      
      * Add PreTraining model test
      
      * Make style and quality
      
      * Make tokenization base tests work
      
      * Update docs
      
      * Add FastTokenizer tests
      
      * Fix fast tokenizer special tokens
      
      * Fix style and quality
      
      * Remove load_tf_weights vestiges
      
      * Add FNet to  main README
      
      * Fix configuration example indentation
      
      * Comment tokenization slow test
      
      * Fix style
      
      * Add changes from review
      
      * Fix style
      
      * Remove bos and eos tokens from tokenizers
      
      * Add tokenizer slow test, TPU transforms, NSP
      
      * Add scipy check
      
      * Add scipy availabilty check to test
      
      * Fix tokenizer and use correct inputs
      
      * Remove remaining TODOs
      
      * Fix tests
      
      * Fix tests
      
      * Comment Fourier Test
      
      * Uncomment Fourier Test
      
      * Change to google checkpoint
      
      * Add changes from review
      
      * Fix activation function
      
      * Fix model integration test
      
      * Add more integration tests
      
      * Add comparison steps to MLM integration test
      
      * Fix style
      
      * Add masked tokenization fix
      
      * Improve mask tokenization fix
      
      * Fix index docs
      
      * Add changes from review
      
      * Fix issue
      
      * Fix failing import in test
      
      * some more fixes
      
      * correct fast tokenizer
      
      * finalize
      
      * make style
      
      * Remove additional tokenization logic
      
      * Set do_lower_case to False
      
      * Allow keeping accents
      
      * Fix tokenization test
      
      * Fix FNet Tokenizer Fast
      
      * fix tests
      
      * make style
      
      * Add tips to FNet docs
      Co-authored-by: default avatarpatrickvonplaten <patrick.v.platen@gmail.com>
      d8049331
  24. 14 Sep, 2021 1 commit
    • Bhadresh Savani's avatar
      [Flax] Addition of FlaxPegasus (#13420) · c1e47bf4
      Bhadresh Savani authored
      
      
      * added initial files
      
      * fixes pipeline
      
      * fixes style and quality
      
      * fixes doc issue and positional encoding
      
      * fixes layer norm and test
      
      * fixes quality issue
      
      * fixes code quality
      
      * removed extra layer norm
      
      * added layer norm back in encoder and decoder
      
      * added more code copy quality checks
      
      * update tests
      
      * Apply suggestions from code review
      
      * fix import
      
      * fix test
      Co-authored-by: default avatarpatil-suraj <surajp815@gmail.com>
      c1e47bf4
  25. 08 Sep, 2021 1 commit
  26. 07 Sep, 2021 1 commit
  27. 02 Sep, 2021 2 commits