1. 09 Nov, 2021 1 commit
    • Yih-Dar's avatar
      Add FlaxVisionEncoderDecoderModel (#13359) · 95b3ec3b
      Yih-Dar authored
      
      
      * Start the work on FlaxVisionEncoderDecoderModel
      
      * Add FlaxVisionEncoderDecoderModel
      
      * Add VisionEncoderDecoderConfig
      
      * Make FlaxVisionEncoderDecoderModel visible to transformers
      
      * Add test
      
      * Fix wrong getattr usage
      
      * Fix tests
      
      * Add FlaxAutoModelForVision2Seq
      
      * Expose FLAX_MODEL_FOR_VISION_2_SEQ_MAPPING
      
      * clean-up
      
      * add integration test
      
      * update expected logits
      
      * update expected scores
      
      * Add ViT2GPT2ModelIntegrationTest + some cleaning
      
      * Add projection layer + PT/Flax equivalence tests
      
      * Fix import
      
      * minor changes
      
      * make test slow again
      
      * Apply suggestions
      
      * Add modeling_flax_vision_encoder_decoder to _ignore_modules in get_model_modules()
      
      * fix copies
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * split long strings in multiple lines
      
      * decoder_input_ids can't be None
      
      * Add back test_configuration_tie
      
      * Remove attention_mask parameter
      
      * fix test - encoder_last_hidden_state should be encoder_outputs.last_hidden_state instead of the projected vector
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Remove more encoder_attention_mask
      
      * remove encoder_attention_mask when calling self.decode (in FlaxVisionEncoderDecoderModule)
      
      * Fix style + pass 1s instead of None as encoder_attention_mask
      
      * fix init_weights
      
      * pass None for encoder_attention_mask
      
      * pass 1s instead of None as encoder_attention_mask
      
      * Fix doc style
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      95b3ec3b
  2. 13 Oct, 2021 1 commit
    • NielsRogge's avatar
      Add TrOCR + VisionEncoderDecoderModel (#13874) · 408b2d2b
      NielsRogge authored
      * First draft
      
      * Update self-attention of RoBERTa as proposition
      
      * Improve conversion script
      
      * Add TrOCR decoder-only model
      
      * More improvements
      
      * Make forward pass with pretrained weights work
      
      * More improvements
      
      * Some more improvements
      
      * More improvements
      
      * Make conversion work
      
      * Clean up print statements
      
      * Add documentation, processor
      
      * Add test files
      
      * Small improvements
      
      * Some more improvements
      
      * Make fix-copies, improve docs
      
      * Make all vision encoder decoder model tests pass
      
      * Make conversion script support other models
      
      * Update URL for OCR image
      
      * Update conversion script
      
      * Fix style & quality
      
      * Add support for the large-printed model
      
      * Fix some issues
      
      * Add print statement for debugging
      
      * Add print statements for debugging
      
      * Make possible fix for sinusoidal embedding
      
      * Further debugging
      
      * Potential fix v2
      
      * Add more print statements for debugging
      
      * Add more print statements for debugging
      
      * Deubg more
      
      * Comment out print statements
      
      * Make conversion of large printed model possible, address review comments
      
      * Make it possible to convert the stage1 checkpoints
      
      * Clean up code, apply suggestions from code review
      
      * Apply suggestions from code review, use Microsoft models in tests
      
      * Rename encoder_hidden_size to cross_attention_hidden_size
      
      * Improve docs
      408b2d2b
  3. 06 Sep, 2021 1 commit
  4. 02 Sep, 2021 2 commits
  5. 01 Sep, 2021 2 commits