1. 13 Oct, 2021 1 commit
    • NielsRogge's avatar
      Add TrOCR + VisionEncoderDecoderModel (#13874) · 408b2d2b
      NielsRogge authored
      * First draft
      
      * Update self-attention of RoBERTa as proposition
      
      * Improve conversion script
      
      * Add TrOCR decoder-only model
      
      * More improvements
      
      * Make forward pass with pretrained weights work
      
      * More improvements
      
      * Some more improvements
      
      * More improvements
      
      * Make conversion work
      
      * Clean up print statements
      
      * Add documentation, processor
      
      * Add test files
      
      * Small improvements
      
      * Some more improvements
      
      * Make fix-copies, improve docs
      
      * Make all vision encoder decoder model tests pass
      
      * Make conversion script support other models
      
      * Update URL for OCR image
      
      * Update conversion script
      
      * Fix style & quality
      
      * Add support for the large-printed model
      
      * Fix some issues
      
      * Add print statement for debugging
      
      * Add print statements for debugging
      
      * Make possible fix for sinusoidal embedding
      
      * Further debugging
      
      * Potential fix v2
      
      * Add more print statements for debugging
      
      * Add more print statements for debugging
      
      * Deubg more
      
      * Comment out print statements
      
      * Make conversion of large printed model possible, address review comments
      
      * Make it possible to convert the stage1 checkpoints
      
      * Clean up code, apply suggestions from code review
      
      * Apply suggestions from code review, use Microsoft models in tests
      
      * Rename encoder_hidden_size to cross_attention_hidden_size
      
      * Improve docs
      408b2d2b
  2. 29 Sep, 2021 1 commit
  3. 20 Sep, 2021 1 commit
    • Gunjan Chhablani's avatar
      Add FNet (#13045) · d8049331
      Gunjan Chhablani authored
      
      
      * Init FNet
      
      * Update config
      
      * Fix config
      
      * Update model classes
      
      * Update tokenizers to use sentencepiece
      
      * Fix errors in model
      
      * Fix defaults in config
      
      * Remove position embedding type completely
      
      * Fix typo and take only real numbers
      
      * Fix type vocab size in configuration
      
      * Add projection layer to embeddings
      
      * Fix position ids bug in embeddings
      
      * Add minor changes
      
      * Add conversion script and remove CausalLM vestiges
      
      * Fix conversion script
      
      * Fix conversion script
      
      * Remove CausalLM Test
      
      * Update checkpoint names to dummy checkpoints
      
      * Add tokenizer mapping
      
      * Fix modeling file and corresponding tests
      
      * Add tokenization test file
      
      * Add PreTraining model test
      
      * Make style and quality
      
      * Make tokenization base tests work
      
      * Update docs
      
      * Add FastTokenizer tests
      
      * Fix fast tokenizer special tokens
      
      * Fix style and quality
      
      * Remove load_tf_weights vestiges
      
      * Add FNet to  main README
      
      * Fix configuration example indentation
      
      * Comment tokenization slow test
      
      * Fix style
      
      * Add changes from review
      
      * Fix style
      
      * Remove bos and eos tokens from tokenizers
      
      * Add tokenizer slow test, TPU transforms, NSP
      
      * Add scipy check
      
      * Add scipy availabilty check to test
      
      * Fix tokenizer and use correct inputs
      
      * Remove remaining TODOs
      
      * Fix tests
      
      * Fix tests
      
      * Comment Fourier Test
      
      * Uncomment Fourier Test
      
      * Change to google checkpoint
      
      * Add changes from review
      
      * Fix activation function
      
      * Fix model integration test
      
      * Add more integration tests
      
      * Add comparison steps to MLM integration test
      
      * Fix style
      
      * Add masked tokenization fix
      
      * Improve mask tokenization fix
      
      * Fix index docs
      
      * Add changes from review
      
      * Fix issue
      
      * Fix failing import in test
      
      * some more fixes
      
      * correct fast tokenizer
      
      * finalize
      
      * make style
      
      * Remove additional tokenization logic
      
      * Set do_lower_case to False
      
      * Allow keeping accents
      
      * Fix tokenization test
      
      * Fix FNet Tokenizer Fast
      
      * fix tests
      
      * make style
      
      * Add tips to FNet docs
      Co-authored-by: default avatarpatrickvonplaten <patrick.v.platen@gmail.com>
      d8049331
  4. 08 Sep, 2021 1 commit
  5. 15 Jul, 2021 1 commit
    • qqaatw's avatar
      Translate README.md to Traditional Chinese (#12701) · 2349ac58
      qqaatw authored
      * Add README_zh-tw.md
      
      * Add links to each README.
      
      * Fix a mismatched term.
      
      * Minor improvements.
      
      * Rename language code to be more inclusive.
      
      * Polish terms to make them fluent.
      
      * Remove redundant spaces.
      
      * Fix typo.
      2349ac58
  6. 12 Jul, 2021 2 commits