1. 20 Sep, 2021 1 commit
    • Gunjan Chhablani's avatar
      Add FNet (#13045) · d8049331
      Gunjan Chhablani authored
      
      
      * Init FNet
      
      * Update config
      
      * Fix config
      
      * Update model classes
      
      * Update tokenizers to use sentencepiece
      
      * Fix errors in model
      
      * Fix defaults in config
      
      * Remove position embedding type completely
      
      * Fix typo and take only real numbers
      
      * Fix type vocab size in configuration
      
      * Add projection layer to embeddings
      
      * Fix position ids bug in embeddings
      
      * Add minor changes
      
      * Add conversion script and remove CausalLM vestiges
      
      * Fix conversion script
      
      * Fix conversion script
      
      * Remove CausalLM Test
      
      * Update checkpoint names to dummy checkpoints
      
      * Add tokenizer mapping
      
      * Fix modeling file and corresponding tests
      
      * Add tokenization test file
      
      * Add PreTraining model test
      
      * Make style and quality
      
      * Make tokenization base tests work
      
      * Update docs
      
      * Add FastTokenizer tests
      
      * Fix fast tokenizer special tokens
      
      * Fix style and quality
      
      * Remove load_tf_weights vestiges
      
      * Add FNet to  main README
      
      * Fix configuration example indentation
      
      * Comment tokenization slow test
      
      * Fix style
      
      * Add changes from review
      
      * Fix style
      
      * Remove bos and eos tokens from tokenizers
      
      * Add tokenizer slow test, TPU transforms, NSP
      
      * Add scipy check
      
      * Add scipy availabilty check to test
      
      * Fix tokenizer and use correct inputs
      
      * Remove remaining TODOs
      
      * Fix tests
      
      * Fix tests
      
      * Comment Fourier Test
      
      * Uncomment Fourier Test
      
      * Change to google checkpoint
      
      * Add changes from review
      
      * Fix activation function
      
      * Fix model integration test
      
      * Add more integration tests
      
      * Add comparison steps to MLM integration test
      
      * Fix style
      
      * Add masked tokenization fix
      
      * Improve mask tokenization fix
      
      * Fix index docs
      
      * Add changes from review
      
      * Fix issue
      
      * Fix failing import in test
      
      * some more fixes
      
      * correct fast tokenizer
      
      * finalize
      
      * make style
      
      * Remove additional tokenization logic
      
      * Set do_lower_case to False
      
      * Allow keeping accents
      
      * Fix tokenization test
      
      * Fix FNet Tokenizer Fast
      
      * fix tests
      
      * make style
      
      * Add tips to FNet docs
      Co-authored-by: default avatarpatrickvonplaten <patrick.v.platen@gmail.com>
      d8049331