1. 17 Oct, 2022 1 commit
    • Matt's avatar
      TF port of ESM (#19587) · 3b3024da
      Matt authored
      
      
      * Partial TF port for ESM model
      
      * Add ESM-TF tests
      
      * Add the various imports for TF-ESM
      
      * TF weight conversion almost ready
      
      * Stop ignoring the decoder weights in PT
      
      * Add tests and lots of fixes
      
      * fix-copies
      
      * Fix imports, add model docs
      
      * Add get_vocab() to tokenizer
      
      * Fix vocab links for pretrained files
      
      * Allow multiple inputs with a sep
      
      * Use EOS as SEP token because ESM vocab lacks SEP
      
      * Correctly return special tokens mask from ESM tokenizer
      
      * make fixup
      
      * Stop testing unsupported embedding resizing
      
      * Handle TF bias correctly
      
      * Skip all models with slow tokenizers in the token classification test
      
      * Fixing the batch/unbatcher of pipelines to accomodate the `None` being
      
      passed around.
      
      * Fixing pipeline bug caused by slow tokenizer  being different.
      
      * Update src/transformers/models/esm/modeling_tf_esm.py
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update src/transformers/models/esm/modeling_tf_esm.py
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update src/transformers/models/esm/modeling_tf_esm.py
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update set_input_embeddings and the copyright notices
      Co-authored-by: default avatarYour Name <you@example.com>
      Co-authored-by: default avatarNicolas Patry <patry.nicolas@protonmail.com>
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      3b3024da