• Matt's avatar
    TF port of ESM (#19587) · 3b3024da
    Matt authored
    
    
    * Partial TF port for ESM model
    
    * Add ESM-TF tests
    
    * Add the various imports for TF-ESM
    
    * TF weight conversion almost ready
    
    * Stop ignoring the decoder weights in PT
    
    * Add tests and lots of fixes
    
    * fix-copies
    
    * Fix imports, add model docs
    
    * Add get_vocab() to tokenizer
    
    * Fix vocab links for pretrained files
    
    * Allow multiple inputs with a sep
    
    * Use EOS as SEP token because ESM vocab lacks SEP
    
    * Correctly return special tokens mask from ESM tokenizer
    
    * make fixup
    
    * Stop testing unsupported embedding resizing
    
    * Handle TF bias correctly
    
    * Skip all models with slow tokenizers in the token classification test
    
    * Fixing the batch/unbatcher of pipelines to accomodate the `None` being
    
    passed around.
    
    * Fixing pipeline bug caused by slow tokenizer  being different.
    
    * Update src/transformers/models/esm/modeling_tf_esm.py
    Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
    
    * Update src/transformers/models/esm/modeling_tf_esm.py
    Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
    
    * Update src/transformers/models/esm/modeling_tf_esm.py
    Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
    
    * Update set_input_embeddings and the copyright notices
    Co-authored-by: default avatarYour Name <you@example.com>
    Co-authored-by: default avatarNicolas Patry <patry.nicolas@protonmail.com>
    Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
    3b3024da
tokenization_esm.py 5.55 KB