1. 30 Jan, 2024 1 commit
    • Matt's avatar
      Add tf_keras imports to prepare for Keras 3 (#28588) · 415e9a09
      Matt authored
      * Port core files + ESM (because ESM code is odd)
      
      * Search-replace in modelling code
      
      * Fix up transfo_xl as well
      
      * Fix other core files + tests (still need to add correct import to tests)
      
      * Fix cookiecutter
      
      * make fixup, fix imports in some more core files
      
      * Auto-add imports to tests
      
      * Cleanup, add imports to sagemaker tests
      
      * Use correct exception for importing tf_keras
      
      * Fixes in modeling_tf_utils
      
      * make fixup
      
      * Correct version parsing code
      
      * Ensure the pipeline tests correctly revert to float32 after each test
      
      * Ensure the pipeline tests correctly revert to float32 after each test
      
      * More tf.keras -> keras
      
      * Add dtype cast
      
      * Better imports of tf_keras
      
      * Add a cast for tf.assign, just in case
      
      * Fix callback imports
      415e9a09
  2. 21 Dec, 2023 1 commit
    • Matt's avatar
      Even more TF test fixes (#28146) · 260b9d21
      Matt authored
      * Fix vision text dual encoder
      
      * Small cleanup for wav2vec2 (not fixed yet)
      
      * Small fix for vision_encoder_decoder
      
      * Fix SAM builds
      
      * Update TFBertTokenizer test with modern exporting + tokenizer
      
      * Fix DeBERTa
      
      * Fix DeBERTav2
      
      * Try RAG fix but it's impossible to test locally
      
      * Actually fix RAG now that I got FAISS working somehow
      
      * Fix Wav2Vec2, add sermon
      
      * Fix Hubert
      260b9d21
  3. 12 Jan, 2023 1 commit
  4. 14 Oct, 2022 1 commit
  5. 27 Jun, 2022 1 commit
    • Matt's avatar
      Add a TF in-graph tokenizer for BERT (#17701) · ee0d001d
      Matt authored
      * Add a TF in-graph tokenizer for BERT
      
      * Add from_pretrained
      
      * Add proper truncation, option handling to match other tokenizers
      
      * Add proper imports and guards
      
      * Add test, fix all the bugs exposed by said test
      
      * Fix truncation of paired texts in graph mode, more test updates
      
      * Small fixes, add a (very careful) test for savedmodel
      
      * Add tensorflow-text dependency, make fixup
      
      * Update documentation
      
      * Update documentation
      
      * make fixup
      
      * Slight changes to tests
      
      * Add some docstring examples
      
      * Update tests
      
      * Update tests and add proper lowercasing/normalization
      
      * make fixup
      
      * Add docstring for padding!
      
      * Mark slow tests
      
      * make fixup
      
      * Fall back to BertTokenizerFast if BertTokenizer is unavailable
      
      * Fall back to BertTokenizerFast if BertTokenizer is unavailable
      
      * make fixup
      
      * Properly handle tensorflow-text dummies
      ee0d001d