• Matt's avatar
    Add a TF in-graph tokenizer for BERT (#17701) · ee0d001d
    Matt authored
    * Add a TF in-graph tokenizer for BERT
    
    * Add from_pretrained
    
    * Add proper truncation, option handling to match other tokenizers
    
    * Add proper imports and guards
    
    * Add test, fix all the bugs exposed by said test
    
    * Fix truncation of paired texts in graph mode, more test updates
    
    * Small fixes, add a (very careful) test for savedmodel
    
    * Add tensorflow-text dependency, make fixup
    
    * Update documentation
    
    * Update documentation
    
    * make fixup
    
    * Slight changes to tests
    
    * Add some docstring examples
    
    * Update tests
    
    * Update tests and add proper lowercasing/normalization
    
    * make fixup
    
    * Add docstring for padding!
    
    * Mark slow tests
    
    * make fixup
    
    * Fall back to BertTokenizerFast if BertTokenizer is unavailable
    
    * Fall back to BertTokenizerFast if BertTokenizer is unavailable
    
    * make fixup
    
    * Properly handle tensorflow-text dummies
    ee0d001d
check_dummies.py 6.04 KB