1. 09 Oct, 2020 3 commits
  2. 08 Oct, 2020 3 commits
    • Lysandre Debut's avatar
      Fix RobertaForCausalLM docs (#7642) · 4a00613c
      Lysandre Debut authored
      
      
      * Fix RobertaForCausalLM docs
      
      * Apply review suggestion
      Co-authored-by: default avatarsgugger <sylvain.gugger@gmail,com>
      Co-authored-by: default avatarsgugger <sylvain.gugger@gmail,com>
      4a00613c
    • Thomas Wolf's avatar
      Adding Fast tokenizers for SentencePiece based tokenizers - Breaking: remove... · 9aeacb58
      Thomas Wolf authored
      
      Adding Fast tokenizers for SentencePiece based tokenizers - Breaking: remove Transfo-XL fast tokenizer (#7141)
      
      * [WIP] SP tokenizers
      
      * fixing tests for T5
      
      * WIP tokenizers
      
      * serialization
      
      * update T5
      
      * WIP T5 tokenization
      
      * slow to fast conversion script
      
      * Refactoring to move tokenzier implementations inside transformers
      
      * Adding gpt - refactoring - quality
      
      * WIP adding several tokenizers to the fast world
      
      * WIP Roberta - moving implementations
      
      * update to dev4 switch file loading to in-memory loading
      
      * Updating and fixing
      
      * advancing on the tokenizers - updating do_lower_case
      
      * style and quality
      
      * moving forward with tokenizers conversion and tests
      
      * MBart, T5
      
      * dumping the fast version of transformer XL
      
      * Adding to autotokenizers + style/quality
      
      * update init and space_between_special_tokens
      
      * style and quality
      
      * bump up tokenizers version
      
      * add protobuf
      
      * fix pickle Bert JP with Mecab
      
      * fix newly added tokenizers
      
      * style and quality
      
      * fix bert japanese
      
      * fix funnel
      
      * limite tokenizer warning to one occurence
      
      * clean up file
      
      * fix new tokenizers
      
      * fast tokenizers deep tests
      
      * WIP adding all the special fast tests on the new fast tokenizers
      
      * quick fix
      
      * adding more fast tokenizers in the fast tests
      
      * all tokenizers in fast version tested
      
      * Adding BertGenerationFast
      
      * bump up setup.py for CI
      
      * remove BertGenerationFast (too early)
      
      * bump up tokenizers version
      
      * Clean old docstrings
      
      * Typo
      
      * Update following Lysandre comments
      Co-authored-by: default avatarSylvain Gugger <sylvain.gugger@gmail.com>
      9aeacb58
    • Piero Molino's avatar
      Replaced torch.load for loading the pretrained vocab of TransformerXL... · 4d04120c
      Piero Molino authored
      
      Replaced torch.load for loading the pretrained vocab of TransformerXL tokenizer to pickle.load (#6935)
      
      * Replaced torch.load for loading the pretrained vocab of TransformerXL to pickle.load
      
      * Replaced torch.save with pickle.dump when saving the vocabulary
      
      * updating transformer-xl
      
      * uploaded on S3 - compatibility
      
      * fix tests
      
      * style
      
      * Address review comments
      Co-authored-by: default avatarThomas Wolf <thomwolf@users.noreply.github.com>
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      4d04120c
  3. 07 Oct, 2020 3 commits
  4. 06 Oct, 2020 8 commits
  5. 05 Oct, 2020 12 commits
    • Sylvain Gugger's avatar
      Documentation fixes (#7585) · 03835af7
      Sylvain Gugger authored
      03835af7
    • Julien Plu's avatar
      Custom TF weights loading (#7422) · 9cf7b23b
      Julien Plu authored
      
      
      * First try
      
      * Fix TF utils
      
      * Handle authorized unexpected keys when loading weights
      
      * Add several more authorized unexpected keys
      
      * Apply style
      
      * Fix test
      
      * Address Patrick's comments.
      
      * Update src/transformers/modeling_tf_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/modeling_tf_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Apply style
      
      * Make return_dict the default behavior and display a warning message
      
      * Revert
      
      * Replace wrong keyword
      
      * Revert code
      
      * Add forgot key
      
      * Fix bug in loading PT models from a TF one.
      
      * Fix sort
      
      * Add a test for custom load weights in BERT
      
      * Apply style
      
      * Remove unused import
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      9cf7b23b
    • Sylvain Gugger's avatar
      ca05c2a4
    • Sylvain Gugger's avatar
      Add new dummy PT objects · 3bd3d8b5
      Sylvain Gugger authored
      3bd3d8b5
    • Sylvain Gugger's avatar
      Allow soft dependencies in the namespace with ImportErrors at use (#7537) · 28d183c9
      Sylvain Gugger authored
      * PoC on RAG
      
      * Format class name/obj name
      
      * Better name in message
      
      * PoC on one TF model
      
      * Add PyTorch and TF dummy objects + script
      
      * Treat scikit-learn
      
      * Bad copy pastes
      
      * Typo
      28d183c9
    • Malte Pietsch's avatar
      Fix tokenization in SQuAD for RoBERTa, Longformer, BART (#7387) · ba5ea66e
      Malte Pietsch authored
      * fix squad tokenization for roberta & co
      
      * change to pure type based check
      
      * sort imports
      ba5ea66e
    • Sylvain Gugger's avatar
      0270256b
    • Cola's avatar
      Add `power` argument for TF PolynomialDecay (#5732) · 60de910e
      Cola authored
      * 馃毄 Add `power` argument for TF PolynomialDecay
      
      * 馃毄 Create default optimizer with power
      
      * 馃毄 Add argument to training args
      
      * 馃毃 Clean code format
      
      * 馃毃 Fix black warning
      
      * 馃毃 Fix code format
      60de910e
    • Lysandre Debut's avatar
      Add Electra unexpected keys (#7569) · 41c3a3b9
      Lysandre Debut authored
      41c3a3b9
    • Forrest Iandola's avatar
      SqueezeBERT architecture (#7083) · 02ef825b
      Forrest Iandola authored
      * configuration_squeezebert.py
      
      thin wrapper around bert tokenizer
      
      fix typos
      
      wip sb model code
      
      wip modeling_squeezebert.py. Next step is to get the multi-layer-output interface working
      
      set up squeezebert to use BertModelOutput when returning results.
      
      squeezebert documentation
      
      formatting
      
      allow head mask that is an array of [None, ..., None]
      
      docs
      
      docs cont'd
      
      path to vocab
      
      docs and pointers to cloud files (WIP)
      
      line length and indentation
      
      squeezebert model cards
      
      formatting of model cards
      
      untrack modeling_squeezebert_scratchpad.py
      
      update aws paths to vocab and config files
      
      get rid of stub of NSP code, and advise users to pretrain with mlm only
      
      fix rebase issues
      
      redo rebase of modeling_auto.py
      
      fix issues with code formatting
      
      more code format auto-fixes
      
      move squeezebert before bert in tokenization_auto.py and modeling_auto.py because squeezebert inherits from bert
      
      tests for squeezebert modeling and tokenization
      
      fix typo
      
      move squeezebert before bert in modeling_auto.py to fix inheritance problem
      
      disable test_head_masking, since squeezebert doesn't yet implement head masking
      
      fix issues exposed by the test_modeling_squeezebert.py
      
      fix an issue exposed by test_tokenization_squeezebert.py
      
      fix issue exposed by test_modeling_squeezebert.py
      
      auto generated code style improvement
      
      issue that we inherited from modeling_xxx.py: SqueezeBertForMaskedLM.forward() calls self.cls(), but there is no self.cls, and I think the goal was actually to call self.lm_head()
      
      update copyright
      
      resolve failing 'test_hidden_states_output' and remove unused encoder_hidden_states and encoder_attention_mask
      
      docs
      
      add integration test. rename squeezebert-mnli --> squeezebert/squeezebert-mnli
      
      autogenerated formatting tweaks
      
      integrate feedback from patrickvonplaten and sgugger to programming style and documentation strings
      
      * tiny change to order of imports
      02ef825b
    • Sylvain Gugger's avatar
      Cleanup documentation for BART, Marian, MBART and Pegasus (#7523) · e2c935f5
      Sylvain Gugger authored
      * Cleanup documentation for BART, Marian, MBART and Pegasus
      
      * Cleanup documentation for BART, Marian, MBART and Pegasus
      e2c935f5
    • Alexandr's avatar
      LayoutLM: add exception handling for bbox values (#7452) · 5e941bec
      Alexandr authored
      
      
      * LayoutLM: add exception handling for bbox values
      
      To replicate unhandled error:
      
      - In `test_modelling_layoutlm.py` set `range_bbox=1025`, i.e. greater 1024
      - Run `pytest tests/test_modeling_layoutlm.py`
      
      Requirement for bbox values to be within the range 0-1000 is documented
      but if it is violated then it isa not clear what is the issue from error
      message.
      
      * Update src/transformers/modeling_layoutlm.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      5e941bec
  6. 04 Oct, 2020 1 commit
  7. 01 Oct, 2020 9 commits
  8. 30 Sep, 2020 1 commit