1. 05 Oct, 2020 2 commits
    • Nathan Cooper's avatar
      [Model card] Java Code Summarizer model (#7568) · 071970fe
      Nathan Cooper authored
      
      
      * Create README.md
      
      * Update model_cards/ncoop57/bart-base-code-summarizer-java-v0/README.md
      Co-authored-by: default avatarJulien Chaumond <chaumond@gmail.com>
      071970fe
    • Forrest Iandola's avatar
      SqueezeBERT architecture (#7083) · 02ef825b
      Forrest Iandola authored
      * configuration_squeezebert.py
      
      thin wrapper around bert tokenizer
      
      fix typos
      
      wip sb model code
      
      wip modeling_squeezebert.py. Next step is to get the multi-layer-output interface working
      
      set up squeezebert to use BertModelOutput when returning results.
      
      squeezebert documentation
      
      formatting
      
      allow head mask that is an array of [None, ..., None]
      
      docs
      
      docs cont'd
      
      path to vocab
      
      docs and pointers to cloud files (WIP)
      
      line length and indentation
      
      squeezebert model cards
      
      formatting of model cards
      
      untrack modeling_squeezebert_scratchpad.py
      
      update aws paths to vocab and config files
      
      get rid of stub of NSP code, and advise users to pretrain with mlm only
      
      fix rebase issues
      
      redo rebase of modeling_auto.py
      
      fix issues with code formatting
      
      more code format auto-fixes
      
      move squeezebert before bert in tokenization_auto.py and modeling_auto.py because squeezebert inherits from bert
      
      tests for squeezebert modeling and tokenization
      
      fix typo
      
      move squeezebert before bert in modeling_auto.py to fix inheritance problem
      
      disable test_head_masking, since squeezebert doesn't yet implement head masking
      
      fix issues exposed by the test_modeling_squeezebert.py
      
      fix an issue exposed by test_tokenization_squeezebert.py
      
      fix issue exposed by test_modeling_squeezebert.py
      
      auto generated code style improvement
      
      issue that we inherited from modeling_xxx.py: SqueezeBertForMaskedLM.forward() calls self.cls(), but there is no self.cls, and I think the goal was actually to call self.lm_head()
      
      update copyright
      
      resolve failing 'test_hidden_states_output' and remove unused encoder_hidden_states and encoder_attention_mask
      
      docs
      
      add integration test. rename squeezebert-mnli --> squeezebert/squeezebert-mnli
      
      autogenerated formatting tweaks
      
      integrate feedback from patrickvonplaten and sgugger to programming style and documentation strings
      
      * tiny change to order of imports
      02ef825b
  2. 01 Oct, 2020 9 commits
  3. 30 Sep, 2020 1 commit
  4. 29 Sep, 2020 1 commit
  5. 28 Sep, 2020 3 commits
  6. 25 Sep, 2020 5 commits
  7. 22 Sep, 2020 2 commits
  8. 21 Sep, 2020 5 commits
  9. 19 Sep, 2020 4 commits
  10. 18 Sep, 2020 8 commits
    • Dat Quoc Nguyen's avatar
      Add new pre-trained models BERTweet and PhoBERT (#6129) · af2322c7
      Dat Quoc Nguyen authored
      * Add BERTweet and PhoBERT models
      
      * Update modeling_auto.py
      
      Re-add `bart` to LM_MAPPING
      
      * Update tokenization_auto.py
      
      Re-add `from .configuration_mobilebert import MobileBertConfig`
      not sure why it's replaced by `from transformers.configuration_mobilebert import MobileBertConfig`
      
      * Add BERTweet and PhoBERT to pretrained_models.rst
      
      * Update tokenization_auto.py
      
      Remove BertweetTokenizer and PhobertTokenizer out of tokenization_auto.py (they are currently not supported by AutoTokenizer.
      
      * Update BertweetTokenizer - without nltk
      
      * Update model card for BERTweet
      
      * PhoBERT - with Auto mode - without import fastBPE
      
      * PhoBERT - with Auto mode - without import fastBPE
      
      * BERTweet - with Auto mode - without import fastBPE
      
      * Add PhoBERT and BERTweet to TF modeling auto
      
      * Improve Docstrings for PhobertTokenizer and BertweetTokenizer
      
      * Update PhoBERT and BERTweet model cards
      
      * Fixed a merge conflict in tokenization_auto
      
      * Used black to reformat BERTweet- and PhoBERT-related files
      
      * Used isort to reformat BERTweet- and PhoBERT-related files
      
      * Reformatted BERTweet- and PhoBERT-related files based on flake8
      
      * Updated test files
      
      * Updated test files
      
      * Updated tf test files
      
      * Updated tf test files
      
      * Updated tf test files
      
      * Updated tf test files
      
      * Update commits from huggingface
      
      * Delete unnecessary files
      
      * Add tokenizers to auto and init files
      
      * Add test files for tokenizers
      
      * Revised model cards
      
      * Update save_vocabulary function in BertweetTokenizer and PhobertTokenizer and test files
      
      * Revised test files
      
      * Update orders of Phobert and Bertweet tokenizers in auto tokenization file
      af2322c7
    • Patrick von Platen's avatar
      Create README.md · 9397436e
      Patrick von Platen authored
      9397436e
    • Patrick von Platen's avatar
      Create README.md · 7eeca4d3
      Patrick von Platen authored
      7eeca4d3
    • Patrick von Platen's avatar
      Update README.md · 31516c77
      Patrick von Platen authored
      31516c77
    • Patrick von Platen's avatar
      Update README.md · 4c14669a
      Patrick von Platen authored
      4c14669a
    • Julien Chaumond's avatar
      [model_cards] · eef8d94d
      Julien Chaumond authored
      We use ISO 639-1 cc @gentaiscool
      eef8d94d
    • Patrick von Platen's avatar
      Create README.md · afd6a9f8
      Patrick von Platen authored
      afd6a9f8
    • Patrick von Platen's avatar
      Create README.md · 9f1544b9
      Patrick von Platen authored
      9f1544b9