1. 11 Oct, 2021 1 commit
  2. 17 Sep, 2021 1 commit
    • Ibraheem Moosa's avatar
      Optimize Token Classification models for TPU (#13096) · eae7a96b
      Ibraheem Moosa authored
      * Optimize Token Classification models for TPU
      
      As per the XLA document XLA cannot handle masked indexing well. So token classification
      models for BERT and others use an implementation based on `torch.where`. This implementation
      works well on TPU. 
      
      ALBERT token classification model uses the masked indexing which causes performance issues
      on TPU. This PR fixes this issue by following the BERT implementation.
      
      * Same fix for ELECTRA
      
      * Same fix for LayoutLM
      eae7a96b
  3. 31 Aug, 2021 1 commit
  4. 23 Aug, 2021 1 commit
  5. 12 Aug, 2021 1 commit
    • Ibraheem Moosa's avatar
      Fix classifier dropout in AlbertForMultipleChoice (#13087) · 3f52c685
      Ibraheem Moosa authored
      Classification head of AlbertForMultipleChoice uses `hidden_dropout_prob` instead of `classifier_dropout_prob`.  This
      is not desirable as we cannot change classifer head dropout probability without changing the dropout probabilities of
      the whole model.
      3f52c685
  6. 06 Aug, 2021 1 commit
    • Sylvain Gugger's avatar
      Tpu tie weights (#13030) · 7fcee113
      Sylvain Gugger authored
      * Fix tied weights on TPU
      
      * Manually tie weights in no trainer examples
      
      * Fix for test
      
      * One last missing
      
      * Gettning owned by my scripts
      
      * Address review comments
      
      * Fix test
      
      * Fix tests
      
      * Fix reformer tests
      7fcee113
  7. 26 Jul, 2021 1 commit
  8. 28 Jun, 2021 1 commit
  9. 22 Jun, 2021 1 commit
    • Hamid Shojanazeri's avatar
      Fix for the issue of device-id getting hardcoded for token_type_ids during Tracing [WIP] (#11252) · af6e01c5
      Hamid Shojanazeri authored
      
      
      * registering a buffer for token_type_ids, to pass the error of device-id getting hardcoded when tracing
      
      * sytle format
      
      * adding persistent flag to the resgitered buffers that prevent from adding them to the state_dict and addresses the Backward compatibility issue
      
      * adding the try catch to the fix as persistent flag is only available from PT >1.6
      
      * adding version check
      
      * added the condition to only use the token_type_ids buffer when its autogenerated not passed by user
      
      * adding comments and making the conidtion where token_type_ids are None to use the registered buffer
      
      * taking out position-embeddding from the if block
      
      * adding comments
      
      * handling the case if buffer for position_ids was not registered
      
      * reverted the changes on position_ids, fix the issue with size of token_type_ids buffer, moved the modification for generated token_type_ids to Bertmodel, instead of Embeddings
      
      * reverting the token_type_ids in case of None to the previous version
      
      * reverting changes on position_ids adding back the if block
      
      * changes added by running make fix-copies
      
      * changes added by running make fix-copies and added the import version as it was getting used
      
      * changes added by running make fix-copies
      
      * changes added by running make fix-copies
      
      * fixing the import format
      
      * fixing the import format
      
      * modified to use temp tensor for trimed and expanded token_type_ids buffer
      
      * changes made by fix-copies after temp tensor modifications
      
      * changes made by fix-copies after temp tensor modifications
      
      * changes made by fix-copies after temp tensor modifications
      
      * clean up
      
      * clean up
      
      * clean up
      
      * clean up
      
      * Nit
      
      * Nit
      
      * Nit
      
      * modified according to support device conversion on traced models
      
      * modified according to support device conversion on traced models
      
      * modified according to support device conversion on traced models
      
      * modified according to support device conversion on traced models
      
      * changes based on latest in master
      
      * Adapt templates
      
      * Add version import
      Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-32-81.us-west-2.compute.internal>
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      af6e01c5
  10. 14 Jun, 2021 1 commit
  11. 07 Jun, 2021 1 commit
    • Fran莽ois Lagunas's avatar
      Fixes bug that appears when using QA bert and distilation. (#12026) · f8bd8c6c
      Fran莽ois Lagunas authored
      * Fixing bug that appears when using distilation (and potentially other uses).
      During backward pass Pytorch complains with:
      RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
      This happens because the QA model code modifies the start_positions and end_positions input tensors, using clamp_ function: as a consequence the teacher and the student both modifies the inputs, and backward pass fails.
      
      * Fixing all models QA clamp_ bug.
      f8bd8c6c
  12. 01 Jun, 2021 1 commit
  13. 20 May, 2021 1 commit
  14. 06 May, 2021 1 commit
  15. 04 May, 2021 1 commit
  16. 26 Apr, 2021 1 commit
  17. 31 Mar, 2021 1 commit
  18. 05 Mar, 2021 2 commits
    • Daniel Hug's avatar
      Refactoring checkpoint names for multiple models (#10527) · 90ecc296
      Daniel Hug authored
      * Refactor checkpoint name in ALBERT and ALBERT_tf
      
      * Refactor checkpoint name in BART and BART_tf
      
      * Refactor checkpoint name in BERT generation
      
      * Refactor checkpoint name in Blenderbot_tf
      
      * Refactor checkpoint name in Blenderbot_small_tf
      
      * Refactor checkpoint name in ConvBERT AND CONVBERT_TF
      
      * Refactor checkpoint name in CTRL AND CTRL_TF
      
      * Refactor checkpoint name in DistilBERT AND DistilBERT_TF
      
      * Refactor checkpoint name in DistilBERT redo
      
      * Refactor checkpoint name in Electra and Electra_tf
      
      * Refactor checkpoint name in FlauBERT and FlauBERT_tf
      
      * Refactor checkpoint name in FSMT
      
      * Refactor checkpoint name in GPT2 and GPT2_tf
      
      * Refactor checkpoint name in IBERT
      
      * Refactor checkpoint name in LED and LED_tf
      
      * Refactor checkpoint name in Longformer and Longformer_tf
      
      * Refactor checkpoint name in Lxmert and Lxmert_tf
      
      * Refactor checkpoint name in Marian_tf
      
      * Refactor checkpoint name in MBART and MBART_tf
      
      * Refactor checkpoint name in MobileBERT and MobileBERT_tf
      
      * Refactor checkpoint name in mpnet and mpnet_tf
      
      * Refactor checkpoint name in openai and openai_tf
      
      * Refactor checkpoint name in pegasus_tf
      
      * Refactor checkpoint name in reformer
      
      * Refactor checkpoint name in Roberta and Roberta_tf
      
      * Refactor checkpoint name in SqueezeBert
      
      * Refactor checkpoint name in Transformer_xl and Transformer_xl_tf
      
      * Refactor checkpoint name in XLM and XLM_tf
      
      * Refactor checkpoint name in XLNET and XLNET_tf
      
      * Refactor checkpoint name in BERT_tf
      
      * run make tests, style, quality, fixup
      90ecc296
    • Sylvain Gugger's avatar
      Fix embeddings for PyTorch 1.8 (#10549) · 7da995c0
      Sylvain Gugger authored
      * Fix embeddings for PyTorch 1.8
      
      * Try with PyTorch 1.8.0
      
      * Fix embeddings init
      
      * Fix copies
      
      * Typo
      
      * More typos
      7da995c0
  19. 23 Dec, 2020 1 commit
    • Suraj Patil's avatar
      Add caching mechanism to BERT, RoBERTa (#9183) · 88ef8893
      Suraj Patil authored
      * add past_key_values
      
      * add use_cache option
      
      * make mask before cutting ids
      
      * adjust position_ids according to past_key_values
      
      * flatten past_key_values
      
      * fix positional embeds
      
      * fix _reorder_cache
      
      * set use_cache to false when not decoder, fix attention mask init
      
      * add test for caching
      
      * add past_key_values for Roberta
      
      * fix position embeds
      
      * add caching test for roberta
      
      * add doc
      
      * make style
      
      * doc, fix attention mask, test
      
      * small fixes
      
      * adress patrick's comments
      
      * input_ids shouldn't start with pad token
      
      * use_cache only when decoder
      
      * make consistent with bert
      
      * make copies consistent
      
      * add use_cache to encoder
      
      * add past_key_values to tapas attention
      
      * apply suggestions from code review
      
      * make coppies consistent
      
      * add attn mask in tests
      
      * remove copied from longformer
      
      * apply suggestions from code review
      
      * fix bart test
      
      * nit
      
      * simplify model outputs
      
      * fix doc
      
      * fix output ordering
      88ef8893
  20. 02 Dec, 2020 1 commit
    • Patrick von Platen's avatar
      [PyTorch] Refactor Resize Token Embeddings (#8880) · 443f67e8
      Patrick von Platen authored
      * fix resize tokens
      
      * correct mobile_bert
      
      * move embedding fix into modeling_utils.py
      
      * refactor
      
      * fix lm head resize
      
      * refactor
      
      * break lines to make sylvain happy
      
      * add news tests
      
      * fix typo
      
      * improve test
      
      * skip bart-like for now
      
      * check if base_model = get(...) is necessary
      
      * clean files
      
      * improve test
      
      * fix tests
      
      * revert style templates
      
      * Update templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}/modeling_{{cookiecutter.lowercase_modelname}}.py
      443f67e8
  21. 27 Nov, 2020 1 commit
  22. 25 Nov, 2020 1 commit
    • Patrick von Platen's avatar
      [XLNet] Fix mems behavior (#8567) · 2a6fbe6a
      Patrick von Platen authored
      * fix mems in xlnet
      
      * fix use_mems
      
      * fix use_mem_len
      
      * fix use mems
      
      * clean docs
      
      * fix tf typo
      
      * make xlnet tf for generation work
      
      * fix tf test
      
      * refactor use cache
      
      * add use cache for missing models
      
      * correct use_cache in generate
      
      * correct use cache in tf generate
      
      * fix tf
      
      * correct getattr typo
      
      * make sylvain happy
      
      * change in docs as well
      
      * do not apply to cookie cutter statements
      
      * fix tf test
      
      * make pytorch model fully backward compatible
      2a6fbe6a
  23. 24 Nov, 2020 1 commit
    • zhiheng-huang's avatar
      Support various BERT relative position embeddings (2nd) (#8276) · 2c83b3c3
      zhiheng-huang authored
      
      
      * Support BERT relative position embeddings
      
      * Fix typo in README.md
      
      * Address review comment
      
      * Fix failing tests
      
      * [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py
      
      * make fix copies
      
      * fix configs of electra and albert and fix longformer
      
      * remove copy statement from longformer
      
      * fix albert
      
      * fix electra
      
      * Add bert variants forward tests for various position embeddings
      
      * [tiny] Fix style for test_modeling_bert.py
      
      * improve docstring
      
      * [tiny] improve docstring and remove unnecessary dependency
      
      * [tiny] Remove unused import
      
      * re-add to ALBERT
      
      * make embeddings work for ALBERT
      
      * add test for albert
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      2c83b3c3
  24. 23 Nov, 2020 1 commit
    • Stas Bekman's avatar
      consistent ignore keys + make private (#8737) · e84786aa
      Stas Bekman authored
      * consistent ignore keys + make private
      
      * style
      
      * - authorized_missing_keys    => _keys_to_ignore_on_load_missing
        - authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected
      
      * move public doc of private attributes to private comment
      e84786aa
  25. 17 Nov, 2020 2 commits
  26. 16 Nov, 2020 1 commit
    • Sylvain Gugger's avatar
      Switch `return_dict` to `True` by default. (#8530) · 1073a2bd
      Sylvain Gugger authored
      * Use the CI to identify failing tests
      
      * Remove from all examples and tests
      
      * More default switch
      
      * Fixes
      
      * More test fixes
      
      * More fixes
      
      * Last fixes hopefully
      
      * Use the CI to identify failing tests
      
      * Remove from all examples and tests
      
      * More default switch
      
      * Fixes
      
      * More test fixes
      
      * More fixes
      
      * Last fixes hopefully
      
      * Run on the real suite
      
      * Fix slow tests
      1073a2bd
  27. 30 Oct, 2020 1 commit
  28. 28 Oct, 2020 1 commit
  29. 26 Oct, 2020 1 commit
    • Sylvain Gugger's avatar
      Doc styling (#8067) · 08f534d2
      Sylvain Gugger authored
      * Important files
      
      * Styling them all
      
      * Revert "Styling them all"
      
      This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e.
      
      * Syling them for realsies
      
      * Fix syntax error
      
      * Fix benchmark_utils
      
      * More fixes
      
      * Fix modeling auto and script
      
      * Remove new line
      
      * Fixes
      
      * More fixes
      
      * Fix more files
      
      * Style
      
      * Add FSMT
      
      * More fixes
      
      * More fixes
      
      * More fixes
      
      * More fixes
      
      * Fixes
      
      * More fixes
      
      * More fixes
      
      * Last fixes
      
      * Make sphinx happy
      08f534d2
  30. 12 Oct, 2020 1 commit
  31. 25 Sep, 2020 1 commit
  32. 24 Sep, 2020 1 commit
  33. 23 Sep, 2020 1 commit
  34. 04 Sep, 2020 1 commit
  35. 26 Aug, 2020 3 commits
  36. 25 Aug, 2020 1 commit