1. 17 Oct, 2022 2 commits
  2. 30 Sep, 2022 1 commit
    • Matt's avatar
      Rebase ESM PR and update all file formats (#19055) · 368b649a
      Matt authored
      
      
      * Rebase ESM PR and update all file formats
      
      * Fix test relative imports
      
      * Add __init__.py to the test dir
      
      * Disable gradient checkpointing
      
      * Remove references to TFESM... FOR NOW >:|
      
      * Remove completed TODOs from tests
      
      * Convert docstrings to mdx, fix-copies from BERT
      
      * fix-copies for the README and index
      
      * Update ESM's __init__.py to the modern format
      
      * Add to _toctree.yml
      
      * Ensure we correctly copy the pad_token_id from the original ESM model
      
      * Ensure we correctly copy the pad_token_id from the original ESM model
      
      * Tiny grammar nitpicks
      
      * Make the layer norm after embeddings an optional flag
      
      * Make the layer norm after embeddings an optional flag
      
      * Update the conversion script to handle other model classes
      
      * Remove token_type_ids entirely, fix attention_masking and add checks to convert_esm.py
      
      * Break the copied from link from BertModel.forward to remove token_type_ids
      
      * Remove debug array saves
      
      * Begin ESM-2 porting
      
      * Add a hacky workaround for the precision issue in original repo
      
      * Code cleanup
      
      * Remove unused checkpoint conversion code
      
      * Remove unused checkpoint conversion code
      
      * Fix copyright notices
      
      * Get rid of all references to the TF weights conversion
      
      * Remove token_type_ids from the tests
      
      * Fix test code
      
      * Update src/transformers/__init__.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/__init__.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Add credit
      
      * Remove _ args and __ kwargs in rotary embedding
      
      * Assertively remove asserts
      
      * Replace einsum with torch.outer()
      
      * Fix docstring formatting
      
      * Remove assertions in tokenization
      
      * Add paper citation to ESMModel docstring
      
      * Move vocab list to single line
      
      * Remove ESMLayer from init
      
      * Add Facebook copyrights
      
      * Clean up RotaryEmbedding docstring
      
      * Fix docstring formatting
      
      * Fix docstring for config object
      
      * Add explanation for new config methods
      
      * make fix-copies
      
      * Rename all the ESM- classes to Esm-
      
      * Update conversion script to allow pushing to hub
      
      * Update tests to point at my repo for now
      
      * Set config properly for tests
      
      * Remove the gross hack that forced loss of precision in inv_freq and instead copy the data from the model being converted
      
      * make fixup
      
      * Update expected values for slow tests
      
      * make fixup
      
      * Remove EsmForCausalLM for now
      
      * Remove EsmForCausalLM for now
      
      * Fix padding idx test
      
      * Updated README and docs with ESM-1b and ESM-2 separately (#19221)
      
      * Updated README and docs with ESM-1b and ESM-2 separately
      
      * Update READMEs, longer entry with 3 citations
      
      * make fix-copies
      Co-authored-by: default avatarYour Name <you@example.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarTom Sercu <tsercu@fb.com>
      Co-authored-by: default avatarYour Name <you@example.com>
      368b649a
  3. 14 Sep, 2022 1 commit
  4. 13 Sep, 2022 1 commit
  5. 05 Sep, 2022 1 commit
  6. 03 Aug, 2022 1 commit
    • LSinev's avatar
      Fix torch version comparisons (#18460) · 02b176c4
      LSinev authored
      Comparisons like
      version.parse(torch.__version__) > version.parse("1.6")
      are True for torch==1.6.0+cu101 or torch==1.6.0+cpu
      
      version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py
      02b176c4
  7. 12 May, 2022 1 commit
  8. 04 May, 2022 1 commit
  9. 03 May, 2022 1 commit
  10. 12 Apr, 2022 1 commit
    • Anmol Joshi's avatar
      Moved functions to pytorch_utils.py (#16625) · a315988b
      Anmol Joshi authored
      * Moved functions to pytorch_utils.py
      
      * isort formatting
      
      * Reverted tf changes
      
      * isort, make fix-copies
      
      * documentation fix
      
      * Fixed Conv1D import
      
      * Reverted research examples file
      
      * backward compatibility for pytorch_utils
      
      * missing import
      
      * isort fix
      a315988b
  11. 31 Mar, 2022 1 commit
  12. 25 Mar, 2022 1 commit
  13. 23 Mar, 2022 1 commit
    • Sylvain Gugger's avatar
      Reorganize file utils (#16264) · 4975002d
      Sylvain Gugger authored
      * Split file_utils in several submodules
      
      * Fixes
      
      * Add back more objects
      
      * More fixes
      
      * Who exactly decided to import that from there?
      
      * Second suggestion to code with code review
      
      * Revert wront move
      
      * Fix imports
      
      * Adapt all imports
      
      * Adapt all imports everywhere
      
      * Revert this import, will fix in a separate commit
      4975002d
  14. 22 Mar, 2022 1 commit
  15. 11 Mar, 2022 1 commit
  16. 07 Feb, 2022 1 commit
    • Michael Benayoun's avatar
      FX tracing improvement (#14321) · 0fe17f37
      Michael Benayoun authored
      * Change the way tracing happens, enabling dynamic axes out of the box
      
      * Update the tests and modeling xlnet
      
      * Add the non recoding of leaf modules to avoid recording more values for the methods to record than what will be seen at tracing time (which would otherwise desynchronize the recorded values and the values that need to be given to the proxies during tracing, causing errors).
      
      * Comments and making tracing work for gpt-j and xlnet
      
      * Refactore things related to num_choices (and batch_size, sequence_length)
      
      * Update fx to work on PyTorch 1.10
      
      * Postpone autowrap_function feature usage for later
      
      * Add copyrights
      
      * Remove unnecessary file
      
      * Fix issue with add_new_model_like
      
      * Apply suggestions
      0fe17f37
  17. 29 Jan, 2022 1 commit
  18. 28 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Doc styler examples (#14953) · b5e2b183
      Sylvain Gugger authored
      * Fix bad examples
      
      * Add black formatting to style_doc
      
      * Use first nonempty line
      
      * Put it at the right place
      
      * Don't add spaces to empty lines
      
      * Better templates
      
      * Deal with triple quotes in docstrings
      
      * Result of style_doc
      
      * Enable mdx treatment and fix code examples in MDXs
      
      * Result of doc styler on doc source files
      
      * Last fixes
      
      * Break copy from
      b5e2b183
  19. 27 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Doc styler v2 (#14950) · 87e6e4fe
      Sylvain Gugger authored
      * New doc styler
      
      * Fix issue with args at the start
      
      * Code sample fixes
      
      * Style code examples in MDX
      
      * Fix more patterns
      
      * Typo
      
      * Typo
      
      * More patterns
      
      * Do without black for now
      
      * Get more info in error
      
      * Docstring style
      
      * Re-enable check
      
      * Quality
      
      * Fix add_end_docstring decorator
      
      * Fix docstring
      87e6e4fe
  20. 21 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Convert docstrings of modeling files (#14850) · 7af80f66
      Sylvain Gugger authored
      * Convert file_utils docstrings to Markdown
      
      * Test on BERT
      
      * Return block indent
      
      * Temporarily disable doc styler
      
      * Remove from quality checks as well
      
      * Remove doc styler mess
      
      * Remove check from circleCI
      
      * Fix typo
      
      * Convert file_utils docstrings to Markdown
      
      * Test on BERT
      
      * Return block indent
      
      * Temporarily disable doc styler
      
      * Remove from quality checks as well
      
      * Remove doc styler mess
      
      * Remove check from circleCI
      
      * Fix typo
      
      * Let's go on all other model files
      
      * Add templates too
      
      * Styling and quality
      7af80f66
  21. 30 Nov, 2021 1 commit
  22. 18 Nov, 2021 2 commits
  23. 09 Nov, 2021 1 commit
  24. 15 Oct, 2021 1 commit
  25. 11 Oct, 2021 1 commit
  26. 22 Sep, 2021 1 commit
  27. 31 Aug, 2021 1 commit
  28. 06 Aug, 2021 1 commit
    • Sylvain Gugger's avatar
      Tpu tie weights (#13030) · 7fcee113
      Sylvain Gugger authored
      * Fix tied weights on TPU
      
      * Manually tie weights in no trainer examples
      
      * Fix for test
      
      * One last missing
      
      * Gettning owned by my scripts
      
      * Address review comments
      
      * Fix test
      
      * Fix tests
      
      * Fix reformer tests
      7fcee113
  29. 03 Aug, 2021 1 commit
  30. 26 Jul, 2021 1 commit
  31. 01 Jul, 2021 1 commit
  32. 22 Jun, 2021 1 commit
    • Hamid Shojanazeri's avatar
      Fix for the issue of device-id getting hardcoded for token_type_ids during Tracing [WIP] (#11252) · af6e01c5
      Hamid Shojanazeri authored
      
      
      * registering a buffer for token_type_ids, to pass the error of device-id getting hardcoded when tracing
      
      * sytle format
      
      * adding persistent flag to the resgitered buffers that prevent from adding them to the state_dict and addresses the Backward compatibility issue
      
      * adding the try catch to the fix as persistent flag is only available from PT >1.6
      
      * adding version check
      
      * added the condition to only use the token_type_ids buffer when its autogenerated not passed by user
      
      * adding comments and making the conidtion where token_type_ids are None to use the registered buffer
      
      * taking out position-embeddding from the if block
      
      * adding comments
      
      * handling the case if buffer for position_ids was not registered
      
      * reverted the changes on position_ids, fix the issue with size of token_type_ids buffer, moved the modification for generated token_type_ids to Bertmodel, instead of Embeddings
      
      * reverting the token_type_ids in case of None to the previous version
      
      * reverting changes on position_ids adding back the if block
      
      * changes added by running make fix-copies
      
      * changes added by running make fix-copies and added the import version as it was getting used
      
      * changes added by running make fix-copies
      
      * changes added by running make fix-copies
      
      * fixing the import format
      
      * fixing the import format
      
      * modified to use temp tensor for trimed and expanded token_type_ids buffer
      
      * changes made by fix-copies after temp tensor modifications
      
      * changes made by fix-copies after temp tensor modifications
      
      * changes made by fix-copies after temp tensor modifications
      
      * clean up
      
      * clean up
      
      * clean up
      
      * clean up
      
      * Nit
      
      * Nit
      
      * Nit
      
      * modified according to support device conversion on traced models
      
      * modified according to support device conversion on traced models
      
      * modified according to support device conversion on traced models
      
      * modified according to support device conversion on traced models
      
      * changes based on latest in master
      
      * Adapt templates
      
      * Add version import
      Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-32-81.us-west-2.compute.internal>
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      af6e01c5
  33. 14 Jun, 2021 1 commit
  34. 07 Jun, 2021 1 commit
    • Fran莽ois Lagunas's avatar
      Fixes bug that appears when using QA bert and distilation. (#12026) · f8bd8c6c
      Fran莽ois Lagunas authored
      * Fixing bug that appears when using distilation (and potentially other uses).
      During backward pass Pytorch complains with:
      RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
      This happens because the QA model code modifies the start_positions and end_positions input tensors, using clamp_ function: as a consequence the teacher and the student both modifies the inputs, and backward pass fails.
      
      * Fixing all models QA clamp_ bug.
      f8bd8c6c
  35. 01 Jun, 2021 1 commit
  36. 20 May, 2021 1 commit
  37. 04 May, 2021 1 commit
  38. 26 Apr, 2021 1 commit