1. 20 Feb, 2024 1 commit
  2. 16 Feb, 2024 1 commit
  3. 14 Feb, 2024 1 commit
    • JB (Don)'s avatar
      Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948) · 725f4ad1
      JB (Don) authored
      * Add tie_weights() to LM heads and set bias in set_output_embeddings()
      
      The bias were not tied correctly in some LM heads, and this change should fix that.
      
      * Moving test_save_and_load_low_cpu_mem_usage to ModelTesterMixin
      
      * Adding _tie_weights() to MPNet and Vilt
      
      * Skip test for low cpu mem usage for Deta/DeformableDetr since they cannot init on meta device
      
      * Rename to test name to save_load to match the convention
      725f4ad1
  4. 27 Oct, 2023 1 commit
  5. 25 Oct, 2023 1 commit
    • Younes Belkada's avatar
      [`core`] Refactor of `gradient_checkpointing` (#27020) · 06e782da
      Younes Belkada authored
      * v1
      
      * fix
      
      * remove `create_custom_forward`
      
      * fixup
      
      * fixup
      
      * add test and fix all failing GC tests
      
      * remove all remaining `create_custom_forward` methods
      
      * fix idefics bug
      
      * fixup
      
      * replace with `__call__`
      
      * add comment
      
      * quality
      06e782da
  6. 11 Oct, 2023 1 commit
    • Billy Bradley's avatar
      In assisted decoding, pass model_kwargs to model's forward call (fix... · dcc49d8a
      Billy Bradley authored
      In assisted decoding, pass model_kwargs to model's forward call (fix prepare_input_for_generation in all models) (#25242)
      
      * In assisted decoding, pass model_kwargs to model's forward call
      
      Previously, assisted decoding would ignore any additional kwargs
      that it doesn't explicitly handle. This was inconsistent with other
      generation methods, which pass the model_kwargs through
      prepare_inputs_for_generation and forward the returned dict to the
      model's forward call.
      
      The prepare_inputs_for_generation method needs to be amended in all
      models, as previously it only kept the last input ID when a past_key_values
      was passed.
      
      * Improve variable names in _extend_attention_mask
      
      * Refactor extending token_type_ids into a function
      
      * Replace deepcopy with copy to optimize performance
      
      * Update new persimmon model with llama changes for assisted generation
      
      * Update new mistral model for assisted generation with prepare_inputs_for_generation
      
      * Update position_ids creation in falcon prepare_inputs_for_generation to support assisted generation
      dcc49d8a
  7. 14 Sep, 2023 1 commit
  8. 08 Aug, 2023 1 commit
    • JB (Don)'s avatar
      Add warning for missing attention mask when pad tokens are detected (#25345) · 5ea2595e
      JB (Don) authored
      * Add attention mask and pad token warning to many of the models
      
      * Remove changes under examples/research_projects
      
      These files are not maintained by HG.
      
      * Skip the warning check during torch.fx or JIT tracing
      
      * Switch ordering for the warning and input shape assignment
      
      This ordering is a little cleaner for some of the cases.
      
      * Add missing line break in one of the files
      5ea2595e
  9. 17 Jul, 2023 1 commit
    • Syed Salman Habeeb Quadri's avatar
      Replace assert statements with exceptions (#24856) · d0154015
      Syed Salman Habeeb Quadri authored
      * Changed AssertionError to ValueError
      
      try-except block was using AssesrtionError in except statement while the expected error is value error. Fixed the same.
      
      * Changed AssertionError to ValueError
      
      try-except block was using AssesrtionError in except statement while the expected error is ValueError. Fixed the same.
      Note: While raising the ValueError args are passed to it, but later added again while handling the error (See the code snippet)
      
      * Changed AssertionError to ValueError
      
      try-except block was using AssesrtionError in except statement while the expected error is ValueError. Fixed the same.
      Note: While raising the ValueError args are passed to it, but later added again while handling the error (See the code snippet)
      
      * Changed AssertionError to ValueError
      
      * Changed AssertionError to ValueError
      
      * Changed AssertionError to ValueError
      
      * Changed AssertionError to ValueError
      
      * Changed AssertionError to ValueError
      
      * Changed assert statement to ValueError based
      
      * Changed assert statement to ValueError based
      
      * Changed assert statement to ValueError based
      
      * Changed incorrect error handling from AssertionError to ValueError
      
      * Undoed change from AssertionError to ValueError as it is not needed
      
      * Reverted back to using AssertionError as it is not necessary to make it into ValueError
      
      * Fixed erraneous comparision
      
      Changed == to !=
      
      * Fixed erraneous comparision
      
      Changed == to !=
      
      * formatted the code
      
      * Ran make fix-copies
      d0154015
  10. 30 Jun, 2023 1 commit
    • JB (Don)'s avatar
      Show a warning for missing attention masks when pad_token_id is not None (#24510) · 78a2b19f
      JB (Don) authored
      
      
      * Adding warning messages to BERT for missing attention masks
      
      These warning messages when there are pad tokens within the input ids and
      no attention masks are given. The warning message should only show up once.
      
      * Adding warning messages to BERT for missing attention masks
      
      These warning messages are shown when the pad_token_id is not None
      and no attention masks are given. The warning message should only
      show up once.
      
      * Ran fix copies to copy over the changes to some of the other models
      
      * Add logger.warning_once.cache_clear() to the test
      
      * Shows warning when there are no attention masks and input_ids start/end with pad tokens
      
      * Using warning_once() instead and fix indexing in input_ids check
      
      ---------
      Co-authored-by: default avatarJB Lau <hckyn@voyager2.local>
      78a2b19f
  11. 27 Jun, 2023 1 commit
    • Sylvain Gugger's avatar
      Clean load keys (#24505) · 8e5d1619
      Sylvain Gugger authored
      * Preliminary work on some models
      
      * Fix test load missing and make sure nonpersistent buffers are tested
      
      * Always ignore nonpersistent buffers if in state_dict
      
      * Treat models
      
      * More models
      
      * Treat remaining models
      
      * Fix quality
      
      * Fix tests
      
      * Remove draft
      
      * This test is not needed anymore
      
      * Fix copies
      
      * Fix last test
      
      * Newly added models
      
      * Fix last tests
      
      * Address review comments
      8e5d1619
  12. 22 Jun, 2023 1 commit
  13. 21 Jun, 2023 1 commit
  14. 13 Jun, 2023 1 commit
    • Sylvain Gugger's avatar
      Tied params cleanup (#24211) · 695928e1
      Sylvain Gugger authored
      * First test
      
      * Add info for all models
      
      * style
      
      * Repo consistency
      
      * Fix last model and cleanup prints
      
      * Repo consistency
      
      * Use consistent function for detecting tied weights
      695928e1
  15. 06 Mar, 2023 1 commit
  16. 27 Feb, 2023 1 commit
  17. 07 Feb, 2023 1 commit
    • Arthur's avatar
      [CI ] Remove `past` in favor of `pat_key_values` (#21443) · 12eb528b
      Arthur authored
      * fix past renamed to past_key_value
      
      * update more `past`that were ski^锚d
      
      * fixup
      
      * remove changes made to rag
      
      * refactor `_reorder_cache` to use `past_key_values`
      
      * fix git `prepare_inputs_for_generation` to pass tests when false is needed in use_cache
      12eb528b
  18. 06 Feb, 2023 1 commit
    • Sylvain Gugger's avatar
      Update quality tooling for formatting (#21480) · 6f79d264
      Sylvain Gugger authored
      * Result of black 23.1
      
      * Update target to Python 3.7
      
      * Switch flake8 to ruff
      
      * Configure isort
      
      * Configure isort
      
      * Apply isort with line limit
      
      * Put the right black version
      
      * adapt black in check copies
      
      * Fix copies
      6f79d264
  19. 23 Jan, 2023 1 commit
  20. 14 Jan, 2023 1 commit
  21. 08 Jan, 2023 1 commit
    • Arthur's avatar
      Replace `past` with `past_key_values` (#20944) · f0577df6
      Arthur authored
      * start cleanup
      
      * more updates
      
      * more models are affected
      
      * more updates
      
      * update generation utils
      
      * style
      
      * revert change that removed reorder cachce
      
      * update generation utils
      
      * style
      
      * style
      
      * remove reorder cache
      f0577df6
  22. 15 Nov, 2022 1 commit
    • Arthur's avatar
      update relative positional embedding (#20203) · f60eec40
      Arthur authored
      * update relative positional embedding
      
      * make fix copies
      
      * add `use_cache` to list of arguments
      
      * fixup
      
      * 1line fucntion
      
      * add `test_decoder_model_past_with_large_inputs_relative_pos_emb`
      
      * add relative pos embedding test for more models
      
      * style
      f60eec40
  23. 09 Nov, 2022 1 commit
    • Nicolas Patry's avatar
      Attempting to test automatically the `_keys_to_ignore`. (#20042) · bac2d29a
      Nicolas Patry authored
      
      
      * Attempting to test automatically the `_keys_to_ignore`.
      
      * Style.
      
      * First fix pass.
      
      * Moving test on its own.
      
      * Another batch.
      
      * Second round removing BatchNorm
      
      * Fixing layoutlmv{2,3} + support older Python.
      
      * Disable miss missing warning.
      
      * Removing dodgy additions.
      
      * Big pass.
      
      * mbart.
      
      * More corrections.
      
      * Fixup.
      
      * Updating test_correct_missing_keys
      
      * Add escape hatch for when the head has no extra params so doesn't need
      
      the missing keys check.
      
      * Fixing test.
      
      * Greener.
      
      * Green ! (except for weird splinter bug).
      
      * Adding a test about `named_parameters` usage.
      
      * Shorten message.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * After rebase modifications.
      
      * More explicit condition checking.
      
      * Fixing slow tests issues.
      
      * Remove extra pdb.
      
      * Remove print.
      
      * Attempt to make failure consistent + fixing roc_bert.
      
      * Removing the seed  (all tests passing with it).
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      bac2d29a
  24. 14 Sep, 2022 1 commit
  25. 03 Aug, 2022 1 commit
    • LSinev's avatar
      Fix torch version comparisons (#18460) · 02b176c4
      LSinev authored
      Comparisons like
      version.parse(torch.__version__) > version.parse("1.6")
      are True for torch==1.6.0+cu101 or torch==1.6.0+cpu
      
      version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py
      02b176c4
  26. 12 May, 2022 1 commit
  27. 04 May, 2022 1 commit
  28. 03 May, 2022 1 commit
  29. 12 Apr, 2022 1 commit
    • Anmol Joshi's avatar
      Moved functions to pytorch_utils.py (#16625) · a315988b
      Anmol Joshi authored
      * Moved functions to pytorch_utils.py
      
      * isort formatting
      
      * Reverted tf changes
      
      * isort, make fix-copies
      
      * documentation fix
      
      * Fixed Conv1D import
      
      * Reverted research examples file
      
      * backward compatibility for pytorch_utils
      
      * missing import
      
      * isort fix
      a315988b
  30. 11 Apr, 2022 1 commit
  31. 31 Mar, 2022 1 commit
  32. 25 Mar, 2022 1 commit
  33. 23 Mar, 2022 1 commit
    • Sylvain Gugger's avatar
      Reorganize file utils (#16264) · 4975002d
      Sylvain Gugger authored
      * Split file_utils in several submodules
      
      * Fixes
      
      * Add back more objects
      
      * More fixes
      
      * Who exactly decided to import that from there?
      
      * Second suggestion to code with code review
      
      * Revert wront move
      
      * Fix imports
      
      * Adapt all imports
      
      * Adapt all imports everywhere
      
      * Revert this import, will fix in a separate commit
      4975002d
  34. 22 Mar, 2022 1 commit
  35. 11 Mar, 2022 1 commit
  36. 07 Feb, 2022 1 commit
    • Michael Benayoun's avatar
      FX tracing improvement (#14321) · 0fe17f37
      Michael Benayoun authored
      * Change the way tracing happens, enabling dynamic axes out of the box
      
      * Update the tests and modeling xlnet
      
      * Add the non recoding of leaf modules to avoid recording more values for the methods to record than what will be seen at tracing time (which would otherwise desynchronize the recorded values and the values that need to be given to the proxies during tracing, causing errors).
      
      * Comments and making tracing work for gpt-j and xlnet
      
      * Refactore things related to num_choices (and batch_size, sequence_length)
      
      * Update fx to work on PyTorch 1.10
      
      * Postpone autowrap_function feature usage for later
      
      * Add copyrights
      
      * Remove unnecessary file
      
      * Fix issue with add_new_model_like
      
      * Apply suggestions
      0fe17f37
  37. 31 Jan, 2022 1 commit
  38. 28 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Doc styler examples (#14953) · b5e2b183
      Sylvain Gugger authored
      * Fix bad examples
      
      * Add black formatting to style_doc
      
      * Use first nonempty line
      
      * Put it at the right place
      
      * Don't add spaces to empty lines
      
      * Better templates
      
      * Deal with triple quotes in docstrings
      
      * Result of style_doc
      
      * Enable mdx treatment and fix code examples in MDXs
      
      * Result of doc styler on doc source files
      
      * Last fixes
      
      * Break copy from
      b5e2b183
  39. 27 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Doc styler v2 (#14950) · 87e6e4fe
      Sylvain Gugger authored
      * New doc styler
      
      * Fix issue with args at the start
      
      * Code sample fixes
      
      * Style code examples in MDX
      
      * Fix more patterns
      
      * Typo
      
      * Typo
      
      * More patterns
      
      * Do without black for now
      
      * Get more info in error
      
      * Docstring style
      
      * Re-enable check
      
      * Quality
      
      * Fix add_end_docstring decorator
      
      * Fix docstring
      87e6e4fe
  40. 21 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Convert docstrings of modeling files (#14850) · 7af80f66
      Sylvain Gugger authored
      * Convert file_utils docstrings to Markdown
      
      * Test on BERT
      
      * Return block indent
      
      * Temporarily disable doc styler
      
      * Remove from quality checks as well
      
      * Remove doc styler mess
      
      * Remove check from circleCI
      
      * Fix typo
      
      * Convert file_utils docstrings to Markdown
      
      * Test on BERT
      
      * Return block indent
      
      * Temporarily disable doc styler
      
      * Remove from quality checks as well
      
      * Remove doc styler mess
      
      * Remove check from circleCI
      
      * Fix typo
      
      * Let's go on all other model files
      
      * Add templates too
      
      * Styling and quality
      7af80f66