1. 18 Apr, 2024 1 commit
  2. 17 Apr, 2024 1 commit
    • Nicolas Patry's avatar
      Upgrading to tokenizers 0.19.0 (#30289) · 8e5f76f5
      Nicolas Patry authored
      * [DO NOT MERGE] Testing tokenizers 0.19.0rc0
      
      * Accounting for the breaking change.
      
      * Ruff.
      
      * Upgrading to tokenizers `0.19` (new release with preprend_scheme fixed
      and new surface for BPE tiktoken bug).
      8e5f76f5
  3. 20 Mar, 2024 1 commit
  4. 12 Mar, 2024 1 commit
  5. 21 Feb, 2024 1 commit
  6. 19 Feb, 2024 1 commit
  7. 16 Feb, 2024 1 commit
  8. 02 Feb, 2024 1 commit
    • Klaus Hipp's avatar
      [Docs] Fix spelling and grammar mistakes (#28825) · 721ee783
      Klaus Hipp authored
      * Fix typos and grammar mistakes in docs and examples
      
      * Fix typos in docstrings and comments
      
      * Fix spelling of `tokenizer` in model tests
      
      * Remove erroneous spaces in decorators
      
      * Remove extra spaces in Markdown link texts
      721ee783
  9. 01 Feb, 2024 1 commit
  10. 29 Jan, 2024 1 commit
  11. 19 Jan, 2024 1 commit
  12. 15 Jan, 2024 1 commit
  13. 11 Jan, 2024 1 commit
    • Alex Hedges's avatar
      Set `cache_dir` for `evaluate.load()` in example scripts (#28422) · 95091e15
      Alex Hedges authored
      While using `run_clm.py`,[^1] I noticed that some files were being added
      to my global cache, not the local cache. I set the `cache_dir` parameter
      for the one call to `evaluate.load()`, which partially solved the
      problem. I figured that while I was fixing the one script upstream, I
      might as well fix the problem in all other example scripts that I could.
      
      There are still some files being added to my global cache, but this
      appears to be a bug in `evaluate` itself. This commit at least moves
      some of the files into the local cache, which is better than before.
      
      To create this PR, I made the following regex-based transformation:
      `evaluate\.load\((.*?)\)` -> `evaluate\.load\($1,
      cache_dir=model_args.cache_dir\)`. After using that, I manually fixed
      all modified files with `ruff` serving as useful guidance. During the
      process, I removed one existing usage of the `cache_dir` parameter in a
      script that did not have a corresponding `--cache-dir` argument
      declared.
      
      [^1]: I specifically used `pytorch/language-modeling/run_clm.py` from
      v4.34.1 of the library. For the original code, see the following URL:
      https://github.com/huggingface/transformers/tree/acc394c4f5e1283c19783581790b3dc3105a3697/examples/pytorch/language-modeling/run_clm.py.
      95091e15
  14. 13 Dec, 2023 1 commit
  15. 12 Dec, 2023 1 commit
  16. 07 Dec, 2023 1 commit
  17. 17 Nov, 2023 1 commit
  18. 16 Nov, 2023 2 commits
    • Arthur's avatar
      [`Styling`] stylify using ruff (#27144) · 651408a0
      Arthur authored
      
      
      * try to stylify using ruff
      
      * might need to remove these changes?
      
      * use ruf format andruff check
      
      * use isinstance instead of type comparision
      
      * use # fmt: skip
      
      * use # fmt: skip
      
      * nits
      
      * soem styling changes
      
      * update ci job
      
      * nits isinstance
      
      * more files update
      
      * nits
      
      * more nits
      
      * small nits
      
      * check and format
      
      * revert wrong changes
      
      * actually use formatter instead of checker
      
      * nits
      
      * well docbuilder is overwriting this commit
      
      * revert notebook changes
      
      * try to nuke docbuilder
      
      * style
      
      * fix feature exrtaction test
      
      * remve `indent-width = 4`
      
      * fixup
      
      * more nits
      
      * update the ruff version that we use
      
      * style
      
      * nuke docbuilder styling
      
      * leve the print for detected changes
      
      * nits
      
      * Remove file I/O
      Co-authored-by: default avatarcharliermarsh <charlie.r.marsh@gmail.com>
      
      * style
      
      * nits
      
      * revert notebook changes
      
      * Add # fmt skip when possible
      
      * Add # fmt skip when possible
      
      * Fix
      
      * More `  # fmt: skip` usage
      
      * More `  # fmt: skip` usage
      
      * More `  # fmt: skip` usage
      
      * NIts
      
      * more fixes
      
      * fix tapas
      
      * Another way to skip
      
      * Recommended way
      
      * Fix two more fiels
      
      * Remove asynch
      Remove asynch
      
      ---------
      Co-authored-by: default avatarcharliermarsh <charlie.r.marsh@gmail.com>
      651408a0
    • Phuc Van Phan's avatar
  19. 15 Nov, 2023 1 commit
  20. 02 Nov, 2023 1 commit
  21. 27 Oct, 2023 1 commit
  22. 12 Oct, 2023 1 commit
  23. 10 Oct, 2023 1 commit
  24. 03 Oct, 2023 1 commit
  25. 29 Sep, 2023 1 commit
    • Sanchit Gandhi's avatar
      [Flax Examples] Seq2Seq ASR Fine-Tuning Script (#21764) · 68e85fc8
      Sanchit Gandhi authored
      * from seq2seq speech
      
      * [Flax] Example script for speech seq2seq
      
      * tests and fixes
      
      * make style
      
      * fix: label padding tokens
      
      * fix: label padding tokens over list
      
      * update ln names for Whisper
      
      * try datasets iter loader
      
      * create readme and append results
      
      * style
      
      * make style
      
      * adjust lr
      
      * use pt dataloader
      
      * make fast
      
      * pin gen max len
      
      * finish
      
      * add pt to requirements for test
      
      * fix pt -> torch
      
      * add accelerate
      68e85fc8
  26. 22 Sep, 2023 1 commit
  27. 18 Sep, 2023 1 commit
  28. 11 Sep, 2023 2 commits
  29. 04 Sep, 2023 1 commit
  30. 21 Aug, 2023 1 commit
  31. 07 Aug, 2023 1 commit
    • Jackmin801's avatar
      Allow `trust_remote_code` in example scripts (#25248) · 14510938
      Jackmin801 authored
      * pytorch examples
      
      * pytorch mim no trainer
      
      * cookiecutter
      
      * flax examples
      
      * missed line in pytorch run_glue
      
      * tensorflow examples
      
      * tensorflow run_clip
      
      * tensorflow run_mlm
      
      * tensorflow run_ner
      
      * tensorflow run_clm
      
      * pytorch example from_configs
      
      * pytorch no trainer examples
      
      * Revert "tensorflow run_clip"
      
      This reverts commit 261f86ac1f1c9e05dd3fd0291e1a1f8e573781d5.
      
      * fix: duplicated argument
      14510938
  32. 02 Aug, 2023 1 commit
  33. 28 Jul, 2023 2 commits
  34. 17 Jul, 2023 1 commit
  35. 12 Jul, 2023 1 commit
  36. 07 Jun, 2023 1 commit
  37. 22 May, 2023 1 commit