Commits · 5e673ed2dc73249b0195ebea305dbad1e4b7cf2a · chenpangpang / transformers

08 Apr, 2024 1 commit

updated examples/pytorch/language-modeling scripts and requirements.txt to... · 5e673ed2

Haz Sameen Shahgir authored Apr 08, 2024

updated examples/pytorch/language-modeling scripts and requirements.txt to require datasets>=2.14.0 (#30120)

updated requirements.txt and require_version() calls in examples/pytorch/language-modeling to require datasets>=2.14.0

5e673ed2

21 Mar, 2024 1 commit
- Add support for `torch_dtype` in the run_mlm example (#29776) · ef6e371d
  Jacky Lee authored Mar 21, 2024
```
feat: add support for torch_dtype
Co-authored-by: Jacky Lee <jackylee328@gmail.com>
```
  ef6e371d
20 Mar, 2024 1 commit
- v4.40.0.dev.0 · 1248f092
  Arthur Zucker authored Mar 20, 2024
  
  1248f092
11 Mar, 2024 1 commit

Make torch xla available on GPU (#29334) · 873d9bb3

Yitong Huang authored Mar 11, 2024



* add USE_TORCH_XLA env

* rename torch_tpu to torch_xla

* better is_torch_xla_available; fix some fsdp and performance issues

* fix format

* fix bug when pjrt_device is cpu

* fix bug

* fix the deprecation handling

---------
Co-authored-by: anw90 <ang868@gmail.com>
Co-authored-by: wangang.wa <wangang.wa@alibaba-inc.com>

873d9bb3

21 Feb, 2024 1 commit
- v4.39.dev.0 · 1a77f07f
  Arthur Zucker authored Feb 21, 2024
  
  1a77f07f
01 Feb, 2024 1 commit
- [docs] fix some bugs about parameter description (#28806) · d98591a1
  zspo authored Feb 02, 2024
```
Co-authored-by: p_spozzhang <p_spozzhang@tencent.com>
```
  d98591a1
19 Jan, 2024 1 commit
- v4.38.dev.0 · b2748a6e
  Amy Roberts authored Jan 19, 2024
  
  b2748a6e
11 Jan, 2024 1 commit

Set `cache_dir` for `evaluate.load()` in example scripts (#28422) · 95091e15

Alex Hedges authored Jan 11, 2024

While using `run_clm.py`,[^1] I noticed that some files were being added
to my global cache, not the local cache. I set the `cache_dir` parameter
for the one call to `evaluate.load()`, which partially solved the
problem. I figured that while I was fixing the one script upstream, I
might as well fix the problem in all other example scripts that I could.

There are still some files being added to my global cache, but this
appears to be a bug in `evaluate` itself. This commit at least moves
some of the files into the local cache, which is better than before.

To create this PR, I made the following regex-based transformation:
`evaluate\.load\((.*?)\)` -> `evaluate\.load\($1,
cache_dir=model_args.cache_dir\)`. After using that, I manually fixed
all modified files with `ruff` serving as useful guidance. During the
process, I removed one existing usage of the `cache_dir` parameter in a
script that did not have a corresponding `--cache-dir` argument
declared.

[^1]: I specifically used `pytorch/language-modeling/run_clm.py` from
v4.34.1 of the library. For the original code, see the following URL:
https://github.com/huggingface/transformers/tree/acc394c4f5e1283c19783581790b3dc3105a3697/examples/pytorch/language-modeling/run_clm.py.

95091e15

13 Dec, 2023 1 commit
- Dev version · 3ed3e319
  Lysandre authored Dec 13, 2023
  
  3ed3e319
17 Nov, 2023 1 commit
- Broken links fixed related to datasets docs (#27569) · ffbcfc01
  V.Prasanna kumar authored Nov 18, 2023
```
fixed the broken links belogs to dataset library of transformers
```
  ffbcfc01
02 Nov, 2023 1 commit
- Dev version · bc78fd12
  Lysandre authored Nov 02, 2023
  
  bc78fd12
31 Oct, 2023 1 commit
- Unify warning styles for better readability (#27184) · 25e6e941
  Dong-geon Lee authored Nov 01, 2023
  
  25e6e941
27 Oct, 2023 1 commit
- Provide alternative when warning on use_auth_token (#27105) · 66b088fa
  Lucain authored Oct 27, 2023
  
  66b088fa
12 Oct, 2023 1 commit
- Add many missing spaces in adjacent strings (#26751) · 40ea9ab2
  Tom Aarsen authored Oct 12, 2023
```
Add missing spaces in adjacent strings
```
  40ea9ab2
03 Oct, 2023 1 commit
- v4.35.0.dev0 · bd620591
  Lysandre authored Oct 03, 2023
  
  bd620591
11 Sep, 2023 1 commit
- docs: update link huggingface map (#26077) · 9cebae64
  Phuc Van Phan authored Sep 11, 2023
  
  9cebae64
04 Sep, 2023 1 commit
- v4.34.dev.0 · d8e13b3e
  Lysandre authored Sep 04, 2023
  
  d8e13b3e
21 Aug, 2023 1 commit
- v4.33.0.dev0 · 5c67682b
  Sylvain Gugger authored Aug 21, 2023
  
  5c67682b
07 Aug, 2023 1 commit

Allow `trust_remote_code` in example scripts (#25248) · 14510938

Jackmin801 authored Aug 07, 2023

* pytorch examples

* pytorch mim no trainer

* cookiecutter

* flax examples

* missed line in pytorch run_glue

* tensorflow examples

* tensorflow run_clip

* tensorflow run_mlm

* tensorflow run_ner

* tensorflow run_clm

* pytorch example from_configs

* pytorch no trainer examples

* Revert "tensorflow run_clip"

This reverts commit 261f86ac1f1c9e05dd3fd0291e1a1f8e573781d5.

* fix: duplicated argument

14510938

02 Aug, 2023 1 commit

Add `token` arugment in example scripts (#25172) · 149cb0cc

Yih-Dar authored Aug 02, 2023



* fix

* fix

* fix

* fix

* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

149cb0cc

28 Jul, 2023 1 commit

Update `use_auth_token` -> `token` in example scripts (#25167) · d53b8ad7

Yih-Dar authored Jul 28, 2023



* pytorch examples

* tensorflow examples

* flax examples

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

d53b8ad7

20 Jul, 2023 1 commit
- Change logic for logging in the examples (#24956) · aa1b09c5
  Zach Mueller authored Jul 20, 2023
```
Change logic
```
  aa1b09c5
17 Jul, 2023 1 commit
- 4.32.0.dev0 · e9ad5130
  Sylvain Gugger authored Jul 17, 2023
  
  e9ad5130
07 Jun, 2023 1 commit
- v4.31.0.dev0 · ba695c1e
  Sylvain Gugger authored Jun 07, 2023
  
  ba695c1e
18 May, 2023 1 commit

fix bug in group_texts function, that was inserting short batches (#23429) · a7920065

Boda Sadallah authored May 18, 2023

* fix bug in group_texts function, that was inserting short batches

* fully exclude short batches and return empty dict instead

* fix style

a7920065

09 May, 2023 1 commit
- v4.30.0.dev0 · a0c0a782
  Sylvain Gugger authored May 09, 2023
  
  a0c0a782
13 Apr, 2023 1 commit
- v4.29.0.dev0 · 888c4a2a
  Sylvain Gugger authored Apr 12, 2023
  
  888c4a2a
22 Mar, 2023 1 commit

add low_cpu_mem_usage option in run_clm.py example which will benefit… (#22288) · 4ccaf268

Wang, Yi authored Mar 22, 2023



* add low_cpu_mem_usage option in run_clm.py example which will benefit LLM loading
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* update all the example and README under language-modeling
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

---------
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

4ccaf268

14 Mar, 2023 1 commit
- v4.28.0.dev0 · ebdb185b
  Sylvain Gugger authored Mar 14, 2023
  
  ebdb185b
22 Feb, 2023 1 commit
- Respect documentation on passive log level (#21700) · b19d64d8
  Sylvain Gugger authored Feb 22, 2023
```
* Respect documentation on passive log level

* Fix test and set log level in examples

* Add doc
```
  b19d64d8
06 Feb, 2023 2 commits

Update quality tooling for formatting (#21480) · 6f79d264

Sylvain Gugger authored Feb 06, 2023

* Result of black 23.1

* Update target to Python 3.7

* Switch flake8 to ruff

* Configure isort

* Configure isort

* Apply isort with line limit

* Put the right black version

* adapt black in check copies

* Fix copies

6f79d264

[examples] improve block_size warning message (#21463) · 3b9a1dc1
Stas Bekman authored Feb 06, 2023

3b9a1dc1

31 Jan, 2023 1 commit
- Simplify column_names in run_clm/mlm (#21382) · 074d6b75
  Quentin Lhoest authored Jan 31, 2023
```
* simplify column_names in run_clm

* simplify column_names in run_mlm

* minor
```
  074d6b75
30 Jan, 2023 1 commit

[`run_(clm|mlm).py` examples] add streaming dataset support (#21343) · 98d88b23

Stas Bekman authored Jan 30, 2023

* [run_clm example] add streaming dataset support

* unrefactor kwargs

* fix

* fix

* require datasets>=2.0.0

* port to mlm

98d88b23

23 Jan, 2023 1 commit
- v4.27.0.dev0 · 7119bb05
  Sylvain Gugger authored Jan 23, 2023
  
  7119bb05
01 Dec, 2022 1 commit
- v4.26.0.dev0 · 60d1f31b
  Sylvain Gugger authored Dec 01, 2022
  
  60d1f31b
03 Nov, 2022 1 commit
- Only resize embeddings when necessary (#20043) · 06886d5a
  Sylvain Gugger authored Nov 03, 2022
```
* Only resize embeddings when necessary

* Add comment
```
  06886d5a
01 Nov, 2022 1 commit
- v4.25.0.dev0 · c3a93d8d
  Sylvain Gugger authored Oct 31, 2022
  
  c3a93d8d
10 Oct, 2022 1 commit
- Dev version · 10100979
  Lysandre authored Oct 10, 2022
  
  10100979
20 Sep, 2022 1 commit
- Add a missing space in a script arg documentation (#19113) · 06f341de
  Santiago Castro authored Sep 20, 2022
  
  06f341de