Commits · 6c08840628a22a4d53ae563d1041479649d1a8e7 · chenpangpang / transformers

01 Oct, 2021 1 commit

[Examples] Add an official audio classification example (#13722) · 42137280

Anton Lozhkov authored Oct 01, 2021



* Restore broken merge

* Additional args, DDP, remove CommonLanguage

* Update examples for V100, add training results

* Style

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove custom datasets for simplicity, apply suggestions from code review

* Add the attention_mask flag, reorganize README
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

42137280

30 Sep, 2021 1 commit
- map only on one process (#13810) · 44eb8bde
  Patrick von Platen authored Sep 30, 2021
  
  44eb8bde
29 Sep, 2021 1 commit
- [examples `run_glue.py`] missing requirements `scipy`, `sklearn` (#13768) · b90096fe
  Stas Bekman authored Sep 29, 2021
```
* missing requirement

* list both
```
  b90096fe
27 Sep, 2021 2 commits
- Docs for version v4.11.0 · 11c69b80
  Lysandre authored Sep 27, 2021
  
  11c69b80
- Release: v4.11.0 · dc193c90
  Lysandre authored Sep 27, 2021
  
  dc193c90
26 Sep, 2021 1 commit
- Update requirements for speech example (#13745) · 044eff5b
  Sylvain Gugger authored Sep 26, 2021
  
  044eff5b
24 Sep, 2021 5 commits

Update README.md · 469b80d4
Patrick von Platen authored Sep 24, 2021

469b80d4
up (#13733) · 493643ff
Patrick von Platen authored Sep 24, 2021

493643ff
Add model card creation snippet to example scripts (#13730) · 38580455
Gunjan Chhablani authored Sep 24, 2021
```
* Update run_glue.py

* Update run_glue.py

* Add model creation snippet to other scripts

* Fix style
```
38580455
Update README.md · 95f888fd
Patrick von Platen authored Sep 24, 2021

95f888fd

[ASR] Add official ASR CTC example to `examples/pytorch/speech-recognition` (#13620) · 4a320f6c

Patrick von Platen authored Sep 24, 2021



* up

* rename

* add asr example

* add auto feature extractor

* some more fixes

* correct layerdrop

* correct for multi-gpu dist

* clean up

* refactor

* refactor

* more fixes

* more fixes

* clean-up

* finish

* up

* Apply suggestions from code review

* fix isort

* update

* up

* add note

* apply surajs suggestions

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* isort

* small change

* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* add hubert

* Update examples/pytorch/speech-recognition/run_speech_recognition_ctc.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

4a320f6c

22 Sep, 2021 1 commit

Make gradient_checkpointing a training argument (#13657) · 27d46397

Sylvain Gugger authored Sep 22, 2021



* Make gradient_checkpointing a training argument

* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/configuration_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix tests

* Style

* document Gradient Checkpointing as a performance feature

* Small rename

* PoC for not using the config

* Adapt BC to new PoC

* Forgot to save

* Rollout changes to all other models

* Fix typo
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>

27d46397

21 Sep, 2021 1 commit

Add push_to_hub to no_trainer examples (#13659) · b7d264be

Sylvain Gugger authored Sep 21, 2021

* Add push_to_hub to no_trainer examples

* Quality

* Document integration

* Roll out to other examples

b7d264be

20 Sep, 2021 1 commit
- fix typo (#13647) · 87d5057d
  Suraj Patil authored Sep 20, 2021
  
  87d5057d
15 Sep, 2021 1 commit
- [Pretrained Model] Add resize_position_embeddings (#13559) · 95f933ea
  Patrick von Platen authored Sep 15, 2021
```
* finish

* delete bogus file

* correct some stuff

* finish

* finish
```
  95f933ea
09 Sep, 2021 1 commit

Fix typo in documentation (#13494) · 008c2d0b

Aleksander Smywiński-Pohl authored Sep 09, 2021

* Fix typo in deepspeed documentation

* Add missing import in deepspeed configuration

* Fix path in translation examples

008c2d0b

07 Sep, 2021 1 commit

Fix img classification tests (#13456) · 79815090

Nathan Raw authored Sep 07, 2021

* ✅ Update image-classification example's tests

* 🔥 remove cats_and_dogs test samples

* 💄 fix flake8

79815090

06 Sep, 2021 2 commits
- skip image classification test (#13451) · 2dd975b2
  Suraj Patil authored Sep 06, 2021
  
  2dd975b2
- add torchvision in example test requirements (#13438) · 6b29bff8
  Suraj Patil authored Sep 06, 2021
  
  6b29bff8
02 Sep, 2021 1 commit

✨

Add PyTorch image classification example (#13134) · 76c4d8bf

Nathan Raw authored Sep 02, 2021

* ✨ add pytorch image classification example

* 🔥 remove utils.py

* 💄 fix flake8 style issues

* 🔥 remove unnecessary line

* ✨ limit dataset sizes

* 📌 update reqs

* 🎨 restructure - use datasets lib

* 🎨 import transforms directly

* 📝 add comments

* 💄 style

* 🔥 remove flag

* 📌 update requirement warning

* 📝 add vision README.md

* 📝 update README.md

* 📝 update README.md

* 🎨 add image-classification tag to model card

* 🚚 rename vision ➡️ image-classification

* 📝 update image-classification README.md

76c4d8bf

31 Aug, 2021 3 commits
- Docs for v4.10.0 · 5ee67a44
  Lysandre authored Aug 31, 2021
  
  5ee67a44
- Release: v4.10.0 · d12bbe49
  Lysandre authored Aug 31, 2021
  
  d12bbe49
- Add generate kwargs to Seq2SeqTrainingArguments (#13339) · c76de105
  Sylvain Gugger authored Aug 31, 2021
```
* Add generate kwargs to Seq2SeqTrainingArguments

* typo

* Address review comments + doc

* Style
```
  c76de105
30 Aug, 2021 1 commit
- Update label2id in the model config for run_glue (#13334) · 139e8301
  Sylvain Gugger authored Aug 30, 2021
  
  139e8301
28 Aug, 2021 1 commit

examples: only use keep_linebreaks when reading TXT files (#13320) · 4046e66e

Stefan Schweter authored Aug 28, 2021

* examples: only use keep_linebreaks when reading TXT files for all CLM examples

* examples: only use keep_linebreaks when reading TXT files for all CLM examples

* examples: only use keep_linebreaks when reading TXT files for all CLM examples

4046e66e

27 Aug, 2021 1 commit

examples: add keep_linebreaks option to CLM examples (#13150) · 319d840b

Stefan Schweter authored Aug 27, 2021

* examples: add keep_linebreaks option to text dataset loader for all CLM examples

* examples: introduce new keep_linebreaks option as data argument in CLM examples

319d840b

19 Aug, 2021 1 commit
- Update namespaces inside torch.utils.data to the latest. (#13167) · 91ff480e
  Allan Lin authored Aug 19, 2021
```
* Update torch.utils.data namespaces to the latest.

* Format

* Update Dataloader.

* Style
```
  91ff480e
06 Aug, 2021 1 commit

Tpu tie weights (#13030) · 7fcee113

Sylvain Gugger authored Aug 06, 2021

* Fix tied weights on TPU

* Manually tie weights in no trainer examples

* Fix for test

* One last missing

* Gettning owned by my scripts

* Address review comments

* Fix test

* Fix tests

* Fix reformer tests

7fcee113

02 Aug, 2021 1 commit
- fix typo in example/text-classification README (#12974) · 75b8990d
  Chungman Lee authored Aug 02, 2021
```
* fix typo in example/text-classification README

* add space to align the table
```
  75b8990d
28 Jul, 2021 2 commits
- Fix QA examples for roberta tokenizer (#12928) · 3ec851dc
  Sylvain Gugger authored Jul 28, 2021
  
  3ec851dc
- Add option to set max_len in run_ner (#12929) · fd85734e
  Sylvain Gugger authored Jul 28, 2021
  
  fd85734e
26 Jul, 2021 1 commit
- Add accelerate to examples requirements (#12888) · 303989de
  Sylvain Gugger authored Jul 26, 2021
  
  303989de
22 Jul, 2021 3 commits
- Docs for v4.10.0dev0 · 40de2d5a
  Lysandre authored Jul 22, 2021
  
  40de2d5a
- Release: v4.9.0 · 72aee83c
  Lysandre authored Jul 22, 2021
  
  72aee83c
- Fix type of max_seq_length arg in run_swag.py (#12832) · fcf83011
  Maxwell Forbes authored Jul 21, 2021
  
  fcf83011
08 Jul, 2021 1 commit
- Fix group_lengths for short datasets (#12558) · 6f1adc43
  Sylvain Gugger authored Jul 08, 2021
  
  6f1adc43
07 Jul, 2021 1 commit

MLM training fails with no validation file(same as #12406 for pytorch now) (#12517) · 1d6623c6

Souvic Chakraborty authored Jul 07, 2021

* Validation split percentage to be used for custom data files also

Issue same as https://github.com/huggingface/transformers/issues/12406 fixed for pytorch branch run_mlm.py

* Validation split added in the right place

* Update run_clm.py

* validation split added for custom files

* Validation split added for custom files

* Update run_plm.py

* fixed validation split for custom files as input for pytorch examples in lm

* Update run_clm_no_trainer.py

* args modified

1d6623c6

28 Jun, 2021 2 commits

[Examples] Added context manager to datasets map (#12367) · 04dbea31

Bhadresh Savani authored Jun 28, 2021

* added cotext manager to datasets map

* fixed style and spaces

* fixed warning of deprecation

* changed desc

04dbea31

Update run_mlm.py (#12344) · 9490d668

Taha ValizadehAslani authored Jun 28, 2021

Before the code could not be used for validation only because of this line:
extension = data_args.train_file.split(".")[-1]
was assuming that extension must be extracted from the training dataset. This line would run regardless of the training or validation options of the user. This would lead to an error if the user only wants to run an evaluation only and does not want to do train (because the training file does not exist). I modified it to extract extension from the training file if the user wants to do train and extract it from the validation file if the user wants to run eval. This way the code can be used for both training and validation separately.

9490d668

26 Jun, 2021 1 commit
- replace print with logger (#12368) · ff5cdc08
  Bhadresh Savani authored Jun 26, 2021
  
  ff5cdc08