Commits · b0d49fd5363429659d9b494d4349fefc8577e788 · chenpangpang / transformers

02 Apr, 2021 1 commit
- fixed typo: logging instead of logger (#11025) · 335c0ca3
  versis authored Apr 02, 2021
  
  335c0ca3
31 Mar, 2021 3 commits

Add `examples/language_modeling/run_mlm_no_trainer.py` (#11001) · 838f83d8

Hemil Desai authored Apr 01, 2021



* Add initial script for finetuning MLM models with accelerate

* Add evaluation metric calculation

* Fix bugs

* Use no_grad on evaluation

* update script docstring

* Update examples/language-modeling/run_mlm_no_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* PR feedback

* Fix CI failure

* Update examples/language-modeling/run_mlm_no_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

838f83d8

Enforce string-formatting with f-strings (#10980) · acc3bd9d

Sylvain Gugger authored Mar 31, 2021



* First third

* Styling and fix mistake

* Quality

* All the rest

* Treat %s and %d

* typo

* Missing )

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

acc3bd9d

Fixed some typos and removed legacy url (#10989) · 645f45c4

WybeKoper authored Mar 31, 2021



* Fixed typos

* Removed legacy colab notebook from readme
Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>

645f45c4

30 Mar, 2021 2 commits
- fix md file to avoid evaluation crash (#10962) · e031162a
  Yih-Dar authored Mar 30, 2021
  
  e031162a
- [examples/s2s] added py7zr dep (#10971) · 3e09d813
  Philipp Schmid authored Mar 30, 2021
```
* added py7zr

* comment out check_min for sagemaker test

* added min version again
```
  3e09d813
29 Mar, 2021 5 commits

[vulnerability] dep fix (#10954) · 05c966f2

Stas Bekman authored Mar 29, 2021

Fixes https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/Pygments/open

@LysandreJik

05c966f2

Add `examples/multiple-choice/run_swag_no_trainer.py` (#10934) · 5057213b

Daniel Stancl authored Mar 29, 2021

* Initial commit

* Another bunch of updates

* make style quliaty + delete debug arg from bash script

* Use compue_metrics func

* Do a few fixes

* Add copyright

* Fix typos

5057213b

Remove duplicate code · 4002f95e
Sylvain Gugger authored Mar 29, 2021

4002f95e

Add `examples/run_ner_no_trainer.py` (#10902) · d7b50ce4

Daniel Stancl authored Mar 29, 2021

* Add NER example with accelerate library

* This commit contains the first (yet really unfinished)
version of a script for showing how to train HuggingFace model
with their new accelerate library.

* Fix metric calculation

* make style quality

* mv ner_no_trainer to token-classification dir

* Delete --debug flag from running script

* hf_datasets -> raw_datasets

* Make a few slight adjustments

* Add an informative comment + rewrite a help comment

* Change header

* Fix a few things

* Enforce to use fast tokenizers only

* DataCollatorWithPadding -> DataCollatorForTokenClassification

* Change bash script: python3 -> accelerate launch

* make style

* Add a few missing things (see below)

* Add a max-lenghth padding to predictions and labels to
enable accelerate gather functionality

* Add PyTorch no trainer example to the example README.md

* Remove --do-train from args as being redundant for now

* DataCollatorWithPadding -> DataCollatorForTokenClassification

* Remove some obsolete args.do_train conditions from the script

* Delete --do_train from bash running script

* Delete use_slow_tokenizer from args

* Add unintentionally removed flag --label_all_tokens

* Delete --debug flag from running script

d7b50ce4

Updated colab links in readme of examples (#10932) · ddea8771
WybeKoper authored Mar 29, 2021
```
Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>
```
ddea8771

28 Mar, 2021 1 commit
- fixed finename (#10939) · 4f21e1dd
  Bhadresh Savani authored Mar 28, 2021
  
  4f21e1dd
26 Mar, 2021 1 commit

[vulnerability] fix dependency (#10914) · 3c27d246

Stas Bekman authored Mar 26, 2021

this PR fixes https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/PyYAML/open

3c27d246

25 Mar, 2021 1 commit
- run_glue_no_trainer: datasets -> raw_datasets (#10898) · 5f1491d3
  Jethro Kuan authored Mar 25, 2021
```
Use the correct variable (raw_datasets) instead of the module (datasets)
where appropriate.
```
  5f1491d3
23 Mar, 2021 1 commit

[Examples] Added predict stage and Updated Example Template (#10868) · 7ef40120

Bhadresh Savani authored Mar 23, 2021



* added predict stage

* added test keyword in exception message

* removed example specific saving predictions

* fixed f-string error

* removed extra line
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

7ef40120

22 Mar, 2021 6 commits

Use DataCollatorForSeq2Seq in run_summarization in all cases (#10856) · 9f8fa4e9
Eliza Szczechla authored Mar 22, 2021
```
Co-authored-by: Eliza <eliza@habanero.tiger.com.pl>
```
9f8fa4e9

feat(wandb): logging and configuration improvements (#10826) · 125ccead

Boris Dayma authored Mar 22, 2021

* feat: ensure unique artifact id

* feat: allow manual init

* fix: simplify reinit logic

* fix: no dropped value + immediate commits

* fix: wandb use in sagemaker

* docs: improve documenation and formatting

* fix: typos

* docs: improve formatting

125ccead

[vulnerability] in example deps fix (#10817) · 8fb46718

Stas Bekman authored Mar 22, 2021

Takes care of:
https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/jinja2/open



@LysandreJik
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

8fb46718

Bump jinja2 from 2.11.2 to 2.11.3 in /examples/research_projects/lxmert (#10818) · dbfe3795

dependabot[bot] authored Mar 22, 2021

Bumps [jinja2](https://github.com/pallets/jinja) from 2.11.2 to 2.11.3.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/master/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/2.11.2...2.11.3

)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

dbfe3795

Update FINE_TUNE_XLSR_WAV2VEC2.md (#10849) · 29904a96
Qiushi Pan authored Mar 22, 2021
```
Fix typo.
```
29904a96
push (#10846) · 0f226f78
Patrick von Platen authored Mar 22, 2021

0f226f78

21 Mar, 2021 4 commits
- Update FINE_TUNE_XLSR_WAV2VEC2.md · 82b8d8c7
  Suraj Patil authored Mar 21, 2021
  
  82b8d8c7
- Update FINE_TUNE_XLSR_WAV2VEC2.md · af6125ff
  Patrick von Platen authored Mar 21, 2021
  
  af6125ff
- small improvements for wav2vec2 info script (#10829) · 5aaf6e14
  Patrick von Platen authored Mar 21, 2021
  
  5aaf6e14
- add doc for Local machine (#10828) · 68b55885
  Suraj Patil authored Mar 21, 2021
  
  68b55885
19 Mar, 2021 6 commits
- wav2vec doc tweaks (#10808) · 1438c487
  Julien Chaumond authored Mar 19, 2021
```
* wording/typos tweaks

* Make model upload instructions simpler
```
  1438c487
- Update FINE_TUNE_XLSR_WAV2VEC2.md · b9570a81
  Patrick von Platen authored Mar 19, 2021
  
  b9570a81
- Expand a bit the presentation of examples (#10799) · 946400fb
  Sylvain Gugger authored Mar 19, 2021
```
* Expand a bit the presentation of examples

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
```
  946400fb
- [Example] Updating Question Answering examples for Predict Stage (#10792) · fd1d9f1a
  Bhadresh Savani authored Mar 19, 2021
```
* added prediction stage and eval fix

* style correction

* removed extra lines
```
  fd1d9f1a
- [XLSR-Wav2Vec2 Info doc] Add a couple of lines (#10806) · e8968bd0
  Patrick von Platen authored Mar 19, 2021
```
* finish

* fix

* fix

* fix

* fix
```
  e8968bd0
- addressing vulnerability report in research project deps (#10802) · 427ea3fe
  Stas Bekman authored Mar 18, 2021
```
Following up on a security alert:
https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/Pillow/open
```
  427ea3fe
18 Mar, 2021 8 commits

Update FINE_TUNE_XLSR_WAV2VEC2.md · 2ae67822
Patrick von Platen authored Mar 19, 2021

2ae67822
Update FINE_TUNE_XLSR_WAV2VEC2.md · 68a32159
Patrick von Platen authored Mar 19, 2021

68a32159
Update FINE_TUNE_XLSR_WAV2VEC2.md · 03df3fbc
Patrick von Platen authored Mar 19, 2021

03df3fbc

Add XLSR-Wav2Vec2 Fine-Tuning README.md (#10786) · e84adbed

Patrick von Platen authored Mar 19, 2021

* upload

* upload fine-tuning script

* improve

* adapt

* Apply suggestions from code review

* correct

* upload

* finalize

* remove @

* correct typos

e84adbed

[examples/seq2seq/README.md] fix t5 examples (#10734) · 9352b515

Stas Bekman authored Mar 18, 2021

* [examples/seq2seq] fix t5 examples

This PR:
* fixes T5 examples to include `--source_prefix` - it's **not** optional. If you give it a try you will see that you get 10x worse bleu scores w/o it. w/ `27.6849`, w/ `2.374`
* added a normal translation example w/o the peculiarities of MBart and T5
* reduces the default max samples to 50 so it's much faster to test quickly

summarization seems to be broken for t5 score-wise: https://github.com/huggingface/transformers/issues/10733

@sgugger

* specify explicitly the t5 models requiring the special handling

* one more

* update the t5 summarization example to use cnn_dailymail

* move max*samples into the top level README.md

* better wording

* better wording

9352b515

[file_utils] do not gobble certain kinds of requests.ConnectionError (#10235) · 4f3e93cf

Julien Chaumond authored Mar 18, 2021



* do not gobble certain kinds of requests.ConnectionError

* Apply review comments
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

4f3e93cf

add run_common_voice script (#10767) · 5f19c07a

Suraj Patil authored Mar 18, 2021

* add initial script

* finish script

* add shell script example

* accept chars_to_ignor as cl arg

* align the script with other example scripts

* add torchaudio dep

5f19c07a

wav2vec2: support datasets other than LibriSpeech (#10581) · af8afdc8

Mohamed El-Geish authored Mar 18, 2021

* wav2vec2: support datasets other than LibriSpeech

* Formatting run_asr.py to pass code quality test

* bundled orthography options and added verbose logs

* fixing a typo in timit fine-tuning script

* update comment for clarity

* resize_lm_head and load custom vocab from file

* adding a max_duration_in_seconds filter

* do not assign `duration_filter` lambda, use a def

* log untransliterated text as well

* fix base model for arabic

* fix duration filter when target_sr is not set

* drop duration_in_seconds when unneeded

* script for wav2vec2-large-lv60-timit-asr

* fix for "tha" in arabic corpus (huggingface#10581)

* adding more options to work with common_voice

* PR feedback (huggingface#10581)

* small README change

af8afdc8

17 Mar, 2021 1 commit

[examples] document resuming (#10776) · 39373919

Stas Bekman authored Mar 17, 2021



* document resuming in examples

* fix

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* put trainer code last, adjust notes
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

39373919