Commits · 99fd3eb4a5449a24918b3981ebc42ebd3bd10dfc · chenpangpang / transformers

15 Mar, 2022 1 commit

Add the XTREME-S fine-tuning example (#15985) · 99fd3eb4

Anton Lozhkov authored Mar 16, 2022

* CTC+classification draft

* CTC+classification draft

* style

* multilingual runs

* Fix race condition during processor.from_reatrained

* Merge covost experiments

* Add README

* Quality

* Switch to .all configs

* Fix typos

99fd3eb4

12 Mar, 2022 1 commit

[Deepspeed] add support for bf16 mode (#14569) · 580dd87c

Stas Bekman authored Mar 11, 2022



* [WIP] add support for bf16 mode

* prep for bf16

* prep for bf16

* fix; zero2/bf16 is ok

* check bf16 is available

* test fixes

* enable zero3_bf16

* config files

* docs

* split stage_dtype; merge back to non-dtype-specific config file

* fix doc

* cleanup

* cleanup

* bfloat16 => bf16 to match the PR changes

* s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/

* test fixes/skipping

* move

* fix

* Update docs/source/main_classes/deepspeed.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* backticks

* cleanup

* cleanup

* cleanup

* new version

* add note about grad accum in bf16
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

580dd87c

10 Mar, 2022 1 commit
- Update README.md · 6c9010ef
  Sanchit Gandhi authored Mar 10, 2022
  
  6c9010ef
04 Mar, 2022 1 commit
- Update README.md · b7147489
  Sanchit Gandhi authored Mar 04, 2022
  
  b7147489
02 Mar, 2022 1 commit
- Fix tiny typo (#15884) · e535c389
  Ross Johnstone authored Mar 02, 2022
  
  e535c389
21 Feb, 2022 1 commit
- Fix minor comment typos (#15740) · 5444687f
  Ivan Agarský authored Feb 21, 2022
  
  5444687f
15 Feb, 2022 1 commit
- updated with latest PL and Ray (#15653) · 80f1a591
  Shamane Siri authored Feb 16, 2022
  
  80f1a591
11 Feb, 2022 1 commit
- [research_projects] deal with security alerts (#15594) · fcb0f743
  Stas Bekman authored Feb 11, 2022
```
* [research_projects] deal with security alerts

* add a note of the original PL ver and warning
```
  fcb0f743
09 Feb, 2022 1 commit
- Upgrade black to version ~=22.0 (#15565) · 7732d0fe
  Lysandre Debut authored Feb 09, 2022
```
* Upgrade black to version ~=22.0

* Check copies

* Fix code
```
  7732d0fe
07 Feb, 2022 1 commit

Add ASR CTC streaming example (#15309) · a459f7f9

Anton Lozhkov authored Feb 07, 2022



* Single-epoch run

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Infinite dataset

* Trainer fix + distributed benchmark

* Benchmark fix

* unused import

* interleaved splits

* interleaved splits

* has_length util

* Move to research projects

* Leftover Sized checks

* Bump min version

* Unused import

* Revert trainer changes
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

a459f7f9

31 Jan, 2022 2 commits

[Robust Speech Challenge] Add missing LR parameter (#15428) · f624249d
Jonatas Grosman authored Jan 31, 2022

f624249d

Add (M)Luke model training for Token Classification in the examples (#14880) · aa19f478

Julien Plu authored Jan 31, 2022

* Add Luke training

* Fix true label tags

* Fix true label tags

* Fix true label tags

* Update the data collator for Luke

* Some training refactor for Luke

* Improve data collator for Luke

* Fix import

* Fix datasets concatenation

* Add the --max_entity_length argument for Luke models

* Remove unused code

* Fix style issues

* Fix style issues

* Move the Luke training into a separate folder

* Fix style

* Fix naming

* Fix filtering

* Fix filtering

* Fix filter

* Update some preprocessing

* Move luke to research_projects

* Checkstyle

* Address comments

* Fix style

aa19f478

27 Jan, 2022 4 commits

Bump numpy from 1.19.2 to 1.21.0 in /examples/research_projects/lxmert (#15369) · 628b59e5

dependabot[bot] authored Jan 27, 2022

Bumps [numpy](https://github.com/numpy/numpy) from 1.19.2 to 1.21.0.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst.txt)
- [Commits](https://github.com/numpy/numpy/compare/v1.19.2...v1.21.0

)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

628b59e5

Bump notebook in /examples/research_projects/visual_bert (#15368) · ca0848b2

dependabot[bot] authored Jan 27, 2022

Bumps [notebook](http://jupyter.org

) from 6.1.5 to 6.4.1.

---
updated-dependencies:
- dependency-name: notebook
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

ca0848b2

Bump numpy in /examples/research_projects/visual_bert (#15367) · 7d45a2e8

dependabot[bot] authored Jan 27, 2022

Bumps [numpy](https://github.com/numpy/numpy) from 1.19.2 to 1.21.0.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst.txt)
- [Commits](https://github.com/numpy/numpy/compare/v1.19.2...v1.21.0

)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

7d45a2e8

Add a device argument to the eval script (#15371) · 196cce6e
Anton Lozhkov authored Jan 27, 2022
```
* Device argument for the eval script

* Default to none

* isort
```
196cce6e

24 Jan, 2022 1 commit
- Update eval.py (#15310) · 4bf97415
  Patrick von Platen authored Jan 24, 2022
  
  4bf97415
21 Jan, 2022 2 commits
- [Robust Speech Challenge] Add timeline (#15274) · 11afb709
  Patrick von Platen authored Jan 21, 2022
  
  11afb709
- Move BART + ONNX example to research_projects (#15271) · 833635e2
  lewtun authored Jan 21, 2022
```
* Move BART + ONNX example to research_projects

* Add author information
```
  833635e2
20 Jan, 2022 2 commits
- Update README.md (#15246) · 85ea462c
  Anton Lozhkov authored Jan 20, 2022
```
Clarify OVH instruction
```
  85ea462c
- Update README.md (#15239) · e57468b8
  Anton Lozhkov authored Jan 20, 2022
```
Add an OVHcloud tutorial URL for the Robust Speech Challenge
```
  e57468b8
19 Jan, 2022 5 commits
- Update README.md (#15233) · 691878ee
  Patrick von Platen authored Jan 19, 2022
  
  691878ee
- fix speech event readme (#15227) · 2a5a3849
  Suraj Patil authored Jan 19, 2022
  
  2a5a3849
- Update README.md (#15226) · 6d92c429
  Patrick von Platen authored Jan 19, 2022
  
  6d92c429
- Update README.md · 19c217b4
  Patrick von Platen authored Jan 19, 2022
  
  19c217b4
- Update README.md · 5439cda7
  Patrick von Platen authored Jan 19, 2022
  
  5439cda7
18 Jan, 2022 1 commit

[Robust Speech Event] Add guides (#15155) · e118e085

Patrick von Platen authored Jan 18, 2022



* up

* improve readme

* up

* up

* more info

* up

* up

* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* add more stuff for eval

* update

* up

* Update README.md

* Update examples/research_projects/xls_r/README.md
Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.github.com>

* apply omar's suggestions
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.github.com>

e118e085

12 Jan, 2022 1 commit
- fix: switch from slow to generic tokenizer class (#15122) · aa0135f2
  Leandro von Werra authored Jan 12, 2022
  
  aa0135f2
10 Jan, 2022 1 commit

[Wav2Vec2 Speech Event] Add speech event v2 (#15083) · d72343d2

Patrick von Platen authored Jan 10, 2022

* up

* up

* up

* up

* up

* up

* improve

* up

* up

* Update src/transformers/trainer.py

* up

* up

* up

d72343d2

23 Dec, 2021 1 commit
- add custom stopping criteria to human eval script (#14897) · 1d651868
  Leandro von Werra authored Dec 23, 2021
  
  1d651868
13 Dec, 2021 1 commit

Code parrot minor fixes/niceties (#14666) · 48bf7e47

Nathan Cooper authored Dec 13, 2021



* Add some nicety flags for better controlling evaluation.

* Fix dependency issue with outdated requirement

* Add additional flag to example to ensure eval is done

* Wrap code into main function for accelerate launcher to find

* Fix valid batch size flag in readme

* Add note to install git-lfs when initializing/training the model

* Update examples/research_projects/codeparrot/scripts/arguments.py
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Revert "Wrap code into main function for accelerate launcher to find"

This reverts commit ff11df1c810d4df198d04b827538eb4572147ba3.

* Fix formatting issue

* Move git-lfs instructions to installation section

* Add a quick check before code generation for code evaluation

* Fix styling issue

* Update examples/research_projects/codeparrot/scripts/human_eval.py
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Make iterable dataset use passed in tokenizer rather than globally defined one
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: ncoop57 <nac33@students.uwf.edu>

48bf7e47

06 Dec, 2021 1 commit
- [urls to hub] Replace outdated model tags with their now-canonical pipeline types (#14617) · 6cdc3a78
  Julien Chaumond authored Dec 06, 2021
```
* Replace outdated model tags with their now-canonical pipeline types

* spam the CI till it's green
```
  6cdc3a78
02 Dec, 2021 1 commit

Add CodeParrot 🦜 codebase (#14536) · 43f953cc

Leandro von Werra authored Dec 02, 2021



* add readme skeleton

* update readme

* add initialization script

* add deduplication script

* add codeparrot training script

* add code generation evaluation

* add validation loss script

* add requirements

* update readme

* tweak readme

* make style

* add highlights to readme

* add CLIs to scripts

* add tokenizer training script

* add docstring to constant length dataset

* fix defaults in arguments

* update readme with cli

* move image to hub

* tweaks of readme

* fix cli commands

* add author

* explain env variables

* fix formatting

* Update examples/research_projects/codeparrot/README.md
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Apply suggestions from code review
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* replace generic with gpt2 tokenizer
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

43f953cc

30 Nov, 2021 1 commit

use functional interface for softmax in attention (#14198) · 6ed9882d

Thomas Viehmann authored Nov 30, 2021

* use functional interface instead of instantiating module and immediately calling it

* fix torch.nn.functional to nn.functional. Thank you Stas!

6ed9882d

22 Nov, 2021 1 commit

Switch from using sum for flattening lists of lists in group_texts (#14472) · 69e16abf

Nicholas Broad authored Nov 22, 2021



* remove sum for list flattening

* change to chain(*)

* make chain object a list

* delete empty lines

per sgugger's suggestions
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

69e16abf

19 Nov, 2021 1 commit

Add QDQBert model and quantization examples of SQUAD task (#14066) · a59e7c1e

Shang Zhang authored Nov 19, 2021



* clean up branch for add-qdqbert-model

* README update for QAT example; update docstrings in modeling_qdqbert.py

* Update qdqbert.rst

* Update README.md

* Update README.md

* calibration data using traning set; QAT example runs in fp32

* re-use BERTtokenizer for qdqbert

* Update docs/source/model_doc/qdqbert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/qdqbert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/qdqbert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove qdqbert tokenizer

* Update qdqbert.rst

* update evaluate-hf-trt-qa.py

* update configuration_qdqbert.py

* update modeling_qdqbert.py: add copied statement; replace assert with ValueError

* update copied from statement

* add is_quantization_available; run make fix-copies

* unittest add require_quantization

* add backend dependency to qdqbert model

* update README; update evaluate script; make style

* lint

* docs qdqbert update

* circleci build_doc add pytorch-quantization for qdqbert

* update README

* update example readme with instructions to upgrade TensorRT to 8.2

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* change quantization to pytorch_quantization for backend requirement

* feed_forward_chunking not supported in QDQBert

* make style

* update model docstrings and comments in testing scripts

* rename example to quantization-qdqbert; rename example scripts from qat to quant

* Update src/transformers/models/qdqbert/modeling_qdqbert.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* rm experimental functions in quant_trainer

* qa cleanup

* make fix-copies for docs index.rst

* fix doctree; use post_init() for qdqbert

* fix early device assignment for qdqbert

* fix CI:Model templates runner
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

a59e7c1e

17 Nov, 2021 1 commit
- [Gradient checkpoining] Update Wav2Vec scripts (#14036) · 7544efc9
  Antonio Carlos Falcão Petri authored Nov 17, 2021
```
Co-authored-by: Stas Bekman <stas@stason.org>
```
  7544efc9
15 Nov, 2021 1 commit

Replace BertLayerNorm with LayerNorm (#14385) · 9fd937ea

Eldar Kurtic authored Nov 15, 2021

Running Movement pruning experiments with the newest HuggingFace would crash due to non-existing BertLayerNorm.

9fd937ea

11 Nov, 2021 2 commits

fix --gradient_checkpointing (#13964) · 77262ef7
Stas Bekman authored Nov 11, 2021

77262ef7

Fix Flax params dtype (#13098) · e92190c0

Suraj Patil authored Nov 11, 2021



* fix inits

* fix embed dtype

* fix embed dtype

* add test to check default dtype

* quality

* add type conversion methods for flax models

* more robust casting

* cast sinusoidal positions

* update pegasus

* update albert

* update test

* make sure dtype is passed to every module

* style

* fix electra dense

* fix t5

* quality

* add more tests

* better name

* use the dtype for lm head computation

* fix albert

* style

* fix albert embed dtype

* more tests

* fix vision enc-dec

* cleanup

* fix embed dtype pegasus

* fix default param test

* doc

* update template

* fix final_logits_bias dtype

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix doc

* fix doc

* add detailed docstring for dtype parameter

* remove un-necessary import
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

e92190c0