Commits · 5ac1c7ea855e0f2d80e6ec531e4f3445710c17ba · chenpangpang / transformers

06 Feb, 2023 3 commits

Added documentation for DagsHubCallback (#21452) · 5ac1c7ea
Jinen Setpal authored Feb 06, 2023
```
updated documentation
```
5ac1c7ea

Add perf numbers for perf_train_cpu (#20974) · ae318318

jianan-gu authored Feb 06, 2023



* Update perf_train_cpu.mdx

* Update perf_train_cpu.mdx

* Update perf_train_cpu.mdx

* Update docs/source/en/perf_train_cpu.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update perf_train_cpu.mdx

* Update perf_train_cpu.mdx

* Update perf_train_cpu.mdx

* Update perf_train_cpu.mdx

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

ae318318

Fix `SpeechT5ForSpeechToSpeechIntegrationTests` device issue (#21460) · 0db5d911
Yih-Dar authored Feb 06, 2023
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
0db5d911

03 Feb, 2023 11 commits

Avoid flaky generation sampling tests (#21445) · 59d5edef
Yih-Dar authored Feb 03, 2023
```
* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
59d5edef

For IterableDataset, return DataLoader using self._train_batch_size. … (#21447) · 31c351c4

agossard authored Feb 03, 2023

For IterableDataset, return DataLoader using self._train_batch_size. This is consistent with how we generate a regular DataLoader, and leads to the correct args.per_device_train_batch_size eventually ending up on each GPU.

31c351c4

Add tutorial doc for TF + TPU (#21429) · 833174c9

Matt authored Feb 03, 2023



* Add tutorial doc for TF + TPU

* Fix all those extra asterisks in the markdown

* Use the actual Tip formatting

* Remove unnecessary spaces

* Reformat checklist

* Fix checklist and reformat tips slightly

* Update docs/source/en/perf_train_tpu_tf.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/perf_train_tpu_tf.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/perf_train_tpu_tf.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/perf_train_tpu_tf.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Add link to TPU notebook in the notebooks list

* Add links to the TPU notebook in the tutorial doc

* Make the markdown table a bit less wild

* Fix notebook link

* More notebook links

* More fixes to wild tables

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

833174c9

exclude deleted files in the fixup script (#21436) · 6c62cfb2
Darren Tuit authored Feb 04, 2023
```
exclude deleted files from fixup script
```
6c62cfb2

[WIP] add SpeechT5 model (#18922) · e4bacf66

Matthijs Hollemans authored Feb 03, 2023

* make SpeechT5 model by copying Wav2Vec2

* add paper to docs

* whoops added docs in wrong file

* remove SpeechT5Tokenizer + put CTC back in the name

* remove deprecated class

* remove unused docstring

* delete SpeechT5FeatureExtractor, use Wav2Vec2FeatureExtractor instead

* remove classes we don't need right now

* initial stab at speech encoder prenet

* add more speech encoder prenet stuff

* improve SpeechEncoderPrenet

* add encoder (not finished yet)

* add relative position bias to self-attention

* add encoder CTC layers

* fix formatting

* add decoder from BART, doesn't work yet

* make it work with generate loop

* wrap the encoder into a speech encoder class

* wrap the decoder in a text decoder class

* changed my mind

* changed my mind again ;-)

* load decoder weights, make it work

* add weights for text decoder postnet

* add SpeechT5ForCTC model that uses only the encoder

* clean up EncoderLayer and DecoderLayer

* implement _init_weights in SpeechT5PreTrainedModel

* cleanup config + Encoder and Decoder

* add head + cross attention masks

* improve doc comments

* fixup

* more cleanup

* more fixup

* TextDecoderPrenet works now, thanks Kendall

* add CTC loss

* add placeholders for other pre/postnets

* add type annotation

* fix freeze_feature_encoder

* set padding tokens to 0 in decoder attention mask

* encoder attention mask downsampling

* remove features_pen calculation

* disable the padding tokens thing again

* fixup

* more fixup

* code review fixes

* rename encoder/decoder wrapper classes

* allow checkpoints to be loaded into SpeechT5Model

* put encoder into wrapper for CTC model

* clean up conversion script

* add encoder for TTS model

* add speech decoder prenet

* add speech decoder post-net

* attempt to reconstruct the generation loop

* add speech generation loop

* clean up generate_speech

* small tweaks

* fix forward pass

* enable always dropout on speech decoder prenet

* sort declaration

* rename models

* fixup

* fix copies

* more fixup

* make consistency checker happy

* add Seq2SeqSpectrogramOutput class

* doc comments

* quick note about loss and labels

* add HiFi-GAN implementation (from Speech2Speech PR)

* rename file

* add vocoder to TTS model

* improve vocoder

* working on tokenizer

* more better tokenizer

* add CTC tokenizer

* fix decode and batch_code in CTC tokenizer

* fix processor

* two processors and feature extractors

* use SpeechT5WaveformFeatureExtractor instead of Wav2Vec2

* cleanup

* more cleanup

* even more fixup

* notebooks

* fix log-mel spectrograms

* support reduction factor

* fixup

* shift spectrograms to right to create decoder inputs

* return correct labels

* add labels for stop token prediction

* fix doc comments

* fixup

* remove SpeechT5ForPreTraining

* more fixup

* update copyright headers

* add usage examples

* add SpeechT5ProcessorForCTC

* fixup

* push unofficial checkpoints to hub

* initial version of tokenizer unit tests

* add slow test

* fix failing tests

* tests for CTC tokenizer

* finish CTC tokenizer tests

* processor tests

* initial test for feature extractors

* tests for spectrogram feature extractor

* fixup

* more fixup

* add decorators

* require speech for tests

* modeling tests

* more tests for ASR model

* fix imports

* add fake tests for the other models

* fixup

* remove jupyter notebooks

* add missing SpeechT5Model tests

* add missing tests for SpeechT5ForCTC

* add missing tests for SpeechT5ForTextToSpeech

* sort tests by name

* fix Hi-Fi GAN tests

* fixup

* add speech-to-speech model

* refactor duplicate speech generation code

* add processor for SpeechToSpeech model

* add usage example

* add tests for speech-to-speech model

* fixup

* enable gradient checkpointing for SpeechT5FeatureEncoder

* code review

* push_to_hub now takes repo_id

* improve doc comments for HiFi-GAN config

* add missing test

* add integration tests

* make number of layers in speech decoder prenet configurable

* rename variable

* rename variables

* add auto classes for TTS and S2S

* REMOVE CTC!!!

* S2S processor does not support save/load_pretrained

* fixup

* these models are now in an auto mapping

* fix doc links

* rename HiFiGAN to HifiGan, remove separate config file

* REMOVE auto classes

* there can be only one

* fixup

* replace assert

* reformat

* feature extractor can process input and target at same time

* update checkpoint names

* fix commit hash

e4bacf66

do not scale gradient in bf16 mode (#21428) · fb13a7df

Kashif Rasul authored Feb 03, 2023

* no dot scale gradient in bf16 mode

* fix since args.fp16 might be none

* fixed typo

* typo

* only do if grad scaling is true

* self.amp_dtype == torch.float16 is true

* put back prop when fsdp is not none

fb13a7df

Fix device issue in a `ConvBertModelTest` test (#21438) · 197e7ce9
Yih-Dar authored Feb 03, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
197e7ce9

Added model resources for LayoutLM Issue#19848 (#21377) · 0df80282

Avi Singhal authored Feb 03, 2023



* updated resources for LayoutLM

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fixed formatting, removed extra section

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

0df80282

Remove more unused attributes in config classes (#21392) · f726d53e

Yih-Dar authored Feb 03, 2023



* * Remove unused type_vocab_size

* Remove unused initializer_factor

* Remove unused n_embd

* Remove unused scale_embedding

* Remove unused scale_attn_weights

* fix

* fix

* Remove unused head_hidden_scale

* Remove unused activation_dropout

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

f726d53e

Add `inputs_embeds` support for `.generate()` with BLOOM models (#21430) · 3560ae6d
Pavel Denisov authored Feb 03, 2023
```
Add accepting `.generate()` calls with `inputs_embeds` on BLOOM models
```
3560ae6d
🚨🚨 Generate: standardize beam search behavior across frameworks (#21368) · f21af262
Joao Gante authored Feb 03, 2023

f21af262

02 Feb, 2023 12 commits

Add VQGAN-CLIP research project (#21329) · ea55bd86

Erwann Millon authored Feb 02, 2023



* Add VQGAN-CLIP research project

* fixed style issues

* Update examples/research_projects/vqgan-clip/README.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update examples/research_projects/vqgan-clip/VQGAN_CLIP.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update examples/research_projects/vqgan-clip/requirements.txt
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update examples/research_projects/vqgan-clip/README.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update examples/research_projects/vqgan-clip/VQGAN_CLIP.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update examples/research_projects/vqgan-clip/VQGAN_CLIP.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update examples/research_projects/vqgan-clip/VQGAN_CLIP.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update examples/research_projects/vqgan-clip/loaders.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* replace CLIPProcessor with tokenizer, change asserts to exceptions

* rm unused import

* remove large files (jupyter notebook linked in readme, imgs migrated to hf dataset)

* add tokenizers dependency

* Remove comment
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* rm model checkpoints

---------
Co-authored-by: Erwann Millon <erwann@Erwanns-MacBook-Air.local>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ea55bd86

Update task summary (#21067) · fbee8295

Steven Liu authored Feb 02, 2023

* first draft of audio section

* make style

* first draft of computer vision section

* add convnext and encoder tasks

* finish up nlp tasks

* minor edits

* add arch images, more edits

* fix image links

* apply sanchit feedback

* model naming convention

* apply niels vit feedback

* replace detr for segmentation with mask2former

* apply feedback

* apply feedback

fbee8295

Fixes bug in the creation of ExponentialDecayLengthPenalty (#21423) · 6a3d1a98

Jorge C. Gomes authored Feb 02, 2023

input_ids_seq_length doesn't exist in the GenerationConfig, it exists as local variable in the function.

Setting exponential_decay_length_penalty therefore results in an error:
`AttributeError: 'GenerationConfig' object has no attribute 'input_ids_seq_length'`

This simple change fixes this issue, and the exponential_decay_length_penalty works as expected.

6a3d1a98

Fix task guide formatting (#21409) · 0a757176
Steven Liu authored Feb 02, 2023
```
fix formatting
```
0a757176
Fix some pipeline tests (#21401) · a6d8a149
Yih-Dar authored Feb 02, 2023
```
* fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
a6d8a149

Allow to add more information in `is_flaky` (#21426) · 145bf41c

Yih-Dar authored Feb 02, 2023



* Allow to add more information

* fix style

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

145bf41c

[`bnb`] Fine-tuning HF 8-bit models (#21290) · 8298e4ec

Younes Belkada authored Feb 02, 2023



* force `memory_efficient_backward=True`

* enhancements

- trainer support
- add new flag

* some changes

- internal changes in `Trainer`
- small refactor

* make quality

* Fixes

- add new testing util
- add new test
- change test in Trainer

* fix CI test

* educate users on how to ft 8bit models

* more checks

* fix `logger` error

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* adapt from review

* fix

* add comment

* use return instead

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

8298e4ec

Fix Graphormer test suite (#21419) · 67a3920d

Clémentine Fourrier authored Feb 02, 2023

* [FIX] path for Graphormer checkpoint

* [FIX] Test suite for graphormer

* [FIX] Update graphormer default num_classes

67a3920d

Add the GeLU activation from pytorch with the tanh approximation (#21345) · e006ab51
Joel Lamy-Poirier authored Feb 02, 2023
```
* gelu_python_tanh

* rename

* Version check, add test

* Pr comment
```
e006ab51
Add distinct section names for PyTorch and TF (#21422) · 53d374f1
Matt authored Feb 02, 2023
```
* Add distinct section names for PyTorch and TF

* Remove extra space
```
53d374f1
Fix image_processor_class bug (#21410) · 0ae8dc0a
Shikhar Tuli authored Feb 02, 2023
```
Co-authored-by: Shreshth Tuli <shreshthtuli@gmail.com>
```
0ae8dc0a

Use torch `1.13.1` in push/schedule CI (#21421) · db572b38

Yih-Dar authored Feb 02, 2023



Use torch 1.13.1 in push/scheduled CI
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

db572b38

01 Feb, 2023 7 commits

Generate: decoder-only models can generate with `inputs_embeds` (#21405) · 92ce53aa
Joao Gante authored Feb 01, 2023

92ce53aa

Add TF image classification example script (#19956) · e5db7051

amyeroberts authored Feb 01, 2023



* TF image classification script

* Update requirements

* Fix up

* Add tests

* Update test fetcher
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix directory path

* Adding `zero-shot-object-detection` pipeline doctest. (#20274)

* Adding `zero-shot-object-detection` pipeline doctest.

* Remove nested_simplify.

* Add generate kwargs to `AutomaticSpeechRecognitionPipeline` (#20952)

* Add generate kwargs to AutomaticSpeechRecognitionPipeline

* Add test for generation kwargs

* Trigger CI

* Data collator returns np

* Update feature extractor -> image processor

* Bug fixes - updates to reflect changes in API

* Update flags to match PT & run faster

* Update instructions - Maria's comment

* Update examples/tensorflow/image-classification/README.md

* Remove slow decorator

---------
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: bofeng huang <bofenghuang7@gmail.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

e5db7051

Added DagshubCallback (#21404) · 3fadb4b2

Jinen Setpal authored Feb 01, 2023



* integrated logger

* bugifx

* added data

* bugfix

* model + state artifacts should log

* fixed paths

* i lied, trying again

* updated function call

* typo

this is painful :( what a stupid error

* typo

this is painful :( what a stupid error

* pivoted to adding a directory

* silly path bug

* multiple experiments

* migrated to getattr

* syntax fix

* syntax fix

* fixed repo pointer

* fixed path error

* added dataset if dataloader is present, uploaded artifacts

* variable in scope

* removed unnecessary line

* updated error type
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* trimmed unused variables, imports

* style formatting

* removed type conversion reliance
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* reverted accidental line deletion

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3fadb4b2

Skip batches fast with accelerate (#21390) · 8d580779

Sylvain Gugger authored Feb 01, 2023

* Skip batches fast with Accelerate

* remove debug statement

* Hack seed reload at the right time

* Reorganize RNG sync

* Fix accelerate version comp

8d580779

Fix the issue of using only inputs_embeds in convbert model (#21398) · 77db257e

raghavanone authored Feb 01, 2023

* Fix the input embeds issue with tests

* Fix black and isort issue

* Clean up tests

* Add slow tag to the test introduced

* Incorporate PR feedbacks

77db257e

Moved LiLT under multimodal models in TOC (#21393) · 65b5035a
Maria Khalusova authored Feb 01, 2023
```
moved LiLT under multimodal models
```
65b5035a

Add variant to transformers (#21332) · 90cddfa8

Patrick von Platen authored Feb 01, 2023

* Bump onnx in /examples/research_projects/decision_transformer

Bumps [onnx](https://github.com/onnx/onnx) from 1.11.0 to 1.13.0.
- [Release notes](https://github.com/onnx/onnx/releases)
- [Changelog](https://github.com/onnx/onnx/blob/main/docs/Changelog.md)
- [Commits](https://github.com/onnx/onnx/compare/v1.11.0...v1.13.0

)

---
updated-dependencies:
- dependency-name: onnx
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>

* adapt

* finish

* Update examples/research_projects/decision_transformer/requirements.txt

* up

* add tests

* Apply suggestions from code review
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* fix test

---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

90cddfa8

31 Jan, 2023 7 commits

Update `Graphormer` and fix its `torchscript` test failures (#21380) · bc44e947
Yih-Dar authored Jan 31, 2023
```
* fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
bc44e947
Generate: fix TF XLA tests on models with `max_position_embeddings` or... · 19d67bfe
Joao Gante authored Jan 31, 2023
```
Generate: fix TF XLA tests on models with `max_position_embeddings` or `max_target_positions` (#21389)
```
19d67bfe

Remove more unused attributes in config classes (#21327) · 63424273

Yih-Dar authored Jan 31, 2023



* remove unused classifier_dropout

* remove unused dropout

* remove unused pooler_fn

* remove unnecessary is_encoder_decoder

* remove unnecessary drop_rate

* remove unused classifier_dropout

* remove unused classifier_dropout

* remove unused dropout

* remove unused dropout

* remove unused summary_* attributes

* remove unused tie_word_embeddings

* remove unused summary_* attributes

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

63424273

Add support of backward_prefetch and forward_prefetch (#21237) · da2a4d95

raghavanone authored Jan 31, 2023



* Add support of backward_prefetch and forward_prefetch

* Fix format issue

* Fix isort issue

* Fix doc style issue

* Update src/transformers/trainer.py
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>

* Fix black issue

* Fix doc-style issue

* Make additional fsdp parameters into fsdp config

* Fix black issue

* Remove unused imports

* Fix doc style issues

* Incorporate PR feedbacks

* Remove unused imports

* Fix tests

* Fix tests

* Fix tests

* Fix tests

* Fix tests

* Update src/transformers/training_args.py
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>

* Fix tests

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Fix black issues

---------
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>

da2a4d95

Simplify column_names in run_clm/mlm (#21382) · 074d6b75
Quentin Lhoest authored Jan 31, 2023
```
* simplify column_names in run_clm

* simplify column_names in run_mlm

* minor
```
074d6b75

[Docs] Minor fixes (#21383) · c21298a6

NielsRogge authored Jan 31, 2023



* Improve docs

* Add DETA resources

---------
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

c21298a6

Do not log the generation config for each prediction step in TrainerSeq2Seq (#21385) · d31497b1
regisss authored Jan 31, 2023
```
Do not log the generation config for each iteration
```
d31497b1