Commits · 9960506cbee4642a75beda7753cdf23a4f493784 · chenpangpang / transformers

"vscode:/vscode.git/clone" did not exist on "12d9b957237ae55886d781cf84eb53e241bfa766"

08 Feb, 2023 6 commits
- Fix multiple `eos_token_id`s in model.generate(...) (#21461) · 9960506c
  Motoki Wu authored Feb 08, 2023
```
* add tests with multiple eos_token_ids

* make math.prod instead of sum

* make fixup

* fix long and also use np.prod since math.prod does not exist <python 3.8

* make fixup

* add prod util

* use prod util instead of np.prod

* make fixup

* previous .long location

* use tensor ops

* remove prod

* remove prod

* update device

* make fixup

* fix none
```
  9960506c
- Fixing backward compatiblity `image_processor` in pipeline. (#21513) · 06d940ef
  Nicolas Patry authored Feb 08, 2023
  
  06d940ef
- Update OPT conversion script to work for OPT-IML (#21519) · 98d5b727
  Thomas Wang authored Feb 08, 2023
  
  98d5b727
- no more dummies for speech processors (#21517) · fe616f35
  Matthijs Hollemans authored Feb 08, 2023
  
  fe616f35
- Generate: TF `compute_transition_scores` (#21341) · 1d9c26a4
  Joao Gante authored Feb 08, 2023
  
  1d9c26a4
- Exclude the madeup words from M2M100Tokenizer.vocab_size (#20976) · ca905ba2
  Guillaume Klein authored Feb 08, 2023
  
  ca905ba2
07 Feb, 2023 14 commits

Fix import in Accelerate for find_exec_bs (#21501) · 5b67ab99
Sylvain Gugger authored Feb 07, 2023

5b67ab99
Check for mapping/dict in distributed_concat function (#21500) · eb1771ef
Prajwal Kailas authored Feb 07, 2023
```
check for mapping/dict in distributed_concat function

Co-authored-by: prajwal967 <user.email>
```
eb1771ef

Add XLM-V to Model Doc (#21498) · 7e51a441

Stefan Schweter authored Feb 07, 2023

* doc: introduce new section for XLM-V model

* doc: mention more details for XLM-V integration

* docs: paper abstract in italics, model identifier for base model added

* doc: mention new XLM-V support

* auto: add XLM-V mapping

* doc: run make fix-copies ;)

7e51a441

Add inverse sqrt learning rate scheduler (#21495) · a3034c70

Adrian Sager La Ganga authored Feb 07, 2023

* added inverse sqrt lr scheduler

* Updated get_scheduler in src/transformers/optimization.py

* Updated src/transformers/__init__.py

* Added inverse sqrt lr scheduler test

* Updated docs/source/en/main_classes/optimizer_schedules.mdx

* Ran style and quality scripts

* Fix get_inverse_sqrt_schedule docstring

* Comment implementation URL

a3034c70

[tokenizer] sanitize saved config (#21483) · b9af152e
Stas Bekman authored Feb 07, 2023
```
* [tokenizer] sanitize saved config

* rm config["name_or_path"] test
```
b9af152e

Cleanup quality (#21493) · 67d07487

Sylvain Gugger authored Feb 07, 2023

* Remove mentions of flake8/isort

* Clean up inits

* Deall with all other inits

* Last special rule for dummy files

67d07487

Add limit_all_gathers option to fsdp_config and fix forward_prefetch bug (#21489) · 571fa585

raghavanone authored Feb 07, 2023

* Add limit_all_gathers option to fsdp_config and fix forward_prefetch bug

* Fix black issue

* Fix ruff failure

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

571fa585

[OPT] Adds `GPT2TokenizerFast` to the list of tokenizer to use for OPT. (#20823) · 9e7f84a5

Arthur authored Feb 07, 2023

* Add ("opt", ("GPT2Tokenizer", "GPT2TokenizerFast" if is_tokenizers_available() else None)),

* skip failing test

* Add ("opt", ("GPT2Tokenizer", "GPT2TokenizerFast" if is_tokenizers_available() else None)),

* skip failing test

9e7f84a5

Sanity check the type of id2label and label2id arguments of from_pretrained... · 8a303f52

raghavanone authored Feb 07, 2023

Sanity check the type of id2label and label2id arguments of from_pretrained for TokenClassification models (#21490)

* Sanity check the type of id2label and label2id arguments of from_pretrained for TokenClassification models

* Incorporate PR feedbacks

* Incorporate PR feedbacks

8a303f52

changed "ot" to "to" (#21488) · 8581fbaa
Iulian Taiatu authored Feb 07, 2023

8581fbaa
Generate: TF can now generate from embeddings in encoder-decoder models (#21475) · 1e4cf8bb
Joao Gante authored Feb 07, 2023

1e4cf8bb

[CI ] Remove `past` in favor of `pat_key_values` (#21443) · 12eb528b

Arthur authored Feb 07, 2023

* fix past renamed to past_key_value

* update more `past`that were ski^êd

* fixup

* remove changes made to rag

* refactor `_reorder_cache` to use `past_key_values`

* fix git `prepare_inputs_for_generation` to pass tests when false is needed in use_cache

12eb528b

Deprecate parallelize API (#21448) · 5b493762
Sylvain Gugger authored Feb 06, 2023
```
* Deprecate parallelize API

* Add documentation

* Fix copies
```
5b493762
Fix epoch number when resuming training (#21478) · cc840752
Sylvain Gugger authored Feb 06, 2023

cc840752

06 Feb, 2023 7 commits
- Update quality tooling for formatting (#21480) · 6f79d264
  Sylvain Gugger authored Feb 06, 2023
```
* Result of black 23.1

* Update target to Python 3.7

* Switch flake8 to ruff

* Configure isort

* Configure isort

* Apply isort with line limit

* Put the right black version

* adapt black in check copies

* Fix copies
```
  6f79d264
- OPT: BLIP2-ready `prepare_inputs_for_generation` (#21477) · 10056d89
  Joao Gante authored Feb 06, 2023
  
  10056d89
- Removing `more_itertools` dependency. (#21473) · 4435c7f5
  Nicolas Patry authored Feb 06, 2023
```
* Removing `more_itertools` dependency.

* Update examples/research_projects/vqgan-clip/requirements.txt
```
  4435c7f5
- Generate: TF can now accept custom logits processors (#21454) · 49433310
  Joao Gante authored Feb 06, 2023
  
  49433310
- make SpeechT5 doc examples deterministic (#21470) · e215e6de
  Matthijs Hollemans authored Feb 06, 2023
```
* make doc examples deterministic

* add IGNORE_RESULT
```
  e215e6de
- Added documentation for DagsHubCallback (#21452) · 5ac1c7ea
  Jinen Setpal authored Feb 06, 2023
```
updated documentation
```
  5ac1c7ea
- Fix `SpeechT5ForSpeechToSpeechIntegrationTests` device issue (#21460) · 0db5d911
  Yih-Dar authored Feb 06, 2023
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  0db5d911
03 Feb, 2023 6 commits

For IterableDataset, return DataLoader using self._train_batch_size. … (#21447) · 31c351c4

agossard authored Feb 03, 2023

For IterableDataset, return DataLoader using self._train_batch_size. This is consistent with how we generate a regular DataLoader, and leads to the correct args.per_device_train_batch_size eventually ending up on each GPU.

31c351c4

[WIP] add SpeechT5 model (#18922) · e4bacf66

Matthijs Hollemans authored Feb 03, 2023

* make SpeechT5 model by copying Wav2Vec2

* add paper to docs

* whoops added docs in wrong file

* remove SpeechT5Tokenizer + put CTC back in the name

* remove deprecated class

* remove unused docstring

* delete SpeechT5FeatureExtractor, use Wav2Vec2FeatureExtractor instead

* remove classes we don't need right now

* initial stab at speech encoder prenet

* add more speech encoder prenet stuff

* improve SpeechEncoderPrenet

* add encoder (not finished yet)

* add relative position bias to self-attention

* add encoder CTC layers

* fix formatting

* add decoder from BART, doesn't work yet

* make it work with generate loop

* wrap the encoder into a speech encoder class

* wrap the decoder in a text decoder class

* changed my mind

* changed my mind again ;-)

* load decoder weights, make it work

* add weights for text decoder postnet

* add SpeechT5ForCTC model that uses only the encoder

* clean up EncoderLayer and DecoderLayer

* implement _init_weights in SpeechT5PreTrainedModel

* cleanup config + Encoder and Decoder

* add head + cross attention masks

* improve doc comments

* fixup

* more cleanup

* more fixup

* TextDecoderPrenet works now, thanks Kendall

* add CTC loss

* add placeholders for other pre/postnets

* add type annotation

* fix freeze_feature_encoder

* set padding tokens to 0 in decoder attention mask

* encoder attention mask downsampling

* remove features_pen calculation

* disable the padding tokens thing again

* fixup

* more fixup

* code review fixes

* rename encoder/decoder wrapper classes

* allow checkpoints to be loaded into SpeechT5Model

* put encoder into wrapper for CTC model

* clean up conversion script

* add encoder for TTS model

* add speech decoder prenet

* add speech decoder post-net

* attempt to reconstruct the generation loop

* add speech generation loop

* clean up generate_speech

* small tweaks

* fix forward pass

* enable always dropout on speech decoder prenet

* sort declaration

* rename models

* fixup

* fix copies

* more fixup

* make consistency checker happy

* add Seq2SeqSpectrogramOutput class

* doc comments

* quick note about loss and labels

* add HiFi-GAN implementation (from Speech2Speech PR)

* rename file

* add vocoder to TTS model

* improve vocoder

* working on tokenizer

* more better tokenizer

* add CTC tokenizer

* fix decode and batch_code in CTC tokenizer

* fix processor

* two processors and feature extractors

* use SpeechT5WaveformFeatureExtractor instead of Wav2Vec2

* cleanup

* more cleanup

* even more fixup

* notebooks

* fix log-mel spectrograms

* support reduction factor

* fixup

* shift spectrograms to right to create decoder inputs

* return correct labels

* add labels for stop token prediction

* fix doc comments

* fixup

* remove SpeechT5ForPreTraining

* more fixup

* update copyright headers

* add usage examples

* add SpeechT5ProcessorForCTC

* fixup

* push unofficial checkpoints to hub

* initial version of tokenizer unit tests

* add slow test

* fix failing tests

* tests for CTC tokenizer

* finish CTC tokenizer tests

* processor tests

* initial test for feature extractors

* tests for spectrogram feature extractor

* fixup

* more fixup

* add decorators

* require speech for tests

* modeling tests

* more tests for ASR model

* fix imports

* add fake tests for the other models

* fixup

* remove jupyter notebooks

* add missing SpeechT5Model tests

* add missing tests for SpeechT5ForCTC

* add missing tests for SpeechT5ForTextToSpeech

* sort tests by name

* fix Hi-Fi GAN tests

* fixup

* add speech-to-speech model

* refactor duplicate speech generation code

* add processor for SpeechToSpeech model

* add usage example

* add tests for speech-to-speech model

* fixup

* enable gradient checkpointing for SpeechT5FeatureEncoder

* code review

* push_to_hub now takes repo_id

* improve doc comments for HiFi-GAN config

* add missing test

* add integration tests

* make number of layers in speech decoder prenet configurable

* rename variable

* rename variables

* add auto classes for TTS and S2S

* REMOVE CTC!!!

* S2S processor does not support save/load_pretrained

* fixup

* these models are now in an auto mapping

* fix doc links

* rename HiFiGAN to HifiGan, remove separate config file

* REMOVE auto classes

* there can be only one

* fixup

* replace assert

* reformat

* feature extractor can process input and target at same time

* update checkpoint names

* fix commit hash

e4bacf66

do not scale gradient in bf16 mode (#21428) · fb13a7df

Kashif Rasul authored Feb 03, 2023

* no dot scale gradient in bf16 mode

* fix since args.fp16 might be none

* fixed typo

* typo

* only do if grad scaling is true

* self.amp_dtype == torch.float16 is true

* put back prop when fsdp is not none

fb13a7df

Remove more unused attributes in config classes (#21392) · f726d53e

Yih-Dar authored Feb 03, 2023



* * Remove unused type_vocab_size

* Remove unused initializer_factor

* Remove unused n_embd

* Remove unused scale_embedding

* Remove unused scale_attn_weights

* fix

* fix

* Remove unused head_hidden_scale

* Remove unused activation_dropout

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

f726d53e

Add `inputs_embeds` support for `.generate()` with BLOOM models (#21430) · 3560ae6d
Pavel Denisov authored Feb 03, 2023
```
Add accepting `.generate()` calls with `inputs_embeds` on BLOOM models
```
3560ae6d
🚨🚨 Generate: standardize beam search behavior across frameworks (#21368) · f21af262
Joao Gante authored Feb 03, 2023

f21af262

02 Feb, 2023 7 commits

Fixes bug in the creation of ExponentialDecayLengthPenalty (#21423) · 6a3d1a98

Jorge C. Gomes authored Feb 02, 2023

input_ids_seq_length doesn't exist in the GenerationConfig, it exists as local variable in the function.

Setting exponential_decay_length_penalty therefore results in an error:
`AttributeError: 'GenerationConfig' object has no attribute 'input_ids_seq_length'`

This simple change fixes this issue, and the exponential_decay_length_penalty works as expected.

6a3d1a98

Fix some pipeline tests (#21401) · a6d8a149
Yih-Dar authored Feb 02, 2023
```
* fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
a6d8a149

Allow to add more information in `is_flaky` (#21426) · 145bf41c

Yih-Dar authored Feb 02, 2023



* Allow to add more information

* fix style

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

145bf41c

[`bnb`] Fine-tuning HF 8-bit models (#21290) · 8298e4ec

Younes Belkada authored Feb 02, 2023



* force `memory_efficient_backward=True`

* enhancements

- trainer support
- add new flag

* some changes

- internal changes in `Trainer`
- small refactor

* make quality

* Fixes

- add new testing util
- add new test
- change test in Trainer

* fix CI test

* educate users on how to ft 8bit models

* more checks

* fix `logger` error

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* adapt from review

* fix

* add comment

* use return instead

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

8298e4ec

Fix Graphormer test suite (#21419) · 67a3920d

Clémentine Fourrier authored Feb 02, 2023

* [FIX] path for Graphormer checkpoint

* [FIX] Test suite for graphormer

* [FIX] Update graphormer default num_classes

67a3920d

Add the GeLU activation from pytorch with the tanh approximation (#21345) · e006ab51
Joel Lamy-Poirier authored Feb 02, 2023
```
* gelu_python_tanh

* rename

* Version check, add test

* Pr comment
```
e006ab51
Fix image_processor_class bug (#21410) · 0ae8dc0a
Shikhar Tuli authored Feb 02, 2023
```
Co-authored-by: Shreshth Tuli <shreshthtuli@gmail.com>
```
0ae8dc0a