Commits · fdf84096565b8d2e15de35ac0cd86818c4b12adb · chenpangpang / transformers

09 Mar, 2023 8 commits

pt-to-tf model architecture override (#22055) · fdf84096

Matt authored Mar 09, 2023

* Add an argument to pt-to-tf to allow overriding the model class

* make fixup

* Minor fix to error message

* Remove unused extra conversion from the script

fdf84096

Return analysis for hyperparameter_search with Ray backend (#22040) · 04bfac83

anruijian authored Mar 09, 2023

* return analysis for hyperparameter_search with ray backend

* Revert "return analysis for hyperparameter_search with ray backend"

This reverts commit cd5179070930e03020d96d98eb51dec3eb21ef75.

* add run_summary attribute to BestRun and return analysis for ray backend

* fix typo

* add doc for run_summary for ray backend

04bfac83

Show the number of `huggingface_hub` warnings in CI report (#22054) · 90a7c954
Yih-Dar authored Mar 09, 2023
```
* show hfh warnings

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
90a7c954

Remove set_access_token usage + fail tests if FutureWarning (#22051) · 923110b7

Lucain authored Mar 09, 2023



* Remove set_access_token usage + fail tests if FutureWarning

* do not fail on FutureWarning in CI

---------
Co-authored-by: testbot <lucainp@hf.co>

923110b7

Can't install tf2 on M1 Chip by default (#22046) · 68477430
Shaun VanWeelden authored Mar 09, 2023

68477430

Docs Improvement - In ZSH, not using ' ' around pip install fails, fix it (#22045) · 81cd655c

Shaun VanWeelden authored Mar 09, 2023

In ZSH, not using ' ' around pip install fails

Running 
```
pip install transformers[torch]
```
in the default ZSH terminal will fail with the error `zsh: no matches found: transformers[torch]`

The solution is to wrap the installation path in ' ' like 
```
pip install 'transformers[torch]'
```

Relevant StackOverflow: https://stackoverflow.com/questions/30539798/zsh-no-matches-found-requestssecurity

81cd655c

[21737][T5]: Fix gradient checkpoint bug (#22036) · 1a77a1a8

Nipun Jindal authored Mar 09, 2023



* [21737][T5]: Fix gradient checkpoint bug

* [21737][T5]: Fix gradient checkpoint bug

* [21737][T5]: Fix gradient checkpoint bug

* Update src/transformers/models/mt5/modeling_mt5.py

* Update src/transformers/models/t5/modeling_t5.py

---------
Co-authored-by: njindal <njindal@adobe.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

1a77a1a8

Update ALIGN docs (#22025) · 2055d737
Alara Dirik authored Mar 09, 2023
```
* Fix typos and add code examples, resources
```
2055d737

08 Mar, 2023 13 commits

Bug fix: token classification pipeline while passing offset_mapping (#22034) · 3ec8171b
Ceyda Cinarel authored Mar 09, 2023
```
fix slow tokenizers with passing offset_mapping
```
3ec8171b
Mark all `BridgeTower` tests slow for now (#22039) · 1cbac686
Yih-Dar authored Mar 08, 2023
```
* slow me

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
1cbac686

Avoid `text_config_dict` and `vision_config_dict` being saved for CLIP-like models (#22035) · bcc8d30a

Yih-Dar authored Mar 08, 2023



* Avoid text_config_dict and vision_config_dict being saved

* for other CLIP-like models

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

bcc8d30a

fixes the gradient checkpointing of whisper (#22019) · 99839506

Somasree Majumder authored Mar 09, 2023



* fixing

* Update modeling_whisper.py

* Update modeling_whisper.py

* Update src/transformers/models/whisper/modeling_whisper.py

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

99839506

[examples/speech-recognition] Add SpecAugment to run_speech_recognition_seq2seq.py (#21942) · 6192549c

bofeng huang authored Mar 08, 2023



* Add specaugment to run_speech_recognition_seq2seq.py

* Remove useless argument: text_column

* Fix quality

* Update return_attention_mask condition

* Update specaugment arguments only for whisper models

* Remove SpecAugment arguments from ModelArguments, only leave default values for simplicity

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update apply_spec_augment only for whisper models

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Rename return_attention_mask to forward_attention_mask to avoid confusion with wav2vec2 models

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

6192549c

Add tokenize_kwargs parameter definition in the FeatureExtractionPipeline (#22031) · b427b263
anruijian authored Mar 08, 2023
```
add tokenize_kwargs doc in the FeatureExtractionPipeline
```
b427b263
Fix test for torchneuroncore in Trainer (#22028) · a5392ee7
Sylvain Gugger authored Mar 08, 2023

a5392ee7

[WIP] Add BridgeTowerForContrastiveLearning (#21964) · de81adf9

Anahita Bhiwandiwalla authored Mar 08, 2023



* Add BridgeTower for ITC

* Fix review feedback

* Rename BridgeTowerForITC, cleanup

* Fix style and quality

* implement tests

---------
Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com>
Co-authored-by: Tiep Le <tiep.le@intel.com>

de81adf9

[`bnb`] Fix bnb error message (#22026) · edea08a6
Younes Belkada authored Mar 08, 2023
```
* fix error message

* make style
```
edea08a6
Update `AudioClassificationPipelineTests::test_small_model_pt` for PT 2.0.0 (#22023) · dfe9a319
Yih-Dar authored Mar 08, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
dfe9a319
update: bertology paper (#22012) · bbd94997
Qiushi authored Mar 08, 2023

bbd94997
VideoMAE doctest - use valid dummy pixel values (#22022) · 4130e703
amyeroberts authored Mar 08, 2023
```
Use valid dummy pixel values
```
4130e703

Generate - add 1 to cur_len to make up the new beam length (#21993) · c1f85598

jim authored Mar 08, 2023



* add 1 to cur_len to make up the new beam length

cur_len is 1 token shorter comparing to the length of the sequence whose best_sum_logprobs is the numerator.

* cur_len+=1 before check if beam hyp is done

* format code

* reformat with black

---------
Co-authored-by: Chiming <chiming@biomap.com>

c1f85598

07 Mar, 2023 13 commits

Update tiny model creation script and some others files (#22006) · b338414e

Yih-Dar authored Mar 07, 2023



* Update 1

* Update 2

* Update 3

* Update 4

* Update 5

* Update 6

* Update 7

* Update 8

* Update 9

* Update 10

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

b338414e

[Time-Series] informer model (#21099) · 8abe4930

Eli Simhayev authored Mar 08, 2023

* added informer to gitignore

* added informer to gitignore

* WIP informer2020

* added checking that instantiate works

* added config using gluonTS by kashif

* WIP config

* adding informeConfig. need to remove FeatureEmbedder

* done InformerConfig, but need to change the names

* Done informer model init. working on enc-dec

* added things to address, after reading again enc-dec in the paper

* done modeling - checking initialization work

* added informer to gitignore

* WIP informer2020

* added checking that instantiate works

* added config using gluonTS by kashif

* WIP config

* adding informeConfig. need to remove FeatureEmbedder

* done InformerConfig, but need to change the names

* Done informer model init. working on enc-dec

* added things to address, after reading again enc-dec in the paper

* done modeling - checking initialization work

* moved enc-dec init to InformerEncoder/Decoder init

* added 'init_std' to config, now model init works!

* WIP conversion script, and added code sources

* WIP conversion script: loading original informer pth works

* WIP conversion script: change defaults in the config

* WIP conversion script: supporting Informer input embedding

* WIP conversion script: added parameters for the informer embed

* WIP conversion script: change dim_feedforward=2048

* WIP conversion script: remove unused args for loading checkpoint

* just cleaning up

* DataEmbedding removed, after thinking with Kashif

* working on forward pass

* WIP forward pass: trying to establish working batch for forward pass

* cleaning and finalizing

* adding HF names and docs

* init after cleaning works

* WIP in tests

* added docs for the informer specific args

* fix style

* undo change

* cleaning informer, now need to work only enc-dec

* initial enc-dec classes

* added encoder and decoder

* added todo

* add todos for conv_layers

* added decoder docs from vanilla

* added encoder docs from vanilla

* remove encoder decoder from the original informer

* removed AttentionLayer from the original paper

* removed TriangularCausalMask, same as decoder_attention_mask

* initial sparse attention

* use conv_layers

* fixed test_config test

* fix parenthesis when itearting zip(layers, conv_layers)

* error found in prob attention, added sizes as comments

* fix sizes

* added proposal for q_reduce indexing, and remove unused

* WIP ProbMask, and changed factor=2 for testing

* remove unused libs for this PR for creating the env

* fix checking the attn_weights.size() after bmm

* Q_reduce: changed from torch.gather to simple slicing

* WIP calculate final attn_output

* finish adding v_aggregated, attn_output ready

* changed tgt_len to u in attention_mask, need to fix the size error

* comment attention_mask for encoder, and fix if cond for v_agg

* added ProbMask support (wip), removed old original code

* finished ProbMask 😃



* Revert "remove unused libs for this PR for creating the env"

This reverts commit 11a081e09e92771e51a5d2758d53a9afb59547f0.

* fixes

* make style

* fix initial tests

* fix more tests

* dry

* make style

* remove unused files

* style

* added integration tests

* fix num_static_real_features

* fix header

* remove unused function

* fix example

* fix docs

* Update src/transformers/models/informer/configuration_informer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/modeling_informer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixes for reviewer

* use prediction_length from model

* fix style

* fixed informer.mdx

* added to index

* updated readme

* undo

* make fix-copies

* typo

* fix copy

* added Informer to toctree

* in order

* fixed comments

* remove unneeded new lines in docs

* make static real and cat optional

* fix use of distil conv layers

* fixed integration test

* added checkpoint for convlayer

* make fix-copies

* updated from time series model

* make fix-copies

* copy decoder

* fix unit tests

* updated scaling config

* fix integration tests

* IGNORE_NON_TESTED

* IGNORE_NON_AUTO_CONFIGURED

* IGNORE_NON_AUTO_CONFIGURED

* updated check configs

* fix formatting

* undo change from time series

* prediction_length should not be None

* aliign with the blog: prettify ProbSparse and change attention_factor  to sampling_factor

* make style

* make fix-copies

* niels CR: update contributed by

* niels CR: update configuration_informer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* niels CR: update kashif -> huggingface
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* niels CR: `sampling_factor` only relevant when `attention_type`=prob

* make style

* fixed U_part: added multiplication by `L_Q`

* fixed bug: remove `is not None` from `if config.distil`

* fixed test: `decoder_seq_length` to `encoder_seq_length` in cross_attentions check

* fix integration tests

* updated model hub

* do not shift as in training

* undo

* fix make-copies

* make fix-copies

* added `if prediction_length is None`

* changed `ProbSparseAttention` to `InformerProbSparseAttention`

* changed `V_sum` -> `v_mean_dim_time`

* changed `ConvLayer` to `InformerConvLayer` and fixed `super()`

* TimeSeriesTansformer->Informer in decoder's Copied from

* more descriptive in ProbSparse

* make style

* fix coped from

* Revert "added `if prediction_length is None`"

This reverts commit b4cbddfa05e3bd739b79569cd3c3b89e316f2451.

* fixed indent

* use InformerSinusoidalPositionalEmbedding

* make fix-style

* fix from #21860

* fix name

* make fix-copies

* use time series utils

* fix dec num_heads

* docstring

* added time series util doc

* _import_structure

* formatting

* changes from review

* make style

* fix docs

* fix doc

* removed NegativeLogLikelihood

---------
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

8abe4930

[DETR and friends] Remove is_timm_available (#21814) · dde718e7

NielsRogge authored Mar 07, 2023



* First draft

* Fix to_dict

* Improve conversion script

* Update config

* Remove timm dependency

* Fix dummies

* Fix typo, add integration test

* Upload 101 model as well

* Remove timm dummies

* Fix style

---------
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

dde718e7

[TF] Fix creating a PR while pushing in TF framework (#21968) · 2156662d

Arthur authored Mar 07, 2023

* add create pr arg

* style

* add test

* ficup

* update test

* last nit fix typo

* add `is_pt_tf_cross_test` marker for the tsts

2156662d

Stop requiring Torch for our TF examples! (#21997) · d128f2ff
Matt authored Mar 07, 2023
```
* Stop requiring Torch for our TF examples!

* Slight tweak to logging in the example itself
```
d128f2ff

[Whisper] Add model for audio classification (#21754) · 7c393181

Sanchit Gandhi authored Mar 07, 2023

* [Whisper] Add model for audio classification

* make fix-copies

* add to docs

* add docstring

* empty returns

* add code example

* switch to fleurs

* stick everything on one line

7c393181

Skip `test_multi_gpu_data_parallel_forward` for some model tests (#21991) · 9402788b

Yih-Dar authored Mar 07, 2023



skip test_multi_gpu_data_parallel_forward for some model tests
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

9402788b

Update `notification_service.py` (#21992) · 99c5c607

Yih-Dar authored Mar 07, 2023



* better check

* better check

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

99c5c607

Remove unneeded casts to bool (#21983) · 10bcbcae
regisss authored Mar 07, 2023
```
Remove cast to Bool
```
10bcbcae
[DETR, YOLOS] Fix device bug (#21974) · 95408e99
NielsRogge authored Mar 07, 2023
```
* Fix integration test

* Add test

* Add test
```
95408e99
Fix MinNewTokensLengthLogitsProcessor when used with a list of eos tokens (#21959) · eec46b4f
Elad Segal authored Mar 07, 2023
```
* Fix MinNewTokensLengthLogitsProcessor when used with a list of eos tokens

* fix docs

* Empty commit

* formatting
```
eec46b4f
Add check before int casting for PIL conversion (#21969) · 4063fd9c
amyeroberts authored Mar 07, 2023
```
* Add check before int casting for PIL conversion

* Line length

* Tidier logic
```
4063fd9c

Update `Jukebox` tests (#21984) · 5b28b783

Yih-Dar authored Mar 07, 2023



* update expected values for jukebox

* update expected values for jukebox

* update expected values for jukebox

* update expected values for jukebox

* update expected values for jukebox

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

5b28b783

06 Mar, 2023 6 commits
- docs: improve clarity for language modeling (#21952) · 31e3c6c3
  PD Hall authored Mar 06, 2023
```
* docs: improve clarity for clm/mlm

* docs: remove incorrect explanation

* docs: remove incorrect explanation

---------

Co-authored-by: pdhall99 <pdhall99>
```
  31e3c6c3
- Fix gradient checkpointing bug in ESM (#21980) · 0ce5236d
  Karim Foda authored Mar 06, 2023
  
  0ce5236d
- Fix gradient checkpointing bug in Codegen (#21979) · de496ef0
  Karim Foda authored Mar 06, 2023
  
  de496ef0
- Fix gradient checkpointing bug in BlipText (#21978) · 4a545d18
  Karim Foda authored Mar 06, 2023
```
Make Format
```
  4a545d18
- Fix gradient checkpointing bug in Blenderbot Small (#21977) · 451263b8
  Karim Foda authored Mar 06, 2023
  
  451263b8
- Fix gradient checkpointing bug in BigBird Pegasus (#21976) · 4f84dedc
  Karim Foda authored Mar 06, 2023
  
  4f84dedc