Commits · a6609caf4e544f72026a84aae2989f22d7e4535b · chenpangpang / transformers

16 Aug, 2023 3 commits
- Fix `MaskFormerModelIntegrationTest` OOM (#25544) · f61f072b
  Yih-Dar authored Aug 16, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  f61f072b
- fix vit hybrid test (#25543) · 0ed23e4d
  Marc Sun authored Aug 16, 2023
```
fix test
```
  0ed23e4d
- Marian: post-hack-fix correction (#25459) · 0b568291
  Joao Gante authored Aug 16, 2023
  
  0b568291
14 Aug, 2023 1 commit
- 🚨🚨🚨 Remove softmax for EfficientNetForImageClassification 🚨🚨🚨 (#25501) · c4129196
  amyeroberts authored Aug 14, 2023
```
* Remove softmax for EfficientNet

* Update integration test values

* Fix up
```
  c4129196
11 Aug, 2023 3 commits
- Mark flaky tests (#25463) · 5e5fa0d8
  amyeroberts authored Aug 11, 2023
```
Make CI less brittle
```
  5e5fa0d8
- Switch Transformers: remove overwritten beam sample test (#25458) · 4692d261
  Joao Gante authored Aug 11, 2023
  
  4692d261
- Refactor image processor testers (#25450) · 41d56ea6
  amyeroberts authored Aug 11, 2023
```
* Refactor image processor test mixin

- Move test_call_numpy, test_call_pytorch, test_call_pil to mixin
- Rename mixin to reflect handling of logic more than saving
- Add prepare_image_inputs, expected_image_outputs for tests

* Fix for oneformer
```
  41d56ea6
09 Aug, 2023 1 commit
- Update Bark generation configs and tests (#25409) · 704bf595
  Yoach Lacombe authored Aug 09, 2023
```
* update bark generation configs for more coherent parameter

* make style

* update bark hub repo
```
  704bf595
08 Aug, 2023 2 commits
- Use small config for `OneFormerModelTest.test_model_with_labels` (#25383) · 5b517e17
  Yih-Dar authored Aug 08, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  5b517e17
- Fix `test_model_parallelism` (#25359) · 6ea3ee3c
  Yih-Dar authored Aug 08, 2023
```
* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  6ea3ee3c
07 Aug, 2023 2 commits

Add mask2former fp16 support (#25093) · 080a9711

Pedro Lira authored Aug 07, 2023

* Add mask2former fp16 support

* Clear consistency/quality issues

* Fix consistency/quality (2)

* Add integration test for mask2former (fp16 case)

* Fix code quality

* Add integration test for maskformer (fp16 case)

* Add integration test for oneformer (fp16 case)

* Remove slow decorator from fp16 tests

* Fix lint

* Remove usage of full inference and value checks for fp16

* Temporarily comment slow for {mask, mask2, one}former

* Add fp16 support to oneformer

* Revert "Temporarily comment slow for {mask, mask2, one}former"

This reverts commit e5371edabd301cf56079def0421a0a87df307cb0.

* Remove dtype conversion noop

080a9711

Fix more offload edge cases (#25342) · c177606f

Yih-Dar authored Aug 07, 2023



* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

c177606f

04 Aug, 2023 1 commit

Make `bark` could have tiny model (#25290) · ce6d153a

Yih-Dar authored Aug 04, 2023



* temp

* update

* update

* update

* small dim

* small dim

* small dim

* fix

* update

* fix

* fix

* fix

* fix

* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

ce6d153a

03 Aug, 2023 2 commits

add generate method to SpeechT5ForTextToSpeech (#25233) · 6d3f9c1e

Yoach Lacombe authored Aug 03, 2023



* add generate method to SpeechT5ForTextToSpeech

* update speecht5forTTS docstrings

* Remove defaults to None in generate docstrings
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

6d3f9c1e

Update InstructBLIP & Align values after rescale update (#25209) · 30409af6

amyeroberts authored Aug 03, 2023

* Update InstructBLIP values
Note: the tests are not independent. Running the test independentely produces different logits compared to running all the integration tests

* Update test values after rescale update

* Remove left over commented out code

* Revert to previous rescaling logic

* Update rescale tests

30409af6

02 Aug, 2023 4 commits

CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266) · bd90cda9
Yih-Dar authored Aug 02, 2023
```
* CI with layers=2

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
bd90cda9

[MMS] Fix mms (#25267) · b28ebb26

Patrick von Platen authored Aug 02, 2023

* [MMS] Fix mms

* [MMS] Fix mms

* fix mms loading

* Apply suggestions from code review

* make style

* Update tests/models/wav2vec2/test_modeling_wav2vec2.py

b28ebb26

Fix some bugs for two stage training of deformable detr (#25045) · 8021c684

Yupeng Jia authored Aug 02, 2023



* Update modeling_deformable_detr.py

Fix bugs for two stage training

* Update modeling_deformable_detr.py

* Add test_two_stage_training to DeformableDetrModelTest

---------
Co-authored-by: yupeng.jia <yupeng.jia@momenta.ai>

8021c684

Update rescale tests - cast to float after rescaling to reflect #25229 (#25259) · 1b354097
amyeroberts authored Aug 02, 2023
```
Rescale tests - cast to float after rescaling to reflect #25229
```
1b354097

01 Aug, 2023 1 commit
- [`MPT`] Add `require_bitsandbytes` on MPT integration tests (#25201) · 05ebb026
  Younes Belkada authored Aug 01, 2023
```
* add  `require_bitsandbytes` on MPT integration tests

* add it on mpt as well
```
  05ebb026
31 Jul, 2023 2 commits

Update tiny model info. and pipeline testing (#25213) · 1b4f6199

Yih-Dar authored Jul 31, 2023



* update tiny_model_summary.json

* update

* update

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

1b4f6199

Fix `all_model_classes` in `FlaxBloomGenerationTest` (#25211) · 9ca3aa01
Yih-Dar authored Jul 31, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
9ca3aa01

28 Jul, 2023 4 commits
- 🚨🚨🚨 Fix rescale ViVit Efficientnet (#25174) · 05cda5df
  amyeroberts authored Jul 28, 2023
```
* Fix rescaling bug

* Add tests

* Update integration tests

* Fix up

* Update src/transformers/image_transforms.py

* Update test - new possible order in list
```
  05cda5df
- [MusicGen] Fix integration tests (#25169) · 03f98f96
  Sanchit Gandhi authored Jul 28, 2023
```
* move to device

* update with cuda values

* fix fp16

* more rigorous
```
  03f98f96
- [`InstructBlip`] Fix instructblip slow test (#25171) · dd9d45b6
  Younes Belkada authored Jul 28, 2023
```
* fix instruct blip slow test

* Update tests/models/instructblip/test_modeling_instructblip.py
```
  dd9d45b6
- [`Mpt`] Fix mpt slow test (#25170) · add0895d
  Younes Belkada authored Jul 28, 2023
```
fix mpt slow test
```
  add0895d
27 Jul, 2023 3 commits

Add bloom flax (#25094) · e9310363

Sanchit Gandhi authored Jul 27, 2023



* First commit

* step 1 working

* add alibi

* placeholder for `scan`

* add matrix mult alibi

* beta scaling factor for bmm

* working v1 - simple forward pass

* move layer_number from attribute to arg in call

* partial functioning scan

* hacky working scan

* add more modifs

* add test

* update scan for new kwarg order

* fix position_ids problem

* fix bug in attention layer

* small fix

- do the alibi broadcasting only once

* prelim refactor

* finish refactor

* alibi shifting

* incorporate dropout_add to attention module

* make style

* make padding work again

* update

* remove bogus file

* up

* get generation to work

* clean code a bit

* added small tests

* adding albii test

* make CI tests pass:

- change init weight
- add correct tuple for output attention
- add scan test
- make CI tests work

* fix few nits

* fix nit onnx

* fix onnx nit

* add missing dtype args to nn.Modules

* remove debugging statements

* fix scan generate

* Update modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* fix small test issue + make style

* clean up

* Update tests/models/bloom/test_modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* fix function name

* small fix test

* forward contrib credits from PR17761

* Fix failing test

* fix small typo documentation

* fix non passing test

- remove device from build alibi

* refactor call

- refactor `FlaxBloomBlockCollection` module

* make style

* upcast to fp32

* cleaner way to upcast

* remove unused args

* remove layer number

* fix scan test

* make style

* fix i4 casting

* fix slow test

* Update src/transformers/models/bloom/modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove `layer_past`

* refactor a bit

* fix `scan` slow test

* remove useless import

* major changes

- remove unused code
- refactor a bit
- revert import `torch`

* major refactoring

- change build alibi

* remove scan

* fix tests

* make style

* clean-up alibi

* add integration tests

* up

* fix batch norm conversion

* style

* style

* update pt-fx cross tests

* update copyright

* Update src/transformers/modeling_flax_pytorch_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* per-weight check

* style

* line formats

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

e9310363

Add offload support to Bark (#25037) · 0b92ae34

Yoach Lacombe authored Jul 27, 2023



* initial Bark offload proposal

* use hooks instead of manually offloading

* add test of bark offload to cpu feature

* Apply nit suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docstrings of offload
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove unecessary set_seed in Bark tests

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

0b92ae34

[`MptConfig`] support from pretrained args (#25116) · 9cea3e7b

Arthur authored Jul 27, 2023



* support from pretrained args

* draft addition of tests

* update test

* use parrent assert true

* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

9cea3e7b

26 Jul, 2023 2 commits
- MaskFormer - enable return_dict in order to compile (#25052) · 659829b6
  amyeroberts authored Jul 26, 2023
```
* Enable return_dict in order to compile

* Update tests
```
  659829b6
- Fix `PvtModelIntegrationTest::test_inference_fp16` (#25106) · 31acba56
  Yih-Dar authored Jul 26, 2023
```
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  31acba56
25 Jul, 2023 4 commits

[`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726) · 8f36ab3e

Sebastian Husch Lee authored Jul 25, 2023

* Initial addition of t5forsequenceclassification

* Adding imports and adding tests

* Formatting

* Running make fix-copies

* Adding mt5forseq

* Formatting

* run make fix-copies

* Adding to docs

* Add model_parallel

* Fix bug

* Fix

* Remove TODO

* Fixing tests for T5ForSequenceClassification

* Undo changes to dependency_versions_table.py

* Change classification head to work with T5Config directly

* Change seq length to let tests pass

* PR comments for formatting

* Formatting

* Initial addition of UMT5ForSequenceClassification

* Adding to inits and formatting

* run make fix-copies

* Add doc for UMT5ForSeqClass

* Update UMT5 config

* Fix docs

* Skip torch fx test for SequenceClassification

* Formatting

* Add skip to UMT5 tests as well

* Fix umt5 tests

* Running make fix-copies

* PR comments

* Fix for change to sentence_representation

* Rename seq_len to hidden_size since that's what it is

* Use base_model to follow format of the rest of the library

* Update docs

* Extract the decoder_input_ids changes and make one liner

* Make one-liner

8f36ab3e

[`MPT`] Add MosaicML's `MPT` model to transformers (#24629) · dcb183f4

Arthur authored Jul 25, 2023



* draft add new model like

* some cleaning of the config

* nits

* add nested configs

* nits

* update

* update

* added layer norms + triton kernels

* consider only LPLayerNorm for now.

* update

* all keys match.

* Update

* fixing nits here and there

* working forward pass.

* removed einops dependency

* nits

* format

* add alibi

* byebye head mask

* refactor attention

* nits.

* format

* fix nits.

* nuke ande updates

* nuke tokenizer test

* don't reshape query with kv heads

* added a bit of documentation.

* remove unneeded things

* nuke more stuff

* nit

* logits match - same generations

* rm unneeded methods

* 1 remaining failing CI test

* nit

* fix nits

* fix docs

* fix docs

* rm tokenizer

* fixup

* fixup

* fixup and fix tests

* fixed configuration object.

* use correct activation

* few minor fixes

* clarify docs a bit

* logits match à 1e-12

* skip and unskip a test

* added some slow tests.

* fix readme

* add more details

* Update docs/source/en/model_doc/mpt.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix configuration issues

* more fixes in config

* added more models

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove unneeded position ids

* fix some  comments

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* revert suggestion

* mpt alibi + added batched generation

* Update src/transformers/models/mpt/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove init config

* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix nit

* add another slow test

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fits in one line

* some refactor because make fixup doesn't pass

* add ft notebook

* update md

* correct doc path

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

dcb183f4

Set `TF32` flag for PyTorch cuDNN backend (#25075) · 6bc61aa7
Xuehai Pan authored Jul 25, 2023

6bc61aa7
Fix last models for common tests that are too big. (#25058) · f295fc8a
Sylvain Gugger authored Jul 25, 2023
```
* Fix last models for common tests that are too big.

* Remove print statement
```
f295fc8a

24 Jul, 2023 2 commits

Pvt model (#24720) · a03d13c8

Rinat authored Jul 24, 2023

* pull and push updates

* add docs

* fix modeling

* Add and run test

* make copies

* add task

* fix tests and fix small issues

* Checks on a Pull Request

* fix docs

* add desc pvt.md

a03d13c8

Make more test models smaller (#25005) · 42571f6e
Sylvain Gugger authored Jul 24, 2023
```
* Make more test models tiny

* Make more test models tiny

* More models

* More models
```
42571f6e

21 Jul, 2023 1 commit
- [`LlamaConfig`] Nit: pad token should be None by default (#24958) · 0511369a
  Arthur authored Jul 21, 2023
```
* pad token should be None by default

* fix tests

* nits
```
  0511369a
20 Jul, 2023 1 commit

Deprecate unused OpenLlama architecture (#24922) · 79444f37

Tom Aarsen authored Jul 20, 2023

* Resolve typo in check_repo.py

* Specify encoding when opening modeling files

* Deprecate the OpenLlama architecture

* Add disclaimer pointing to Llama

I'm open to different wordings here

* Match the capitalisation of LLaMA

79444f37

18 Jul, 2023 1 commit

[`Llama2`] Add support for Llama 2 (#24891) · 07360b6c

Arthur authored Jul 18, 2023



* add llama

* add other readmes

* update padding id in readme

* add link to paper

* fix paths and tokenizer

* more nits

* styling

* fit operation in 2 lines when possible

* nits

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add form

* update reademe

* update readme, we don't have a default pad token

* update test and tokenization

* LLaMA instead of Llama

* nits

* add expected text

* add greeedy output

* styling

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* sequential device map

* skip relevant changes

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

07360b6c