Commits · ebd5258975ad8673bc4532ba5f6bdbe2496066ec · chenpangpang / transformers

"vscode:/vscode.git/clone" did not exist on "e20faa6f0317657c3c03c61c7550d0b805911ddb"

01 Mar, 2023 6 commits

Change the way tensor is reshaped in BartAttention (from .view to .reshape) (#21860) · ebd52589

raghavanone authored Mar 01, 2023

* Change the .view call to .reshape

* Change the .view call to .reshape to all the copies from bart attention

* Fix copies and style

* Fix copies and style

* Fix copies and style

* Fix copies and style

* Fix copies and style

* Revert unneccessary changes

* Revert unneccessary changes

* Revert unneccessary changes

* Revert unneccessary changes

ebd52589

[deepspeed] check whether model is NLP one instead of counting on input type (#21800) · f71873c5

Eugene Zapolsky authored Mar 01, 2023



* trying to figure out whether model is NLP

* drop my changes and apply easier fix

* trying to handle all int input types

* fix logic

---------
Co-authored-by: Stas Bekman <stas@stason.org>

f71873c5

Fix gradient checkpointing bug Bart (#21866) · 72e9ca75
saswatmeher authored Mar 01, 2023
```
Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>
```
72e9ca75
Flax beam search fix (#21857) · 5e6cd51b
Andy Ehrenberg authored Mar 01, 2023

5e6cd51b

[ConvBert] Fix #21523 (#21849) · b599b192

Arthur authored Mar 01, 2023

* fix reshaping
Fixes #21523

* add test

* styling

* last fixes

* Update src/transformers/models/convbert/modeling_convbert.py

* code quallity

b599b192

prepare for "__floordiv__ is deprecated and its behavior will change in a... · 44e3e3fb

Arthur authored Mar 01, 2023

prepare for "__floordiv__ is deprecated  and its behavior will change in a future version of pytorch" (#20211)

* rounding_mode = "floor"  instead of // to prevent behavioral change

* add other TODO

* use `torch_int_div` from pytrch_utils

* same for tests

* fix copies

* style

* use relative imports when needed

* Co-authored-by: sgugger <sylvain.gugger@gmail.com>

44e3e3fb

28 Feb, 2023 17 commits

Fix flaky test for log level (#21776) · b29e2dca
Sylvain Gugger authored Feb 28, 2023
```
* Fix flaky test for log level

* Fix other flaky test
```
b29e2dca

Improve TF weight loading, especially PT crossloading (#21792) · acfb714b

Matt authored Feb 28, 2023

* First commit for the improved PT-TF weight loading

* Remove workarounds from TFEncoderDecoder tests

* Allow a custom weight renaming function in from_pretrained and use that to clean up EncoderDecoder

* make fixup

* First attempt at visionencoderdecoder

* Disable tensorfloat32 in tests to get consistent outputs

* Quick fix to tf_vision_encoder_decoder tests

* make fixup

* Update Blenderbot tests

* Remove unused arg in modeling_tf_opt

* load_tf_sharded_weights had strict=True! This meant transfer learning was impossible, so I'm setting it to False.

* Support prefixes when loading sharded TF checkpoints

* make fixup

* Add test to load sharded models with a weight prefix

* Fix sharded weight loading test

* Add a test for transfer from a sharded checkpoint

* make fixup

* Add test to check that crossloading from PT with a prefix works

* Refactor from_pretrained in the encoderdecoder classes

* Refactor from_pretrained in the encoderdecoder classes

* missmatched -> mismatched

* Explicitly check for None

* No comments showing my very impressive and attractive knowledge of Py3.9+

* Disable TF32 across all TF tests

acfb714b

🔥

Rework pipeline testing by removing `PipelineTestCaseMeta`

🚀

(#21516) · 871c31a6

Yih-Dar authored Feb 28, 2023



* Add PipelineTesterMixin

* remove class PipelineTestCaseMeta

* move validate_test_components

* Add for ViT

* Add to SPECIAL_MODULE_TO_TEST_MAP

* style and quality

* Add feature-extraction

* update

* raise instead of skip

* add tiny_model_summary.json

* more explicit

* skip tasks not in mapping

* add availability check

* Add Copyright

* A way to diable irrelevant tests

* update with main

* remove disable_irrelevant_tests

* skip tests

* better skip message

* better skip message

* Add all pipeline task tests

* revert

* Import PipelineTesterMixin

* subclass test classes with PipelineTesterMixin

* Add pipieline_model_mapping

* Fix import after adding pipieline_model_mapping

* Fix style and quality after adding pipieline_model_mapping

* Fix one more import after adding pipieline_model_mapping

* Fix style and quality after adding pipieline_model_mapping

* Fix test issues

* Fix import requirements

* Fix mapping for MobileViTModelTest

* Update

* Better skip message

* pipieline_model_mapping could not be None

* Remove some PipelineTesterMixin

* Fix typo

* revert tests_fetcher.py

* update

* rename

* revert

* Remove PipelineTestCaseMeta from ZeroShotAudioClassificationPipelineTests

* style and quality

* test fetcher for all pipeline/model tests

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

871c31a6

Add loss for BridgeTowerForMaskedLM and BridgeTowerForImageAndTextRetrieval (#21684) · 4cb5ffa9

Anahita Bhiwandiwalla authored Feb 28, 2023



* Add loss for BridgeTowerForMaskedLM and BridgeTowerForImageAndTextRetrieval

* minor fix return_dict

* implement test for loss computation

---------
Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com>
Co-authored-by: Tiep Le <tiep.le@intel.com>

4cb5ffa9

[`Blip2`] Fix Blip-2 multi gpu (#21707) · 7f4f8b97

Younes Belkada authored Feb 28, 2023



* fix blip multi gpu

* fix

* final changes

* adapt suggestions

* fix failing slow test

* forward contrib credits from testing and suggestions

* reformat

---------
Co-authored-by: akkikiki <akkikiki@users.noreply.github.com>

7f4f8b97

Make Slack CI reporting stronger (#21823) · aab895c3

Yih-Dar authored Feb 28, 2023



* Use token

* Avoid failure

* better error

* Fix

* fix style

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

aab895c3

Add: task guide for zero shot object detection (#21829) · 6ca84458

Maria Khalusova authored Feb 28, 2023



* zero shot object detection part 1

* added batch prediction section

* added image guided object detection section

* make style

* added the task guide to the TOC

* minor polishing

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

* added embedded owlvit demo

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* minor fix

* make style

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

6ca84458

[GPTJ] Fix gradient checkpointing bug (#21794) · 31fa2b6c

Herumb Shandilya authored Feb 28, 2023



* If applied, this commit fixes generate bug in gptj

* Remove extra same code block

* formatting and test fix

* Conflict fix and declaration error fix

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

31fa2b6c

Fix the issue of blip model returning loss even when the label is not provided. (#21811) · eec76042

raghavanone authored Feb 28, 2023

* Fix the issue of blip model returning loss even when the label is not provoided

* Fix ruff failure

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

* Incorporate PR feedbacks

eec76042

[`Blip2`] Add `Blip2Model` (#21817) · b8de7e44

Younes Belkada authored Feb 28, 2023

* add v1

* add `Blip2Model`

- add relevant functions
- add tests
- add on automapping

* fix docs

* fix doctest

b8de7e44

[`T5`] Fix torchquant issue (#21843) · ae9230af
Younes Belkada authored Feb 28, 2023
```
* fix torchquant issue

* add tests
```
ae9230af

Fix tf random token masking probability in data collator (#21834) · 2d506ea4

anruijian authored Feb 28, 2023

* fix tf random mask tokens probability

* fix tf random mask tokens probability in collator for langauge modelling

2d506ea4

Fix gradient checkpointing imagegpt (#21816) · 4fe744f5

Karim Foda authored Feb 28, 2023



* Fix gradient checkpointing bug in gptneox

* Fix gradient checkpointing bug in modeling_imagegpt.py

* Revert gpt neox changes

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

4fe744f5

Fix gradient checkpointing bug in git (#21818) · e07a3d95
Karim Foda authored Feb 28, 2023
```
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
e07a3d95
check for None forced tokens (#21793) · 50db7414
Andy Ehrenberg authored Feb 28, 2023

50db7414
Fix gradient checkpointing bug BioGpt (#21844) · 50644cf6
saswatmeher authored Feb 28, 2023
```
Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>
```
50644cf6
Rename `MobileViTModelTest` to `TFMobileViTModelTest` (#21825) · a9dd1243
Yih-Dar authored Feb 28, 2023
```
Let's give TF a bit more love ❤️ 🙏

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
a9dd1243

27 Feb, 2023 13 commits

introduce `logger.warning_once` and use it for grad checkpointing code (#21804) · c7f3abc2
Stas Bekman authored Feb 27, 2023
```
* logger.warning_once

* style
```
c7f3abc2

Fix quality with `ruff==0.0.253` (#21828) · f95f60c8

Yih-Dar authored Feb 27, 2023



fix quality with ruff 0.0.253
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

f95f60c8

Inheritance-based framework detection (#21784) · 92dfceb1
Joao Gante authored Feb 27, 2023

92dfceb1
Fix gradient checkpointing bug in gptneox (#21815) · 7811bf7e
Karim Foda authored Feb 27, 2023
```
* Fix gradient checkpointing bug in gptneox

* Remove use_cache block
```
7811bf7e
Fix nn.init.trunc_normal_ call on torch.float16 data (#21789) · 0c7f93f5
fxmarty authored Feb 27, 2023
```
fix nn.init.trunc_normal_ call on half data
```
0c7f93f5
Fix PyTorch Perceiver `PerceiverFourierPositionEncoding` with fp16 (#21787) · ebf84f07
fxmarty authored Feb 27, 2023
```
* fix perceiver fp16

* hopefully fix tests
```
ebf84f07
[`tests`] add `accelerate` marker (#21743) · 831f3144
Younes Belkada authored Feb 27, 2023
```
* add `accelerate` marker

* add to docs

* Update docs/source/en/testing.mdx
```
831f3144

[torch] remove deprecated uint8 in favor of bool (#21384) · c51dc4f9

Arthur authored Feb 27, 2023



* uint8 -> bool

* fix copies

* style

* update test modeling commen when checking attention buffers

* style

* use logical not on random mask instead of subtraction with 1

* remove torch uint8

* quality

* remove modified modeling utils

* Update based on review
Co-authored-by: sgugger <sylvain.gugger@gmail.com>

---------
Co-authored-by: sgugger <sylvain.gugger@gmail.com>

c51dc4f9

[Pipeline] Add zero shot audio classificatoin pipeline (#21600) · cc44e72d

Arthur authored Feb 27, 2023



* add pipeline

* update init

* add zero shot to init

* update inits and correct checkpoints

* update base to support input features

* add tests

* Update src/transformers/pipelines/zero_shot_audio_classification.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/pipelines/zero_shot_audio_classification.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* update pieline code

* use tiny checkpoint

* nits and expected value with tiny model

* style

* last nit on tests values

* fix styling

* fix collate fn that was casting t float

* update

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

cc44e72d

[FX tracer] Make `concrete_args` from outside available (#21775) · 2ea1ef90
Tianqi Zhang (张天启) authored Feb 27, 2023
```
make concrete_args from outside available
```
2ea1ef90
Fix en documentation typos (#21799) · ba2a5f13
Thomas Paviot authored Feb 27, 2023
```
* fix wrong url

* typos in english documentation
```
ba2a5f13
Fix type in gpt2 config docstring (#21782) · a3698365
Julian Weber authored Feb 27, 2023
```
Fix docstring gpt2 config
```
a3698365

[examples/summarization] deal with `max_length` and `num_beams` (#21740) · 3c0ce608

bofeng huang authored Feb 27, 2023

* Override the decoding parameters of Seq2SeqTrainer

* Fix quality

* Fix max_length parameter

* Fix quality

* Remove redundant parameter max_length

* Separate the preprocess of train and validation to use different max_target_length

3c0ce608

25 Feb, 2023 1 commit

Fix resume_from_checkpoint for deepspeed (#21735) · 9ddf4f4f

Moshe Berchansky authored Feb 25, 2023



* Fix resume_from_checkpoint for deepspeed

Fix resume_from_checkpoint for deepspeed, by ensuring that the deepspeed engine is the one to load the checkpoint.

* Empty commit to trigger CI

* Removed deepspeed skipping 

Removed deepspeed skipping inside the _load_from_checkpoint function, as it is obsolete

* another adjustment

* Trigger CI

* trigger circleci

* style

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>

9ddf4f4f

24 Feb, 2023 3 commits
- [SpeechT5] Fix HiFiGAN tests (#21788) · 3dae0d7b
  Sanchit Gandhi authored Feb 24, 2023
  
  3dae0d7b
- [GPT2, ProphetNet] Fix gradient checkpointing bug (#21772) · 59c1d5b9
  Yi Heng Lim authored Feb 24, 2023
```
* fix gradient checkpointing bug

* fix gradient checkpointing bug

* ran make fix-copies

* fixed bug

* fixed bug
```
  59c1d5b9
- [time series] updated expected values for integration test. (#21762) · ba0e370d
  Kashif Rasul authored Feb 24, 2023
```
* updated expected

* prediction_length fix

* prediction_length default value

* default prediction_length 24

* revert back prediction_length default

* move prediction_length test
```
  ba0e370d