Commits · 07c54413ac24f891fc37920f6c61ad8b7b035dc3 · chenpangpang / transformers

02 Jun, 2023 2 commits

Shehan Munasinghe authored Jun 02, 2023



* generated code from add-new-model-like

* Add code for modeling, config, and weight conversion

* add tests for image-classification, update modeling and config

* add code, tests for semantic-segmentation

* make style, make quality, make fix-copies

* make fix-copies

* Update modeling_mobilevitv2.py

fix bugs

* Update _toctree.yml

* update modeling, config

fix bugs

* Edit docs - fix bug MobileViTv2v2 -> MobileViTv2

* Update mobilevitv2.mdx

* update docstrings

* Update configuration_mobilevitv2.py

make style

* Update convert_mlcvnets_to_pytorch.py

remove unused options

* Update convert_mlcvnets_to_pytorch.py

make style

* Add suggestions from code review
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* make style, make quality

* Add suggestions from code review
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Add suggestions from code review

Remove MobileViTv2ImageProcessor
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* make style

* Add suggestions from code review

Rename MobileViTv2 -> MobileViTV2
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Add suggestions from code review
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update modeling_mobilevitv2.py

make style

* Update serialization.mdx

* Update modeling_mobilevitv2.py

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

07c54413

[MMS] Scaling Speech Technology to 1,000+ Languages | Add attention adapter to Wav2Vec2 (#23813) · 5dfd407b

Patrick von Platen authored Jun 02, 2023



* add fine-tuned with adapter layer

* Add set_target_lang to tokenizer

* Implement load adapter

* add tests

* make style

* Apply suggestions from code review

* Update src/transformers/models/wav2vec2/tokenization_wav2vec2.py

* make fix-copies

* Apply suggestions from code review

* make fix-copies

* make style again

* mkae style again

* fix doc string

* Update tests/models/wav2vec2/test_tokenization_wav2vec2.py

* Apply suggestions from code review

* fix

* Correct wav2vec2 adapter

* mkae style

* Update src/transformers/models/wav2vec2/modeling_wav2vec2.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* add more nice docs

* finish

* finish

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions from code review

* all finish

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

5dfd407b

01 Jun, 2023 12 commits

Fix `ReduceLROnPlateau` object has no attribute 'get_last_lr' (#23944) · f49a3453
wasupandceacar authored Jun 02, 2023
```
* Fix 'ReduceLROnPlateau' object has no attribute 'get_last_lr'

* fix style
```
f49a3453
use _make_causal_mask in clip/vit models (#23942) · c62b01d0
Kashif Rasul authored Jun 01, 2023
```
use _make_causal_mask in clip models
```
c62b01d0

Modify device_map behavior when loading a model using from_pretrained (#23922) · e03a9cc0

Marc Sun authored Jun 01, 2023



* Modify device map behavior for 4/8 bits model

* Remove device_map arg for training 4/8 bit model

* Remove index
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add Exceptions

* Modify comment
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix formatting

* Get current device with accelerate

* Revert "Get current device with accelerate"

This reverts commit 46f00799103bbe15bd58762ba029aab35363c4f7.

* Fix Exception

* Modify quantization doc

* Fix error
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

e03a9cc0

#23675 Registering Malay language (#23689) · d1fa349e

Brendon Soong authored Jun 02, 2023

* #23675 Registering Malay language

* removing untranslated files

* some translate

* more updates to toctree

* inc index

* additional translations for toctree

* translations of more sections

* removing untranslated file

* translated index.mdx to malay

d1fa349e

Revert "Update stale.yml to use HuggingFaceBot" (#23943) · dc67da01
Lysandre Debut authored Jun 01, 2023
```
Revert "Update stale.yml to use HuggingFaceBot (#23941)"

This reverts commit 5929f86e.
```
dc67da01
Make TF ESM inv_freq non-trainable like PyTorch (#23940) · 8088ca41
Matt authored Jun 01, 2023
```
Make TF inv_freq non-trainable like PyTorch
```
8088ca41
Update stale.yml to use HuggingFaceBot (#23941) · 5929f86e
Lysandre Debut authored Jun 01, 2023

5929f86e
rename DocumentQuestionAnsweringTool parameter input to match docstring (#23939) · 857d4e1c
Adam Lewis authored Jun 01, 2023
```
rename encode input to match docstring
```
857d4e1c
Pin rhoknp (#23937) · 91931882
Sylvain Gugger authored Jun 01, 2023

91931882
Fix doc string nits (#23929) · af2c3679
Sheon Han authored Jun 01, 2023

af2c3679
Effectively allow `encoder_outputs` input to be a tuple in pix2struct (#23932) · 9a35a7b9
fxmarty authored Jun 01, 2023
```
consistentcy
```
9a35a7b9
[Flax Whisper] Update decode docstring (#23908) · 9603ef89
Sanchit Gandhi authored Jun 01, 2023

9603ef89

31 May, 2023 26 commits

Skip device placement for past key values in decoder models (#23919) · fabe17a7
Sylvain Gugger authored May 31, 2023

fabe17a7
[PushToHub] Make it possible to upload folders (#23920) · 6affd9cd
NielsRogge authored May 31, 2023
```
Add first draft
```
6affd9cd
Update the update metadata job to use upload_folder (#23917) · 4aa13224
Sylvain Gugger authored May 31, 2023

4aa13224

Re-enable squad test (#23912) · 3ff443a6

Sylvain Gugger authored May 31, 2023

* Re-enable squad test

* [all-test]

* [all-test] Fix all test command

* Fix the all-test

3ff443a6

remove the extra `accelerator.prepare` (#23914) · d13021e3
Sourab Mangrulkar authored May 31, 2023
```
remove the extra `accelerator.prepare` that slipped in with multiple update from main 😅
```
d13021e3
Bug fix - flip_channel_order for channels first images (#23701) · c608b8fc
amyeroberts authored May 31, 2023
```
Bug fix - flip_channel_order for channels_first
```
c608b8fc
Empty circleci config (#23913) · 0b3d092f
Sylvain Gugger authored May 31, 2023
```
* Try easy first

* Add an empty job

* Fix name

* Fix method
```
0b3d092f
Raise error if loss can't be calculated - ViT MIM (#23872) · 8714b964
amyeroberts authored May 31, 2023
```
Raise error if loss can't be calculated
```
8714b964
add conditional statement for auxiliary loss calculation (#23899) · 404d9253
Hari authored May 31, 2023
```
* add conditional statement for auxiliary loss calculation

* fix style and copies
```
404d9253
[`RWKV`] Fix RWKV 4bit (#23910) · c63bfc30
Younes Belkada authored May 31, 2023
```
fix RWKV 4bit
```
c63bfc30
Upgrade safetensors version (#23911) · 55451c66
Zachary Mueller authored May 31, 2023
```
* Upgrade safetensors

* Second table
```
55451c66

fix: Replace `add_prefix_space` in `get_prompt_ids` with manual space for... · 7adce8b5

Connor Henderson authored May 31, 2023

fix: Replace `add_prefix_space` in `get_prompt_ids` with manual space for FastTokenizer compatibility (#23796)

* add ' ' replacement for add_prefix_space

* add fast tokenizer test

7adce8b5

Move import check to before state reset (#23906) · 84bac652
Zachary Mueller authored May 31, 2023
```
* Move import check to before state reset

* Guard better
```
84bac652
[`bnb`] add warning when no linear (#23894) · e42869b0
Younes Belkada authored May 31, 2023
```
* add warning for gpt2-like models

* more details

* adapt from suggestions
```
e42869b0

Unpin numba (#23162) · 8f915c45

Sanchit Gandhi authored May 31, 2023

* fix for ragged list

* unpin numba

* make style

* np.object -> object

* propagate changes to tokenizer as well

* np.long -> "long"

* revert tokenization changes

* check with tokenization changes

* list/tuple logic

* catch numpy

* catch else case

* clean up

* up

* better check

* trigger ci

* Empty commit to trigger CI

8f915c45

ensure banned_mask and indices in same device (#23901) · d99f11e8

Xinyu Yang authored May 31, 2023

* ensure banned_mask and indices in same device

* ensure banned_mask and indices in same device

switch the order in which indices and banned_mask are created and create banned_mask on the proper device

d99f11e8

Support shared tensors (#23871) · d68d6665

Thomas Wang authored May 31, 2023



* Suport shared storage

* Really be sure we have the same storage

* Make style

* - Refactor storage identifier mechanism
 - Group everything into a single for loop

* Make style

* PR

* make style

* Update src/transformers/pytorch_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

d68d6665

Fix Trainer when model is loaded on a different GPU (#23792) · 68d53bc7
Sylvain Gugger authored May 31, 2023

68d53bc7
fix(configuration_llama): add `keys_to_ignore_at_inference` to `LlamaConfig` (#23891) · 0963a250
Calico authored May 31, 2023

0963a250
Skip failing test for now · 00f6ba0e
Sylvain Gugger authored May 31, 2023

00f6ba0e

accelerate deepspeed and gradient accumulation integrate (#23236) · a73b1d59

Sourab Mangrulkar authored May 31, 2023

* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix 😅

* resolving comments

* move fsdp handling to accelerate

* fixex

* fix saving

* shift torch dynamo handling to accelerate

* shift deepspeed integration and save & load utils to accelerate

* fix accelerate launcher support

* oops

* fix 🐛

* save ckpt fix

* Trigger CI

* nasty 🐛 😅

* as deepspeed needs grad_acc fixes, transfer grad_acc to accelerate

* make tests happy

* quality ✨

* loss tracked needs to account for grad_acc

* fixing the deepspeed tests

* quality ✨

* 😅😅😅

* tests 😡

* quality ✨



* Trigger CI

* resolve comments and fix the issue with the previous merge from branch

* Trigger CI

* accelerate took over deepspeed integration

---------
Co-authored-by: Stas Bekman <stas@stason.org>

a73b1d59

Add TensorFlow implementation of EfficientFormer (#22620) · 88f50a1e

Denisa Roberts authored May 31, 2023

* Add tf code for efficientformer

* Fix return dict bug - return last hidden state after last stage

* Fix corresponding return dict bug

* Override test tol

* Change default values of training to False

* Set training to default False X3

* Rm axis from ln

* Set init in dense projection

* Rm debug stuff

* Make style; all tests pass.

* Modify year to 2023

* Fix attention biases codes

* Update the shape list logic

* Add a batch norm eps config

* Remove extract comments in test files

* Add conditional attn and hidden states return for serving output

* Change channel dim checking logic

* Add exception for withteacher model in training mode

* Revert layer count for now

* Add layer count for conditional layer naming

* Transpose for conv happens only in main layer

* Make tests smaller

* Make style

* Update doc

* Rm from_pt

* Change to actual expect image class label

* Remove stray print in tests

* Update image processor test

* Remove the old serving output logic

* Make style

* Make style

* Complete test

88f50a1e

Fix last instances of kbit -> quantized (#23797) · 9fea71b4
Sylvain Gugger authored May 31, 2023

9fea71b4
Fix bug leading to missing token in GPTSanJapaneseTokenizer (#23883) · 38dbbc26
Sam Passaglia authored May 31, 2023
```
* add \n

* removed copied from header
```
38dbbc26

shift torch dynamo handling to accelerate (#23168) · 03db5910

Sourab Mangrulkar authored May 31, 2023

* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix 😅

* resolving comments

* move fsdp handling to accelerate

* fixex

* fix saving

* shift torch dynamo handling to accelerate

03db5910

move fsdp handling to accelerate (#23158) · 0b774074

Sourab Mangrulkar authored May 31, 2023

* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix 😅

* resolving comments

* move fsdp handling to accelerate

* fixex

* fix saving

0b774074