Commits · 3ff443a6d96ec29f7e7b395db90ba4de558ac3f3 · chenpangpang / transformers

"examples/vscode:/vscode.git/clone" did not exist on "725a56329d304bb997fc3cf3013c908a20ae50a0"

31 May, 2023 26 commits

Re-enable squad test (#23912) · 3ff443a6

Sylvain Gugger authored May 31, 2023

* Re-enable squad test

* [all-test]

* [all-test] Fix all test command

* Fix the all-test

3ff443a6

remove the extra `accelerator.prepare` (#23914) · d13021e3
Sourab Mangrulkar authored May 31, 2023
```
remove the extra `accelerator.prepare` that slipped in with multiple update from main 😅
```
d13021e3
Bug fix - flip_channel_order for channels first images (#23701) · c608b8fc
amyeroberts authored May 31, 2023
```
Bug fix - flip_channel_order for channels_first
```
c608b8fc
Empty circleci config (#23913) · 0b3d092f
Sylvain Gugger authored May 31, 2023
```
* Try easy first

* Add an empty job

* Fix name

* Fix method
```
0b3d092f
Raise error if loss can't be calculated - ViT MIM (#23872) · 8714b964
amyeroberts authored May 31, 2023
```
Raise error if loss can't be calculated
```
8714b964
add conditional statement for auxiliary loss calculation (#23899) · 404d9253
Hari authored May 31, 2023
```
* add conditional statement for auxiliary loss calculation

* fix style and copies
```
404d9253
[`RWKV`] Fix RWKV 4bit (#23910) · c63bfc30
Younes Belkada authored May 31, 2023
```
fix RWKV 4bit
```
c63bfc30
Upgrade safetensors version (#23911) · 55451c66
Zachary Mueller authored May 31, 2023
```
* Upgrade safetensors

* Second table
```
55451c66

fix: Replace `add_prefix_space` in `get_prompt_ids` with manual space for... · 7adce8b5

Connor Henderson authored May 31, 2023

fix: Replace `add_prefix_space` in `get_prompt_ids` with manual space for FastTokenizer compatibility (#23796)

* add ' ' replacement for add_prefix_space

* add fast tokenizer test

7adce8b5

Move import check to before state reset (#23906) · 84bac652
Zachary Mueller authored May 31, 2023
```
* Move import check to before state reset

* Guard better
```
84bac652
[`bnb`] add warning when no linear (#23894) · e42869b0
Younes Belkada authored May 31, 2023
```
* add warning for gpt2-like models

* more details

* adapt from suggestions
```
e42869b0

Unpin numba (#23162) · 8f915c45

Sanchit Gandhi authored May 31, 2023

* fix for ragged list

* unpin numba

* make style

* np.object -> object

* propagate changes to tokenizer as well

* np.long -> "long"

* revert tokenization changes

* check with tokenization changes

* list/tuple logic

* catch numpy

* catch else case

* clean up

* up

* better check

* trigger ci

* Empty commit to trigger CI

8f915c45

ensure banned_mask and indices in same device (#23901) · d99f11e8

Xinyu Yang authored May 31, 2023

* ensure banned_mask and indices in same device

* ensure banned_mask and indices in same device

switch the order in which indices and banned_mask are created and create banned_mask on the proper device

d99f11e8

Support shared tensors (#23871) · d68d6665

Thomas Wang authored May 31, 2023



* Suport shared storage

* Really be sure we have the same storage

* Make style

* - Refactor storage identifier mechanism
 - Group everything into a single for loop

* Make style

* PR

* make style

* Update src/transformers/pytorch_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

d68d6665

Fix Trainer when model is loaded on a different GPU (#23792) · 68d53bc7
Sylvain Gugger authored May 31, 2023

68d53bc7
fix(configuration_llama): add `keys_to_ignore_at_inference` to `LlamaConfig` (#23891) · 0963a250
Calico authored May 31, 2023

0963a250
Skip failing test for now · 00f6ba0e
Sylvain Gugger authored May 31, 2023

00f6ba0e

accelerate deepspeed and gradient accumulation integrate (#23236) · a73b1d59

Sourab Mangrulkar authored May 31, 2023

* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix 😅

* resolving comments

* move fsdp handling to accelerate

* fixex

* fix saving

* shift torch dynamo handling to accelerate

* shift deepspeed integration and save & load utils to accelerate

* fix accelerate launcher support

* oops

* fix 🐛

* save ckpt fix

* Trigger CI

* nasty 🐛 😅

* as deepspeed needs grad_acc fixes, transfer grad_acc to accelerate

* make tests happy

* quality ✨

* loss tracked needs to account for grad_acc

* fixing the deepspeed tests

* quality ✨

* 😅😅😅

* tests 😡

* quality ✨



* Trigger CI

* resolve comments and fix the issue with the previous merge from branch

* Trigger CI

* accelerate took over deepspeed integration

---------
Co-authored-by: Stas Bekman <stas@stason.org>

a73b1d59

Add TensorFlow implementation of EfficientFormer (#22620) · 88f50a1e

Denisa Roberts authored May 31, 2023

* Add tf code for efficientformer

* Fix return dict bug - return last hidden state after last stage

* Fix corresponding return dict bug

* Override test tol

* Change default values of training to False

* Set training to default False X3

* Rm axis from ln

* Set init in dense projection

* Rm debug stuff

* Make style; all tests pass.

* Modify year to 2023

* Fix attention biases codes

* Update the shape list logic

* Add a batch norm eps config

* Remove extract comments in test files

* Add conditional attn and hidden states return for serving output

* Change channel dim checking logic

* Add exception for withteacher model in training mode

* Revert layer count for now

* Add layer count for conditional layer naming

* Transpose for conv happens only in main layer

* Make tests smaller

* Make style

* Update doc

* Rm from_pt

* Change to actual expect image class label

* Remove stray print in tests

* Update image processor test

* Remove the old serving output logic

* Make style

* Make style

* Complete test

88f50a1e

Fix last instances of kbit -> quantized (#23797) · 9fea71b4
Sylvain Gugger authored May 31, 2023

9fea71b4
Fix bug leading to missing token in GPTSanJapaneseTokenizer (#23883) · 38dbbc26
Sam Passaglia authored May 31, 2023
```
* add \n

* removed copied from header
```
38dbbc26

shift torch dynamo handling to accelerate (#23168) · 03db5910

Sourab Mangrulkar authored May 31, 2023

* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix 😅

* resolving comments

* move fsdp handling to accelerate

* fixex

* fix saving

* shift torch dynamo handling to accelerate

03db5910

move fsdp handling to accelerate (#23158) · 0b774074

Sourab Mangrulkar authored May 31, 2023

* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix 😅

* resolving comments

* move fsdp handling to accelerate

* fixex

* fix saving

0b774074

🌐

[i18n-KO] Translated `pad_truncation.mdx` to Korean (#23823) · 015829e6

Sohyun Sim authored May 31, 2023



* docs: ko: pad_truncation.mdx

* feat: manual draft

* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

---------
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

015829e6

Smangrul/accelerate ddp integrate (#23151) · 1cf148a6

Sourab Mangrulkar authored May 31, 2023

* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix 😅

* resolving comments

1cf148a6

Smangrul/accelerate mp integrate (#23148) · 9f0646a5

Sourab Mangrulkar authored May 31, 2023

* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* address comments by removing debugging print statements

9f0646a5

30 May, 2023 14 commits

Adds AutoProcessor.from_pretrained support for MCTCTProcessor (#23856) · de9255de
Abhinav Patil authored May 30, 2023
```
Adds support for AutoProcessor.from_pretrained to MCTCTProcessor models
```
de9255de

Editing issue with pickle def with lambda function (#23869) · 6451ad04

George authored May 30, 2023



* Editing issue with pickle def with lambda function

* fix type

* Made helper function private

* delete tab

---------
Co-authored-by: georgebredis <9454-georgebredis@users.noreply.gitlab.aicrowd.com>

6451ad04

[from_pretrained] imporve the error message when `_no_split_modules` is not defined (#23861) · af2aac51

Arthur authored May 30, 2023



* Better warning

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* format line

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

af2aac51

#23388 Issue: Update RoBERTa configuration (#23863) · 58022e41
Vijeth Moudgalya authored May 30, 2023

58022e41

[LlamaTokenizerFast] nit update `post_processor` on the fly (#23855) · 6fc0454b

Arthur authored May 30, 2023

* Update the processor when changing add_eos and add_bos

* fixup

* update

* add a test

* fix failing tests

* fixup

6fc0454b

Update collating_graphormer.py (#23862) · 0623f08e
Clémentine Fourrier authored May 30, 2023

0623f08e

Adds a FlyteCallback (#23759) · 62ba64b9

peridotml authored May 30, 2023



* initial flyte callback

* lint

* logs should still be saved to Flyte even if pandas isn't install (unlikely)

* cr - flyte team

* add docs for Flytecallback

* fix doc string - cr sgugger

* Apply suggestions from code review

cr - sgugger fix doc strings
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

62ba64b9

🌐

[i18n-KO] Translated `troubleshooting.mdx` to Korean (#23166) · 86731667

Hyeonseo Yun authored May 30, 2023



* docs: ko: troubleshooting.mdx

* revised: fix _toctree.yml #23112

* feat: nmt draft `troubleshooting.mdx`

* fix: manual edits `troubleshooting.mdx`

* revised: resolve suggestions troubleshooting.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

---------
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

86731667

[i18n-KO] Translated video_classification.mdx to Korean (#23026) · 192aa047

Kihoon Son authored May 30, 2023



* task/video_classification translated
Co-Authored-By: Hyeonseo Yun <0525_hhgus@naver.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>

* Update video_classification.mdx

* Update _toctree.yml

* Update _toctree.yml

* Update _toctree.yml

* Update _toctree.yml

---------
Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

192aa047

🌐

[i18n-KO] Translated `fast_tokenizers.mdx` to Korean (#22956) · a077f710

Kihoon Son authored May 30, 2023



* docs: ko: fast_tokenizer.mdx

content - translated
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Hyeonseo Yun <0525_hhgus@naver.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>

* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* Update fast_tokenizers.mdx

* Update fast_tokenizers.mdx

* Update fast_tokenizers.mdx

* Update fast_tokenizers.mdx

* Update _toctree.yml

---------
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

a077f710

fix Whisper tests on GPU (#23753) · 2faa0953

Matthijs Hollemans authored May 30, 2023

* move input features to GPU

* skip these tests because undefined behavior

* unskip tests

2faa0953

TF SAM shape flexibility fixes (#23842) · ac224dee
Matt authored May 30, 2023
```
SAM shape flexibility fixes for compilation
```
ac224dee

add type hint in pipeline model argument (#23740) · af45ec0a

Samin Yasar authored May 30, 2023

* add type hint in pipeline model argument

* add pretrainedmodel and tfpretainedmodel type hint

* make type hints string

af45ec0a

[Time-Series] Autoformer model (#21891) · 4b6a5a7c

Eli Simhayev authored May 30, 2023

* ran `transformers-cli add-new-model-like`

* added `AutoformerLayernorm` and `AutoformerSeriesDecomposition`

* added `decomposition_layer` in `init` and `moving_avg` to config

* added `AutoformerAutoCorrelation` to encoder & decoder

* removed caninical self attention `AutoformerAttention`

* added arguments in config and model tester. Init works! 😁

* WIP autoformer attention with autocorrlation

* fixed `attn_weights` size

* wip time_delay_agg_training

* fixing sizes and debug time_delay_agg_training

* aggregation in training works! 😁

* `top_k_delays` -> `top_k_delays_index` and added `contiguous()`

* wip time_delay_agg_inference

* finish time_delay_agg_inference 😎

* added resize to autocorrelation

* bug fix: added the length of the output signal to `irfft`

* `attention_mask = None` in the decoder

* fixed test: changed attention expected size, `test_attention_outputs` works!

* removed unnecessary code

* apply AutoformerLayernorm in final norm in enc & dec

* added series decomposition to the encoder

* added series decomp to decoder, with inputs

* added trend todos

* added autoformer to README

* added to index

* added autoformer.mdx

* remove scaling and init attention_mask in the decoder

* make style

* fix copies

* make fix-copies

* inital fix-copies

* fix from https://github.com/huggingface/transformers/pull/22076



* make style

* fix class names

* added trend

* added d_model and projection layers

* added `trend_projection` source, and decomp layer init

* added trend & seasonal init for decoder input

* AutoformerModel cannot be copied as it has the decomp layer too

* encoder can be copied from time series transformer

* fixed generation and made distrb. out more robust

* use context window to calculate decomposition

* use the context_window for decomposition

* use output_params helper

* clean up AutoformerAttention

* subsequences_length off by 1

* make fix copies

* fix test

* added init for nn.Conv1d

* fix IGNORE_NON_TESTED

* added model_doc

* fix ruff

* ignore tests

* remove dup

* fix SPECIAL_CASES_TO_ALLOW

* do not copy due to conv1d weight init

* remove unused imports

* added short summary

* added label_length and made the model non-autoregressive

* added params docs

* better doc for `factor`

* fix tests

* renamed `moving_avg` to `moving_average`

* renamed `factor` to `autocorrelation_factor`

* make style

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix configurations

* fix integration tests

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fixing `lags_sequence` doc

* Revert "fixing `lags_sequence` doc"

This reverts commit 21e34911e36a6f8f45f25cbf43584a49e5316c55.

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* model layers now take the config

* added `layer_norm_eps` to the config

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* added `config.layer_norm_eps` to AutoformerLayernorm

* added `config.layer_norm_eps` to all layernorm layers

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix variable names

* added inital pretrained model

* added use_cache docstring

* doc strings for trend and use_cache

* fix order of args

* imports on one line

* fixed get_lagged_subsequences docs

* add docstring for create_network_inputs

* get rid of layer_norm_eps config

* add back layernorm

* update fixture location

* fix signature

* use AutoformerModelOutput dataclass

* fix pretrain config

* no need as default exists

* subclass ModelOutput

* remove layer_norm_eps config

* fix test_model_outputs_equivalence test

* test hidden_states_output

* make fix-copies

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* removed unused attr

* Update tests/models/autoformer/test_modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* use AutoFormerDecoderOutput

* fix formatting

* fix formatting

---------
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

4b6a5a7c