Commits · 68d53bc7178866821282f45732c1e465f5160fa6 · chenpangpang / transformers

31 May, 2023 12 commits

Fix Trainer when model is loaded on a different GPU (#23792) · 68d53bc7
Sylvain Gugger authored May 31, 2023

68d53bc7
fix(configuration_llama): add `keys_to_ignore_at_inference` to `LlamaConfig` (#23891) · 0963a250
Calico authored May 31, 2023

0963a250
Skip failing test for now · 00f6ba0e
Sylvain Gugger authored May 31, 2023

00f6ba0e

accelerate deepspeed and gradient accumulation integrate (#23236) · a73b1d59

Sourab Mangrulkar authored May 31, 2023

* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix 😅

* resolving comments

* move fsdp handling to accelerate

* fixex

* fix saving

* shift torch dynamo handling to accelerate

* shift deepspeed integration and save & load utils to accelerate

* fix accelerate launcher support

* oops

* fix 🐛

* save ckpt fix

* Trigger CI

* nasty 🐛 😅

* as deepspeed needs grad_acc fixes, transfer grad_acc to accelerate

* make tests happy

* quality ✨

* loss tracked needs to account for grad_acc

* fixing the deepspeed tests

* quality ✨

* 😅😅😅

* tests 😡

* quality ✨



* Trigger CI

* resolve comments and fix the issue with the previous merge from branch

* Trigger CI

* accelerate took over deepspeed integration

---------
Co-authored-by: Stas Bekman <stas@stason.org>

a73b1d59

Add TensorFlow implementation of EfficientFormer (#22620) · 88f50a1e

Denisa Roberts authored May 31, 2023

* Add tf code for efficientformer

* Fix return dict bug - return last hidden state after last stage

* Fix corresponding return dict bug

* Override test tol

* Change default values of training to False

* Set training to default False X3

* Rm axis from ln

* Set init in dense projection

* Rm debug stuff

* Make style; all tests pass.

* Modify year to 2023

* Fix attention biases codes

* Update the shape list logic

* Add a batch norm eps config

* Remove extract comments in test files

* Add conditional attn and hidden states return for serving output

* Change channel dim checking logic

* Add exception for withteacher model in training mode

* Revert layer count for now

* Add layer count for conditional layer naming

* Transpose for conv happens only in main layer

* Make tests smaller

* Make style

* Update doc

* Rm from_pt

* Change to actual expect image class label

* Remove stray print in tests

* Update image processor test

* Remove the old serving output logic

* Make style

* Make style

* Complete test

88f50a1e

Fix last instances of kbit -> quantized (#23797) · 9fea71b4
Sylvain Gugger authored May 31, 2023

9fea71b4
Fix bug leading to missing token in GPTSanJapaneseTokenizer (#23883) · 38dbbc26
Sam Passaglia authored May 31, 2023
```
* add \n

* removed copied from header
```
38dbbc26

shift torch dynamo handling to accelerate (#23168) · 03db5910

Sourab Mangrulkar authored May 31, 2023

* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix 😅

* resolving comments

* move fsdp handling to accelerate

* fixex

* fix saving

* shift torch dynamo handling to accelerate

03db5910

move fsdp handling to accelerate (#23158) · 0b774074

Sourab Mangrulkar authored May 31, 2023

* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix 😅

* resolving comments

* move fsdp handling to accelerate

* fixex

* fix saving

0b774074

🌐

[i18n-KO] Translated `pad_truncation.mdx` to Korean (#23823) · 015829e6

Sohyun Sim authored May 31, 2023



* docs: ko: pad_truncation.mdx

* feat: manual draft

* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

---------
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

015829e6

Smangrul/accelerate ddp integrate (#23151) · 1cf148a6

Sourab Mangrulkar authored May 31, 2023

* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix 😅

* resolving comments

1cf148a6

Smangrul/accelerate mp integrate (#23148) · 9f0646a5

Sourab Mangrulkar authored May 31, 2023

* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* address comments by removing debugging print statements

9f0646a5

30 May, 2023 14 commits

Adds AutoProcessor.from_pretrained support for MCTCTProcessor (#23856) · de9255de
Abhinav Patil authored May 30, 2023
```
Adds support for AutoProcessor.from_pretrained to MCTCTProcessor models
```
de9255de

Editing issue with pickle def with lambda function (#23869) · 6451ad04

George authored May 30, 2023



* Editing issue with pickle def with lambda function

* fix type

* Made helper function private

* delete tab

---------
Co-authored-by: georgebredis <9454-georgebredis@users.noreply.gitlab.aicrowd.com>

6451ad04

[from_pretrained] imporve the error message when `_no_split_modules` is not defined (#23861) · af2aac51

Arthur authored May 30, 2023



* Better warning

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* format line

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

af2aac51

#23388 Issue: Update RoBERTa configuration (#23863) · 58022e41
Vijeth Moudgalya authored May 30, 2023

58022e41

[LlamaTokenizerFast] nit update `post_processor` on the fly (#23855) · 6fc0454b

Arthur authored May 30, 2023

* Update the processor when changing add_eos and add_bos

* fixup

* update

* add a test

* fix failing tests

* fixup

6fc0454b

Update collating_graphormer.py (#23862) · 0623f08e
Clémentine Fourrier authored May 30, 2023

0623f08e

Adds a FlyteCallback (#23759) · 62ba64b9

peridotml authored May 30, 2023



* initial flyte callback

* lint

* logs should still be saved to Flyte even if pandas isn't install (unlikely)

* cr - flyte team

* add docs for Flytecallback

* fix doc string - cr sgugger

* Apply suggestions from code review

cr - sgugger fix doc strings
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

62ba64b9

🌐

[i18n-KO] Translated `troubleshooting.mdx` to Korean (#23166) · 86731667

Hyeonseo Yun authored May 30, 2023



* docs: ko: troubleshooting.mdx

* revised: fix _toctree.yml #23112

* feat: nmt draft `troubleshooting.mdx`

* fix: manual edits `troubleshooting.mdx`

* revised: resolve suggestions troubleshooting.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

---------
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

86731667

[i18n-KO] Translated video_classification.mdx to Korean (#23026) · 192aa047

Kihoon Son authored May 30, 2023



* task/video_classification translated
Co-Authored-By: Hyeonseo Yun <0525_hhgus@naver.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Update docs/source/ko/tasks/video_classification.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>

* Update video_classification.mdx

* Update _toctree.yml

* Update _toctree.yml

* Update _toctree.yml

* Update _toctree.yml

---------
Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

192aa047

🌐

[i18n-KO] Translated `fast_tokenizers.mdx` to Korean (#22956) · a077f710

Kihoon Son authored May 30, 2023



* docs: ko: fast_tokenizer.mdx

content - translated
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Hyeonseo Yun <0525_hhgus@naver.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>

* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* Update docs/source/ko/fast_tokenizers.mdx
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* Update fast_tokenizers.mdx

* Update fast_tokenizers.mdx

* Update fast_tokenizers.mdx

* Update fast_tokenizers.mdx

* Update _toctree.yml

---------
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

a077f710

fix Whisper tests on GPU (#23753) · 2faa0953

Matthijs Hollemans authored May 30, 2023

* move input features to GPU

* skip these tests because undefined behavior

* unskip tests

2faa0953

TF SAM shape flexibility fixes (#23842) · ac224dee
Matt authored May 30, 2023
```
SAM shape flexibility fixes for compilation
```
ac224dee

add type hint in pipeline model argument (#23740) · af45ec0a

Samin Yasar authored May 30, 2023

* add type hint in pipeline model argument

* add pretrainedmodel and tfpretainedmodel type hint

* make type hints string

af45ec0a

[Time-Series] Autoformer model (#21891) · 4b6a5a7c

Eli Simhayev authored May 30, 2023

* ran `transformers-cli add-new-model-like`

* added `AutoformerLayernorm` and `AutoformerSeriesDecomposition`

* added `decomposition_layer` in `init` and `moving_avg` to config

* added `AutoformerAutoCorrelation` to encoder & decoder

* removed caninical self attention `AutoformerAttention`

* added arguments in config and model tester. Init works! 😁

* WIP autoformer attention with autocorrlation

* fixed `attn_weights` size

* wip time_delay_agg_training

* fixing sizes and debug time_delay_agg_training

* aggregation in training works! 😁

* `top_k_delays` -> `top_k_delays_index` and added `contiguous()`

* wip time_delay_agg_inference

* finish time_delay_agg_inference 😎

* added resize to autocorrelation

* bug fix: added the length of the output signal to `irfft`

* `attention_mask = None` in the decoder

* fixed test: changed attention expected size, `test_attention_outputs` works!

* removed unnecessary code

* apply AutoformerLayernorm in final norm in enc & dec

* added series decomposition to the encoder

* added series decomp to decoder, with inputs

* added trend todos

* added autoformer to README

* added to index

* added autoformer.mdx

* remove scaling and init attention_mask in the decoder

* make style

* fix copies

* make fix-copies

* inital fix-copies

* fix from https://github.com/huggingface/transformers/pull/22076



* make style

* fix class names

* added trend

* added d_model and projection layers

* added `trend_projection` source, and decomp layer init

* added trend & seasonal init for decoder input

* AutoformerModel cannot be copied as it has the decomp layer too

* encoder can be copied from time series transformer

* fixed generation and made distrb. out more robust

* use context window to calculate decomposition

* use the context_window for decomposition

* use output_params helper

* clean up AutoformerAttention

* subsequences_length off by 1

* make fix copies

* fix test

* added init for nn.Conv1d

* fix IGNORE_NON_TESTED

* added model_doc

* fix ruff

* ignore tests

* remove dup

* fix SPECIAL_CASES_TO_ALLOW

* do not copy due to conv1d weight init

* remove unused imports

* added short summary

* added label_length and made the model non-autoregressive

* added params docs

* better doc for `factor`

* fix tests

* renamed `moving_avg` to `moving_average`

* renamed `factor` to `autocorrelation_factor`

* make style

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix configurations

* fix integration tests

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fixing `lags_sequence` doc

* Revert "fixing `lags_sequence` doc"

This reverts commit 21e34911e36a6f8f45f25cbf43584a49e5316c55.

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* model layers now take the config

* added `layer_norm_eps` to the config

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* added `config.layer_norm_eps` to AutoformerLayernorm

* added `config.layer_norm_eps` to all layernorm layers

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix variable names

* added inital pretrained model

* added use_cache docstring

* doc strings for trend and use_cache

* fix order of args

* imports on one line

* fixed get_lagged_subsequences docs

* add docstring for create_network_inputs

* get rid of layer_norm_eps config

* add back layernorm

* update fixture location

* fix signature

* use AutoformerModelOutput dataclass

* fix pretrain config

* no need as default exists

* subclass ModelOutput

* remove layer_norm_eps config

* fix test_model_outputs_equivalence test

* test hidden_states_output

* make fix-copies

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* removed unused attr

* Update tests/models/autoformer/test_modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* use AutoFormerDecoderOutput

* fix formatting

* fix formatting

---------
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

4b6a5a7c

26 May, 2023 7 commits
- Enable code-specific revision for code on the Hub (#23799) · 17a55534
  Sylvain Gugger authored May 26, 2023
```
* Enable code-specific revision for code on the Hub

* invalidate old revision
```
  17a55534
- Log the right train_batch_size if using auto_find_batch_size and also log the... · edf77728
  Zachary Mueller authored May 26, 2023
```
Log the right train_batch_size if using auto_find_batch_size and also log the adjusted value seperately. (#23800)

* Log right bs

* Log

* Diff message
```
  edf77728
- Fix no such file or directory error (#23783) · e7242469
  Ran Ran authored May 26, 2023
```
* Fix no such file or directory error

* Address comment

* Fix formatting issue
```
  e7242469
- no_cuda does not take effect in non distributed environment (#23795) · b7b729b3
  Wang, Yi authored May 26, 2023
```
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
```
  b7b729b3
- Update trainer.mdx class_weights example (#23787) · d61d7476
  amitportnoy authored May 26, 2023
```
class_weights tensor should follow model's device
```
  d61d7476
- Fix RWKV backward on GPU (#23774) · 4d9b76a8
  Sylvain Gugger authored May 26, 2023
  
  4d9b76a8
- [OPT] Doc nit, using fast is fine (#23789) · 8d28dba3
  Arthur authored May 26, 2023
```
small doc nit
```
  8d28dba3
25 May, 2023 7 commits

[`Nllb-Moe`] Fix nllb moe accelerate issue (#23758) · f67dac97
Younes Belkada authored May 25, 2023
```
fix nllb moe accelerate issue
```
f67dac97

Bump tornado from 6.0.4 to 6.3.2 in /examples/research_projects/visual_bert (#23767) · d685e330

dependabot[bot] authored May 25, 2023

Bump tornado in /examples/research_projects/visual_bert

Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.0.4 to 6.3.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst)
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.0.4...v6.3.2

)

---
updated-dependencies:
- dependency-name: tornado
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

d685e330

Bump tornado from 6.0.4 to 6.3.2 in /examples/research_projects/lxmert (#23766) · 4b0e7ded

dependabot[bot] authored May 25, 2023

Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.0.4 to 6.3.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst)
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.0.4...v6.3.2

)

---
updated-dependencies:
- dependency-name: tornado
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

4b0e7ded

Fix is_ninja_available() (#23752) · f04f549b

玩火 authored May 26, 2023

* Fix is_ninja_available()

search ninja using subprocess instead of importlib.

* Fix style

* Fix doc

* Fix style

f04f549b

[LongFormer] code nits, removed unused parameters (#23749) · 3416bba7
Arthur authored May 25, 2023
```
* remove unused parameters

* remove unused parameters in config
```
3416bba7

Revamp test selection for the example tests (#23737) · 6e4bc670

Sylvain Gugger authored May 25, 2023

* Revamp test selection for the example tests

* Rename old XLA test and fake modif in run_glue

* Fixes

* Fake Trainer modif

* Remove fake modifs

6e4bc670

Fix psuh_to_hub in Trainer when nothing needs pushing (#23751) · 7d4fe85e
Sylvain Gugger authored May 25, 2023

7d4fe85e