Commits · 4b6a5a7caa1ea31a3321eb17e6dcc9baff4f55d9 · chenpangpang / transformers

".github/vscode:/vscode.git/clone" did not exist on "4ef85fee718969f1703d7dffa134deb72f4de828"

30 May, 2023 1 commit

[Time-Series] Autoformer model (#21891) · 4b6a5a7c

Eli Simhayev authored May 30, 2023

* ran `transformers-cli add-new-model-like`

* added `AutoformerLayernorm` and `AutoformerSeriesDecomposition`

* added `decomposition_layer` in `init` and `moving_avg` to config

* added `AutoformerAutoCorrelation` to encoder & decoder

* removed caninical self attention `AutoformerAttention`

* added arguments in config and model tester. Init works! 😁

* WIP autoformer attention with autocorrlation

* fixed `attn_weights` size

* wip time_delay_agg_training

* fixing sizes and debug time_delay_agg_training

* aggregation in training works! 😁

* `top_k_delays` -> `top_k_delays_index` and added `contiguous()`

* wip time_delay_agg_inference

* finish time_delay_agg_inference 😎

* added resize to autocorrelation

* bug fix: added the length of the output signal to `irfft`

* `attention_mask = None` in the decoder

* fixed test: changed attention expected size, `test_attention_outputs` works!

* removed unnecessary code

* apply AutoformerLayernorm in final norm in enc & dec

* added series decomposition to the encoder

* added series decomp to decoder, with inputs

* added trend todos

* added autoformer to README

* added to index

* added autoformer.mdx

* remove scaling and init attention_mask in the decoder

* make style

* fix copies

* make fix-copies

* inital fix-copies

* fix from https://github.com/huggingface/transformers/pull/22076



* make style

* fix class names

* added trend

* added d_model and projection layers

* added `trend_projection` source, and decomp layer init

* added trend & seasonal init for decoder input

* AutoformerModel cannot be copied as it has the decomp layer too

* encoder can be copied from time series transformer

* fixed generation and made distrb. out more robust

* use context window to calculate decomposition

* use the context_window for decomposition

* use output_params helper

* clean up AutoformerAttention

* subsequences_length off by 1

* make fix copies

* fix test

* added init for nn.Conv1d

* fix IGNORE_NON_TESTED

* added model_doc

* fix ruff

* ignore tests

* remove dup

* fix SPECIAL_CASES_TO_ALLOW

* do not copy due to conv1d weight init

* remove unused imports

* added short summary

* added label_length and made the model non-autoregressive

* added params docs

* better doc for `factor`

* fix tests

* renamed `moving_avg` to `moving_average`

* renamed `factor` to `autocorrelation_factor`

* make style

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix configurations

* fix integration tests

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fixing `lags_sequence` doc

* Revert "fixing `lags_sequence` doc"

This reverts commit 21e34911e36a6f8f45f25cbf43584a49e5316c55.

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* model layers now take the config

* added `layer_norm_eps` to the config

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* added `config.layer_norm_eps` to AutoformerLayernorm

* added `config.layer_norm_eps` to all layernorm layers

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix variable names

* added inital pretrained model

* added use_cache docstring

* doc strings for trend and use_cache

* fix order of args

* imports on one line

* fixed get_lagged_subsequences docs

* add docstring for create_network_inputs

* get rid of layer_norm_eps config

* add back layernorm

* update fixture location

* fix signature

* use AutoformerModelOutput dataclass

* fix pretrain config

* no need as default exists

* subclass ModelOutput

* remove layer_norm_eps config

* fix test_model_outputs_equivalence test

* test hidden_states_output

* make fix-copies

* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* removed unused attr

* Update tests/models/autoformer/test_modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* use AutoFormerDecoderOutput

* fix formatting

* fix formatting

---------
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

4b6a5a7c

26 May, 2023 7 commits
- Enable code-specific revision for code on the Hub (#23799) · 17a55534
  Sylvain Gugger authored May 26, 2023
```
* Enable code-specific revision for code on the Hub

* invalidate old revision
```
  17a55534
- Log the right train_batch_size if using auto_find_batch_size and also log the... · edf77728
  Zachary Mueller authored May 26, 2023
```
Log the right train_batch_size if using auto_find_batch_size and also log the adjusted value seperately. (#23800)

* Log right bs

* Log

* Diff message
```
  edf77728
- Fix no such file or directory error (#23783) · e7242469
  Ran Ran authored May 26, 2023
```
* Fix no such file or directory error

* Address comment

* Fix formatting issue
```
  e7242469
- no_cuda does not take effect in non distributed environment (#23795) · b7b729b3
  Wang, Yi authored May 26, 2023
```
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
```
  b7b729b3
- Update trainer.mdx class_weights example (#23787) · d61d7476
  amitportnoy authored May 26, 2023
```
class_weights tensor should follow model's device
```
  d61d7476
- Fix RWKV backward on GPU (#23774) · 4d9b76a8
  Sylvain Gugger authored May 26, 2023
  
  4d9b76a8
- [OPT] Doc nit, using fast is fine (#23789) · 8d28dba3
  Arthur authored May 26, 2023
```
small doc nit
```
  8d28dba3
25 May, 2023 9 commits

[`Nllb-Moe`] Fix nllb moe accelerate issue (#23758) · f67dac97
Younes Belkada authored May 25, 2023
```
fix nllb moe accelerate issue
```
f67dac97

Bump tornado from 6.0.4 to 6.3.2 in /examples/research_projects/visual_bert (#23767) · d685e330

dependabot[bot] authored May 25, 2023

Bump tornado in /examples/research_projects/visual_bert

Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.0.4 to 6.3.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst)
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.0.4...v6.3.2

)

---
updated-dependencies:
- dependency-name: tornado
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

d685e330

Bump tornado from 6.0.4 to 6.3.2 in /examples/research_projects/lxmert (#23766) · 4b0e7ded

dependabot[bot] authored May 25, 2023

Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.0.4 to 6.3.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst)
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.0.4...v6.3.2

)

---
updated-dependencies:
- dependency-name: tornado
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

4b0e7ded

Fix is_ninja_available() (#23752) · f04f549b

玩火 authored May 26, 2023

* Fix is_ninja_available()

search ninja using subprocess instead of importlib.

* Fix style

* Fix doc

* Fix style

f04f549b

[LongFormer] code nits, removed unused parameters (#23749) · 3416bba7
Arthur authored May 25, 2023
```
* remove unused parameters

* remove unused parameters in config
```
3416bba7

Revamp test selection for the example tests (#23737) · 6e4bc670

Sylvain Gugger authored May 25, 2023

* Revamp test selection for the example tests

* Rename old XLA test and fake modif in run_glue

* Fixes

* Fake Trainer modif

* Remove fake modifs

6e4bc670

Fix psuh_to_hub in Trainer when nothing needs pushing (#23751) · 7d4fe85e
Sylvain Gugger authored May 25, 2023

7d4fe85e
Add LlamaIndex to awesome-transformers.md (#23484) · 06c28cd0
Ravi Theja authored May 25, 2023

06c28cd0
Fix `pip install --upgrade accelerate` command in modeling_utils.py (#23747) · f0a2a82a
Eric J. Wang authored May 25, 2023
```
Fix command in modeling_utils.py
```
f0a2a82a

24 May, 2023 17 commits

Remove the last few TF serving sigs (#23738) · e45e756d
Matt authored May 24, 2023
```
Remove some more serving methods that (I think?) turned up while this PR was open
```
e45e756d

Enable prompts on the Hub (#23662) · 9850e6dd

Sylvain Gugger authored May 24, 2023



* Enable prompts on the Hub

* Update src/transformers/tools/prompts.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Address review comments

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

9850e6dd

Fix sagemaker DP/MP (#23681) · 75bbf20b

Zachary Mueller authored May 24, 2023

* Check for use_sagemaker_dp

* Add a check for is_sagemaker_mp when setting _n_gpu again. Should be last broken thing

* Try explicit check?

* Quality

75bbf20b

Fix the regex in `get_imports` to support multiline try blocks and excepts... · 89159651

Daniel King authored May 24, 2023

Fix the regex in `get_imports` to support multiline try blocks and excepts with specific exception types (#23725)

* fix and test get_imports for multiline try blocks, and excepts with specific errors

* fixup

* add some more tests

* add license

89159651

[Whisper] Reduce batch size in tests (#23736) · d8222be5
Sanchit Gandhi authored May 24, 2023

d8222be5

Overhaul TF serving signatures + dummy inputs (#23234) · 814de8fa

Matt authored May 24, 2023

* Let's try autodetecting serving sigs

* Don't clobber existing sigs

* Change shapes for multiplechoice models

* Make default dummy inputs smarter too

* Fix missing f-string

* Let's YOLO a serving output too

* Read __class__.__name__ properly

* Don't just pass naked lists in there and expect it to be okay

* Code cleanup

* Update default serving sig

* Clearer error messages

* Further updates to the default serving output

* make fixup

* Update the serving output a bit more

* Cleanups and renames, raise errors appropriately when we can't infer inputs

* More renames

* we're building in a functional context again, yolo

* import DUMMY_INPUTS from the right place

* import DUMMY_INPUTS from the right place

* Support cross-attention in the dummies

* Support cross-attention in the dummies

* Complete removal of dummy/serving overrides in BERT

* Complete removal of dummy/serving overrides in RoBERTa

* Obliterate lots and lots of serving sig and dummy overrides

* merge type hint changes

* Fix for token_type_ids with vocab_size 1

* Add missing property decorator

* Fix T5 and hopefully some models that take conv inputs

* More signature pruning

* Fix T5's signature

* Fix Wav2Vec2 signature

* Fix LongformerForMultipleChoice input signature

* Fix BLIP and LED

* Better default serving output error handling

* Fix BART dummies

* Fix dummies for cross-attention, esp encoder-decoder models

* Fix visionencoderdecoder signature

* Fix BLIP serving output

* Small tweak to BART dummies

* Cleanup the ugly parameter inspection line that I used in a few places

* committed a breakpoint again

* Move the text_dims check

* Remove blip_text serving_output

* Add decoder_input_ids to the default input sig

* Remove all the manual overrides for encoder-decoder model signatures

* Tweak longformer/led input sigs

* Tweak default serving output

* output.keys() -> output

* make fixup

814de8fa

fix: Whisper generate, move text_prompt_ids trim up for max_new_tokens calculation (#23724) · 3d7baef1
Connor Henderson authored May 24, 2023
```
move text_prompt_ids trimming to top
```
3d7baef1
fix: delete duplicate sentences in `document_question_answering.mdx` (#23735) · 50a56bed
Jungnerd authored May 25, 2023
```
fix: delete duplicate sentence
```
50a56bed

TF SAM memory reduction (#23732) · d2d88226

Matt authored May 24, 2023

* Extremely small change to TF SAM dummies to reduce memory usage on build

* remove debug breakpoint

* Debug print statement to track array sizes

* More debug shape printing

* More debug shape printing

* Now remove the debug shape printing

* make fixup

* make fixup

d2d88226

Minor awesome-transformers.md fixes (#23453) · 28aa438c
pagarsky authored May 24, 2023
```
Minor docs fixes
```
28aa438c

Better TF docstring types (#23477) · f8b25744

Matt authored May 24, 2023

* Rework TF type hints to use | None instead of Optional[] for tf.Tensor

* Rework TF type hints to use | None instead of Optional[] for tf.Tensor

* Don't forget the imports

* Add the imports to tests too

* make fixup

* Refactor tests that depended on get_type_hints

* Better test refactor

* Fix an old hidden bug in the test_keras_fit input creation code

* Fix for the Deit tests

f8b25744

fix gptj could not jit.trace in GPU (#23317) · 767e6b53
Wang, Yi authored May 24, 2023
```
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
```
767e6b53

fix: use bool instead of uint8/byte in Deberta/DebertaV2/SEW-D to make it... · b4698b7e

uchuhimo authored May 24, 2023


fix: use bool instead of uint8/byte in Deberta/DebertaV2/SEW-D to make it compatible with TensorRT (#23683)

* Use bool instead of uint8/byte in DebertaV2 to make it compatible with TensorRT

TensorRT cannot accept onnx graph with uint8/byte intermediate tensors. This PR uses bool tensors instead of unit8/byte tensors to make the exported onnx file can work with TensorRT.

* fix: use bool instead of uint8/byte in Deberta and SEW-D

---------
Co-authored-by: Yuxian Qiu <yuxianq@nvidia.com>

b4698b7e

Export to ONNX doc refocused on using optimum, added tflite (#23434) · 2eaaf17a

Maria Khalusova authored May 24, 2023



* doc refocused on using optimum, tflite

* minor updates to fix checks

* Apply suggestions from code review
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>

* TFLite to separate page, added links

* Removed the onnx list builder

* make style

* Update docs/source/en/serialization.mdx
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>

---------
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>

2eaaf17a

Paged Optimizer + Lion Optimizer for Trainer (#23217) · 796162c5

Tim Dettmers authored May 24, 2023



* Added lion and paged optimizers and made original tests pass.

* Added tests for paged and lion optimizers.

* Added and fixed optimizer tests.

* Style and quality checks.

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>

796162c5

4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479) · 9d73b922

Tim Dettmers authored May 24, 2023



* Added lion and paged optimizers and made original tests pass.

* Added tests for paged and lion optimizers.

* Added and fixed optimizer tests.

* Style and quality checks.

* Initial draft. Some tests fail.

* Fixed dtype bug.

* Fixed bug caused by torch_dtype='auto'.

* All test green for 8-bit and 4-bit layers.

* Added fix for fp32 layer norms and bf16 compute in LLaMA.

* Initial draft. Some tests fail.

* Fixed dtype bug.

* Fixed bug caused by torch_dtype='auto'.

* All test green for 8-bit and 4-bit layers.

* Added lion and paged optimizers and made original tests pass.

* Added tests for paged and lion optimizers.

* Added and fixed optimizer tests.

* Style and quality checks.

* Fixing issues for PR #23479.

* Added fix for fp32 layer norms and bf16 compute in LLaMA.

* Reverted variable name change.

* Initial draft. Some tests fail.

* Fixed dtype bug.

* Fixed bug caused by torch_dtype='auto'.

* All test green for 8-bit and 4-bit layers.

* Added lion and paged optimizers and made original tests pass.

* Added tests for paged and lion optimizers.

* Added and fixed optimizer tests.

* Style and quality checks.

* Added missing tests.

* Fixup changes.

* Added fixup changes.

* Missed some variables to rename.

* revert trainer tests

* revert test trainer

* another revert

* fix tests and safety checkers

* protect import

* simplify a bit

* Update src/transformers/trainer.py

* few fixes

* add warning

* replace with `load_in_kbit = load_in_4bit or load_in_8bit`

* fix test

* fix tests

* this time fix tests

* safety checker

* add docs

* revert torch_dtype

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* multiple fixes

* update docs

* version checks and multiple fixes

* replace `is_loaded_in_kbit`

* replace `load_in_kbit`

* change methods names

* better checks

* oops

* oops

* address final comments

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

9d73b922

add GPTJ/bloom/llama/opt into model list and enhance the jit support (#23291) · 33687a3f
Wang, Yi authored May 24, 2023
```
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
```
33687a3f

23 May, 2023 6 commits

Fix some docs what layerdrop does (#23691) · 003a0cf8

zspo authored May 24, 2023



* Fix some docs what layerdrop does

* Update src/transformers/models/data2vec/configuration_data2vec_audio.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix more docs

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

003a0cf8

fix: load_best_model_at_end error when load_in_8bit is True (#23443) · 357f281b

小桐桐 authored May 24, 2023

Ref: https://github.com/huggingface/peft/issues/394
Loading a quantized checkpoint into non-quantized Linear8bitLt is not supported.
call module.cuda() before module.load_state_dict()

357f281b

Skip `TFCvtModelTest::test_keras_fit_mixed_precision` for now (#23699) · de5f86e5
Yih-Dar authored May 23, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
de5f86e5

is_batched fix for remaining 2-D numpy arrays (#23309) · 3d574044

LWprogramming authored May 23, 2023

* Fix is_batched code to allow 2-D numpy arrays for audio

* Tests

* Fix typo

* Incorporate comments from PR #23223

3d574044

[`Blip`] Fix blip doctest (#23698) · 6b7d6f84
Younes Belkada authored May 23, 2023
```
fix blip doctest
```
6b7d6f84

TF version compatibility fixes (#23663) · 876d9a32

Matt authored May 23, 2023

* New TF version compatibility fixes

* Remove dummy print statement, move expand_1d

* Make a proper framework inference function

* Make a proper framework inference function

* ValueError -> TypeError

876d9a32