Commits · 74c9cfeaa764400a784c9cf82c657a1e0225706c · chenpangpang / transformers

30 Jan, 2024 11 commits

Matt authored Jan 30, 2024



* Pin torch to <2.2.0

* Pin torchvision and torchaudio as well

* Playing around with versions to see if this helps

* twiddle something to restart the CI

* twiddle it back

* Try changing the natten version

* make fixup

* Revert "Try changing the natten version"

This reverts commit de0d6592c35dc39ae8b5a616c27285db28262d06.

* make fixup

* fix fix fix

* fix fix fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

74c9cfea

Add tf_keras imports to prepare for Keras 3 (#28588) · 415e9a09

Matt authored Jan 30, 2024

* Port core files + ESM (because ESM code is odd)

* Search-replace in modelling code

* Fix up transfo_xl as well

* Fix other core files + tests (still need to add correct import to tests)

* Fix cookiecutter

* make fixup, fix imports in some more core files

* Auto-add imports to tests

* Cleanup, add imports to sagemaker tests

* Use correct exception for importing tf_keras

* Fixes in modeling_tf_utils

* make fixup

* Correct version parsing code

* Ensure the pipeline tests correctly revert to float32 after each test

* Ensure the pipeline tests correctly revert to float32 after each test

* More tf.keras -> keras

* Add dtype cast

* Better imports of tf_keras

* Add a cast for tf.assign, just in case

* Fix callback imports

415e9a09

Task-specific pipeline init args (#28439) · 1d489b3e

amyeroberts authored Jan 30, 2024

* Abstract out pipeline init args

* Address PR comments

* Reword

* BC PIPELINE_INIT_ARGS

* Remove old arguments

* Small fix

1d489b3e

[`Backbone`] Use `load_backbone` instead of `AutoBackbone.from_config` (#28661) · 2fa1c808

amyeroberts authored Jan 30, 2024

* Enable instantiating model with pretrained backbone weights

* Remove doc updates until changes made in modeling code

* Use load_backbone instead

* Add use_timm_backbone to the model configs

* Add missing imports and arguments

* Update docstrings

* Make sure test is properly configured

* Include recent DPT updates

2fa1c808

Further pin pytest version (in a temporary way) (#28780) · c24c5245
Yih-Dar authored Jan 30, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
c24c5245
Fix transformers.utils.fx compatibility with torch<2.0 (#28774) · 6f7d5db5
fxmarty authored Jan 30, 2024
```
guard sdpa on torch>=2.0
```
6f7d5db5

Use Conv1d for TDNN (#25728) · 5c8d941d

Thien Tran authored Jan 30, 2024

* use conv for tdnn

* run make fixup

* update TDNN

* add PEFT LoRA check

* propagate tdnn warnings to others

* add missing imports

* update TDNN in wav2vec2_bert

* add missing imports

5c8d941d

[`HfQuantizer`] Move it to "Developper guides" (#28768) · 866253f8
Younes Belkada authored Jan 30, 2024
```
Update _toctree.yml
```
866253f8

`HfQuantizer` class for quantization-related stuff in `modeling_utils.py` (#26610) · d78e78a0

Poedator authored Jan 30, 2024



* squashed earlier commits for easier rebase

* rm rebase leftovers

* 4bit save enabled @quantizers

* TMP gptq test use exllama

* fix AwqConfigTest::test_wrong_backend for A100

* quantizers AWQ fixes

* _load_pretrained_model low_cpu_mem_usage branch

* quantizers style

* remove require_low_cpu_mem_usage attr

* rm dtype arg from process_model_before_weight_loading

* rm config_origin from Q-config

* rm inspect from q_config

* fixed docstrings in QuantizationConfigParser

* logger.warning fix

* mv is_loaded_in_4(8)bit to BnbHFQuantizer

* is_accelerate_available error msg fix in quantizer

* split is_model_trainable in bnb quantizer class

* rm llm_int8_skip_modules as separate var in Q

* Q rm todo

* fwd ref to HFQuantizer in type hint

* rm note re optimum.gptq.GPTQQuantizer

* quantization_config in __init__ simplified

* replaced NonImplemented with  create_quantized_param

* rm load_in_4/8_bit deprecation warning

* QuantizationConfigParser refactoring

* awq-related minor changes

* awq-related changes

* awq config.modules_to_not_convert

* raise error if no q-method in q-config in args

* minor cleanup

* awq quantizer docstring

* combine common parts in bnb process_model_before_weight_loading

* revert test_gptq

* .process_model_ cleanup

* restore dict config warning

* removed typevars in quantizers.py

* cleanup post-rebase 16 jan

* QuantizationConfigParser classmethod refactor

* rework of handling of unexpected aux elements of bnb weights

* moved q-related stuff from save_pretrained to quantizers

* refactor v1

* more changes

* fix some tests

* remove it from main init

* ooops

* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* fix awq issues

* fix

* fix

* fix

* fix

* fix

* fix

* add docs

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/hf_quantizer.md

* address comments

* fix

* fixup

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* address final comment

* update

* Update src/transformers/quantizers/base.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/quantizers/auto.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

* add kwargs update

* fixup

* add `optimum_quantizer` attribute

* oops

* rm unneeded file

* fix doctests

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

d78e78a0

Move CLIP _no_split_modules to CLIPPreTrainedModel (#27841) · 1f5590d3
Zhan Ling authored Jan 30, 2024
```
Add _no_split_modules to CLIPModel
```
1f5590d3

Don't allow passing `load_in_8bit` and `load_in_4bit` at the same time (#28266) · a989c6c6

Omar Sanseviero authored Jan 30, 2024



* Update quantization_config.py

* Style

* Protect from setting directly

* add tests

* Update tests/quantization/bnb/test_4bit.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

a989c6c6

29 Jan, 2024 13 commits

Add French translation: french README.md (#28696) · cd2eb8cb

ThibaultLengagne authored Jan 29, 2024



* doc: french README
Signed-off-by: ThibaultLengagne <thibaultl@padok.fr>

* doc: Add Depth Anything
Signed-off-by: ThibaultLengagne <thibaultl@padok.fr>

* doc: Add french link in other docs
Signed-off-by: ThibaultLengagne <thibaultl@padok.fr>

* doc: Add missing links in fr docs

* doc: fix several mistakes in translation
Signed-off-by: ThibaultLengagne <thibaultl@padok.fr>

---------
Signed-off-by: ThibaultLengagne <thibaultl@padok.fr>
Co-authored-by: Sarapuce <alexandreh@padok.fr>

cd2eb8cb

Support saving only PEFT adapter in checkpoints when using PEFT + FSDP (#28297) · a055d09e

Ajay Patel authored Jan 29, 2024

* Update trainer.py

* Revert "Update trainer.py"

This reverts commit 0557e2cc9effa3a41304322032239a3874b948a7.

* Make trainer.py use adapter_only=True when using FSDP + PEFT

* Support load_best_model with adapter_only=True

* Ruff format

* Inspect function args for save_ load_ fsdp utility functions and only pass adapter_only=True if they support it

a055d09e

[Whisper] Make tokenizer normalization public (#28136) · da3c79b2
Sanchit Gandhi authored Jan 29, 2024
```
* [Whisper] Make tokenizer normalization public

* add to docs
```
da3c79b2
Fix typo of `Block`. (#28727) · e694e985
xkszltl authored Jan 29, 2024

e694e985
Mark test_constrained_beam_search_generate as flaky (#28757) · 9e8f35fa
amyeroberts authored Jan 29, 2024
```
* Make test_constrained_beam_search_generate as flaky

* Update tests/generation/test_utils.py
```
9e8f35fa
Pin pytest version <8.0.0 (#28758) · 0f8d015a
amyeroberts authored Jan 29, 2024
```
* Pin pytest version <8.0.0

* Update setup.py

* make deps_table_update
```
0f8d015a
small doc update for CamemBERT (#28644) · 26aa03a2
Julien Chaumond authored Jan 29, 2024

26aa03a2

Enable Gradient Checkpointing in Deformable DETR (#28686) · 0548af54

Nate Cibik authored Jan 29, 2024

* Enabled gradient checkpointing in Deformable DETR

* Enabled gradient checkpointing in Deformable DETR encoder

* Removed # Copied from headers in modeling_deta.py to break dependence on Deformable DETR code

0548af54

PatchtTST and PatchTSMixer fixes (#28083) · f72c7c22

Wesley Gifford authored Jan 29, 2024

* 🐛

 fix .max bug

* remove prediction_length from regression output dimensions

* fix parameter names, fix output names, update tests

* ensure shape for PatchTST

* ensure output shape for PatchTSMixer

* update model, batch, and expected for regression distribution test

* update test expected
Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com>

* Update tests/models/patchtst/test_modeling_patchtst.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/patchtst/test_modeling_patchtst.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/patchtst/test_modeling_patchtst.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* standardize on patch_length
Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com>

* Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Make arguments more explicit
Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com>

* adjust prepared inputs
Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com>

---------
Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com>
Co-authored-by: Wesley M. Gifford <wmgifford@us.ibm.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

f72c7c22

[Docs] Fix Typo in English & Japanese CLIP Model Documentation (TMBD -> TMDB) (#28751) · 3a08cc48
Vinyzu authored Jan 29, 2024
```
* [Docs] Fix Typo in English CLIP model_doc

* [Docs] Fix Typo in Japanese CLIP model_doc
```
3a08cc48
Fix input data file extension in examples (#28741) · 39fa4009
Klaus Hipp authored Jan 29, 2024

39fa4009
Fix `DepthEstimationPipeline`'s docstring (#28733) · 5649c0cb
Yih-Dar authored Jan 29, 2024
```
* fix

* fix

* Fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
5649c0cb
Add serialization logic to pytree types (#27871) · 243e186e
Angela Yi authored Jan 29, 2024
```
* Add serialized type name to pytrees

* Modify context

* add serde test
```
243e186e

28 Jan, 2024 1 commit
- [`Siglip`] protect from imports if sentencepiece not installed (#28737) · f1cc6157
  amyeroberts authored Jan 28, 2024
```
[Siglip] protect from imports if sentencepiece not installed
```
  f1cc6157
27 Jan, 2024 2 commits
- Generate: deprecate old src imports (#28607) · 03cc1777
  Joao Gante authored Jan 27, 2024
  
  03cc1777
- Falcon: removed unused function (#28605) · a28a7699
  Joao Gante authored Jan 27, 2024
  
  a28a7699
26 Jan, 2024 12 commits

[Flax] Update no init test for Flax v0.7.1 (#28735) · de13a951
Sanchit Gandhi authored Jan 26, 2024

de13a951
[docs] Fix datasets in guides (#28715) · abe0289e
Steven Liu authored Jan 26, 2024
```
* change datasets

* fix
```
abe0289e

Unpin pydantic (#28728) · f8b7c434

Yih-Dar authored Jan 26, 2024



* try pydantic v2

* try pydantic v2

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

f8b7c434

fix: suppress `GatedRepoError` to use cache file (fix #28558). (#28566) · 3aea38ce
Scruel Tao authored Jan 27, 2024
```
* fix: suppress `GatedRepoError` to use cache file (fix #28558).

* move condition_to_return parameter back to outside.
```
3aea38ce

Stop confusing the TF compiler with ModelOutput objects (#28712) · 708b19eb

Matt authored Jan 26, 2024

* Stop confusing the TF compiler with ModelOutput objects

* Stop confusing the TF compiler with ModelOutput objects

708b19eb

Fix `weights_only` (#28725) · a638de19
Yih-Dar authored Jan 26, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
a638de19

Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled(… (#28717) · d6ac8f4a

Shukant Pal authored Jan 26, 2024

Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled() to respect HF_HUB_DISABLE_PROGRESS_BARS

It seems like enable_progress_bar() and disable_progress_bar() sync up with huggingface_hub, but the initial value is always True. This changes will make sure the user's preference is respected implicity on initialization.

d6ac8f4a

[`docs`] Update preprocessing.md (#28719) · 3a46e30d

D authored Jan 26, 2024

* Update preprocessing.md

adjust ImageProcessor link to working target (same as in lower section of file)

* Update preprocessing.md

3a46e30d

fix: corrected misleading log message in save_pretrained function (#28699) · 1f47a24a
Turetskii Mikhail authored Jan 26, 2024

1f47a24a

support PeftMixedModel signature inspect (#28321) · bbe30c69

Facico authored Jan 26, 2024



* support PeftMixedModel signature inspect

* import PeftMixedModel only peft>=0.7.0

* Update src/transformers/trainer.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* fix styling

* Update src/transformers/trainer.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* style fixup

* fix note

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

bbe30c69

Fix duplicate & unnecessary flash attention warnings (#28557) · 8eb74c1c

fxmarty authored Jan 26, 2024



* fix duplicate & unnecessary flash warnings

* trigger ci

* warning_once

* if/else order

---------
Co-authored-by: Your Name <you@example.com>

8eb74c1c

Don't fail when `LocalEntryNotFoundError` during `processor_config.json` loading (#28709) · 142ce683
Yih-Dar authored Jan 26, 2024
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
142ce683

25 Jan, 2024 1 commit

[`docs`] Improve visualization for vertical parallelism (#28583) · 28751958

Peter Götz authored Jan 25, 2024

The documentation says "We refer to this Model parallelism as “Vertical” because of how models are typically visualized.", but then visualizes the model horizontally. This change visualizes the model indeed vertically.

28751958