Commits · 3aea38ce6181d505b038039ccf88855b8537e787 · chenpangpang / transformers

26 Jan, 2024 9 commits

fix: suppress `GatedRepoError` to use cache file (fix #28558). (#28566) · 3aea38ce
Scruel Tao authored Jan 27, 2024
```
* fix: suppress `GatedRepoError` to use cache file (fix #28558).

* move condition_to_return parameter back to outside.
```
3aea38ce

Stop confusing the TF compiler with ModelOutput objects (#28712) · 708b19eb

Matt authored Jan 26, 2024

* Stop confusing the TF compiler with ModelOutput objects

* Stop confusing the TF compiler with ModelOutput objects

708b19eb

Fix `weights_only` (#28725) · a638de19
Yih-Dar authored Jan 26, 2024
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
a638de19

Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled(… (#28717) · d6ac8f4a

Shukant Pal authored Jan 26, 2024

Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled() to respect HF_HUB_DISABLE_PROGRESS_BARS

It seems like enable_progress_bar() and disable_progress_bar() sync up with huggingface_hub, but the initial value is always True. This changes will make sure the user's preference is respected implicity on initialization.

d6ac8f4a

[`docs`] Update preprocessing.md (#28719) · 3a46e30d

D authored Jan 26, 2024

* Update preprocessing.md

adjust ImageProcessor link to working target (same as in lower section of file)

* Update preprocessing.md

3a46e30d

fix: corrected misleading log message in save_pretrained function (#28699) · 1f47a24a
Turetskii Mikhail authored Jan 26, 2024

1f47a24a

support PeftMixedModel signature inspect (#28321) · bbe30c69

Facico authored Jan 26, 2024



* support PeftMixedModel signature inspect

* import PeftMixedModel only peft>=0.7.0

* Update src/transformers/trainer.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* fix styling

* Update src/transformers/trainer.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* style fixup

* fix note

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

bbe30c69

Fix duplicate & unnecessary flash attention warnings (#28557) · 8eb74c1c

fxmarty authored Jan 26, 2024



* fix duplicate & unnecessary flash warnings

* trigger ci

* warning_once

* if/else order

---------
Co-authored-by: Your Name <you@example.com>

8eb74c1c

Don't fail when `LocalEntryNotFoundError` during `processor_config.json` loading (#28709) · 142ce683
Yih-Dar authored Jan 26, 2024
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
142ce683

25 Jan, 2024 6 commits

[`docs`] Improve visualization for vertical parallelism (#28583) · 28751958

Peter Götz authored Jan 25, 2024

The documentation says "We refer to this Model parallelism as “Vertical” because of how models are typically visualized.", but then visualizes the model horizontally. This change visualizes the model indeed vertically.

28751958

[`Vilt`] align input and model dtype in the ViltPatchEmbeddings forward pass (#28633) · 4cbd876e
Fanli Lin authored Jan 25, 2024
```
align dtype
```
4cbd876e

Update question_answering.md (#28694) · 24f1a00e

Yusuf authored Jan 25, 2024

fix typo:

from:

 "model = TFAutoModelForQuestionAnswering("distilbert-base-uncased")"

to:
model = TFAutoModelForQuestionAnswering.from_pretrained("distilbert-base-uncased")

24f1a00e

Improve Backbone API docs (#28666) · 20000956
Merve Noyan authored Jan 25, 2024
```
Update backbones.md
```
20000956
[`chore`] Add missing space in warning (#28695) · 7fa4b36e
Tom Aarsen authored Jan 25, 2024
```
Add missing space in warning
```
7fa4b36e

Add Depth Anything (#28654) · 963db81a

NielsRogge authored Jan 25, 2024

* First draft

* More improvements

* More improvements

* More improvements

* More improvements

* Add docs

* Remove file

* Add copied from

* Address comments

* Address comments

* Address comments

* Fix style

* Update docs

* Convert all checkpoints, add integration test

* Rename checkpoints

* Add pretrained backbone attributes

* Fix default config

* Address comment

* Add figure to docs

* Fix bug thanks to @xenova

* Update conversion script

* Fix integration test

963db81a

24 Jan, 2024 7 commits

[docs] Fix doc format (#28684) · f40b87de
Steven Liu authored Jan 24, 2024
```
* fix hfoptions

* revert changes to other files

* fix
```
f40b87de

improve efficient training on CPU documentation (#28646) · 8278b153

Fanli Lin authored Jan 25, 2024



* update doc

* revert

* typo fix

* refine

* add dtypes

* Update docs/source/en/perf_train_cpu.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_train_cpu.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/perf_train_cpu.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* no comma

* use avx512-vnni

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

8278b153

Improved type hinting for all attention parameters (#28479) · 5d29530e

nakranivaibhav authored Jan 24, 2024

* Changed type hinting for all attention inputs to 'Optional[Tuple[torch.FloatTensor,...]] = None'

* Fixed the ruff formatting issue

* fixed type hinting for all hidden_states to 'Optional[Tuple[torch.FloatTensor, ...]] = None'

* Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py

* test fail update

* fixed type hinting for these 15 scripts modeling_xlnet.py,modeling_tf_xlnet.py,modeling_led.py,modeling_tf_led.py,modleing_rwkv.py,modeling_dpt.py,modeling_tf_cvt.py,modeling_clip.py,modeling_flax_clip.py,modeling_tf_clip.py,modeling_longformer.py,modeling_tf_longformer.py,modeling_siglip.py,modeling_clap.py,modeling_git.py

* Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py

* test fail update

* Removed the myvenv file

* Fixed type hinting for these 8 scripts modeling_tvlt.py,modeling_sam.py,modeling_tf_sam.py,modeling_tvp.py,modeling_rag.py,modeling_tf_rag.py,modeling_tf_xlm.py,modeling_xlm.py

5d29530e

[docs] DeepSpeed (#28542) · 738ec75c

Steven Liu authored Jan 24, 2024

* config

* optim

* pre deploy

* deploy

* save weights, memory, troubleshoot, non-Trainer

* done

738ec75c

Add back in generation types (#28681) · bb6aa8bc
amyeroberts authored Jan 24, 2024

bb6aa8bc

Use save_safetensor to disable safe serialization for XLA (#28669) · 0549000c

jeffhataws authored Jan 24, 2024

* Use save_safetensor to disable safe serialization for XLA

https://github.com/huggingface/transformers/issues/28438

* Style fixup

0549000c

Exclude the load balancing loss of padding tokens in Mixtral-8x7B (#28517) · c5c69096

Khai Mai authored Jan 24, 2024

* fix the function load_balancing_loss_func in Mixtral_Moe to include attention_mask

* format code using black and ruff

* skip computing mask if attention_mask=None

* add tests for load balancing loss Mixtral-Moe

* fix assert loss is different in mixtral_test

* fix pad_leng

* use assertNotAlmostEqual and print to debug

* remove print for debug

* minor updates

* reduce rtol and atol

c5c69096

23 Jan, 2024 11 commits

Update README_es.md (#28612) · 5f81266f
Vladimir Pinera authored Jan 23, 2024
```
Fixing grammatical errors in the text
```
5f81266f

fix a hidden bug of `GenerationConfig`, now the `generation_config.json` can... · 39c3c0a7

Zhenwei authored Jan 24, 2024


fix a hidden bug of `GenerationConfig`, now the `generation_config.json` can be loaded successfully (#28604)

* fix a hidden bug of GenerationConfig

* keep `sort_keys=True` to maintain visibility

* Update src/transformers/generation/configuration_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update configuration_utils.py

in case `obj` is a list, check the items in the list

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

39c3c0a7

Remove deprecated eager_serving fn (#28665) · ebc8f47b

Matt authored Jan 23, 2024

* Remove deprecated eager_serving fn

* Fix the input_signature docstring while I'm here

ebc8f47b

Support single token decode for `CodeGenTokenizer` (#28628) · 9a4521dd
cmathw authored Jan 23, 2024
```
convert token id to list in .decode()
```
9a4521dd

add dataloader prefetch factor in training args and trainer (#28498) · 5b5e71dc

Quentin Meeus authored Jan 23, 2024



* add dataloader prefetch factor in training args and trainer

* remove trailing spaces

* prevent dataloader_num_workers == 0 and dataloader_prefetch_factor != None

dataloader_prefetch_factor works only when data is loaded in a different process as the main one. This commit adds the necessary checks to avoid having prefetch_factor set when there is no such process.

* Remove whitespaces in empty line

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

5b5e71dc

Fix windows err with checkpoint race conditions (#28637) · 582d104b
Zach Mueller authored Jan 23, 2024
```
Fix windows err
```
582d104b
`tensor_size` - fix copy/paste error msg typo (#28660) · c475eca9
Scruel Tao authored Jan 23, 2024
```
Fix copy/paste error msg typo
```
c475eca9

Enable instantiating model with pretrained backbone weights (#28214) · 27c79a0f

amyeroberts authored Jan 23, 2024



* Enable instantiating model with pretrained backbone weights

* Update tests so backbone checkpoint isn't passed in

* Remove doc updates until changes made in modeling code

* Clarify pretrained import

* Update configs - docs and validation check

* Update src/transformers/utils/backbone_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Clarify exception message

* Update config init in tests

* Add test for when use_timm_backbone=True

* Small test updates

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

27c79a0f

Enable safetensors conversion from PyTorch to other frameworks without the... · 008a6a22

Lysandre Debut authored Jan 23, 2024


Enable safetensors conversion from PyTorch to other frameworks without the torch requirement (#27599)

* Initial commit

* Requirements & tests

* Tests

* Tests

* Rogue import

* Rogue torch import

* Cleanup

* Apply suggestions from code review
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

* bfloat16 management

* Sanchit's comments

* Import shield

* apply suggestions from code review

* correct bf16

* rebase

---------
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>

008a6a22

integrations: fix DVCLiveCallback model logging (#28653) · 03986609
Dave Berenbaum authored Jan 23, 2024

03986609

get default device through `PartialState().default_device` as it has been... · 1fc12960

Huazhong Ji authored Jan 23, 2024

get default device through `PartialState().default_device` as it has been officially released (#27256)

get default device through `PartialState().default_device` as it has
been officially released

1fc12960

22 Jan, 2024 7 commits
- Fix phi model doc checkpoint (#28581) · e547458c
  amyeroberts authored Jan 22, 2024
```
Co-authored-by: Pashmina Cameron <11311835+pashminacameron@users.noreply.github.com>
```
  e547458c
- [`SigLIP`] Only import tokenizer if sentencepiece available (#28636) · 590be773
  amyeroberts authored Jan 22, 2024
```
Only import class if sp available
```
  590be773
- Update image_processing_deformable_detr.py (#28561) · a35ea570
  Sounak Dey authored Jan 22, 2024
```
* Update image_processing_deformable_detr.py

* Changes after running make fix-copies
```
  a35ea570
- [`GPTNeoX`] Fix GPTNeoX + Flash Attention 2 issue (#28645) · e201864b
  Younes Belkada authored Jan 22, 2024
```
Update modeling_gpt_neox.py
```
  e201864b
- [`Llava`] Update convert_llava_weights_to_hf.py script (#28617) · dafd5951
  isaac-vidas authored Jan 22, 2024
```
* Update convert_llava_weights_to_hf.py script

* Remove config update of adding padding to `vocab_size` and `text_config.vocab_size` which causes `ValueError` exception.
* Remove keys that ends with `inv_freq` from the state dict.
* Add examples and instructions for creating `model_state_dict.bin` that can be used by the script.

* Update convert_llava_weights_to_hf.py

* Update convert_vipllava_weights_to_hf.py
```
  dafd5951
- Fix lr_scheduler in no_trainer training scripts (#27872) · deb2b590
  bofeng huang authored Jan 22, 2024
```
* Fix lr_scheduler

* Fix lr scheduler
```
  deb2b590
- Add config tip to custom model docs (#28601) · 692c3c6b
  Matt authored Jan 22, 2024
```
Add tip to custom model docs
```
  692c3c6b