Commits · 0145c6825e488b2bfa1bbf403a6b92f754043ed3 · chenpangpang / transformers

21 Nov, 2023 8 commits

Fix tracing dinov2 (#27561) · 0145c682
amyeroberts authored Nov 21, 2023
```
* Enable tracing with DINOv2 model

* ABC

* Add note to model doc
```
0145c682

Fix flash attention bugs with Mistral and Falcon (#27625) · 82cc0a79

fxmarty authored Nov 21, 2023

* fix various bugs with flash attention

* bump

* fix test

* fix mistral

* use skiptest instead of return that may be misleading

* fix on review

82cc0a79

Add RoCm scheduled CI & upgrade RoCm CI to PyTorch 2.1 (#26940) · f93c1e9e

fxmarty authored Nov 21, 2023



* add scheduled ci on amdgpu

* fix likely typo

* more tests, avoid parallelism

* precise comment

* fix report channel

* trigger docker build on this branch

* fix

* fix

* run rocm scheduled ci

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

f93c1e9e

Idefics: Fix information leak with cross attention gate in modeling (#26839) · 851a4f70

Leo Tronchon authored Nov 21, 2023



* fix image_attention gate in idefics modeling

* update comment

* cleaner gating

* fix gate condition

* create attention gate once

* update comment

* update doc of cross-attention forward

* improve comment

* bring back no_images

* pass cross_attention_gate similarly  to no_images gate

* add information on gate shape

* fix no_images placement

* make tests for gate

* take off no_images logic

* update test based on comments

* raise value error if cross_attention_gate is None

* send cross_attention_gate to device

* Revert "send cross_attention_gate to device"

This reverts commit 054f84228405bfa2e75fecc502f6a96dc83cdc0b.

* send cross_attention_gate to device

* fix device in test + nit

* fill hidden_states with zeros instead of multiplying with the gate

* style

* Update src/transformers/models/idefics/modeling_idefics.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/idefics/modeling_idefics.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

851a4f70

Generate: Update docs regarding reusing `past_key_values` in `generate` (#27612) · 81b79818
Joao Gante authored Nov 21, 2023

81b79818
[ConvNext] Improve backbone (#27621) · ade7af93
NielsRogge authored Nov 21, 2023
```
* Improve convnext backbone

* Fix convnext2
```
ade7af93
[`core` / `gradient_checkpointing`] add support for old GC method (#27610) · 0e6794ff
Younes Belkada authored Nov 21, 2023
```
* add support for old GC method

* add also disable

* up

* oops
```
0e6794ff
dvclive callback: warn instead of fail when logging non-scalars (#27608) · 8eb9e29d
Dave Berenbaum authored Nov 21, 2023
```
* dvclive callback: warn instead of fail when logging non-scalars

* tests: log lr as scalar
```
8eb9e29d

20 Nov, 2023 9 commits

Fix torch.fx import issue for torch 1.12 (#27570) · 38e2633f

amyeroberts authored Nov 20, 2023

* Fix torch.fx import issue for torch 1.12

* Fix up

* Python verion dependent import

* Woops - fix

* Fix

38e2633f

Update Korean tutorial for using LLMs, and refactor the nested conditional... · f18c95b4

Yeonwoo Sung authored Nov 21, 2023

Update Korean tutorial for using LLMs, and refactor the nested conditional statements in hr_argparser.py (#27489)

docs: Update Korean LLM tutorial to use Mistral-7B, not Llama-v1

f18c95b4

[Whisper] Add `large-v3` version support (#27336) · 87e217d0

Dmitrii Mukhutdinov authored Nov 21, 2023



* Enable large-v3 downloading and update language list

* Fix type annotation

* make fixup

* Export Whisper feature extractor

* Fix error after extractor loading

* Do not use pre-computed mel filters

* Save the full preprocessor properly

* Update docs

* Remove comment
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add alignment heads consistent with each Whisper version

* Remove alignment heads calculation

* Save fast tokenizer format as well

* Fix slow to fast conversion

* Fix bos/eos/pad token IDs in the model config

* Add decoder_start_token_id to config

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

87e217d0

timm to pytorch conversion for vit model fix (#26908) · 93f2de85

Said Taghadouini authored Nov 20, 2023

* timm to pytorch conversion for vit model fix

* remove unecessary print statments

* Detect non-supported ViTs in transformers & better handle id2label mapping

* detect non supported hybrid resnet-vit models in conversion script

* remove check for overlap between cls token and pos embed

93f2de85

[`FA-2`] Add fa2 support for `from_config` (#26914) · e66984f9
Younes Belkada authored Nov 20, 2023
```
* add fa2 support for from_config

* Update test_modeling_common.py
```
e66984f9

[ examples] fix loading jsonl with load dataset in run translation example (#26924) · f31af392

Mathias Nielsen authored Nov 20, 2023



* Renamed variable extension to builder_name

* If builder name is jsonl change to json to align with load_datasets

* Apply suggestions from code review
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

---------
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

f31af392

docs: fix 404 link (#27529) · e4280d65
Peter Pan authored Nov 20, 2023
```
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
```
e4280d65
Add `convert_hf_to_openai.py` script to Whisper documentation resources (#27590) · ee292615
Xabier de Zuazo authored Nov 20, 2023
```
Add `convert_hf_to_openai.py` script to Whisper documentation resources.
```
ee292615

Fix idx2sym not loaded from pretrained vocab file in Transformer XL (#27589) · dbf7bfaf

Joel Tang authored Nov 20, 2023

* Load idx2sym from pretrained vocab file in Transformer XL

When loading vocab file from a pretrained tokenizer for Transformer XL,
although the pickled vocabulary file contains a idx2sym key, it isn't
loaded, because it is discarded as the empty list already exists as
an attribute.

Solution is to explicitly take it into account, just like for sym2idx.

* ran make style

dbf7bfaf

19 Nov, 2023 1 commit
- Adding leaky relu in dict ACT2CLS (#27574) · dc68a39c
  Rafael Padilla authored Nov 19, 2023
```
Co-authored-by: Rafael Padilla <rafael.padilla@huggingface.co>
```
  dc68a39c
18 Nov, 2023 1 commit
- Fix broken distilbert url (#27579) · 25b0f203
  Omar Sanseviero authored Nov 18, 2023
  
  25b0f203
17 Nov, 2023 7 commits
- translate deepspeed.md to chinese (#27495) · d1a00f9d
  jiaqiw09 authored Nov 18, 2023
```
* translate deepspeed.md

* update
```
  d1a00f9d
- Broken links fixed related to datasets docs (#27569) · ffbcfc01
  V.Prasanna kumar authored Nov 18, 2023
```
fixed the broken links belogs to dataset library of transformers
```
  ffbcfc01
- fixed broken link (#27560) · 638d4998
  V.Prasanna kumar authored Nov 17, 2023
  
  638d4998
- Generate: update compute transition scores doctest (#27558) · 5330b83b
  Joao Gante authored Nov 17, 2023
  
  5330b83b
- Generate: fix flaky tests (#27543) · 913d03dc
  Joao Gante authored Nov 17, 2023
  
  913d03dc
- Fix AMD CI not showing GPU (#27555) · d903abfc
  Yih-Dar authored Nov 17, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  d903abfc
- Skip some fuyu tests (#27553) · fe3ce061
  Yih-Dar authored Nov 17, 2023
```
* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  fe3ce061
16 Nov, 2023 13 commits

translate Trainer.md to chinese (#27527) · b074461e
jiaqiw09 authored Nov 16, 2023
```
* translate

* update

* update
```
b074461e

Updated albert.md doc for ALBERT model (#27223) · 93f31e0e

Nathaniel Egwu authored Nov 16, 2023



* Updated albert.md doc for ALBERT model

* Update docs/source/en/model_doc/albert.md

Fixed Resources heading
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update the ALBERT model doc resources

Fixed resource example for fine-tuning the ALBERT sentence-pair classification.
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/albert.md

Removed resource duplicate
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Updated albert.md doc with reviewed changes

* Updated albert.md doc for ALBERT

* Update docs/source/en/model_doc/albert.md

Removed duplicates from  updated docs/source/en/model_doc/albert.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/albert.md

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

93f31e0e

Generate: improve assisted generation tests (#27540) · 12b50c61
Joao Gante authored Nov 16, 2023

12b50c61

[`Styling`] stylify using ruff (#27144) · 651408a0

Arthur authored Nov 16, 2023



* try to stylify using ruff

* might need to remove these changes?

* use ruf format andruff check

* use isinstance instead of type comparision

* use # fmt: skip

* use # fmt: skip

* nits

* soem styling changes

* update ci job

* nits isinstance

* more files update

* nits

* more nits

* small nits

* check and format

* revert wrong changes

* actually use formatter instead of checker

* nits

* well docbuilder is overwriting this commit

* revert notebook changes

* try to nuke docbuilder

* style

* fix feature exrtaction test

* remve `indent-width = 4`

* fixup

* more nits

* update the ruff version that we use

* style

* nuke docbuilder styling

* leve the print for detected changes

* nits

* Remove file I/O
Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com>

* style

* nits

* revert notebook changes

* Add # fmt skip when possible

* Add # fmt skip when possible

* Fix

* More `  # fmt: skip` usage

* More `  # fmt: skip` usage

* More `  # fmt: skip` usage

* NIts

* more fixes

* fix tapas

* Another way to skip

* Recommended way

* Fix two more fiels

* Remove asynch
Remove asynch

---------
Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com>

651408a0

Disable docker image build job `latest-pytorch-amd` for now (#27541) · acb5b4af
Yih-Dar authored Nov 16, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
acb5b4af
Raise error when quantizing a quantized model (#27500) · 6b39470b
Marc Sun authored Nov 16, 2023
```
add error msg
```
6b39470b

Set `usedforsecurity=False` in hashlib methods (FIPS compliance) (#27483) · fd65aa98

Lucain authored Nov 16, 2023

* Set usedforsecurity=False in hashlib methods (FIPS compliance)

* trigger ci

* tokenizers version

* deps

* bump hfh version

* let's try this

fd65aa98

Revert "add attention_mask and position_ids in assisted model" (#27523) · 5603fad2
Patrick von Platen authored Nov 16, 2023
```
* Revert "add attention_mask and position_ids in assisted model (#26892)"

This reverts commit 184f60dc.

* more debug
```
5603fad2
Update the TF pin for 2.15 (#27375) · 4989e73e
Matt authored Nov 16, 2023
```
* Move the TF pin for 2.15

* make fixup
```
4989e73e
docs: add docs for map, and add num procs to load_dataset (#27520) · 69c9b89f
Phuc Van Phan authored Nov 16, 2023

69c9b89f
[`pytest`] Avoid flash attn test marker warning (#27509) · 85fde09c
Arthur authored Nov 16, 2023
```
add flash attn markers
```
85fde09c
Support ONNX export for causal LM sequence classifiers (#27450) · 1394e08c
Dean Wyatte authored Nov 16, 2023
```
support onnx for causal lm sequence classification
```
1394e08c

translate model.md to chinese (#27518) · 06343b06

Hz, Ji authored Nov 16, 2023



* translate model.md to chinese

* apply review suggestion
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

06343b06

15 Nov, 2023 1 commit
- Fix offload disk for loading derivated model checkpoint into base model (#27253) · 1ac599d9
  Marc Sun authored Nov 15, 2023
```
* fix

* style

* add test
```
  1ac599d9