Commits · da69de17e86501b95396086a5b6479f645e8f70e · chenpangpang / transformers

11 Oct, 2023 5 commits

[Assistant Generation] Improve Encoder Decoder (#26701) · da69de17

Patrick von Platen authored Oct 11, 2023

* [Assistant Generation] Improve enc dec

* save more

* Fix logit processor checks

* Clean

* make style

* fix deprecation

* fix generation test

* Apply suggestions from code review

* fix biogpt

* make style

da69de17

`Copied from` for test files (#26713) · 5334796d

Yih-Dar authored Oct 11, 2023



* copied statement for test files

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

5334796d

Update docs to explain disabling callbacks using report_to (#26155) · 9f406392
Ben Gubler authored Oct 11, 2023
```
* feat: update callback doc to explain disabling callbacks using report_to

* docs: update report_to docstring
```
9f406392

In assisted decoding, pass model_kwargs to model's forward call (fix... · dcc49d8a

Billy Bradley authored Oct 11, 2023

In assisted decoding, pass model_kwargs to model's forward call (fix prepare_input_for_generation in all models) (#25242)

* In assisted decoding, pass model_kwargs to model's forward call

Previously, assisted decoding would ignore any additional kwargs
that it doesn't explicitly handle. This was inconsistent with other
generation methods, which pass the model_kwargs through
prepare_inputs_for_generation and forward the returned dict to the
model's forward call.

The prepare_inputs_for_generation method needs to be amended in all
models, as previously it only kept the last input ID when a past_key_values
was passed.

* Improve variable names in _extend_attention_mask

* Refactor extending token_type_ids into a function

* Replace deepcopy with copy to optimize performance

* Update new persimmon model with llama changes for assisted generation

* Update new mistral model for assisted generation with prepare_inputs_for_generation

* Update position_ids creation in falcon prepare_inputs_for_generation to support assisted generation

dcc49d8a

Make Whisper Encoder's sinusoidal PE non-trainable by default (#26032) · 1e3c9dda

Thien Tran authored Oct 11, 2023



* set encoder's PE as non-trainable

* freeze flax

* init sinusoids

* add test for non-trainable embed positions

* simplify TF encoder embed_pos

* revert tf

* clean up

* add sinusoidal init for jax

* make consistent sinusoidal function

* fix dtype

* add default dtype

* use numpy for sinusoids. fix jax

* add sinusoid init for TF

* fix

* use custom embedding

* use specialized init for each impl

* fix sinusoids init. add test for pytorch

* fix TF dtype

* simplify sinusoid init for flax and tf

* add tests for TF

* change default dtype to float32

* add sinusoid test for flax

* Update src/transformers/models/whisper/modeling_flax_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update src/transformers/models/whisper/modeling_tf_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* move sinusoidal init to _init_weights

---------
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

1e3c9dda

10 Oct, 2023 6 commits

[JAX] Replace uses of `jnp.array` in types with `jnp.ndarray`. (#26703) · fc639143

Roy Hvaara authored Oct 10, 2023

`jnp.array` is a function, not a type:
https://jax.readthedocs.io/en/latest/_autosummary/jax.numpy.array.html


so it never makes sense to use `jnp.array` in a type annotation. Presumably the intent was to write `jnp.ndarray` aka `jax.Array`.
Co-authored-by: Peter Hawkins <phawkins@google.com>

fc639143

Fix source_prefix default value (#26654) · 3eceaa36
jheitmann authored Oct 10, 2023

3eceaa36
fix a typo in flax T5 attention - attention_mask variable is misnamed (#26663) · 975003ea
théo gigant authored Oct 10, 2023
```
* fix a typo in flax t5 attention

* fix the typo in flax longt5 attention
```
975003ea

[docstring] Fix docstring for `LlamaConfig` (#26685) · e8fdd787

Pavarissy authored Oct 10, 2023

* Your commit message here

* fix LlamaConfig docstring

* run make fixup

* fix formatting after review

reformat of the file to prevent script issues

* rerun make fixup after reformat

e8fdd787

Fix Typo: table in deepspeed.md (#26705) · a9862a0f
Tuowei Wang authored Oct 10, 2023

a9862a0f

Control first downsample stride in ResNet (#26374) · 592f2eab

jiqing-feng authored Oct 10, 2023

* control first downsample stride

* reduce first only works for ResNetBottleNeckLayer

* fix param name

* fix style

592f2eab

09 Oct, 2023 10 commits

[docstring] Fix docstrings for `CLIP` (#26691) · a5e6df82
Isaac Chung authored Oct 09, 2023
```
fix docstrings for vanilla clip
```
a5e6df82
Fix stale bot (#26692) · 87b4ade9
Lysandre Debut authored Oct 09, 2023
```
* Fix stale bot

* Comments
```
87b4ade9

[docstring] Fix docstring for DonutImageProcessor (#26641) · 3257946f

Alex Bzdel authored Oct 09, 2023

* removed donutimageprocessor from objects_to_ignore

* added docstring for donutimageprocessor

* readding donut file

* moved docstring to correct location

3257946f

[docstring] Fix docstring for `CLIPImageProcessor` (#26676) · d2f06dff
Isaac Chung authored Oct 09, 2023
```
fix docstring for CLIPImageProcessor
```
d2f06dff
[docstring] Fix docstring CLIP configs (#26677) · 3763101f
Isaac Chung authored Oct 09, 2023
```
* fix docstrings for CLIP configs

* black formatted
```
3763101f

fix typos in idefics.md (#26648) · c7f01bee

tom white authored Oct 09, 2023

* fix typos in idefics.md

Two typos found in reviewing this documentation.

1) max_new_tokens=4, is not sufficient to generate "Vegetables" as indicated - you will get only "Veget". (incidentally - some mention of how to select this value might be useful as it seems to change in each example)

2) inputs = processor(prompts, return_tensors="pt").to(device) as inputs need to be on the same device (as they are in all other examples on the page)

* Update idefics.md

Change device to cuda explicitly to match other examples

c7f01bee

Avoid CI OOM (#26639) · 740fc6a1

Yih-Dar authored Oct 09, 2023



fix avoid oom
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

740fc6a1

fix links in README.md for the GPT, GPT-2, and Llama2 Models (#26640) · 8835bff6
D. Carpintero authored Oct 09, 2023
```
* fix OpenAI GPT, GPT-2 links

* fix Llama2 link
```
8835bff6
Fixed malapropism error (#26660) · 86a4e5a9
Shreyas S authored Oct 09, 2023
```
Update test_integration.py

Fixed malapropism clone>copy
```
86a4e5a9
[DINOv2] Convert more checkpoints (#26177) · 2629c8f3
NielsRogge authored Oct 09, 2023
```
* Convert checkpoints

* Update doc test

* Address comment
```
2629c8f3

06 Oct, 2023 11 commits

docs(zh): review and punctuation & space fix (#26627) · 897a826d
Jabasukuriputo Wang authored Oct 06, 2023

897a826d
[docstring] Fix docstring for `AlbertConfig` (#26636) · 360ea8fc
Yih-Dar authored Oct 06, 2023
```
example fix docstring
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
360ea8fc

[`LlamaTokenizerFast`] Adds edge cases for the template processor (#26606) · 9ad815e4

Arthur authored Oct 06, 2023

* make sure eos and bos are properly handled for fast tokenizer

* fix code llama as well

* nits

* fix the conversion script as well

* fix failing test

9ad815e4

remove SharedDDP as it is deprecated (#25702) · 27597fea

statelesshz authored Oct 06, 2023



* remove SharedDDP as it was drepracated

* apply review suggestion

* make style

* Oops,forgot to remove the compute_loss context manager in Seq2SeqTrainer.

* remove the unnecessary conditional statement

* keep the logic of IPEX

* clean code

* mix precision setup & make fixup

---------
Co-authored-by: statelesshz <jihuazhong1@huawei.com>

27597fea

Fix failing `MusicgenTest .test_pipeline_text_to_audio` (#26586) · e840aa67
Yih-Dar authored Oct 06, 2023
```
* fix

* fix

* Fix

* Fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
e840aa67
fix RoPE t range issue for fp16 (#26602) · 87499420
rui-ren authored Oct 06, 2023

87499420
Update chat template docs with more tips on writing a template (#26625) · ea52ed9d
Matt authored Oct 06, 2023

ea52ed9d

Remove unnecessary unsqueeze - squeeze in rotary positional embedding (#26162) · 64845307

fxmarty authored Oct 06, 2023

* remove unnecessary unsqueeze-squeeze in llama

* correct other models

* fix

* revert gpt_neox_japanese

* fix copie

* fix test

64845307

Update tokenization_code_llama_fast.py (#26576) · 65aabafe

Tianqi Liu authored Oct 06, 2023

* Update tokenization_code_llama_fast.py

* Update test_tokenization_code_llama.py

* Update test_tokenization_code_llama.py

65aabafe

Fixed inconsistency in several fast tokenizers (#26561) · af38c837
Towdo authored Oct 06, 2023

af38c837

Remove unnecessary `view`s of `position_ids` (#26059) · 8878eb1b

Ramiro Leal-Cavazos authored Oct 06, 2023

* Remove unnecessary `view` of `position_ids` in `modeling_llama`

When `position_ids` is `None`, its value is generated using
`torch.arange`, which creates a tensor of size `(seq_length +
past_key_values_length) - past_key_values_length = seq_length`. The
tensor is then unsqueezed, resulting in a tensor of shape `(1,
seq_length)`. This means that the last `view` to a tensor of shape
`(-1, seq_length)` is a no-op.

This commit removes the unnecessary view.

* Remove no-op `view` of `position_ids` in rest of transformer models

8878eb1b

05 Oct, 2023 8 commits

Don't install `pytorch-quantization` in Doc Builder docker file (#26622) · 75a33d60
Yih-Dar authored Oct 05, 2023
```
Fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
75a33d60

[docs] Update to scripts building index.md (#26546) · 18fbeec8

Maria Khalusova authored Oct 05, 2023

* build the table in index.md with links to the model_doc

* removed list generation on index.md

* fixed missing models

* make style

18fbeec8

Fix `transformers-pytorch-gpu` docker build (#26615) · 9d206012
Yih-Dar authored Oct 05, 2023
```
Fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
9d206012
Don't close ClearML task if it was created externally (#26614) · 9e78c9ac
eajechiloae authored Oct 05, 2023
```
don't close clearml task if it was created externally
```
9e78c9ac

#26566 swin2 sr allow in out channels (#26568) · 0a3b9d02

Marvin Gabler authored Oct 05, 2023



* feat: close #26566, changed model & config files to accept arbitary in and out channels

* updated docstrings

* fix: linter error

* fix: update Copy docstrings

* fix: linter update

* fix: rename num_channels_in to num_channels to prevent breaking changes

* fix: make num_channels_out None per default

* Update src/transformers/models/swin2sr/configuration_swin2sr.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix: update tests to include num_channels_out

* fix:linter

* fix: remove normalization with precomputed rgb values when #input_channels!=#output_channels

---------
Co-authored-by: marvingabler <marvingabler@outlook.de>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

0a3b9d02

[`core`] fix silent bug `keep_in_fp32` modules (#26589) · e6d250e4
Younes Belkada authored Oct 05, 2023
```
* fix silent bug `keep_in_fp32` modules

* final fix

* added a common test.

* Trigger CI

* revert
```
e6d250e4

Make `ModelOutput` serializable (#26493) · 19f0b7dd

Charles Bensimon authored Oct 05, 2023

* Make `ModelOutput` serializable

Original PR from diffusers : https://github.com/huggingface/diffusers/pull/5234

* Black

19f0b7dd

Fix failing tests on `main` due to torch 2.1 (#26607) · 54e17a15
Yih-Dar authored Oct 05, 2023
```
* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
54e17a15