Commits · 7c31d05b59a9dce24b8ddc4b2bb8c8cf6bb5fd77 · chenpangpang / transformers

03 Aug, 2024 1 commit

fix: (issue #32124) Exception raised when running... · 7c31d05b

Shaopeng Fu authored Aug 03, 2024

fix: (issue #32124) Exception raised when running `transformers/examples/flax/language-modeling/t5_tokenizer_model.py`. (#32157)

fix: Exception raised when running .

7c31d05b

02 Aug, 2024 3 commits
- [generate] only require an attention mask for mps with torch<2.4 (#32367) · c1aa0edb
  Sanchit Gandhi authored Aug 02, 2024
```
* up

* style

* stopping
```
  c1aa0edb
- RoPE: Add numerical tests ✨ (#32380) · 083e13b7
  Joao Gante authored Aug 02, 2024
```
tests! :D
```
  083e13b7
- Update docs (#32368) · 2af199c4
  Raushan Turganbay authored Aug 02, 2024
```
nits
```
  2af199c4
01 Aug, 2024 13 commits

Yell at the user if zero-3 init wasn't performed, but expected to have been done (#32299) · 82efc535

Zach Mueller authored Aug 01, 2024



* Test this zach

* Test for improper init w/o zero3

* Move back

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Get rid of stars in warning

* Make private

* Make clear

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

82efc535

Fixed Hybrid Cache Shape Initialization. (#32163) · 51ab25e2

OsamaS99 authored Aug 01, 2024



* fixed hybrid cache init, added test

* Fix Test Typo

---------
Co-authored-by: Aaron Haag <aaron.haag@siemens.com>

51ab25e2

Docker: add `speech` dep to the consistency docker image (#32374) · e3d8285a
Joao Gante authored Aug 01, 2024

e3d8285a

Offloaded KV Cache (#31325) · ca59d6f7

Nikos Karampatziakis authored Aug 01, 2024

* Initial implementation of OffloadedCache

* enable usage via cache_implementation

* Address feedback, add tests, remove legacy methods.

* Remove flash-attn, discover synchronization bugs, fix bugs

* Prevent usage in CPU only mode

* Add a section about offloaded KV cache to the docs

* Fix typos in docs

* Clarifications and better explanation of streams

ca59d6f7

Fix conflicting key in init kwargs in PreTrainedTokenizerBase (#31233) · b4727a12

Omar Salman authored Aug 01, 2024

* Fix conflicting key in init kwargs in PreTrainedTokenizerBase

* Update code to check for callable key in save_pretrained

* Apply PR suggestions

* Invoke CI

* Updates based on PR suggestion

b4727a12

Empty list in defaults for LLaMA special tokens during weights conversion (#32342) · db8c7cae
Viktor Scherbakov authored Aug 01, 2024
```
empty list in defaults
```
db8c7cae
update clean_up_tokenization_spaces warning (#32371) · 2229ebe7
Ita Zaporozhets authored Aug 01, 2024

2229ebe7

Check device map for saving tokenizer config on TPU (fix for issue #31971) (#32043) · 05c1f9af

Hanna Yukhymenko authored Aug 01, 2024

* Remove TPU device map for saving tokenizer config

* Update tokenization_utils_base.py

* Fix error msg when passing non-string device into tokenizer

* Fix error message for non-string tokenizer device

* Print out tokenizer device type in error msg

* Update tokenization_utils_base.py

05c1f9af

add missing attribute _supports_param_buffer_assignment for gpt-j. (#32359) · 9e282840
nv-guomingz authored Aug 01, 2024
```
Co-authored-by: Guoming Zhang <37257613+nv-guomingz@users.noreply.github.com>
```
9e282840
Remove size check between attn_weights and kv_seq_len for phi3 (#32339) · 48ed24c5
Lunwen He authored Aug 01, 2024
```
* Remove size check between attn_weights and kv_seq_len

* add unit tests
```
48ed24c5

[whisper] compile compatibility with long-form decoding (#31772) · e234061c

Sanchit Gandhi authored Aug 01, 2024

* [whisper] compile compatibility with long-form decoding

* clarify comment

* fix after rebase

* finalise

* fix bsz

* fix cache split

* remove contiguous

* style

* finish

* update doc

* prevent cuda graph trace

e234061c

[enc-dec cache] fix bug in indexing (#32370) · 9451a385
Sanchit Gandhi authored Aug 01, 2024

9451a385
LLaVa: add cache class attribute (#32278) · 453e7488
Raushan Turganbay authored Aug 01, 2024
```
cache class flag
```
453e7488

31 Jul, 2024 9 commits

fix: warmup_steps check for training_args (#32236) · 14ee2326
Ricardo authored Aug 01, 2024

14ee2326

fix: Removed unnecessary `@staticmethod` decorator (#32361) · 53f0c9c2

Sai-Suraj-27 authored Aug 01, 2024

* Fixed staticmethods with self as first argument.

* Fixed staticmethods with self as first argument.

* Fixed staticmethods with self as first argument.

* Fixed staticmethods with self as first argument.

53f0c9c2

>3-5x faster torch.compile forward compilation for autoregressive decoder models (#32227) · 92abe603

fxmarty authored Jul 31, 2024



* draft

* apply changes to all relevant archs

* rerun ci - check_docstrings.py failing?

* fix docstring

* move 2D->4D mask creation to modeling file

* repo consistency

* fix the batch size = 1 case - calling contiguous is not enough

* nit

* style

* propagate to gemma/gemma-2

* prepare inputs for gemma generation

* implement test and tiny fix in gemma2

* Update src/transformers/models/bloom/modeling_bloom.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix copies

* ci pass

* fix gemma's test_compile_static_cache tests

* flacky

* retrigger ci

---------
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

92abe603

Fix error when streaming to gradio with non-string tool arguments (#32360) · b46bd8b9
Aymeric Roucher authored Jul 31, 2024
```
Fix error when streaming agent run to gradio with non-string tool arguments
```
b46bd8b9
Gemma 2: support assisted generation (#32357) · ef177a5e
Joao Gante authored Jul 31, 2024

ef177a5e

[Idefics2] - Fix FA2 call for Perceiver layer (#32275) · 5f1fcc29

amyeroberts authored Jul 31, 2024

* Fix FA2 call for Perciever layer

* [run_slow] idefics2

* [run_slow] idefics2

* [run_slow] idefics2

* Fix up

* [run_slow] idefics2

* [run_slow] idefics2

* [run_slow] idefics2

5f1fcc29

Llama 3.1: Fix incorrect `inv_freq` assignment (#32330) · b75ad566
Joao Gante authored Jul 31, 2024
```
fix 💩
```
b75ad566

Gemma2 and flash-attention (#32188) · 7f552e28

Raushan Turganbay authored Jul 31, 2024

* enable flash-attn & static cache

* this works, not the prev

* fix for sliding window layers

* not needed anymore

7f552e28

LLaVA-NeXT: fix anyres shapes (#32314) · a3264332
Raushan Turganbay authored Jul 31, 2024
```
fix
```
a3264332

30 Jul, 2024 12 commits

Fix slow GemmaTokenizer and improve SPM slow -> fast conversion process (#32191) · 6e2d04e4

Joshua Lochner authored Jul 30, 2024

* Remove user-defined tokens which can be obtained through merges

* Remove debug line

* formatting

* Refactor spm slow -> fast converter

* revert unnecessary refactor

* set comprehension

* remove test files

* Use `vocab_scores`

* Always replace spiece underline with space in decode

* we no longer need token filtering

* Add save fast load slow unit test

* Remove tokenizers version check

* Remove duplicate code

* Make `<start_of_turn>` and `<end_of_turn>` special tokens

* Bias merge priority with length if score is the same

* Add unit test for merge priority

* CI

6e2d04e4

Repo checks: skip docstring checks if not in the diff (#32328) · 026a173a

Joao Gante authored Jul 30, 2024

* tmp

* skip files not in the diff

* use git.Repo instead of an external subprocess

* add tiny change to confirm that the diff is working on pushed changes

* add make quality task

* more profesh main commit reference

026a173a

fixes #32329 : The Torch code is correct - to get an average of 10% o… (#32335) · 516af4bb

fkrasnov2 authored Jul 30, 2024

fixes #32329 : The Torch code is correct - to get an average of 10% of the total, we want to take 50% of the remainder after we've already masked 80% with [MASK] in the previous step.

516af4bb

fixes to properly shard FSDP across cpu and meta for cpu_efficient_loading for... · 62c60a30
Wing Lian authored Jul 30, 2024
```
fixes to properly shard FSDP across cpu and meta for cpu_efficient_loading for prequantized 4bit (#32276)
```
62c60a30
fix: Added missing raise keyword for few exceptions (#32333) · 16271080
Sai-Suraj-27 authored Jul 30, 2024
```
Fixed raising of few exceptions.
```
16271080

Alternative agent plan (#32295) · bd54ed2e

plaggy authored Jul 30, 2024

* new agent plan

* plan type assertion

* style corrections

* better prompt naming

* make fixup

bd54ed2e

Docs: formatting nits (#32247) · e68ec18c

Joao Gante authored Jul 30, 2024



* doc formatting nits

* ignore non-autodocs

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/esm/modeling_esm.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/esm/modeling_esm.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* make fixup

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

e68ec18c

Fix M4T for ASR pipeline (#32296) · 2fbbcf50
Yoach Lacombe authored Jul 30, 2024
```
* tentative fix

* do the same for M4T
```
2fbbcf50
feat(ci): set `fetch-depth: 0` in trufflehog checkout step (#31663) · 084b5094
Luc Georges authored Jul 30, 2024

084b5094

Cast epochs_trained to int when resuming training (#32286) · 20528f06

Teddy Ferdinan authored Jul 30, 2024



* fix epochs_trained as int when resuming training

* refactor

---------
Co-authored-by: teddyferdinan <teddy.ferdinan@pwr.edu.pl>

20528f06

Fix GGUF dequantize for `gguf==0.9.1` (#32298) · 934fe150
Isotr0py authored Jul 30, 2024
```
* fix gguf dequantize for gguf==0.9.1

* fix old version

* make style
```
934fe150

Docs: fix GaLore optimizer code example (#32249) · 3e8106d2

Gilad Turok authored Jul 30, 2024

Docs: fix GaLore optimizer example

Fix incorrect usage of GaLore optimizer in Transformers trainer code example.

The GaLore optimizer uses low-rank gradient updates to reduce memory usage. GaLore is quite popular and is implemented by the authors in [https://github.com/jiaweizzhao/GaLore](https://github.com/jiaweizzhao/GaLore). A few months ago GaLore was added to the HuggingFace Transformers library in https://github.com/huggingface/transformers/pull/29588.

Documentation of the Trainer module includes a few code examples of how to use GaLore. However, the `optim_targe_modules` argument to the `TrainingArguments` function is incorrect, as discussed in https://github.com/huggingface/transformers/pull/29588#issuecomment-2006289512. This pull request fixes this issue.

3e8106d2

29 Jul, 2024 2 commits
- use torch 2.4 in 2 CI jobs (#32302) · f0bc49e7
  Yih-Dar authored Jul 29, 2024
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  f0bc49e7
- Add stream messages from agent run for gradio chatbot (#32142) · a24a9a66
  Aymeric Roucher authored Jul 29, 2024
```
* Add stream_to_gradio method for running agent in gradio demo
```
  a24a9a66