Commits · b8e5cd5396f7c0cc2d5e10be6696ea38742abf51 · chenpangpang / transformers

26 Jul, 2024 5 commits

Refactor: Removed un-necessary `object` base class (#32230) · b8e5cd53
Sai-Suraj-27 authored Jul 26, 2024
```
* Refactored to remove un-necessary object base class.

* small fix.
```
b8e5cd53

don't log base model architecture in wandb if log model is false (#32143) · 1c7ebf1d

João Nadkarni authored Jul 26, 2024



* don't log base model architecture in wandb is log model is false

* Update src/transformers/integrations/integration_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* convert log model setting into an enum

* fix formatting

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

1c7ebf1d

Resize embeds with DeepSpeed (#32214) · c46edfb8
Raushan Turganbay authored Jul 26, 2024
```
* fix resize when deepspeed

* deepsped uses new embeds

* we needed this
```
c46edfb8
Llava: generate without images (#32183) · fad15fba
Raushan Turganbay authored Jul 26, 2024
```
* llava w/o images

* tests
```
fad15fba

Generation: stop at `eos` for assisted decoding (#31301) · 4ab33c2d

Raushan Turganbay authored Jul 26, 2024



* fix

* move changes to prompt lookup

* add test

* set eos in assistant model

* style

* fix flakiness

* changes for new `main`

* Update tests/generation/test_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/generation/test_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add comment to explain

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

4ab33c2d

25 Jul, 2024 9 commits
- Fix code snippet for Grounding DINO (#32229) · 9d6c0641
  Pavel Iakubovskii authored Jul 25, 2024
```
Fix code snippet for grounding-dino
```
  9d6c0641
- Allow a specific microphone to be used by the ffmpeg audio pipeline utility... · 3a83ec48
  jrhe authored Jul 25, 2024
```
Allow a specific microphone to be used by the ffmpeg audio pipeline utility functions. Default to using the currently active microphone on Mac (#31846)

* use currently active microphone on mac for ffmpeg_microphone

* Allow ffmpeg_microphone device to be specified
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
```
  3a83ec48
- translate philosophy.md to chinese (#32177) · 6ed0bf1e
  Huazhong Ji authored Jul 26, 2024
```
* translate philosophy.md to chinese

* add the missing link
```
  6ed0bf1e
- Follow up for #31973 (#32025) · df6eee92
  Yih-Dar authored Jul 25, 2024
```
* fix

* [test_all] trigger full CI

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  df6eee92
- [warnings] fix E721 warnings (#32223) · de231889
  Kashif Rasul authored Jul 25, 2024
```
fix E721 warnings
```
  de231889
- [BigBird Pegasus] set _supports_param_buffer_assignment to False (#32222) · 9b9a54e6
  Kashif Rasul authored Jul 25, 2024
```
set _supports_param_buffer_assignment to False
```
  9b9a54e6
- Update question_answering.py (#32208) · 1ecedf1d
  Austin authored Jul 25, 2024
  
  1ecedf1d
- remove unnecessary guard code related with pytorch versions 1.4.2 ~ 1.7.0 (#32210) · f53a5dec
  Huazhong Ji authored Jul 25, 2024
```
remove unnecessary guard code related with pytorch versions 1.4.2 ~
1.7.0
```
  f53a5dec
- [whisper] fix short-form output type (#32178) · 5658e749
  Sanchit Gandhi authored Jul 25, 2024
```
* [whisper] fix short-form output type

* add test

* make style

* update long-form tests

* fixes

* last fix

* finalise test
```
  5658e749
24 Jul, 2024 11 commits

fix: Replaced deprecated `unittest method` with the correct one (#32198) · 85a1269e
Sai-Suraj-27 authored Jul 24, 2024
```
Replaced deprecated unittest method with the correct one.
```
85a1269e

🚨

No more default chat templates (#31733) · edd68f4e

Matt authored Jul 24, 2024

* No more default chat templates

* Add the template to the GPT-SW3 tests since it's not available by default now

* Fix GPT2 test

* Fix Bloom test

* Fix Bloom test

* Remove default templates again

edd68f4e

Support dequantizing GGUF FP16 format (#31783) · 1c122a46
Penut Chen authored Jul 24, 2024
```
* support gguf fp16

* support gguf bf16 with pytorch

* add gguf f16 test

* remove bf16
```
1c122a46
Fix float8_e4m3fn in modeling_utils (#32193) · af0e4b7b
Marc Sun authored Jul 24, 2024
```
* Fix float8_e4m3fn in modeling_utils

* style

* fix

* comment
```
af0e4b7b
Fix resize embedding with Deepspeed (#32192) · 1392a686
Raushan Turganbay authored Jul 24, 2024
```
fix resize when deepspeed
```
1392a686

let's not warn when someone is running a forward (#32176) · 8d2534c4

Arthur authored Jul 24, 2024

* let's not warn when someone is running a foward without cache + self.training

* more models

* fixup

8d2534c4

RoPE: relaxed rope validation (#32182) · e0182f3b

Joao Gante authored Jul 24, 2024

* relaxed rope check

* lets also accept rope_type=None, defaulting to the original implementation

* type and rope_type can coexist

e0182f3b

Remove conversational pipeline tests (#32099) · 165116bc
amyeroberts authored Jul 24, 2024
```
Remove conversation pipeline tests
```
165116bc

Update qwen2.md (#32108) · 5f4ee98a

Dr. Artificial曾小健 authored Jul 24, 2024

* Update qwen2.md

outdated description

* Update qwen2.md

amended

* Update qwen2.md

Update

* Update qwen2.md

fix wrong version code, now good to go

5f4ee98a

fix: default value reflects the runtime environment variables rather than the... · 8678879f

조준래 authored Jul 24, 2024

fix: default value reflects the runtime environment variables rather than the ones present at import time. (#32153)

* fix: default value reflects the runtime environment variables rather than the ones present at import time.

* Fix: Change `deterministic` to None by default; use env var if None

8678879f

adds: extra_repr() to MambaRMSNorm to include hidden size / size of weights in the layer (#32171) · 01be5b48
Rohit Dwivedula authored Jul 24, 2024
```
* adds: extra_repr() to MambaRMSNorm to include the hidden size of the layer

* style fix with ruff:
```
01be5b48

23 Jul, 2024 15 commits

[docs] change temperature to a positive value (#32077) · c85510f9
Fanli Lin authored Jul 24, 2024
```
fix
```
c85510f9
fix: Fixed an if condition that is always evaluating to true (#32160) · bc2adb01
Sai-Suraj-27 authored Jul 23, 2024
```
Fixed an if condition always evaluating to true.
```
bc2adb01
fix (#32162) · 23f6a43f
Joao Gante authored Jul 23, 2024

23f6a43f
Llama 3.1 conversion · d5a99dfc
Lysandre authored Jul 23, 2024
```
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
```
d5a99dfc
Dev version: v4.44.0.dev0 · ff0d708f
Lysandre authored Jul 23, 2024

ff0d708f

Updated `ruff` to the latest version (#31926) · d2c687b3

Sai-Suraj-27 authored Jul 23, 2024

* Updated ruff version and fixed the required code accorindg to the latest version.

* Updated ruff version and fixed the required code accorindg to the latest version.

* Added noqa directive to ignore 1 error shown by ruff

d2c687b3

Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs (#31629) · 9cf4f2aa

RhuiDih authored Jul 23, 2024

* add DataCollatorBatchFlattening

* Update data_collator.py

* change name

* new FA2 flow if position_ids is provided

* add comments

* minor fix

* minor fix data collator

* add test cases for models

* add test case for data collator

* remove extra code

* formating for ruff check and check_repo.py

* ruff format

ruff format tests src utils

* custom_init_isort.py

9cf4f2aa

Added additional kwarg for successful running of optuna hyperparameter search (#31924) · 7d92009a
Deep Gandhi authored Jul 23, 2024
```
Update integration_utils.py

Added additional kwarg
```
7d92009a

feat(cache): StaticCache uses index_copy_ to avoid useless copy (#31857) · 63700628

Alvaro Moran authored Jul 23, 2024

* feat(cache): StaticCache uses index_copy_ to avoid useless copy

Using index_copy_ allows for explicit in-place change of the tensor.
Some backends (XLA) will otherwise copy the tensor, making the code
slower and using more memory.

Proposed implementation will end up using less memory and on XLA will
result in less compilation, but the change is also quite generic, making
no change whatsoever on CUDA or CPU backend.

* feat(cache): SlidingWindowCache uses index_copy_ to avoid useless copy

Applying the same change done in StaticCache.

* fix(cache): fallback of index_copy_ when not implemented

* fix(cache): in index_copy_ ensure tensors are on same device

* [run slow] llama

* fix(cache): add move of cache_position to same device in SlidingWindowCache

* Revert "[run slow] llama"

This reverts commit 02608dd14253ccd464e31c108e0cd94364f0e8b9.

63700628

Fix typing to be compatible with later py versions (#32155) · a009fbda
amyeroberts authored Jul 23, 2024

a009fbda
Revert "Incorrect Whisper long-form decoding timestamps " (#32148) · 3263b343
Sanchit Gandhi authored Jul 23, 2024
```
Revert "Incorrect Whisper long-form decoding timestamps  (#32003)"

This reverts commit cd48553f.
```
3263b343

Rename Phi-3 rope scaling type (#31436) · 034b4778

Amit Garg authored Jul 23, 2024

* renamed phi3 rope_scaling type

* fixed trailing whitespaces

* fixed test

* added warning

* fixed format

034b4778

Added mamba.py backend (#30139) · bab32d6f

Alexandre TL authored Jul 23, 2024



* Update README.md

* tests: forward ok

* backward test done

* done testing

* removed check. scripts

* Update README.md

* added use_mambapy arg

* fixed typo in warning

* protected imports w/ mambapy package

* delete pscan.py + raise rather than assert

* Update import_utils.py

* fix whitespaces and unused import

* trailing whitespace + import block unformatted

* Update modeling_mamba.py

* transpose before pscan

* shape comment

* ran make style

* use_mambapy=False by default
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* ran make fix-copies

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

bab32d6f

Fix video batching to videollava (#32139) · 9ced33ca
Merve Noyan authored Jul 23, 2024
```
---------
Co-authored-by: Merve Noyan <mervenoyan@Merve-MacBook-Pro.local>
```
9ced33ca
Fix flash attention speed issue (#32028) · a5b226ce
Cyril Vallez authored Jul 23, 2024
```
Add the lru_cache for speed
```
a5b226ce