Commits · 4d8427f739bc77262138ccaa9748e4abbe2b3c75 · chenpangpang / transformers

27 Mar, 2024 5 commits

Reimplement "Automatic safetensors conversion when lacking these files" (#29846) · 4d8427f7

Lysandre Debut authored Mar 27, 2024

* Automatic safetensors conversion when lacking these files (#29390)

* Automatic safetensors conversion when lacking these files

* Remove debug

* Thread name

* Typo

* Ensure that raises do not affect the main thread

* Catch all errors

4d8427f7

Fix 29807, sinusoidal positional encodings overwritten by post_init() (#29813) · a81cf9ee

Hovnatan Karapetyan authored Mar 27, 2024

* Check for requires_grad when initing weights

* Add unit test

* Move sinusoidal positional encoding generation after post_init()

* Add modules to skip init list

* Move create_sinusoidal_embeddings to _init_weights

a81cf9ee

Mamba `slow_forward` gradient fix (#29563) · cefb819f

Anton Vlasjuk authored Mar 27, 2024

* FIX: Cached slow forward in mamba
- additionally added mamba cached test
- added unused test (mamba causal lm forward and backward)
- fixed typo: "causl" --> "causal"

* formatting

* fix: use real `slow_forward` call instead of torch module's

* add shape assertion for mixer block test

* adjust shape assertion

cefb819f

Add Qwen2MoE (#29377) · 1c39974a

Bo Zheng authored Mar 27, 2024



* add support for qwen2 MoE models

* update docs

* add support for qwen2 MoE models

* update docs

* update model name & test

* update readme

* update class names & readme & model_doc of Qwen2MoE.

* update architecture name

* fix qwen2_moe tests

* use Qwen2Tokenizer instead of Qwen2MoeTokenizer

* update modeling_qwen2_moe.py

* fix model architecture

* fix qwen2_moe tests

* use Qwen2Tokenizer instead of Qwen2MoeTokenizer

* update modeling_qwen2_moe.py

* fix model architecture

* fix style

* fix test when there are sparse and non sparse layers

* fixup

* Update README.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

* fixup

* add archive back

* add support for qwen2 MoE models

* update docs

* update model name & test

* update readme

* update class names & readme & model_doc of Qwen2MoE.

* update architecture name

* fix qwen2_moe tests

* use Qwen2Tokenizer instead of Qwen2MoeTokenizer

* update modeling_qwen2_moe.py

* fix model architecture

* fixup

* fix qwen2_moe tests

* use Qwen2Tokenizer instead of Qwen2MoeTokenizer

* fix style

* fix test when there are sparse and non sparse layers

* fixup

* add archive back

* fix integration test

* fixup

---------
Co-authored-by: bozheng-hit <dsoul0621@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

1c39974a

Support `num_attention_heads` != `num_key_value_heads` in Flax Llama Implementation (#29557) · 8e08acad
Benjamin Minixhofer authored Mar 27, 2024
```
* fix tinyllama flax modelling

* rename vars to minimize changes

* move

* formatting

* remove unused var
```
8e08acad

26 Mar, 2024 8 commits
- Set custom_container in build docs workflows (#29855) · f01e1609
  Lucain authored Mar 26, 2024
  
  f01e1609
- Disable AMD memory benchmarks (#29871) · 07d79520
  Ilyas Moutawwakil authored Mar 26, 2024
```
* remove py3nvml to skip amd memory benchmarks

* uninstall pynvml from docker images
```
  07d79520
- Add `cosine_with_min_lr` scheduler in Trainer (#29341) · ef609958
  Yanyi Liu authored Mar 26, 2024
```
* Add cosine_with_min_lr scheduler

* Update error message for missing min_lr or min_lr_rate
```
  ef609958
- Allow `bos_token_id is None` during the generation with `inputs_embeds` (#29772) · 998b5bb5
  Zhihao Lin authored Mar 26, 2024
```
* update

* add ut

* update
```
  998b5bb5
- [docs] Indent ordered list in add_new_model.md (#29796) · b9ceb03d
  Michael authored Mar 26, 2024
  
  b9ceb03d
- Fix header in IFE task guide (#29859) · de81a677
  Merve Noyan authored Mar 26, 2024
```
Update image_feature_extraction.md
```
  de81a677
- Replace 'decord' with 'av' in VideoClassificationPipeline (#29747) · b32bf85b
  yunxiangtang authored Mar 26, 2024
```
* replace the 'decord' with 'av' in VideoClassificationPipeline

* fix the check of backend in VideoClassificationPipeline

* adjust the order of imports

* format 'video_classification.py'

* format 'video_classification.py' with ruff

---------
Co-authored-by: wanqiancheng <13541261013@163.com>
```
  b32bf85b
- Add warnings if training args differ from checkpoint trainer state (#29255) · b5a6d6ee
  Jonathan Flynn authored Mar 26, 2024
```
* add warnings if training args differ from checkpoint args stored in trainer_state.json

* run formatting and styling

* add a test

* format and styling

---------
Co-authored-by: Jonathan Flynn <jonl.flynn@guardian.co.uk>
```
  b5a6d6ee
25 Mar, 2024 6 commits

remove quotes in code example (#29812) · 7eb3ba82
Johannes Kolbe authored Mar 25, 2024
```
Co-authored-by: Johannes <johannes.kolbe@tech.better.team>
```
7eb3ba82
[`revert commit`] revert 00a09ed4 · e3e16ddc
Arthur Zucker authored Mar 25, 2024

e3e16ddc
fix 😭 · 00a09ed4
Arthur Zucker authored Mar 25, 2024

00a09ed4

Populate torch_dtype from model to pipeline (#28940) · 8e9a2207

Yuki Watanabe authored Mar 25, 2024



* Populate torch_dtype from model to pipeline
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* use property
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* lint
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* Remove default handling
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

---------
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

8e9a2207

Fix the behavior of collecting 'num_input_tokens_seen' (#29099) · afe73aed

yhuang authored Mar 25, 2024

fix the behavior of collecting 'num_input_tokens_seen'

See https://github.com/huggingface/transformers/issues/28791 for more details.

afe73aed

Remove static pretrained maps from the library's internals (#29112) · 39114c03

Lysandre Debut authored Mar 25, 2024



* [test_all] Remove static pretrained maps from the library's internals

* Deprecate archive maps instead of removing them

* Revert init changes

* [test_all] Deprecate instead of removing

* [test_all] PVT v2 support

* [test_all] Tests should all pass

* [test_all] Style

* Address review comments

* Update src/transformers/models/deprecated/_archive_maps.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/deprecated/_archive_maps.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* [test_all] trigger tests

* [test_all] LLAVA

* [test_all] Bad rebase

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

39114c03

24 Mar, 2024 1 commit

model_summary.md - Restore link to Harvard's Annotated Transformer. (#29702) · 76a33a10

gamepad_coder authored Mar 23, 2024

* model_summary.md - Add link to Harvard's Annotated Transformer.

* model_summary.md - slight wording change + capitalize name of the paper

* model_summary.md - moves the Annotated Transformer link in a praenthesis next to the link to the original paper (great idea, stevhliu!)

* model_summary.md - moves the Annotated Transformer link in a praenthesis next to the link to the original paper (commit pt. 2, accidentally removed "has" in pt. 1)

76a33a10

23 Mar, 2024 1 commit
- [DOCS] Fix typo for llava next docs (#29829) · dafe3702
  Billy Cao authored Mar 24, 2024
```
Fix typo for llava next docs
```
  dafe3702
22 Mar, 2024 10 commits

[`SuperPoint`] Fix doc example (#29816) · c5f0288b
amyeroberts authored Mar 22, 2024
```
[SuperPoint] Fix doc example
```
c5f0288b

Complete security policy with mentions of remote code (#29707) · 7e1413d1

Lysandre Debut authored Mar 22, 2024



* Security policy

* Apply suggestions from code review
Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>
Co-authored-by: Michelle Habonneau <83347449+Michellehbn@users.noreply.github.com>

* Update SECURITY.md
Co-authored-by: Diogo Teles Sant'Anna <diogoteles@google.com>

---------
Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>
Co-authored-by: Michelle Habonneau <83347449+Michellehbn@users.noreply.github.com>
Co-authored-by: Diogo Teles Sant'Anna <diogoteles@google.com>

7e1413d1

[`cleanup`] vestiges of causal mask (#29806) · 2e7cb46f
Arthur authored Mar 22, 2024
```
nit
```
2e7cb46f
replaced concatenation to f-strings to improve readability and unify … (#29785) · 884b2215
igeni authored Mar 22, 2024
```
replaced concatenation to f-strings to improve readability and unify with the rest code
```
884b2215
Generate: remove unused attributes in `AssistedCandidateGenerator` (#29787) · 34e07f4b
Joao Gante authored Mar 22, 2024
```
remove unused attrs
```
34e07f4b

rm input dtype change in CPU (#28631) · e85654f5

jiqing-feng authored Mar 22, 2024

* rm input dtype change in CPU

* add warning when use CPU low-precision

* rm useless logging

e85654f5

Correct llava mask & fix missing setter for `vocab_size` (#29389) · 13b23704

fxmarty authored Mar 22, 2024

* correct llava mask

* fix vipllava as wlel

* mask out embedding for padding tokens

* add test

* fix style

* add setter

* fix test on suggestion

13b23704

Enable AMD docker build CI (#29803) · aa17cf98
Ilyas Moutawwakil authored Mar 22, 2024
```
* enable amd ci

* remove unnecessary clean up
```
aa17cf98

Fix type hint for train_dataset param of Trainer.__init__() to allow... · 34791613

Steven Madere authored Mar 22, 2024

Fix type hint for train_dataset param of Trainer.__init__() to allow IterableDataset.  Issue 29678 (#29738)

* Fixed typehint for train_dataset param in Trainer.__init__().  Added IterableDataset option.

* make fixup

34791613

[`quality`] update quality check to make sure we check imports

😈

(#29771) · e68ff304

Arthur authored Mar 22, 2024

* update quality check

* make it nice

* update

* let's make sure it runs and we have the logs actually

* update workflow

* nits

e68ff304

21 Mar, 2024 9 commits

Change in-place operations to out-of-place in LogitsProcessors (#29680) · fadb0533

Raushan Turganbay authored Mar 21, 2024



* change in-place -> out-of-place

* add tests

* add more tests

* naming consistency

* fix doctest

* forgot min-length processors

* empty

* Revert "fix doctest"

This reverts commit 4772768457f9bc057f1d4d9d67ea94eb7224eb8d.

* revert change in docstring

* Update tests/generation/test_logits_process.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/generation/test_logits_process.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

fadb0533

Prepend `bos token` to Blip generations (#29642) · b469ebc5

Raushan Turganbay authored Mar 21, 2024



* prepend "bos" to blip generation

* minor changes

* Update src/transformers/models/blip_2/modeling_blip_2.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/instructblip/modeling_instructblip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add generation tester mixin

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

b469ebc5

Llama: always convert the causal mask in the SDPA code path (#29663) · ee38fc31
Joao Gante authored Mar 21, 2024
```
* always convert the mask

* rebase and fix copies
```
ee38fc31
Generate: remove legacy generation mixin imports (#29782) · 5ffef2a9
Joao Gante authored Mar 21, 2024

5ffef2a9
Add support for `torch_dtype` in the run_mlm example (#29776) · ef6e371d
Jacky Lee authored Mar 21, 2024
```
feat: add support for torch_dtype
Co-authored-by: Jacky Lee <jackylee328@gmail.com>
```
ef6e371d
Add deterministic config to `set_seed` (#29778) · 10d232e8
Zach Mueller authored Mar 21, 2024
```
* Add deterministic config

* Add note on slowdown

* English fails me again
```
10d232e8
Silence deprecations and use the DataLoaderConfig (#29779) · f0bfb150
Zach Mueller authored Mar 21, 2024
```
* Remove deprecations

* Clean
```
f0bfb150
Cast bfloat16 to float32 for Numpy conversions (#29755) · de627f5a
Matt authored Mar 21, 2024
```
* Cast bfloat16 to float32 for Numpy conversions

* Add test
```
de627f5a
[`LlavaNext`] Fix llava next unsafe imports (#29773) · 73a73b41
Arthur authored Mar 21, 2024
```
* path llava-next

* styling

* styling
```
73a73b41