Commits · 6c640f098a6a7dd7337d4a630332fd418d41e396 · chenpangpang / transformers

"vscode:/vscode.git/clone" did not exist on "7824fa431eec5e07811b907deb519102f6b0f932"

05 Apr, 2023 3 commits
- Move back doctest instructions to setup.cfg (#22587) · 6c640f09
  Sylvain Gugger authored Apr 05, 2023
  
  6c640f09
- Generate: `TextIteratorStreamer` timeout (#22576) · 861ff890
  Joao Gante authored Apr 05, 2023
  
  861ff890
- Skip failing test · 11fd2c77
  Sylvain Gugger authored Apr 04, 2023
  
  11fd2c77
04 Apr, 2023 14 commits

Fix inverted conditional in TF common test! (#22540) · edb704b2

Matt authored Apr 04, 2023

* Fix inverted conditional in TF common test!

* Make the same change in the PT tests file

* Make sure hidden states for GPT2 have the same output shape in PT/TF

* Minor fix to PT implementation of token classification loss

* Skip loss equivalence test for TFHubert because it keeps overflowing to inf

* Compute LM loss for TF the (weird) way it's computed in PT

* Skip loss equivalence test for Wav2Vec2 for the same reason as Hubert

* Fix - don't try to access the hidden states property when output is a tuple

edb704b2

fix `_no_split_modules` for Whisper model (#22486) · 48fbd8fa
Sourab Mangrulkar authored Apr 04, 2023

48fbd8fa

Flax Regnet (#21867) · 90067748

Shubhamai authored Apr 04, 2023

* initial commit

* review changes

* post model PR merge

* updating doc

90067748

corrected the code comment for the output of find_pruneable_heads_and_indices (#22557) · fc5b7419
Sun Haozhe authored Apr 04, 2023
```
* corrected/clarified the code comment of find_pruneable_heads_and_indices

* have run make style
```
fc5b7419

Add TF port of BLIP (#22090) · 5f3ea66b

Matt authored Apr 04, 2023



* Initial commit

* more stash commit

* Yet another stash commit

* yet more stash commit

* Mostly working except for docs / repo consistency

* Stop importing model list from torch file

* Add TF BLIP models to docs

* Add auto classes

* Move get_text_features and get_image_features

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/blip/test_modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/blip/test_modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/models/blip/test_modeling_tf_blip_text.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Use channels_last convolutions in TF (better performance + compatibility)

* Remove _shape function

* Move multi-line statement to one line in PT + TF

* Specify tf.keras.layers instead of importing from it

* Remove test_gradient_checkpointing and empty test_training methods

* move some multi-line statements to one line

* Update docstring for generate

* Remove pruned heads set

* Remove self.seq_len_dim

* Fixed issues with loss computation, should resolve some tests. Also ensured that the PT version follows the config for output_attentions and output_hidden_states

* ensure original model follows config in more cases

* Skip the same cross-attention tests in the PT tests - didn't realize we did it twice!

* Add training args throughout the models and layers

* make fixup

* Fix docstring for inputs_embeds

* Add docstring for is_decoder

* Add docstrings to text models

* Remove redundant computation

* Add unpack_inputs / keras_serializable

* Add modeling_tf_blip to doctests

* Add config classes for keras serialization

* Changes to allow model porting with pt-to-tf

* Quick fix to decoder head and test tweaks

* Revert an issue with masking the embeddings outputs

* Allow missing keys in some equivalence tests (for unused layers)

* Add tf-pt equivalence tests back in

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make fixup

* Refactor invert_attention_mask out into tf_utils

* Re-enable cross-tests on the PT side too

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5f3ea66b

Soft error whisper. (#22475) · a515d0a7

Nicolas Patry authored Apr 04, 2023



* Soft error whisper.

* Fix format.

---------
Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-94.taildb5d.ts.net>

a515d0a7

Add id2label and label2id to model's config in run_xnil (#22558) · 98268b2e
Maziyar Panahi authored Apr 04, 2023
```
Add id2label and label2id to config in run_xnil
```
98268b2e
[`bnb`] Fix typo (#22556) · fa2bdffc
Younes Belkada authored Apr 04, 2023
```
Update modeling_utils.py
```
fa2bdffc
Remove hack for dynamic modules and use Python functions instead (#22537) · 28fcf006
Sylvain Gugger authored Apr 04, 2023

28fcf006

Implemented safetensors checkpoints save/load for Trainer (#22498) · 871598be

Viktor Scherbakov authored Apr 04, 2023



* implemented safetensors save/load

* remove duplicated file

* added tests

* more tests

* style fix

* fix tf tests

* change to list comprehension
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* review fixes + safe load for sharded checkpoint

* style fix

* remove rogue import

* remove partial to avoid undefined exception

* use naming alias instead of safetensors.torch

* fix safe sharding in tests

* grammar
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update docs
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update docs
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* minor corrections

* style

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

871598be

🚨

`[NLLB Tokenizer]` Fix the prefix tokens

🚨

(#22313) · 00b5887b

Arthur authored Apr 04, 2023



* fix the prefix tokens

* update fast and test values

* add legacy behaviour
Co-authored-by: sgugger <sylvain.gugger@gmail.com>

* update disclaimer, linkissue PR and behaviral changes

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re>

* styling

* make a quote

* quote this time

---------
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>

00b5887b

[Roformer] Fixing a bug in RoFormerEncoder where it was ignoring the length of... · ad5e9b6c

TheWall9 authored Apr 04, 2023

[Roformer] Fixing a bug in RoFormerEncoder where it was ignoring the length of past_key_values when generating as a decoder (#22416)

* fix RoFormerEncoder postion embedding when generate as decoder

* make fixup

* add test case for check generate with past key values

* remove duplicating code

ad5e9b6c

Generate: Add text streamer decoding options (#22544) · 1905384f
Joao Gante authored Apr 04, 2023

1905384f

Fix OPTForQuestionAnswering doc string (#22481) · 41a2f352

Pavel T authored Apr 03, 2023

* Fix OPTForQuestionAnswering doc string

for more adequate model answer decoding

* black style fix

* doc-builder style

41a2f352

03 Apr, 2023 19 commits

Update test_image_processing_pix2struct.py (#22543) · 159ff334
Younes Belkada authored Apr 03, 2023

159ff334
Skip failing test · c14d3129
Sylvain Gugger authored Apr 03, 2023

c14d3129

[setup] migrate setup script to `pyproject.toml` (#22539) · 4169dc84

Xuehai Pan authored Apr 04, 2023

* [setup] migrate setup script to `pyproject.toml`

* [setup] cleanup configurations

* remove unused imports

4169dc84

Generate: Enable easier TextStreamer customization (#22516) · a17841ac
Vladimir Blagojevic authored Apr 03, 2023

a17841ac

[setup] drop deprecated `distutils` usage (#22531) · 80d1319e

Xuehai Pan authored Apr 04, 2023

* [setup] drop deprecated `distutils` usage

* drop deprecated `distutils.util.strtobool` usage

* fix import order

* reformat docstring by `doc-builder`

80d1319e

Fix missing metrics with multiple eval datasets (#22536) · 4c33a0c4
Ilya authored Apr 03, 2023

4c33a0c4
[`T5`] Enable naive Pipeline Parallelism training for T5 (#22535) · d7a4f5be
Younes Belkada authored Apr 03, 2023
```
* enable PP for T5

* make fixup

* fix failing tests
```
d7a4f5be

[`Trainer`] Force `is_model_parallel` when model is loaded in multiple GPUs... · cab048fb

Younes Belkada authored Apr 03, 2023

[`Trainer`] Force `is_model_parallel` when model is loaded in multiple GPUs using `accelerate` (#22532)

* add `is_model_parallel` arg on Trainer

* add warning

* adapt from suggestions

* revert t5 changes

* remove commas

* adapt from suggestions

cab048fb

[BLIP] fix cross attentions for BlipTextEncoder (#22515) · aecbcb36
zhbh01 authored Apr 03, 2023

aecbcb36

fix LayoutLMv3TokenizerFast subword label after 'Ġ' token (#21695) · 4e441e52

Thibault Douzon authored Apr 03, 2023

LayoutLMv3TokenizerFast produces empty 'Ġ' token with `offset_mapping = (0, 0)`.
Next token is wrongly assumed to also be beginning of word and isn't
correctly assigned `pad_token_label`.
Modify test with text that produce 'Ġ' token.
Remove copy check from LayoutLMv2TokenizerFast for `_batch_encode_plus`.

solves issue: #19978

4e441e52

llama docs: fix conversion script url (#22514) · a6001056
Kirill authored Apr 03, 2023

a6001056

Fix convert_opt_original_pytorch_checkpoint_to_pytorch.py typo (#22526) · 9419f144

larekrow authored Apr 03, 2023

`load_checkpoint()` silently fails because `".qkj_proj." in key` is always `False`, but will eventually cause an error at `model.load_state_dict(state_dict)`.

9419f144

Generate: `TextIteratorStreamer` (streamer for gradio) (#22501) · a55a822a
Joao Gante authored Apr 03, 2023
```
* haha text go brrr (but in gradio)
```
a55a822a

added biogpt token classifier (#22447) · 7d25c9c8

Mohammed Jabir authored Apr 03, 2023



* added biogpt token classifier

* fix reviews

* Updated modeling_biogpt.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

7d25c9c8

[WIP] docs: ko: sagemaker.mdx (#22509) · 1194c3e3
Jungnerd authored Apr 03, 2023
```
docs: ko: sagemaker.mdx
```
1194c3e3

Fix llama tokenizer (#22402) · c0f99b4d

Arthur authored Apr 03, 2023

* draft

* update tokenization limma and conversion script

* more udpates

* initial commit

* style

* default pad to None

* draft tokenization tests

* update test

* update tokenization tests

* nits

* update

* versioning test

* major fix

* fix more testst

* finish fixing special masks

* last nit

* more nits

* add encode decode tests

* add more

* fix token type ids

* style

c0f99b4d

[Time-Series] fix past_observed_mask type (#22076) · 9eae4aa5
Eli Simhayev authored Apr 03, 2023
```
added > 0.5 to `past_observed_mask`
```
9eae4aa5

Backbone add out indices (#22493) · 559a45d1

amyeroberts authored Apr 03, 2023

* Add out_indices to backbones, deprecate out_features

* Update - can specify both out_features and out_indices but not both

* Can specify both

* Fix copies

* Add out_indices to convnextv2 configuration

559a45d1

Update convert_llama_weights_to_hf.py (#22525) · db803b69
kevinpro authored Apr 03, 2023

db803b69

31 Mar, 2023 4 commits

Test fetch v2 (#22367) · c6126280

Sylvain Gugger authored Mar 31, 2023



* Test fetcher v2

* Fix regexes

* Remove sanity check

* Fake modification to OPT

* Fixes some .sep issues

* Remove fake OPT change

* Fake modif for BERT

* Fake modif for init

* Exclude SageMaker tests

* Fix test and remove fake modif

* Fake setup modif

* Fake pipeline modif

* Remove all fake modifs

* Adds options to skip/force tests

* [test-all-models] Fake modif for BERT

* Try this way

* Does the command actually work?

* [test-all-models] Try again!

* [skip circleci] Remove fake modif

* Remove debug statements

* Add the list of important models

* Quality

* Update utils/tests_fetcher.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

* Address review comments

* Address review comments

* Fix and add test

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Address review comments

---------
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

c6126280

Update Neptune callback docstring (#22497) · 3a9464bd

Sabine authored Mar 31, 2023



* update NeptuneCallback docstring

* formatting

* apply make style

---------
Co-authored-by: Aleksander Wojnarowicz <alwojnarowicz@gmail.com>

3a9464bd

Bump redis from 4.5.3 to 4.5.4 in /examples/research_projects/decision_transformer (#22494) · 6fc44656

dependabot[bot] authored Mar 31, 2023

Bump redis in /examples/research_projects/decision_transformer

Bumps [redis](https://github.com/redis/redis-py) from 4.5.3 to 4.5.4.
- [Release notes](https://github.com/redis/redis-py/releases)
- [Changelog](https://github.com/redis/redis-py/blob/master/CHANGES)
- [Commits](https://github.com/redis/redis-py/compare/v4.5.3...v4.5.4

)

---
updated-dependencies:
- dependency-name: redis
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

6fc44656

Making sure we can use safetensors to serialize all the time. (#22437) · d143087d

Nicolas Patry authored Mar 31, 2023



* Making sure we can use safetensors to serialize all the time.

* Expanding the tests for increased coverage.

* Update the test.

* Getting current state of affairs.

* Tentative fix.

* Fixing black version.

* Fixing the worst offenders.

* Try to modify less files.

* Fixing blip_2 (Weird solution right now).

* Fixing deta.

* Fix blip ?

* Missing extra newline.

* No deta modification.

* Adding some comments.

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Addressing comments.

* Addressing comments.

* creating warn_once.

* Warning_once !

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

d143087d