Commits · 06e782da4e58f93a60c6bedc84b5991abaae58f5 · chenpangpang / transformers

25 Oct, 2023 1 commit

[`core`] Refactor of `gradient_checkpointing` (#27020) · 06e782da

Younes Belkada authored Oct 25, 2023

* v1

* fix

* remove `create_custom_forward`

* fixup

* fixup

* add test and fix all failing GC tests

* remove all remaining `create_custom_forward` methods

* fix idefics bug

* fixup

* replace with `__call__`

* add comment

* quality

06e782da

25 Aug, 2023 1 commit

Add type hints for several pytorch models (batch-3) (#25705) · 4d9e45f3

David Reguera authored Aug 25, 2023

* Add missing type hints for ErnieM family

* Add missing type hints for EsmForProteinFolding model

* Add missing type hints for Graphormer model

* Add type hints for InstructBlipQFormer model

* Add missing type hints for LayoutLMForMaskedLM model

* Add missing type hints for LukeForEntitySpanClassification model

4d9e45f3

08 Aug, 2023 1 commit

Fix `test_model_parallelism` (#25359) · 6ea3ee3c

Yih-Dar authored Aug 08, 2023



* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

6ea3ee3c

04 Aug, 2023 1 commit

Move usage of deprecated logging.warn to logging.warning (#25310) · 67683095

Peter Law authored Aug 04, 2023

The former spelling is deprecated and has been discouraged for a
while. The latter spelling seems to be more common in this project
anyway, so this change ought to be safe.

Fixes https://github.com/huggingface/transformers/issues/25283

67683095

02 Aug, 2023 1 commit

Fix return_dict_in_generate bug in InstructBlip generate function (#25246) · 1baeed5b

Euan Ong authored Aug 02, 2023

Fix bug in InstructBlip generate function

Previously, the postprocessing conducted on generated sequences in InstructBlip's generate function assumed these sequences were tensors (i.e. that `return_dict_in_generate == False`).

This commit checks whether the result of the call to the wrapped language model `generate()` is a tensor, and if not attempts to postprocess the sequence attribute of the returned results object.

1baeed5b

21 Jul, 2023 1 commit

fix: cast input pixels to appropriate dtype for image_to_text pipelines (#24947) · 83f9314d

Jim Allanson authored Jul 21, 2023

* fix: cast input pixels to appropriate dtype for image_to_text tasks

* fix: add casting to pixel inputs of additional models after running copy checks

83f9314d

18 Jul, 2023 1 commit
- [`InstructBlip`] Fix int8/fp4 issues (#24888) · a9e067a4
  Younes Belkada authored Jul 18, 2023
```
* fix dtype issue

* revert `.float()`

* fix copies
```
  a9e067a4
11 Jul, 2023 1 commit
- [InstructBLIP] Fix bos token of LLaMa checkpoints (#24492) · bb13a928
  NielsRogge authored Jul 11, 2023
```
* Add fix

* Fix doctest
```
  bb13a928
28 Jun, 2023 2 commits
- [`InstructBlip`] Add instruct blip int8 test (#24555) · 33b5ef5c
  Younes Belkada authored Jun 28, 2023
```
* add 8bit instructblip test

* update tests
```
  33b5ef5c
- Finishing tidying keys to ignore on load (#24535) · 89b6ee49
  Sylvain Gugger authored Jun 27, 2023
  
  89b6ee49
26 Jun, 2023 2 commits

[`InstructBlip`] Add accelerate support for instructblip (#24488) · 9895670e

Younes Belkada authored Jun 26, 2023

* add accelerate support for instructblip

* add `_keep_in_fp32_modules`

* dynamically adapt `_no_split_modules`

* better fix

* same logic for `_keep_in_fp32_modules`

9895670e

Add InstructBLIP (#23460) · 868363ab

NielsRogge authored Jun 26, 2023



* Squash 88 commits

* Use markdown

* Remove mdx files due to bad rebase

* Fix modeling files due to bad rebase

* Fix style

* Update comment

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

868363ab

22 Jun, 2023 1 commit
- Revert "Fix gradient checkpointing + fp16 autocast for most models" (#24420) · 3ce3385c
  Younes Belkada authored Jun 22, 2023
```
Revert "Fix gradient checkpointing + fp16 autocast for most models (#24247)"

This reverts commit 285a4801.
```
  3ce3385c
21 Jun, 2023 1 commit

Fix gradient checkpointing + fp16 autocast for most models (#24247) · 285a4801

Younes Belkada authored Jun 21, 2023



* fix gc bug

* continue PoC on OPT

* fixes

* :exploding_head:

* fix tests

* remove pytest.mark

* fixup

* forward contrib credits from discussions

* forward contrib credits from discussions

* reverting changes on untouched files.

---------
Co-authored-by: zhaoqf123 <zhaoqf123@users.noreply.github.com>
Co-authored-by: 7eu7d7 <7eu7d7@users.noreply.github.com>

285a4801

13 Jun, 2023 1 commit

Tied params cleanup (#24211) · 695928e1

Sylvain Gugger authored Jun 13, 2023

* First test

* Add info for all models

* style

* Repo consistency

* Fix last model and cleanup prints

* Repo consistency

* Use consistent function for detecting tied weights

695928e1

31 May, 2023 1 commit
- Skip device placement for past key values in decoder models (#23919) · fabe17a7
  Sylvain Gugger authored May 31, 2023
  
  fabe17a7
23 May, 2023 1 commit

Fix some docs what layerdrop does (#23691) · 003a0cf8

zspo authored May 24, 2023



* Fix some docs what layerdrop does

* Update src/transformers/models/data2vec/configuration_data2vec_audio.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix more docs

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

003a0cf8

01 May, 2023 1 commit
- Fix string syntax error in logger warning message (additional comma) (#23083) · 549e5f9f
  Xin Wen authored May 01, 2023
  
  549e5f9f
07 Apr, 2023 1 commit
- Move labels to the same device as logits for LlamaForSequenceClassification and Blip2 (#22596) · 1de8ce9e
  Shikhar Chauhan authored Apr 07, 2023
```
* (feat): Move labels to the same device as logits

* Trigger CI

* Trigger CI

* Trigger CI

* (feat): Making changes for Blip2
```
  1de8ce9e
04 Apr, 2023 1 commit

Add TF port of BLIP (#22090) · 5f3ea66b

Matt authored Apr 04, 2023



* Initial commit

* more stash commit

* Yet another stash commit

* yet more stash commit

* Mostly working except for docs / repo consistency

* Stop importing model list from torch file

* Add TF BLIP models to docs

* Add auto classes

* Move get_text_features and get_image_features

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/blip/test_modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/blip/test_modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/models/blip/test_modeling_tf_blip_text.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Use channels_last convolutions in TF (better performance + compatibility)

* Remove _shape function

* Move multi-line statement to one line in PT + TF

* Specify tf.keras.layers instead of importing from it

* Remove test_gradient_checkpointing and empty test_training methods

* move some multi-line statements to one line

* Update docstring for generate

* Remove pruned heads set

* Remove self.seq_len_dim

* Fixed issues with loss computation, should resolve some tests. Also ensured that the PT version follows the config for output_attentions and output_hidden_states

* ensure original model follows config in more cases

* Skip the same cross-attention tests in the PT tests - didn't realize we did it twice!

* Add training args throughout the models and layers

* make fixup

* Fix docstring for inputs_embeds

* Add docstring for is_decoder

* Add docstrings to text models

* Remove redundant computation

* Add unpack_inputs / keras_serializable

* Add modeling_tf_blip to doctests

* Add config classes for keras serialization

* Changes to allow model porting with pt-to-tf

* Quick fix to decoder head and test tweaks

* Revert an issue with masking the embeddings outputs

* Allow missing keys in some equivalence tests (for unused layers)

* Add tf-pt equivalence tests back in

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make fixup

* Refactor invert_attention_mask out into tf_utils

* Re-enable cross-tests on the PT side too

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5f3ea66b

31 Mar, 2023 1 commit

Making sure we can use safetensors to serialize all the time. (#22437) · d143087d

Nicolas Patry authored Mar 31, 2023



* Making sure we can use safetensors to serialize all the time.

* Expanding the tests for increased coverage.

* Update the test.

* Getting current state of affairs.

* Tentative fix.

* Fixing black version.

* Fixing the worst offenders.

* Try to modify less files.

* Fixing blip_2 (Weird solution right now).

* Fixing deta.

* Fix blip ?

* Missing extra newline.

* No deta modification.

* Adding some comments.

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Addressing comments.

* Addressing comments.

* creating warn_once.

* Warning_once !

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

d143087d

28 Feb, 2023 2 commits

[`Blip2`] Fix Blip-2 multi gpu (#21707) · 7f4f8b97

Younes Belkada authored Feb 28, 2023



* fix blip multi gpu

* fix

* final changes

* adapt suggestions

* fix failing slow test

* forward contrib credits from testing and suggestions

* reformat

---------
Co-authored-by: akkikiki <akkikiki@users.noreply.github.com>

7f4f8b97

[`Blip2`] Add `Blip2Model` (#21817) · b8de7e44

Younes Belkada authored Feb 28, 2023

* add v1

* add `Blip2Model`

- add relevant functions
- add tests
- add on automapping

* fix docs

* fix doctest

b8de7e44

27 Feb, 2023 1 commit

Fix quality with `ruff==0.0.253` (#21828) · f95f60c8

Yih-Dar authored Feb 27, 2023



fix quality with ruff 0.0.253
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

f95f60c8

10 Feb, 2023 1 commit
- [`Blip2`] Add int8 support for `blip2-flan-t5-xxl` (#21574) · 75a208ef
  Younes Belkada authored Feb 10, 2023
```
add int8 support
```
  75a208ef
09 Feb, 2023 1 commit

Add BLIP-2 (#21441) · d7f1e7c0

NielsRogge authored Feb 09, 2023



* First draft

* More improvements

* More improvements

* Improve conversion script

* Convert all weights

* Make forward pass work

* Make logits match

* More improvements

* More improvements

* More improvements

* Use get_input_embeddings

* Improve some more

* Improve model tests

* Improve model tests

* More improvements

* Fix processor

* Update files

* Update prepare_inputs_for_generation

* More improvements

* Fix copies

* More fixes

* Make fixup

* More improvements

* Add support for seq2seq language model

* More improvements

* Fix test

* More improvements

* Improve conversion script

* Remove some todo's

* Fix README's

* Improve conversion script

* Fix generation

* Fix style and remove Blip2Model

* Fix model outputs

* More improvements

* Set eos_token_id in config

* Fix quality

* Small improvements

* Add processor tests

* More improvements

* Apply suggestions

* Apply suggestions

* Add integration test

* Update image URL

* Add integration test

* Fix model_type

* Update style

* Improve docs

* Add doc tests

* Fix copies

* Remove tests which are passing

* Improve some more

* Add tests for seq2seq language models

* Minor fix

* Convert more checkpoints

* finalize CI

* Fix blip and blip2 processors

* add `accelerate` support for `blip2`

* clean up

* make style

* Update conversion script

* Update conversion script some more

* Update organization

* revert toc file

* add blip-2 to toc file

* Some more improvements

* Fix docstring

* Improve docs

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>

d7f1e7c0