Commits · ea7b0a539a92a79b829cfc7d41d28f33f993e820 · chenpangpang / transformers

17 Apr, 2023 2 commits
- Use code on the Hub from another repo (#22698) · ea7b0a53
  Sylvain Gugger authored Apr 17, 2023
```
* initial work

* Add other classes

* Refactor code

* Move warning and fix dynamic pipeline

* Issue warning when necessary

* Add test
```
  ea7b0a53
- Fix `test_word_time_stamp_integration` for `Wav2Vec2ProcessorWithLMTest` (#22800) · 76d24f1a
  Yih-Dar authored Apr 17, 2023
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  76d24f1a
14 Apr, 2023 1 commit
- Generate: pin number of beams in BART test (#22763) · 9af845af
  Joao Gante authored Apr 14, 2023
  
  9af845af
13 Apr, 2023 4 commits

Revert (for now) the change on `Deta` in #22437 (#22750) · 410b61ad
Yih-Dar authored Apr 13, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
410b61ad
Generate: handle text conditioning with multimodal encoder-decoder models (#22748) · 9dfd6a4b
Joao Gante authored Apr 13, 2023

9dfd6a4b
`DocumentQuestionAnsweringPipeline` only for fast ⚡ tokenizers (#22745) · 32b08742
Yih-Dar authored Apr 13, 2023
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
32b08742

[Pix2struct] Simplify generation (#22527) · 8eb38f63

NielsRogge authored Apr 13, 2023

* Add model to doc tests

* Remove generate and replace by prepare_inputs_for_generation

* More fixes

* Remove print statements

* Update integration tests

* Fix generate

* Remove model from auto mapping

* Use auto processor

* Fix integration tests

* Fix test

* Add inference code snippet

* Remove is_encoder_decoder

* Update docs

* Remove notebook link

8eb38f63

12 Apr, 2023 4 commits

Fix docstrings for TF BLIP (#22618) · 50f82e12

Matt authored Apr 12, 2023

* Fix docstrings for TFBLIP

* Fix missing line in TF port!

* Use values from torch tests now other bugs fixed

* Use values from torch tests now other bugs fixed

* Fix doctest string

50f82e12

[tests] switch to torchrun (#22712) · 1306b7d3
Stas Bekman authored Apr 12, 2023

1306b7d3

[`bnb`] Let's make serialization of int8 models possible (#22177) · 370f0ca1

Younes Belkada authored Apr 12, 2023



* make serialization of int8 models possible

* make fixup

* add docs

* add ability to push to hub and save pretrained

* fixes

* more addition

* more tests

* fix issues

* change variable

* clearer message

* adapt from suggestions

* few fixes

* remove unused function

* Update src/transformers/utils/quantization_config.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address last comments

* last warning

* clarify doc

* protect import

* Update src/transformers/modeling_utils.py

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

370f0ca1

add model resources for CPMAnt (new) (#20906) · 523ca4e0

pioliverse authored Apr 12, 2023



* resolve conflicts

* rebase and make style

* test

* test

* test

* rebase and make style

* rebase and make style

* tests

* tests

* rewrite some functions

* rebase and make style

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* fix some bugs & docstring

* add models and tests

* solve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* tests

* resolve conflicts

* resolve conflicts

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* fix some bugs & docstring

* save resolution

* make style

* delete redefinition code

* reformat function

* reformat

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* tests

* resolve conflicts

* resolve conflicts

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* resolve conflicts

* make style

* fix bugs and refactor

* modify docstrings and make style

* unify import format in __init__.py

* fix import-altclp bug

* fix copies to update index.md

* fix unused config parameters

* fix unused config parameters

* fix unused config parameters

* update README_ja.md

* dummy commit for unit test

* fix attention mask

* add CPMAntTokenizer&-Fast to auto-mapping

* drop redundant changes in README_ko

* fix  defaults in docstring

* fix use_cache and some docstring

* add missing args in tokenizer

* modify tester inheritance

* add is_jieba_available

* fix some bugs

* make style and fix-copies

* add doctests

* skip integration tests

* add is_jieba_available

* fix bugs in common tests

* adjust docstrings and make style

* add argument docstring

* adjust code to some specifications

* make style and fix-copies

* add fast tokenization test

* dummy commit for unit test

* dummy commit for unit test

* dummy commit for unit test

* normalize some comments and names

* Bert->CPMAnt

* camel names and drop redundant codes

* make style and fix-coies

* add CpmTokenizerFast _import_structure

* drop cpmanttokenizerfast in model_doc

* fix some problems

* fix CPMAnt tokenization for common test

* make style and fixup

* fix copies and fixup

* fix bugs in tokenization test

* dummy commit for connection failure in unittest

* fix copies

* drop trailing comma

* fix decorator in tests

* dummy commit for connection failure in unittest

---------
Co-authored-by: Gong Baitao <gongbaitao11@gmail.com>

523ca4e0

11 Apr, 2023 2 commits
- Fix decorator order (#22708) · fe1f5a63
  Yih-Dar authored Apr 11, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  fe1f5a63
- Update some `MarkupLM` tests' expected values (#22667) · 4c01231e
  Yih-Dar authored Apr 11, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  4c01231e
10 Apr, 2023 3 commits

add GPTNeoXForSequenceClassification (#22671) · 6daa9cb5

Sugawara authored Apr 11, 2023

* add GPTNeoXForSequenceClassification

* move the labels to logits.device (ref: #22561)

* fix

6daa9cb5

Make dynamic code work with offline mode (#22661) · 3876fc68
Sylvain Gugger authored Apr 10, 2023
```
* Make dynamic code work with offline mode

* Clean up

* Quality
```
3876fc68

Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575) · e0921c6b

Joel Lamy-Poirier authored Apr 10, 2023



* Add model with cli tool

* Remove unwanted stuff

* Add new code

* Remove inference runner

* Style

* Fix checks

* Test updates

* make fixup

* fix docs

* fix doc

* fix test

* hopefully fix pipeline tests

* refactor

* fix CIs

* add comment

* rename to `GPTBigCodeForCausalLM`

* correct readme

* make fixup + docs

* make fixup

* fixes

* fixes

* Remove pruning

* Remove import

* Doc updates

* More pruning removal

* Combine copies

* Single MQA implementation, remove kv cache pre-allocation and padding

* Update doc

* Revert refactor to match gpt2 style

* Merge back key and value caches, fix some type hints

* Update doc

* Fix position ids pith padding (PR 21080)

* Add conversion script temporarily

* Update conversion script

* Remove checkpoint conversion

* New model

* Fix MQA test

* Fix copies

* try fix tests

* FIX TEST!!

* remove  `DoubleHeadsModel`

* add MQA tests

* add slow tests

* clean up

* add CPU checker

* final fixes

* fixes

- fix GPU issue
- fixed slow tests
- skip disk offload

* fix final issue

* Simplify and comment baddbmm fix

* Remove unnecessary code

* Transpose tweaks

* Use beta=1 on cpu, improve tests

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>

e0921c6b

07 Apr, 2023 2 commits

[OPT] Fix default attention mask size (#22649) · f3341926

Arthur authored Apr 07, 2023

* Fix default attention mask size

* fixup

* add a test to make sure that even if attention mask are not provided, works

* style

f3341926

Fix `MegaModel` CI (#22652) · 14d5b2b6

Yih-Dar authored Apr 07, 2023



* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

14d5b2b6

06 Apr, 2023 7 commits

Update tiny model summary file for recent models (#22637) · c7ec71ba

Yih-Dar authored Apr 06, 2023



* Update tiny model summary file for recent models

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

c7ec71ba

[`Blip`] Fix slow tests and doctests with correct values (#22632) · ed672864
Younes Belkada authored Apr 06, 2023
```
fix slow tests and doctests
```
ed672864

update_pip_test_mapping (#22606) · fa01127a

Yih-Dar authored Apr 06, 2023



* Add TFBlipForConditionalGeneration

* update pipeline_model_mapping

* Add import

* Revert changes in GPTSanJapaneseTest

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

fa01127a

Make tiny model creation + pipeline testing more robust (#22500) · 2c22bc79
Yih-Dar authored Apr 06, 2023
```
* Final Tiny things

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
2c22bc79

Backbone add mixin tests (#22542) · 12d51db2

amyeroberts authored Apr 06, 2023

* Add out_indices to backbones, deprecate out_features

* Update - can specify both out_features and out_indices but not both

* Add backbone mixin tests

* Test tidy up

* Add test_backbone for convnext

* Remove redefinition of method

* Update for Dinat and Nat backbones

* Update tests

* Smarter indexing

* Add checks on config creation for backbone

* PR comments

12d51db2

Revert error back into warning for byte fallback conversion. (#22607) · 0aa1153f
Nicolas Patry authored Apr 06, 2023

0aa1153f

Adding Llama FastTokenizer support. (#22264) · 1670be4b

Nicolas Patry authored Apr 06, 2023

* Adding Llama FastTokenizer support.

- Requires https://github.com/huggingface/tokenizers/pull/1183 version
- Only support byte_fallback for llama, raise otherwise (safety net).
- Lots of questions are special tokens

How to test:

```python

from transformers.convert_slow_tokenizer import convert_slow_tokenizer
from transformers import AutoTokenizer
from tokenizers import Tokenizer

tokenizer = AutoTokenizer.from_pretrained("huggingface/llama-7b")

if False:
    new_tokenizer = Tokenizer.from_file("tok.json")
else:
    new_tokenizer = convert_slow_tokenizer(tokenizer)
    new_tokenizer.save("tok.json")

strings = [
    "This is a test",
    "生活的真谛是",
    "生活的真谛是[MASK]。",
    # XXX: This one is problematic because of special tokens
    # "<s> Something something",
]

for string in strings:
    encoded = tokenizer(string)["input_ids"]
    encoded2 = new_tokenizer.encode(string).ids

    assert encoded == encoded2, f"{encoded} != {encoded2}"

    decoded = tokenizer.decode(encoded)
    decoded2 = new_tokenizer.decode(encoded2)

    assert decoded.strip() == decoded2, f"{repr(decoded)} != {repr(decoded2)}"
```

The converter + some test script.

The test script.

Tmp save.

Adding Fast tokenizer + tests.

Adding the tokenization tests.

Correct combination.

Small fix.

Fixing tests.

Fixing with latest update.

Rebased.

fix copies + normalized added tokens  + copies.

Adding doc.

TMP.

Doc + split files.

Doc.

Versions + try import.

Fix Camembert + warnings -> Error.

Fix by ArthurZucker.

Not a decorator.

* Fixing comments.

* Adding more to docstring.

* Doc rewriting.

1670be4b

05 Apr, 2023 4 commits
- Use native TF checkpoints for the BLIP TF tests (#22593) · e577bd0f
  Matt authored Apr 05, 2023
```
* Use native TF checkpoints for the TF tests

* Remove unneeded exceptions
```
  e577bd0f
- Fix PT-TF equivalence test for GPT1 (#22586) · 2a91a9ef
  Matt authored Apr 05, 2023
```
* Re-enable skipped test and fix the hidden state shape issue

* Actually fix the bug instead of just doing something wrong
```
  2a91a9ef
- Generate: `TextIteratorStreamer` timeout (#22576) · 861ff890
  Joao Gante authored Apr 05, 2023
  
  861ff890
- Skip failing test · 11fd2c77
  Sylvain Gugger authored Apr 04, 2023
  
  11fd2c77
04 Apr, 2023 8 commits

Fix inverted conditional in TF common test! (#22540) · edb704b2

Matt authored Apr 04, 2023

* Fix inverted conditional in TF common test!

* Make the same change in the PT tests file

* Make sure hidden states for GPT2 have the same output shape in PT/TF

* Minor fix to PT implementation of token classification loss

* Skip loss equivalence test for TFHubert because it keeps overflowing to inf

* Compute LM loss for TF the (weird) way it's computed in PT

* Skip loss equivalence test for Wav2Vec2 for the same reason as Hubert

* Fix - don't try to access the hidden states property when output is a tuple

edb704b2

Flax Regnet (#21867) · 90067748

Shubhamai authored Apr 04, 2023

* initial commit

* review changes

* post model PR merge

* updating doc

90067748

Add TF port of BLIP (#22090) · 5f3ea66b

Matt authored Apr 04, 2023



* Initial commit

* more stash commit

* Yet another stash commit

* yet more stash commit

* Mostly working except for docs / repo consistency

* Stop importing model list from torch file

* Add TF BLIP models to docs

* Add auto classes

* Move get_text_features and get_image_features

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/blip/test_modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/blip/test_modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update tests/models/blip/test_modeling_tf_blip_text.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Use channels_last convolutions in TF (better performance + compatibility)

* Remove _shape function

* Move multi-line statement to one line in PT + TF

* Specify tf.keras.layers instead of importing from it

* Remove test_gradient_checkpointing and empty test_training methods

* move some multi-line statements to one line

* Update docstring for generate

* Remove pruned heads set

* Remove self.seq_len_dim

* Fixed issues with loss computation, should resolve some tests. Also ensured that the PT version follows the config for output_attentions and output_hidden_states

* ensure original model follows config in more cases

* Skip the same cross-attention tests in the PT tests - didn't realize we did it twice!

* Add training args throughout the models and layers

* make fixup

* Fix docstring for inputs_embeds

* Add docstring for is_decoder

* Add docstrings to text models

* Remove redundant computation

* Add unpack_inputs / keras_serializable

* Add modeling_tf_blip to doctests

* Add config classes for keras serialization

* Changes to allow model porting with pt-to-tf

* Quick fix to decoder head and test tweaks

* Revert an issue with masking the embeddings outputs

* Allow missing keys in some equivalence tests (for unused layers)

* Add tf-pt equivalence tests back in

* Update src/transformers/models/blip/modeling_tf_blip.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/blip/modeling_tf_blip_text.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make fixup

* Refactor invert_attention_mask out into tf_utils

* Re-enable cross-tests on the PT side too

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5f3ea66b

Soft error whisper. (#22475) · a515d0a7

Nicolas Patry authored Apr 04, 2023



* Soft error whisper.

* Fix format.

---------
Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-94.taildb5d.ts.net>

a515d0a7

Implemented safetensors checkpoints save/load for Trainer (#22498) · 871598be

Viktor Scherbakov authored Apr 04, 2023



* implemented safetensors save/load

* remove duplicated file

* added tests

* more tests

* style fix

* fix tf tests

* change to list comprehension
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* review fixes + safe load for sharded checkpoint

* style fix

* remove rogue import

* remove partial to avoid undefined exception

* use naming alias instead of safetensors.torch

* fix safe sharding in tests

* grammar
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update docs
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update docs
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* minor corrections

* style

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

871598be

🚨

`[NLLB Tokenizer]` Fix the prefix tokens

🚨

(#22313) · 00b5887b

Arthur authored Apr 04, 2023



* fix the prefix tokens

* update fast and test values

* add legacy behaviour
Co-authored-by: sgugger <sylvain.gugger@gmail.com>

* update disclaimer, linkissue PR and behaviral changes

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re>

* styling

* make a quote

* quote this time

---------
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>

00b5887b

[Roformer] Fixing a bug in RoFormerEncoder where it was ignoring the length of... · ad5e9b6c

TheWall9 authored Apr 04, 2023

[Roformer] Fixing a bug in RoFormerEncoder where it was ignoring the length of past_key_values when generating as a decoder (#22416)

* fix RoFormerEncoder postion embedding when generate as decoder

* make fixup

* add test case for check generate with past key values

* remove duplicating code

ad5e9b6c

Generate: Add text streamer decoding options (#22544) · 1905384f
Joao Gante authored Apr 04, 2023

1905384f

03 Apr, 2023 3 commits

Update test_image_processing_pix2struct.py (#22543) · 159ff334
Younes Belkada authored Apr 03, 2023

159ff334
Skip failing test · c14d3129
Sylvain Gugger authored Apr 03, 2023

c14d3129

fix LayoutLMv3TokenizerFast subword label after 'Ġ' token (#21695) · 4e441e52

Thibault Douzon authored Apr 03, 2023

LayoutLMv3TokenizerFast produces empty 'Ġ' token with `offset_mapping = (0, 0)`.
Next token is wrongly assumed to also be beginning of word and isn't
correctly assigned `pad_token_label`.
Modify test with text that produce 'Ġ' token.
Remove copy check from LayoutLMv2TokenizerFast for `_batch_encode_plus`.

solves issue: #19978

4e441e52