Commits · ff73deeb0ed72a54db76ed2e204aaba85066d952 · chenpangpang / transformers

11 Apr, 2023 5 commits
- Remove 2 failing ONNX conversion tests (#22660) · ff73deeb
  Yih-Dar authored Apr 11, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  ff73deeb
- Clarify stride option (#22684) · 06b05d45
  Luc CAILLIAU authored Apr 11, 2023
```
* Clarify stride option

* formatting
```
  06b05d45
- Enable naive Pipeline Parallelism training for Gpt neox japanese and san japanese (#22702) · 0224aaf6
  Mayank Agarwal authored Apr 11, 2023
```
Move labels to same device as logits
```
  0224aaf6
- Make it easier to develop without a dev install (#22697) · 28c19ab5
  Sylvain Gugger authored Apr 11, 2023
```
* Make it easier to develop without a dev install

* Remove ugly hack that doesn't work anyway
```
  28c19ab5
- Update some `MarkupLM` tests' expected values (#22667) · 4c01231e
  Yih-Dar authored Apr 11, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  4c01231e
10 Apr, 2023 8 commits

Model parallelism: Moving labels to same devices as the logits are (#22691) · 151425dd
Shahad Mahmud authored Apr 10, 2023
```
Model parallelism correct labels device
```
151425dd

add GPTNeoXForSequenceClassification (#22671) · 6daa9cb5

Sugawara authored Apr 11, 2023

* add GPTNeoXForSequenceClassification

* move the labels to logits.device (ref: #22561)

* fix

6daa9cb5

use __func__ to check can_generate (#22643) · f74b4020
xinhe authored Apr 10, 2023

f74b4020
Fix quantization docs typo (#22666) · 14fc1a24
Kirill authored Apr 10, 2023

14fc1a24
Make dynamic code work with offline mode (#22661) · 3876fc68
Sylvain Gugger authored Apr 10, 2023
```
* Make dynamic code work with offline mode

* Clean up

* Quality
```
3876fc68
(feat): Moving labels to same device as logits for Deit (#22679) · 98597725
Shikhar Chauhan authored Apr 10, 2023

98597725
Model parallelism: Moving labels to the same device as logits for BridgeTower models (#22676) · 870d91fb
Shahad Mahmud authored Apr 10, 2023
```
BrideTower Model parallelism logits device for loss calculation
```
870d91fb

Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575) · e0921c6b

Joel Lamy-Poirier authored Apr 10, 2023



* Add model with cli tool

* Remove unwanted stuff

* Add new code

* Remove inference runner

* Style

* Fix checks

* Test updates

* make fixup

* fix docs

* fix doc

* fix test

* hopefully fix pipeline tests

* refactor

* fix CIs

* add comment

* rename to `GPTBigCodeForCausalLM`

* correct readme

* make fixup + docs

* make fixup

* fixes

* fixes

* Remove pruning

* Remove import

* Doc updates

* More pruning removal

* Combine copies

* Single MQA implementation, remove kv cache pre-allocation and padding

* Update doc

* Revert refactor to match gpt2 style

* Merge back key and value caches, fix some type hints

* Update doc

* Fix position ids pith padding (PR 21080)

* Add conversion script temporarily

* Update conversion script

* Remove checkpoint conversion

* New model

* Fix MQA test

* Fix copies

* try fix tests

* FIX TEST!!

* remove  `DoubleHeadsModel`

* add MQA tests

* add slow tests

* clean up

* add CPU checker

* final fixes

* fixes

- fix GPU issue
- fixed slow tests
- skip disk offload

* fix final issue

* Simplify and comment baddbmm fix

* Remove unnecessary code

* Transpose tweaks

* Use beta=1 on cpu, improve tests

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>

e0921c6b

07 Apr, 2023 11 commits

moved labels to the same device as logits for BLOOM, GPT Neo, GPT NeoX,... · 656e869a

Arun Brahma authored Apr 08, 2023

moved labels to the same device as logits for BLOOM, GPT Neo, GPT NeoX, RoBERTa and VIT models (#22663)

moved labels to the same device as logits

656e869a

Revert migration of setup to pyproject.toml (#22658) · 6db23af5
Sylvain Gugger authored Apr 07, 2023

6db23af5
Generate: add API warning to streamers (#22659) · 3f96e0b4
Joao Gante authored Apr 07, 2023
```
add API warning
```
3f96e0b4

[OPT] Fix default attention mask size (#22649) · f3341926

Arthur authored Apr 07, 2023

* Fix default attention mask size

* fixup

* add a test to make sure that even if attention mask are not provided, works

* style

f3341926

[tokenization] do not push special file (#22657) · b1b3dc3e

Arthur authored Apr 07, 2023



* do not push special file

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

b1b3dc3e

Small nit, (#22653) · 117a0f6a

Arthur authored Apr 07, 2023

* Small nit,
Fixes #21986

* Update src/transformers/pipelines/__init__.py

117a0f6a

🌐

[i18n-KO] Translated `pipeline_tutorial.mdx` to Korean (#22508) · fc1ba6fd

Wonhyeong Seo authored Apr 08, 2023



docs: feat: Korean pipeline_tutorial
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com>
Co-authored-by: gabrielwithappy <102908949+gabrielwithappy@users.noreply.github.com>
Co-authored-by: Na Yeon Han <nayeon2.han@gmail.com>

fc1ba6fd

Fix `MegaModel` CI (#22652) · 14d5b2b6

Yih-Dar authored Apr 07, 2023



* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

14d5b2b6

Fix typo (#22650) · f2cc8ffd
Seung-Moo Yang authored Apr 07, 2023

f2cc8ffd
Move labels to the same device as logits for LlamaForSequenceClassification and Blip2 (#22596) · 1de8ce9e
Shikhar Chauhan authored Apr 07, 2023
```
* (feat): Move labels to the same device as logits

* Trigger CI

* Trigger CI

* Trigger CI

* (feat): Making changes for Blip2
```
1de8ce9e
🌐[i18n-KO] Translate `autoclass_tutorial` to Korean and Fix the typo of `quicktour` (#22533) · d59034ff
gabrielwithappy authored Apr 07, 2023
```
translate the autoclass_tutorial and fix the typo of the quicktour
```
d59034ff

06 Apr, 2023 13 commits

fix FSDP version related issues (#22489) · ee8e80a0
Sourab Mangrulkar authored Apr 07, 2023
```
fix fsdp
```
ee8e80a0

Update tiny model summary file for recent models (#22637) · c7ec71ba

Yih-Dar authored Apr 06, 2023



* Update tiny model summary file for recent models

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

c7ec71ba

[`Blip`] Fix slow tests and doctests with correct values (#22632) · ed672864
Younes Belkada authored Apr 06, 2023
```
fix slow tests and doctests
```
ed672864
LlamaTokenizerFast Fix (.., from_slow=True). (#22630) · 6a02e980
Nicolas Patry authored Apr 06, 2023

6a02e980
[`bnb`] 8bit models should not be converted to `DDP` (#22628) · 09a9888f
Younes Belkada authored Apr 06, 2023
```
add safety checker
```
09a9888f

A script to add/update `pipeline_model_mapping` systematically (#22180) · d0b83fe2

Yih-Dar authored Apr 06, 2023



* Auto. add and update pipeline_model_mapping

* Fix style and quality

* Finalize (comments)

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

d0b83fe2

update_pip_test_mapping (#22606) · fa01127a

Yih-Dar authored Apr 06, 2023



* Add TFBlipForConditionalGeneration

* update pipeline_model_mapping

* Add import

* Revert changes in GPTSanJapaneseTest

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

fa01127a

docs: Fix broken link to generation strategies (#22623) · 321b0908
Connor Henderson authored Apr 06, 2023
```
fix broken link
```
321b0908
Make tiny model creation + pipeline testing more robust (#22500) · 2c22bc79
Yih-Dar authored Apr 06, 2023
```
* Final Tiny things

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
2c22bc79

Backbone add mixin tests (#22542) · 12d51db2

amyeroberts authored Apr 06, 2023

* Add out_indices to backbones, deprecate out_features

* Update - can specify both out_features and out_indices but not both

* Add backbone mixin tests

* Test tidy up

* Add test_backbone for convnext

* Remove redefinition of method

* Update for Dinat and Nat backbones

* Update tests

* Smarter indexing

* Add checks on config creation for backbone

* PR comments

12d51db2

Seq2SeqTrainer: use unwrapped model to retrieve the generation config (#22584) · 48706c71
Joao Gante authored Apr 06, 2023

48706c71
Revert error back into warning for byte fallback conversion. (#22607) · 0aa1153f
Nicolas Patry authored Apr 06, 2023

0aa1153f

Adding Llama FastTokenizer support. (#22264) · 1670be4b

Nicolas Patry authored Apr 06, 2023

* Adding Llama FastTokenizer support.

- Requires https://github.com/huggingface/tokenizers/pull/1183 version
- Only support byte_fallback for llama, raise otherwise (safety net).
- Lots of questions are special tokens

How to test:

```python

from transformers.convert_slow_tokenizer import convert_slow_tokenizer
from transformers import AutoTokenizer
from tokenizers import Tokenizer

tokenizer = AutoTokenizer.from_pretrained("huggingface/llama-7b")

if False:
    new_tokenizer = Tokenizer.from_file("tok.json")
else:
    new_tokenizer = convert_slow_tokenizer(tokenizer)
    new_tokenizer.save("tok.json")

strings = [
    "This is a test",
    "生活的真谛是",
    "生活的真谛是[MASK]。",
    # XXX: This one is problematic because of special tokens
    # "<s> Something something",
]

for string in strings:
    encoded = tokenizer(string)["input_ids"]
    encoded2 = new_tokenizer.encode(string).ids

    assert encoded == encoded2, f"{encoded} != {encoded2}"

    decoded = tokenizer.decode(encoded)
    decoded2 = new_tokenizer.decode(encoded2)

    assert decoded.strip() == decoded2, f"{repr(decoded)} != {repr(decoded2)}"
```

The converter + some test script.

The test script.

Tmp save.

Adding Fast tokenizer + tests.

Adding the tokenization tests.

Correct combination.

Small fix.

Fixing tests.

Fixing with latest update.

Rebased.

fix copies + normalized added tokens  + copies.

Adding doc.

TMP.

Doc + split files.

Doc.

Versions + try import.

Fix Camembert + warnings -> Error.

Fix by ArthurZucker.

Not a decorator.

* Fixing comments.

* Adding more to docstring.

* Doc rewriting.

1670be4b

05 Apr, 2023 3 commits

feat(model parallelism): moving the labels to the same device as the logits... · 15641892
Kaustubh authored Apr 06, 2023
```
feat(model parallelism): moving the labels to the same device as the logits for gpt2 and bart (#22591)
```
15641892
Use native TF checkpoints for the BLIP TF tests (#22593) · e577bd0f
Matt authored Apr 05, 2023
```
* Use native TF checkpoints for the TF tests

* Remove unneeded exceptions
```
e577bd0f

Add DePlot + MatCha on `transformers` (#22528) · 176ceff9

Younes Belkada authored Apr 05, 2023



* add deplot + matcha on `transformers`

* more docs

* correct path

* Update docs/source/en/model_doc/deplot.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix

* use auto processor

* Update docs/source/en/model_doc/matcha.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make fixup

* Update docs/source/en/model_doc/deplot.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* add correct names

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

176ceff9