Commits · d85bf954361f39f2ea38386940f40d29ed201910 · chenpangpang / transformers

"vscode:/vscode.git/clone" did not exist on "4207a4076d1bef446402edeba7297a58d4389157"

13 Apr, 2023 9 commits
- [trainer] update url (#22747) · d85bf954
  Stas Bekman authored Apr 13, 2023
```
* [trainer] update url

* style
```
  d85bf954
- `DocumentQuestionAnsweringPipeline` only for fast ⚡ tokenizers (#22745) · 32b08742
  Yih-Dar authored Apr 13, 2023
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  32b08742
- Change `torch_dtype` to `str` when `saved_model=True` in `save_pretrained` for TF models (#22740) · 7df13432
  Yih-Dar authored Apr 13, 2023
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  7df13432
- [Pix2struct] Simplify generation (#22527) · 8eb38f63
  NielsRogge authored Apr 13, 2023
```
* Add model to doc tests

* Remove generate and replace by prepare_inputs_for_generation

* More fixes

* Remove print statements

* Update integration tests

* Fix generate

* Remove model from auto mapping

* Use auto processor

* Fix integration tests

* Fix test

* Add inference code snippet

* Remove is_encoder_decoder

* Update docs

* Remove notebook link
```
  8eb38f63
- Make vilt, switch_transformers compatible with model parallelism (#22703) · 95e70575
  Rinat authored Apr 13, 2023
```
* Update modeling_vilt.py

Vilt compatible with model parallelism

* Update modeling_switch_transformers.py

switch_transformers compatible with model parallelism
```
  95e70575
- Indexing fix for gpt_bigcode (#22737) · 89087597
  Joel Lamy-Poirier authored Apr 13, 2023
```
Fix indexing
```
  89087597
- [Doctest] Add configuration_mvp.py (#22735) · 7ade6ef7
  Elabonga Atuo authored Apr 13, 2023
```
* added configuration file for mvp model

* added configuration_mvp.py line to file
```
  7ade6ef7
- [Doctest] Add configuration_m2m_100.py (#22733) · 51007976
  Elabonga Atuo authored Apr 13, 2023
```
m2m-100-config for doctest
```
  51007976
- v4.29.0.dev0 · 888c4a2a
  Sylvain Gugger authored Apr 12, 2023
  
  888c4a2a
12 Apr, 2023 8 commits

Fix docstrings for TF BLIP (#22618) · 50f82e12

Matt authored Apr 12, 2023

* Fix docstrings for TFBLIP

* Fix missing line in TF port!

* Use values from torch tests now other bugs fixed

* Use values from torch tests now other bugs fixed

* Fix doctest string

50f82e12

Update warning levels (#22727) · ce06e478

NielsRogge authored Apr 12, 2023

* Use different level

* Remove futurewarning

* Use warning_once

* Update copies

ce06e478

add fast support and option (#22724) · 98581954

Arthur authored Apr 12, 2023



* add fast support and option

* update based on review

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/llama/convert_llama_weights_to_hf.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* nit

* add print

* fixup

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

98581954

`torch.distributed` group initialization for `torch_neuron` disabled when... · 10fab90f

Michael Benayoun authored Apr 12, 2023

`torch.distributed` group initialization for `torch_neuron` disabled when `optimum-neuron` is installed (#22728)

* Make the process group initialization not happen if optimum_neuron is installed

* Add warning

* Remove list and added warning

10fab90f

[`bnb`] Let's make serialization of int8 models possible (#22177) · 370f0ca1

Younes Belkada authored Apr 12, 2023



* make serialization of int8 models possible

* make fixup

* add docs

* add ability to push to hub and save pretrained

* fixes

* more addition

* more tests

* fix issues

* change variable

* clearer message

* adapt from suggestions

* few fixes

* remove unused function

* Update src/transformers/utils/quantization_config.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address last comments

* last warning

* clarify doc

* protect import

* Update src/transformers/modeling_utils.py

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

370f0ca1

add model resources for CPMAnt (new) (#20906) · 523ca4e0

pioliverse authored Apr 12, 2023



* resolve conflicts

* rebase and make style

* test

* test

* test

* rebase and make style

* rebase and make style

* tests

* tests

* rewrite some functions

* rebase and make style

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* fix some bugs & docstring

* add models and tests

* solve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* tests

* resolve conflicts

* resolve conflicts

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* fix some bugs & docstring

* save resolution

* make style

* delete redefinition code

* reformat function

* reformat

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* tests

* resolve conflicts

* resolve conflicts

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* resolve conflicts

* make style

* fix bugs and refactor

* modify docstrings and make style

* unify import format in __init__.py

* fix import-altclp bug

* fix copies to update index.md

* fix unused config parameters

* fix unused config parameters

* fix unused config parameters

* update README_ja.md

* dummy commit for unit test

* fix attention mask

* add CPMAntTokenizer&-Fast to auto-mapping

* drop redundant changes in README_ko

* fix  defaults in docstring

* fix use_cache and some docstring

* add missing args in tokenizer

* modify tester inheritance

* add is_jieba_available

* fix some bugs

* make style and fix-copies

* add doctests

* skip integration tests

* add is_jieba_available

* fix bugs in common tests

* adjust docstrings and make style

* add argument docstring

* adjust code to some specifications

* make style and fix-copies

* add fast tokenization test

* dummy commit for unit test

* dummy commit for unit test

* dummy commit for unit test

* normalize some comments and names

* Bert->CPMAnt

* camel names and drop redundant codes

* make style and fix-coies

* add CpmTokenizerFast _import_structure

* drop cpmanttokenizerfast in model_doc

* fix some problems

* fix CPMAnt tokenization for common test

* make style and fixup

* fix copies and fixup

* fix bugs in tokenization test

* dummy commit for connection failure in unittest

* fix copies

* drop trailing comma

* fix decorator in tests

* dummy commit for connection failure in unittest

---------
Co-authored-by: Gong Baitao <gongbaitao11@gmail.com>

523ca4e0

Added parallel device usage for GPT-J (#22713) · 17503b00
jprivera44 authored Apr 12, 2023

17503b00
Update input values for docstring (#22631) · 5a71977b
amyeroberts authored Apr 12, 2023

5a71977b

11 Apr, 2023 3 commits
- Remove 2 failing ONNX conversion tests (#22660) · ff73deeb
  Yih-Dar authored Apr 11, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  ff73deeb
- Clarify stride option (#22684) · 06b05d45
  Luc CAILLIAU authored Apr 11, 2023
```
* Clarify stride option

* formatting
```
  06b05d45
- Enable naive Pipeline Parallelism training for Gpt neox japanese and san japanese (#22702) · 0224aaf6
  Mayank Agarwal authored Apr 11, 2023
```
Move labels to same device as logits
```
  0224aaf6
10 Apr, 2023 7 commits

Model parallelism: Moving labels to same devices as the logits are (#22691) · 151425dd
Shahad Mahmud authored Apr 10, 2023
```
Model parallelism correct labels device
```
151425dd

add GPTNeoXForSequenceClassification (#22671) · 6daa9cb5

Sugawara authored Apr 11, 2023

* add GPTNeoXForSequenceClassification

* move the labels to logits.device (ref: #22561)

* fix

6daa9cb5

use __func__ to check can_generate (#22643) · f74b4020
xinhe authored Apr 10, 2023

f74b4020
Make dynamic code work with offline mode (#22661) · 3876fc68
Sylvain Gugger authored Apr 10, 2023
```
* Make dynamic code work with offline mode

* Clean up

* Quality
```
3876fc68
(feat): Moving labels to same device as logits for Deit (#22679) · 98597725
Shikhar Chauhan authored Apr 10, 2023

98597725
Model parallelism: Moving labels to the same device as logits for BridgeTower models (#22676) · 870d91fb
Shahad Mahmud authored Apr 10, 2023
```
BrideTower Model parallelism logits device for loss calculation
```
870d91fb

Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575) · e0921c6b

Joel Lamy-Poirier authored Apr 10, 2023



* Add model with cli tool

* Remove unwanted stuff

* Add new code

* Remove inference runner

* Style

* Fix checks

* Test updates

* make fixup

* fix docs

* fix doc

* fix test

* hopefully fix pipeline tests

* refactor

* fix CIs

* add comment

* rename to `GPTBigCodeForCausalLM`

* correct readme

* make fixup + docs

* make fixup

* fixes

* fixes

* Remove pruning

* Remove import

* Doc updates

* More pruning removal

* Combine copies

* Single MQA implementation, remove kv cache pre-allocation and padding

* Update doc

* Revert refactor to match gpt2 style

* Merge back key and value caches, fix some type hints

* Update doc

* Fix position ids pith padding (PR 21080)

* Add conversion script temporarily

* Update conversion script

* Remove checkpoint conversion

* New model

* Fix MQA test

* Fix copies

* try fix tests

* FIX TEST!!

* remove  `DoubleHeadsModel`

* add MQA tests

* add slow tests

* clean up

* add CPU checker

* final fixes

* fixes

- fix GPU issue
- fixed slow tests
- skip disk offload

* fix final issue

* Simplify and comment baddbmm fix

* Remove unnecessary code

* Transpose tweaks

* Use beta=1 on cpu, improve tests

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>

e0921c6b

07 Apr, 2023 8 commits

moved labels to the same device as logits for BLOOM, GPT Neo, GPT NeoX,... · 656e869a

Arun Brahma authored Apr 08, 2023

moved labels to the same device as logits for BLOOM, GPT Neo, GPT NeoX, RoBERTa and VIT models (#22663)

moved labels to the same device as logits

656e869a

Generate: add API warning to streamers (#22659) · 3f96e0b4
Joao Gante authored Apr 07, 2023
```
add API warning
```
3f96e0b4

[OPT] Fix default attention mask size (#22649) · f3341926

Arthur authored Apr 07, 2023

* Fix default attention mask size

* fixup

* add a test to make sure that even if attention mask are not provided, works

* style

f3341926

[tokenization] do not push special file (#22657) · b1b3dc3e

Arthur authored Apr 07, 2023



* do not push special file

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

b1b3dc3e

Small nit, (#22653) · 117a0f6a

Arthur authored Apr 07, 2023

* Small nit,
Fixes #21986

* Update src/transformers/pipelines/__init__.py

117a0f6a

Fix `MegaModel` CI (#22652) · 14d5b2b6

Yih-Dar authored Apr 07, 2023



* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

14d5b2b6

Fix typo (#22650) · f2cc8ffd
Seung-Moo Yang authored Apr 07, 2023

f2cc8ffd
Move labels to the same device as logits for LlamaForSequenceClassification and Blip2 (#22596) · 1de8ce9e
Shikhar Chauhan authored Apr 07, 2023
```
* (feat): Move labels to the same device as logits

* Trigger CI

* Trigger CI

* Trigger CI

* (feat): Making changes for Blip2
```
1de8ce9e

06 Apr, 2023 5 commits
- fix FSDP version related issues (#22489) · ee8e80a0
  Sourab Mangrulkar authored Apr 07, 2023
```
fix fsdp
```
  ee8e80a0
- [`Blip`] Fix slow tests and doctests with correct values (#22632) · ed672864
  Younes Belkada authored Apr 06, 2023
```
fix slow tests and doctests
```
  ed672864
- LlamaTokenizerFast Fix (.., from_slow=True). (#22630) · 6a02e980
  Nicolas Patry authored Apr 06, 2023
  
  6a02e980
- [`bnb`] 8bit models should not be converted to `DDP` (#22628) · 09a9888f
  Younes Belkada authored Apr 06, 2023
```
add safety checker
```
  09a9888f
- update_pip_test_mapping (#22606) · fa01127a
  Yih-Dar authored Apr 06, 2023
```
* Add TFBlipForConditionalGeneration

* update pipeline_model_mapping

* Add import

* Revert changes in GPTSanJapaneseTest

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  fa01127a