Commits · 31f137c04f9dfb0a9892237a9ca2bf23c3df6c11 · chenpangpang / transformers

28 Jul, 2023 4 commits
- added compiled model support for inference (#25124) · 3cbc560d
  Alexander Markov authored Jul 28, 2023
```
* added compiled model support for inference

* linter

* Fix tests

* linter

* linter

* remove inference mode from pipelines

* Linter

---------
Co-authored-by: amarkov <alexander@inworld.ai>
```
  3cbc560d
- Represent query_length in a different way to solve jit issue (#25164) · d23d2c27
  jiqing-feng authored Jul 28, 2023
```
Fix jit trace
```
  d23d2c27
- override .cuda() to check if model is already quantized (#25166) · 2a787201
  YQ authored Jul 28, 2023
  
  2a787201
- Fix `.push_to_hub` and cleanup `get_full_repo_name` usage (#25120) · 6232c380
  Lucain authored Jul 28, 2023
```
* Fix .push_to_hub and cleanup get_full_repo_name usage

* Do not rely on Python bool conversion magic

* request changes
```
  6232c380
27 Jul, 2023 10 commits

Add new model in doc table of content (#25148) · 400e76ef
Sylvain Gugger authored Jul 27, 2023

400e76ef

Sanchit Gandhi authored Jul 27, 2023



* First commit

* step 1 working

* add alibi

* placeholder for `scan`

* add matrix mult alibi

* beta scaling factor for bmm

* working v1 - simple forward pass

* move layer_number from attribute to arg in call

* partial functioning scan

* hacky working scan

* add more modifs

* add test

* update scan for new kwarg order

* fix position_ids problem

* fix bug in attention layer

* small fix

- do the alibi broadcasting only once

* prelim refactor

* finish refactor

* alibi shifting

* incorporate dropout_add to attention module

* make style

* make padding work again

* update

* remove bogus file

* up

* get generation to work

* clean code a bit

* added small tests

* adding albii test

* make CI tests pass:

- change init weight
- add correct tuple for output attention
- add scan test
- make CI tests work

* fix few nits

* fix nit onnx

* fix onnx nit

* add missing dtype args to nn.Modules

* remove debugging statements

* fix scan generate

* Update modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* fix small test issue + make style

* clean up

* Update tests/models/bloom/test_modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* fix function name

* small fix test

* forward contrib credits from PR17761

* Fix failing test

* fix small typo documentation

* fix non passing test

- remove device from build alibi

* refactor call

- refactor `FlaxBloomBlockCollection` module

* make style

* upcast to fp32

* cleaner way to upcast

* remove unused args

* remove layer number

* fix scan test

* make style

* fix i4 casting

* fix slow test

* Update src/transformers/models/bloom/modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove `layer_past`

* refactor a bit

* fix `scan` slow test

* remove useless import

* major changes

- remove unused code
- refactor a bit
- revert import `torch`

* major refactoring

- change build alibi

* remove scan

* fix tests

* make style

* clean-up alibi

* add integration tests

* up

* fix batch norm conversion

* style

* style

* update pt-fx cross tests

* update copyright

* Update src/transformers/modeling_flax_pytorch_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* per-weight check

* style

* line formats

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

e9310363

More `token` things (#25146) · 0c790ddb

Yih-Dar authored Jul 27, 2023



* fix

* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

0c790ddb

Add offload support to Bark (#25037) · 0b92ae34

Yoach Lacombe authored Jul 27, 2023



* initial Bark offload proposal

* use hooks instead of manually offloading

* add test of bark offload to cpu feature

* Apply nit suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docstrings of offload
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove unecessary set_seed in Bark tests

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

0b92ae34

[`MptConfig`] support from pretrained args (#25116) · 9cea3e7b

Arthur authored Jul 27, 2023



* support from pretrained args

* draft addition of tests

* update test

* use parrent assert true

* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

9cea3e7b

🚨🚨🚨Change default from `adamw_hf` to `adamw_torch` 🚨🚨🚨 (#25109) · a1c4954d
Zach Mueller authored Jul 27, 2023
```
* Change defaults

* Sylvain's comments
```
a1c4954d
Clarify 4/8 bit loading log message (#25134) · 9a220ce3
Bram Vanroy authored Jul 27, 2023
```
* clarify 4/8 bit loading log message

* make style
```
9a220ce3
[`T5/LlamaTokenizer`] default legacy to `None` to not always warn (#25131) · 9429642e
Arthur authored Jul 27, 2023
```
default legacy to None
```
9429642e
fix delete all checkpoints when save_total_limit is set to 1 (#25136) · de9e3b59
Pbihao authored Jul 27, 2023

de9e3b59
fix deepspeed load best model at end when the model gets sharded (#25057) · a0042379
Sourab Mangrulkar authored Jul 27, 2023

a0042379

26 Jul, 2023 9 commits

Move center_crop to BaseImageProcessor (#25122) · 1689aea7
amyeroberts authored Jul 26, 2023

1689aea7
MaskFormer - enable return_dict in order to compile (#25052) · 659829b6
amyeroberts authored Jul 26, 2023
```
* Enable return_dict in order to compile

* Update tests
```
659829b6
Fix ViT docstring regarding default dropout values. (#25118) · b914ec98
Eric Bezzam authored Jul 26, 2023
```
Fix docstring for dropout.
```
b914ec98
Move common image processing methods to BaseImageProcessor (#25089) · 1486d2ae
amyeroberts authored Jul 26, 2023
```
Move out common methods
```
1486d2ae
Fix past CI after #24334 (#25113) · d30cf3d0
Yih-Dar authored Jul 26, 2023
```
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
d30cf3d0
update `use_auth_token` -> `token` (#25083) · 224da5df
Yih-Dar authored Jul 26, 2023
```
* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
224da5df

fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is … (#24772) · c53c8e49

Leo authored Jul 26, 2023



fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor."
Co-authored-by: 刘长伟 <hzliuchw@corp.netease.com>

c53c8e49

Add descriptive docstring to TemperatureLogitsWarper (#24892) · 04a5c859

David Reguera authored Jul 26, 2023

* Add descriptive docstring to TemperatureLogitsWarper

It addresses https://github.com/huggingface/transformers/issues/24783



* Remove niche features
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Commit suggestion
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Refactor the examples to simpler ones

* Add a missing comma
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Make args description more compact
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Remove extra text after making description more compact
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fix linter

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

04a5c859

documentation for llama2 models (#25102) · 277d3aed
Shauray Singh authored Jul 26, 2023
```
* fix documentation

* changes
```
277d3aed

25 Jul, 2023 15 commits

fix tied_params for meta tensor (#25101) · a5cc30d7
Marc Sun authored Jul 25, 2023
```
* fix tied_params for meta tensor

* remove duplicate
```
a5cc30d7

[`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726) · 8f36ab3e

Sebastian Husch Lee authored Jul 25, 2023

* Initial addition of t5forsequenceclassification

* Adding imports and adding tests

* Formatting

* Running make fix-copies

* Adding mt5forseq

* Formatting

* run make fix-copies

* Adding to docs

* Add model_parallel

* Fix bug

* Fix

* Remove TODO

* Fixing tests for T5ForSequenceClassification

* Undo changes to dependency_versions_table.py

* Change classification head to work with T5Config directly

* Change seq length to let tests pass

* PR comments for formatting

* Formatting

* Initial addition of UMT5ForSequenceClassification

* Adding to inits and formatting

* run make fix-copies

* Add doc for UMT5ForSeqClass

* Update UMT5 config

* Fix docs

* Skip torch fx test for SequenceClassification

* Formatting

* Add skip to UMT5 tests as well

* Fix umt5 tests

* Running make fix-copies

* PR comments

* Fix for change to sentence_representation

* Rename seq_len to hidden_size since that's what it is

* Use base_model to follow format of the rest of the library

* Update docs

* Extract the decoder_input_ids changes and make one liner

* Make one-liner

8f36ab3e

Hotfix for failing `MusicgenForConditionalGeneration` tests (#25091) · 21150cb0
Yih-Dar authored Jul 25, 2023
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
21150cb0

[ `PreTrainedTokenizerFast`] Keep properties from fast tokenizer (#25053) · f9cc3338

Arthur authored Jul 25, 2023

* draft solution

* use `setdefault`

* nits

* add tests and fix truncation issue

* fix test

* test passes locally

* quality

* updates

* update tsets

f9cc3338

[`TF`] Also apply patch to support left padding (#25085) · 2fac3422
Arthur authored Jul 25, 2023
```
* tf versions

* apply changes to other models

* 3 models slipped through the cracks
```
2fac3422

[ `ForSequenceClassification`] Support `left` padding (#24979) · f1045227

Arthur authored Jul 25, 2023

* support left padding

* nit

* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py

* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py

f1045227

Allow generic composite models to pass more kwargs (#24927) · 1e662f0f

Yih-Dar authored Jul 25, 2023



* fix

* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

1e662f0f

[DOCS] add example NoBadWordsLogitsProcessor (#25046) · b99f7bd4
Gema Parreño authored Jul 25, 2023
```
* add example NoBadWordsLogitsProcessor

* fix L764 & L767

* make style
```
b99f7bd4

[`MPT`] Add MosaicML's `MPT` model to transformers (#24629) · dcb183f4

Arthur authored Jul 25, 2023



* draft add new model like

* some cleaning of the config

* nits

* add nested configs

* nits

* update

* update

* added layer norms + triton kernels

* consider only LPLayerNorm for now.

* update

* all keys match.

* Update

* fixing nits here and there

* working forward pass.

* removed einops dependency

* nits

* format

* add alibi

* byebye head mask

* refactor attention

* nits.

* format

* fix nits.

* nuke ande updates

* nuke tokenizer test

* don't reshape query with kv heads

* added a bit of documentation.

* remove unneeded things

* nuke more stuff

* nit

* logits match - same generations

* rm unneeded methods

* 1 remaining failing CI test

* nit

* fix nits

* fix docs

* fix docs

* rm tokenizer

* fixup

* fixup

* fixup and fix tests

* fixed configuration object.

* use correct activation

* few minor fixes

* clarify docs a bit

* logits match à 1e-12

* skip and unskip a test

* added some slow tests.

* fix readme

* add more details

* Update docs/source/en/model_doc/mpt.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix configuration issues

* more fixes in config

* added more models

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove unneeded position ids

* fix some  comments

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* revert suggestion

* mpt alibi + added batched generation

* Update src/transformers/models/mpt/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove init config

* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix nit

* add another slow test

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fits in one line

* some refactor because make fixup doesn't pass

* add ft notebook

* update md

* correct doc path

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

dcb183f4

Fix: repeat per sample for SAM image embeddings (#25074) · 1dbc1440
Xiaoke Huang authored Jul 25, 2023
```
Repeat per sample for SAM image embeddings
```
1dbc1440

[`generate`] Only warn users if the `generation_config`'s `max_length` is set... · f2c1df93

Arthur authored Jul 25, 2023

[`generate`]  Only warn users if the `generation_config`'s `max_length` is set to the default value (#25030)

* check max length is default

* nit

* update warning: no-longer deprecate

* comment in the configuration_utils in case max length's default gets changed in the futur

f2c1df93

Set `TF32` flag for PyTorch cuDNN backend (#25075) · 6bc61aa7
Xuehai Pan authored Jul 25, 2023

6bc61aa7
Fix last models for common tests that are too big. (#25058) · f295fc8a
Sylvain Gugger authored Jul 25, 2023
```
* Fix last models for common tests that are too big.

* Remove print statement
```
f295fc8a
[Docs] fix rope_scaling doc string (#25072) · faf25c04
Kashif Rasul authored Jul 25, 2023
```
fix rope_scaling doc string
```
faf25c04
Generate - add beam indices output in contrained beam search (#25042) · c0742b15
Joao Gante authored Jul 25, 2023

c0742b15

24 Jul, 2023 2 commits
- Better error message when signal is not supported on OS (#25049) · d2295708
  Sylvain Gugger authored Jul 24, 2023
```
* Better error message when signal is not supported on OS

* Address review comments
```
  d2295708
- [`8bit`] Fix 8bit corner case with Blip2 8bit (#25047) · b08f41e6
  Younes Belkada authored Jul 24, 2023
```
fix 8bit corner case with Blip2 8bit
```
  b08f41e6