Commits · add0895dd98c1a193bf0ada2f3575fcd5e256cb2 · chenpangpang / transformers

28 Jul, 2023 8 commits
- [`Mpt`] Fix mpt slow test (#25170) · add0895d
  Younes Belkada authored Jul 28, 2023
```
fix mpt slow test
```
  add0895d
- Update `use_auth_token` -> `token` in example scripts (#25167) · d53b8ad7
  Yih-Dar authored Jul 28, 2023
```
* pytorch examples

* tensorflow examples

* flax examples

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  d53b8ad7
- added compiled model support for inference (#25124) · 3cbc560d
  Alexander Markov authored Jul 28, 2023
```
* added compiled model support for inference

* linter

* Fix tests

* linter

* linter

* remove inference mode from pipelines

* Linter

---------
Co-authored-by: amarkov <alexander@inworld.ai>
```
  3cbc560d
- make run_generation more generic for other devices (#25133) · afa96fff
  Alan Ji authored Jul 28, 2023
```
* make run_generation more generic for other devices

* use Accelerate to support any device type it supports.

* make style

* fix error usage of accelerator.prepare_model

* use `PartialState` to make sure everything is running on the right device

---------
Co-authored-by: statelesshz <jihuazhong1@huawei.com>
```
  afa96fff
- Represent query_length in a different way to solve jit issue (#25164) · d23d2c27
  jiqing-feng authored Jul 28, 2023
```
Fix jit trace
```
  d23d2c27
- override .cuda() to check if model is already quantized (#25166) · 2a787201
  YQ authored Jul 28, 2023
  
  2a787201
- Add test when downloading from gated repo (#25039) · c1dba111
  Lucain authored Jul 28, 2023
  
  c1dba111
- Fix `.push_to_hub` and cleanup `get_full_repo_name` usage (#25120) · 6232c380
  Lucain authored Jul 28, 2023
```
* Fix .push_to_hub and cleanup get_full_repo_name usage

* Do not rely on Python bool conversion magic

* request changes
```
  6232c380
27 Jul, 2023 10 commits

Add new model in doc table of content (#25148) · 400e76ef
Sylvain Gugger authored Jul 27, 2023

400e76ef

Sanchit Gandhi authored Jul 27, 2023



* First commit

* step 1 working

* add alibi

* placeholder for `scan`

* add matrix mult alibi

* beta scaling factor for bmm

* working v1 - simple forward pass

* move layer_number from attribute to arg in call

* partial functioning scan

* hacky working scan

* add more modifs

* add test

* update scan for new kwarg order

* fix position_ids problem

* fix bug in attention layer

* small fix

- do the alibi broadcasting only once

* prelim refactor

* finish refactor

* alibi shifting

* incorporate dropout_add to attention module

* make style

* make padding work again

* update

* remove bogus file

* up

* get generation to work

* clean code a bit

* added small tests

* adding albii test

* make CI tests pass:

- change init weight
- add correct tuple for output attention
- add scan test
- make CI tests work

* fix few nits

* fix nit onnx

* fix onnx nit

* add missing dtype args to nn.Modules

* remove debugging statements

* fix scan generate

* Update modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* fix small test issue + make style

* clean up

* Update tests/models/bloom/test_modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* fix function name

* small fix test

* forward contrib credits from PR17761

* Fix failing test

* fix small typo documentation

* fix non passing test

- remove device from build alibi

* refactor call

- refactor `FlaxBloomBlockCollection` module

* make style

* upcast to fp32

* cleaner way to upcast

* remove unused args

* remove layer number

* fix scan test

* make style

* fix i4 casting

* fix slow test

* Update src/transformers/models/bloom/modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove `layer_past`

* refactor a bit

* fix `scan` slow test

* remove useless import

* major changes

- remove unused code
- refactor a bit
- revert import `torch`

* major refactoring

- change build alibi

* remove scan

* fix tests

* make style

* clean-up alibi

* add integration tests

* up

* fix batch norm conversion

* style

* style

* update pt-fx cross tests

* update copyright

* Update src/transformers/modeling_flax_pytorch_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* per-weight check

* style

* line formats

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

e9310363

More `token` things (#25146) · 0c790ddb

Yih-Dar authored Jul 27, 2023



* fix

* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

0c790ddb

Add offload support to Bark (#25037) · 0b92ae34

Yoach Lacombe authored Jul 27, 2023



* initial Bark offload proposal

* use hooks instead of manually offloading

* add test of bark offload to cpu feature

* Apply nit suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docstrings of offload
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove unecessary set_seed in Bark tests

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

0b92ae34

[`MptConfig`] support from pretrained args (#25116) · 9cea3e7b

Arthur authored Jul 27, 2023



* support from pretrained args

* draft addition of tests

* update test

* use parrent assert true

* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

9cea3e7b

🚨🚨🚨Change default from `adamw_hf` to `adamw_torch` 🚨🚨🚨 (#25109) · a1c4954d
Zach Mueller authored Jul 27, 2023
```
* Change defaults

* Sylvain's comments
```
a1c4954d
Clarify 4/8 bit loading log message (#25134) · 9a220ce3
Bram Vanroy authored Jul 27, 2023
```
* clarify 4/8 bit loading log message

* make style
```
9a220ce3
[`T5/LlamaTokenizer`] default legacy to `None` to not always warn (#25131) · 9429642e
Arthur authored Jul 27, 2023
```
default legacy to None
```
9429642e
fix delete all checkpoints when save_total_limit is set to 1 (#25136) · de9e3b59
Pbihao authored Jul 27, 2023

de9e3b59
fix deepspeed load best model at end when the model gets sharded (#25057) · a0042379
Sourab Mangrulkar authored Jul 27, 2023

a0042379

26 Jul, 2023 11 commits

Move center_crop to BaseImageProcessor (#25122) · 1689aea7
amyeroberts authored Jul 26, 2023

1689aea7
MaskFormer - enable return_dict in order to compile (#25052) · 659829b6
amyeroberts authored Jul 26, 2023
```
* Enable return_dict in order to compile

* Update tests
```
659829b6
Fix ViT docstring regarding default dropout values. (#25118) · b914ec98
Eric Bezzam authored Jul 26, 2023
```
Fix docstring for dropout.
```
b914ec98
Move common image processing methods to BaseImageProcessor (#25089) · 1486d2ae
amyeroberts authored Jul 26, 2023
```
Move out common methods
```
1486d2ae
Fix past CI after #24334 (#25113) · d30cf3d0
Yih-Dar authored Jul 26, 2023
```
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
d30cf3d0
update `use_auth_token` -> `token` (#25083) · 224da5df
Yih-Dar authored Jul 26, 2023
```
* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
224da5df

fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is … (#24772) · c53c8e49

Leo authored Jul 26, 2023



fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor."
Co-authored-by: 刘长伟 <hzliuchw@corp.netease.com>

c53c8e49

Add descriptive docstring to TemperatureLogitsWarper (#24892) · 04a5c859

David Reguera authored Jul 26, 2023

* Add descriptive docstring to TemperatureLogitsWarper

It addresses https://github.com/huggingface/transformers/issues/24783



* Remove niche features
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Commit suggestion
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Refactor the examples to simpler ones

* Add a missing comma
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Make args description more compact
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Remove extra text after making description more compact
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fix linter

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

04a5c859

Fix `PvtModelIntegrationTest::test_inference_fp16` (#25106) · 31acba56
Yih-Dar authored Jul 26, 2023
```
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
31acba56

🌐

[i18n-KO] Translated pipeline_webserver.md to Korean (#24828) · ee63520a

Kihoon Son authored Jul 26, 2023



* translated pipeline_webserver.md
Co-Authored-By: Hyeonseo Yun <0525yhs@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update pipeline_webserver.md

* Apply suggestions from code review
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com>
Co-authored-by: Kim haewon <ehdvkf02@naver.com>

---------
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com>
Co-authored-by: Kim haewon <ehdvkf02@naver.com>

ee63520a

documentation for llama2 models (#25102) · 277d3aed
Shauray Singh authored Jul 26, 2023
```
* fix documentation

* changes
```
277d3aed

25 Jul, 2023 11 commits

fix tied_params for meta tensor (#25101) · a5cc30d7
Marc Sun authored Jul 25, 2023
```
* fix tied_params for meta tensor

* remove duplicate
```
a5cc30d7

Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/visual_bert (#25097) · f1deb21f

dependabot[bot] authored Jul 25, 2023

Bump certifi in /examples/research_projects/visual_bert

Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22

)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

f1deb21f

Bump certifi from 2022.12.7 to 2023.7.22 in... · 45bde362

dependabot[bot] authored Jul 25, 2023

Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/decision_transformer (#25098)

Bump certifi in /examples/research_projects/decision_transformer

Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22

)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

45bde362

Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/lxmert (#25096) · 6b8dbc28

dependabot[bot] authored Jul 25, 2023

Bump certifi in /examples/research_projects/lxmert

Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22

)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

6b8dbc28

Fix doctest (#25031) · da5ff18a

Yih-Dar authored Jul 25, 2023



fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

da5ff18a

[`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726) · 8f36ab3e

Sebastian Husch Lee authored Jul 25, 2023

* Initial addition of t5forsequenceclassification

* Adding imports and adding tests

* Formatting

* Running make fix-copies

* Adding mt5forseq

* Formatting

* run make fix-copies

* Adding to docs

* Add model_parallel

* Fix bug

* Fix

* Remove TODO

* Fixing tests for T5ForSequenceClassification

* Undo changes to dependency_versions_table.py

* Change classification head to work with T5Config directly

* Change seq length to let tests pass

* PR comments for formatting

* Formatting

* Initial addition of UMT5ForSequenceClassification

* Adding to inits and formatting

* run make fix-copies

* Add doc for UMT5ForSeqClass

* Update UMT5 config

* Fix docs

* Skip torch fx test for SequenceClassification

* Formatting

* Add skip to UMT5 tests as well

* Fix umt5 tests

* Running make fix-copies

* PR comments

* Fix for change to sentence_representation

* Rename seq_len to hidden_size since that's what it is

* Use base_model to follow format of the rest of the library

* Update docs

* Extract the decoder_input_ids changes and make one liner

* Make one-liner

8f36ab3e

Hotfix for failing `MusicgenForConditionalGeneration` tests (#25091) · 21150cb0
Yih-Dar authored Jul 25, 2023
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
21150cb0

[ `PreTrainedTokenizerFast`] Keep properties from fast tokenizer (#25053) · f9cc3338

Arthur authored Jul 25, 2023

* draft solution

* use `setdefault`

* nits

* add tests and fix truncation issue

* fix test

* test passes locally

* quality

* updates

* update tsets

f9cc3338

Edit err message and comment in `test_model_is_small` (#25087) · 0779fc8e
Connor Henderson authored Jul 25, 2023
```
* Edit err message and comment in

* put back 80M comment
```
0779fc8e
[`TF`] Also apply patch to support left padding (#25085) · 2fac3422
Arthur authored Jul 25, 2023
```
* tf versions

* apply changes to other models

* 3 models slipped through the cracks
```
2fac3422

[ `ForSequenceClassification`] Support `left` padding (#24979) · f1045227

Arthur authored Jul 25, 2023

* support left padding

* nit

* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py

* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py

f1045227