Commits · 9884862383766b9de666fb677e66b841e9dde63b · chenpangpang / transformers

01 May, 2023 5 commits

Depricate xpu_backend for ddp_backend (#23085) · 98848623

Zachary Mueller authored May 01, 2023



* Depricate xpu_backend for ddp_backend

* Typo

* Only do a minor deprecation, no need for major
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

98848623

Fix `convnext` __init__ (#23078) · 95cf3725
IMvision12 authored May 01, 2023
```
fix
```
95cf3725

Add `BioGPTForSequenceClassification` (#22253) · 487f132a

Ashwin Mathur authored May 01, 2023



* added BioGptForSequenceClassification

* added source of copied code

* typo

* Format code with black

* Update comments for copied code

* Remove code copy comment

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fix failing tests

* Update code copied from comments

* Fix code quality

* Update src/transformers/models/biogpt/modeling_biogpt.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fix lint error

* Update src/transformers/models/biogpt/modeling_biogpt.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Rename model to biogpt for consistency

* Add PipelineTesterMixin to test_modeling_biogpt.py

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Resolve merge confict

---------
Co-authored-by: Guillem García Subies <37592763+GuillemGSubies@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

487f132a

Fix string syntax error in logger warning message (additional comma) (#23083) · 549e5f9f
Xin Wen authored May 01, 2023

549e5f9f
Fix grammar error in summarization pipeline (#23080) · 9062d1ba
Stephen Kaplan authored May 01, 2023
```
Fix minor grammar issue
```
9062d1ba

29 Apr, 2023 1 commit
- Generate: prepare assisted generation for release (#23052) · 849367cc
  Joao Gante authored Apr 29, 2023
  
  849367cc
28 Apr, 2023 5 commits

Fix model parallelism for `BridgeTower` (#23039) · b6865b9b
Yih-Dar authored Apr 28, 2023
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
b6865b9b
🚨🚨🚨 [`Blip`] remove labels masking (#23024) · d337631b
Younes Belkada authored Apr 28, 2023
```
* remove labels masking

* add fix on blip tf
```
d337631b

add open-llama model with ckpt (#22795) · c2c99dc7

s-JoL authored Apr 28, 2023



* update Open-Llama model

* update

* update format

* update doc

* update

* update stable embedding test

* update test case

* update format

* update readme

* fix typo

* update name

* remove tokenizer and update format

* remove convert_open_llama_weights_to_hf

* update warning and doc_string

---------
Co-authored-by: songliang.bayesian <songliang.bayesian@bytedance.com>

c2c99dc7

Cuda rng_state_all is used when saving in distributed mode so same should also... · 4d0ea3d2

Shivam Shrirao authored Apr 28, 2023

Cuda rng_state_all is used when saving in distributed mode so same should also be used when loading (#23045)

cuda rng state should be all for distributed bc all were saved

4d0ea3d2

Add Trainer support for ReduceLROnPlateau (#23010) · 9b435204

Maxime Méloux authored Apr 28, 2023



* Add Trainer support for ReduceLROnPlateau

Fixes #16503

* Remove training argument and add default instance

---------
Co-authored-by: mmeloux <maxime.meloux@loria.fr>

9b435204

27 Apr, 2023 6 commits

Fix bigbird random attention (#21023) · 88399476

Bartosz Szmelczynski authored Apr 27, 2023

* switch np.random.permutation to jax.random.permuation

* remove comments

* remove leftover comment

* skip similarity tests

* modify indices_prng_key usage, add deterministic behaviour

* update style

* remove unused import

* remove copy statement since classes are not identical

* remove numpy import

* revert removing copied from statements

* make style from copied

* remove copied from statement

* update copied from statement to include only np.ndarry

* add deterministic args, unittestskip equivalence tests

88399476

added GPTNeoForTokenClassification (#22908) · d65b14ed

peter-sk authored Apr 27, 2023



* added GPTNeoForTokenClassification

* add to top-level init

* fixup

* test

* more fixup

* add to gpt_neo.mdx

* repo consistency

* dummy copy

* fix copies

* optax >= 0.1.5 assumes jax.Array exists - which it doesn't for jax <= 0.3.6

* merge with main made this superfluous

* added classifier_dropout

* remove legacy code

* removed fmt:on/off
removed expected_outputs

* doc style fix

* classifier_dropout is always in config

---------
Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>

d65b14ed

added GPTNeoXForTokenClassification (#23002) · 614e191c

peter-sk authored Apr 27, 2023



* initial commit

* added GPTNeoXForTokenClassification

* typo

* doc
fixed extra comma that turned into a tuple

* unifying variable names
fixing forward call

* classifier_dropout is in config
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

614e191c

[MEGA] nit size test (#23028) · 1933231a

Arthur authored Apr 27, 2023

* add fast not use warning

* properly check sequence_length vs chunk_size

* fixup

1933231a

[`Pix2Struct`] Fix pix2struct doctest (#23023) · 9435cc66
Younes Belkada authored Apr 27, 2023
```
fix pix2struct doctest
```
9435cc66

Add methods to PreTrainedModel to use PyTorch's BetterTransformer (#21259) · 3042c63a

fxmarty authored Apr 27, 2023



* fix mess

* better documentation

* typo

* fix doc

* update

* add test

* fix test

* more tests

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* move to utils

* Apply suggestions from code review
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* nit

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

3042c63a

26 Apr, 2023 9 commits

🚨🚨🚨 Use default ignore index in Luke (#23014) · 0083b149
Sylvain Gugger authored Apr 26, 2023
```
Use default ignore index in Luke
```
0083b149

Bring back PartialState DeepSpeed (#22921) · 8b129030

Zachary Mueller authored Apr 26, 2023

* Bring back deepspeed integration

* Branchname

* Self-scheduled

* newline

* Use deepspeed env var

* Remove comment

* Del env var after partialstate

8b129030

Fix None value when adding info to auto_map (#22990) · 4331923b
Sylvain Gugger authored Apr 26, 2023

4331923b

[Llama Tokenizer] Fast llama template (#22959) · d0b50023

Arthur authored Apr 26, 2023

* update template processing for llama fast to add eos

* style

* update

* adress training from new issue

* fix

* update

* special tokens can be given even if not used

d0b50023

[`PEFT`] Add HFTracer support for PEFT (#23006) · 00bc6e20

Younes Belkada authored Apr 26, 2023



* add hack fx

* continue hacking

* final changes

* Test

* Add a keys method

* Fix keys method

* revert unneeded changes

* small nit

---------
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

00bc6e20

🚨🚨🚨 [`Pix2Struct`] Attempts to fix training issues 🚨🚨🚨 (#23004) · 304aacac
Younes Belkada authored Apr 26, 2023
```
* multiple fixes

- add `add_special_tokens` to `True` by default
- remove label smoothing and labels masking

* fix test
```
304aacac

Add gradient checkpointing to Whisper Flax (#22954) · ba0dc545

Javier de la Rosa authored Apr 26, 2023

* Add gradient checkpointing to Whisper Flax

* self.gradient_checkpointing only needed in nn.Module, removing unnecessary comments

ba0dc545

Remove a failing ONNX test (#23011) · a72b82eb
Yih-Dar authored Apr 26, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
a72b82eb

Add TensorFlow Wav2Vec2 for sequence classification (#22073) · 20ac86c6

Ritik Nandwal authored Apr 26, 2023

* Add initial changes for TF wav2vec2 for sequence classification

* Add suggested changes

* Add serving and serving output methods

* Add serving_output implementation and fix layer_weights

* Add fixes

* Fixed test cases

* Fixing test and adding suggested changes

20ac86c6

25 Apr, 2023 4 commits

[`DocTest`] Fix correct checkpoint (#22988) · a0ae2310
Younes Belkada authored Apr 25, 2023
```
fix pipeline issue
```
a0ae2310
Avoid invalid escape sequences, use raw strings (#22936) · 54272503
Lingepumpe authored Apr 25, 2023
```
* Avoid invalid escape sequences, use raw strings

* Integrate PR feedback
```
54272503

Neptune fix bug init run (#22836) · 0a570dbd

AleksanderWWW authored Apr 25, 2023



* [neptune] fix checkpoint bug with relative out_dir

* update imports

* reformat with black

* check neptune without imports

* fix typing-related issue

* run black on code

* use os.path.sep instead of raw \

* simplify imports and remove type annotation

* make ruff happy

* apply review suggestions

* replace run with with_id kwarg to run

* update imports to avoid deprecation warnings for the latest client

---------
Co-authored-by: kshitij12345 <kshitijkalambarkar@gmail.com>

0a570dbd

[`SAM`] Add sam doc (#22984) · d4d62846
Younes Belkada authored Apr 25, 2023
```
* add sam doc

* fixes

* multiple fixes
```
d4d62846

24 Apr, 2023 9 commits

Generate: assisted generation with sample (take 2) (#22949) · e4a97f82
Joao Gante authored Apr 24, 2023
```
* temperature controls speed
```
e4a97f82

Update feature selection in to_tf_dataset (#21935) · 8f20e61c

amyeroberts authored Apr 24, 2023

* Update feature selection

* Check compatibility with datasets version

* Checkout from datasets main

8f20e61c

fix ValueError message in LlamaAttention (#22966) · 503e8c8b
othertea authored Apr 24, 2023

503e8c8b

Reverting Deta cloning mecanism. (#22656) · 6e329593

Nicolas Patry authored Apr 24, 2023



* Fixed the revert by making sure that even the regexp can cover all
duplicates.

* Code simplification using hash.

* Fixing the `ident`.

* Fixing ignoring patterened duplicate names.

* Using `accelerate@find_tied_parameters` for from_pretrained

This is more correct there, since it handles meta device seemlessly
and we don't need to handle "non-duplicate" tensors (slices of each
other).

* Protecting accelerate.

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

6e329593

Prepare tests for hfh 0.14 (#22958) · 74c55ab9

Lucain authored Apr 24, 2023



* Test hf_hub 0.14.0rc1

* fix mocked tests

* package version

---------
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
Co-authored-by: testbot <lucainp@hf.co>

74c55ab9

[Fix Bugs] Fix keys in `_load_pretrained_model` (#22947) · 69f2d538
hanrui1sensetime authored Apr 24, 2023
```
fix transformers keys
```
69f2d538
Raise error if `stride` is too high in `TokenClassificationPipeline` (#22942) · b5f06d6c
Connor Boyle authored Apr 24, 2023
```
* Raise error if `stride` is too high

* Clarify use of `stride`
```
b5f06d6c

Add an attribute to disable custom kernels in deformable detr in order to make... · edb6d950

fxmarty authored Apr 24, 2023


Add an attribute to disable custom kernels in deformable detr in order to make the model ONNX exportable (#22918)

* add disable kernel option

* add comment

* fix copies

* add disable_custom_kernels to config

* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/deta/modeling_deta.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* style

* fix

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

edb6d950

Generate: Add exception path for Donut (#22955) · 2fbd6df8
Joao Gante authored Apr 24, 2023

2fbd6df8

23 Apr, 2023 1 commit

Add FocalNet (#21532) · 3d3204c0

NielsRogge authored Apr 23, 2023



Adds FocalNet by Microsoft to transformers

---------
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: alaradirik <alaradirik@gmail.com>

3d3204c0