Commits · 54e17a15dc4fb4be329eb9aaf534a4c6e776d598 · chenpangpang / transformers

05 Oct, 2023 1 commit
- Fix failing tests on `main` due to torch 2.1 (#26607) · 54e17a15
  Yih-Dar authored Oct 05, 2023
```
* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  54e17a15
04 Oct, 2023 1 commit
- Add # Copied from statements to audio feature extractors that use the floats_list function (#26581) · 9deb18ca
  dg845 authored Oct 04, 2023
```
Add # Copied from statements to audio feature extractors that use the floats_list function.
```
  9deb18ca
03 Oct, 2023 1 commit
- [Wav2Vec2 and Co] Update init tests for PT 2.1 (#26494) · 768aa3d9
  Sanchit Gandhi authored Oct 03, 2023
  
  768aa3d9
18 Sep, 2023 1 commit

[`Tokenizer`] attemp to fix add_token issues

(#23909) · 2da88537

Arthur authored Sep 18, 2023



* fix test for bart. Order is correct now let's skip BPEs

* ouf

* styling

* fix bert....

* slow refactoring

* current updates

* massive refactoring

* update

* NICE!

* update to see where I am at

* updates

* update

* update

* revert

* updates

* updates

* start supporting legacy_save

* styling

* big update

* revert some changes

* nits

* nniiiiiice

* small fixes

* kinda fix t5 with new behaviour

* major update

* fixup

* fix copies

* today's updates

* fix byt5

* upfate

* update

* update

* updates

* update vocab size test

* Barthez does not use not need the fairseq offset ids

* super calll must be after

* calll super

* move all super init

* move other super init

* fixup

* nits

* more fixes

* nits

* more fixes

* nits

* more fix

* remove useless files

* ouch all of them are affected

* and more!

* small imporvements

* no more sanitize token

* more changes around unique no split tokens

* partially fix more things

* keep legacy save but add warning

* so... more fixes

* updates

* guess deberta tokenizer could be nuked

* fixup

* fixup did some bad things

* nuke it if it breaks

* remove prints and pretrain fast from slow with new format.

* fixups

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fiou

* nit

* by default specials should not be normalized?

* update

* remove brakpoint

* updates

* a lot of updates

* fixup

* fixes revert some changes to match fast

* small nits

* that makes it cleaner

* fix camembert accordingly

* update

* some lest breaking changes

* update

* fixup

* fix byt5 and whisper mostly

* some more fixes, canine's byte vocab

* fix gpt2

* fix most of the perceiver tests (4 left)

* fix layout lmv3

* fixup

* fix copies for gpt2 style

* make sure to only warn once

* fix perciever and gpt2 tests

* some more backward compatibility: also read special tokens map because some ppl use it........////.....

* fixup

* add else when reading

* nits

* fresh updates

* fix copies

* will this make everything faster?

* fixes

* more fixes

* update

* more fixes

* fixup

* is the source of truth right?

* sorry camembert for the troubles

* current updates

* fixup

* update led

* update

* fix regression

* fix single word

* more model specific fixes

* fix t5 tests

* fixup

* more comments

* update

* fix nllb

* rstrip removed

* small fixes

* better handle additional_special_tokens and vocab sizes

* fixing

* styling

* fix 4 / 21

* fixup

* fix nlbb's tests

* some fixes

* fix t5

* fixes

* style

* fix canine tests

* damn this is nice

* nits

* m2m100 nit

* fixups

* fixes!

* fixup

* stash

* fix merge

* revert bad change

* fixup

* correct order for code Llama

* fix speecht5 post merge

* styling

* revert source of 11 fails

* small nits

* all changes in one go

* fnet hack

* fix 2 more tests

* update based on main branch of tokenizers

* fixup

* fix VITS issues

* more fixes

* fix mgp test

* fix camembert issues

* oups camembert still has 2 failing tests

* mluke fixes

* decode fixes

* small nits

* nits

* fix llama and vits

* fix camembert

* smal nits

* more fixes when initialising a fast from a slow and etc

* fix one of the last test

* fix CPM tokenizer test

* fixups

* fix pop2piano

* fixup

* ⚠️ Change tokenizers required version ⚠️

* ⚠️ Change tokenizers required version ⚠️

* "tokenizers>=0.14,<0.15", don't forget smaller than

* fix musicgen tests and pretraiendtokenizerfast

* fix owlvit and all

* update t5

* fix 800 red

* fix tests

* fix the fix of the fix of t5

* styling

* documentation nits

* cache _added_tokens_encoder

* fixups

* Nit

* fix red tests

* one last nit!

* make eveything a lot simpler

* Now it's over 😉



* few small nits

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* updates that work for now

* tests that should no be skipped / changed and fixed next

* fixup

* i am ashamed

* pushe the fix

* update

* fixups

* nits

* fix added_tokens_encoder

* fix canine test

* fix pegasus vocab

* fix transfoXL

* fixup

* whisper needs to be fixed for train new

* pegasus nits

* more pegasus fixes

* minor update

* better error message in failed test

* fix whisper failing test

* fix whisper failing test

* fix pegasus

* fixup

* fix **** pegasus

* reset things

* remove another file

* attempts to fix the strange custome encoder and offset

* nits here and there

* update

* fixup

* nit

* fix the whisper test

* nits nits

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* updates based on review

* some small update to potentially remove

* nits

* import rlu cache

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

* move warning to `from_pretrained`

* update tests results now that the special tokens are always added

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>

2da88537

05 Sep, 2023 1 commit
- Fix typo (#25966) · 404ff8fc
  Susnato Dhar authored Sep 05, 2023
```
* Update feature_extraction_clap.py

* changed all lenght to length
```
  404ff8fc
11 Aug, 2023 1 commit
- Mark flaky tests (#25463) · 5e5fa0d8
  amyeroberts authored Aug 11, 2023
```
Make CI less brittle
```
  5e5fa0d8
02 Aug, 2023 2 commits

CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266) · bd90cda9
Yih-Dar authored Aug 02, 2023
```
* CI with layers=2

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
bd90cda9

[MMS] Fix mms (#25267) · b28ebb26

Patrick von Platen authored Aug 02, 2023

* [MMS] Fix mms

* [MMS] Fix mms

* fix mms loading

* Apply suggestions from code review

* make style

* Update tests/models/wav2vec2/test_modeling_wav2vec2.py

b28ebb26

30 Jun, 2023 1 commit

Speed up TF tests by reducing hidden layer counts (#24595) · 134caef3

Matt authored Jun 30, 2023

* hidden layers, huh, what are they good for (absolutely nothing)

* Some tests break with 1 hidden layer, use 2

* Use 1 hidden layer in a few slow models

* Use num_hidden_layers=2 everywhere

* Slightly higher tol for groupvit

* Slightly higher tol for groupvit

134caef3

27 Jun, 2023 1 commit

Fix TypeError: Object of type int64 is not JSON serializable (#24340) · 239ace15

Xiaoli Wang authored Jun 27, 2023

* Fix TypeError: Object of type int64 is not JSON serializable

* Convert numpy.float64 and numpy.int64 to float and int for json serialization

* Black reformatted examples/pytorch/token-classification/run_ner_no_trainer.py

* * make style

239ace15

20 Jun, 2023 2 commits
- [Wav2Vec2 - MMS] Correct directly loading adapters weights (#24335) · b0513b01
  Patrick von Platen authored Jun 20, 2023
```
* Correct direct lang loading

* correct more

* revert black

* Use tie weights instead=

* add tests

* add tests

* make style
```
  b0513b01
- Update tiny models for pipeline testing. (#24364) · c23d131e
  Yih-Dar authored Jun 20, 2023
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  c23d131e
12 Jun, 2023 1 commit
- Fix `Wav2Vec2` CI OOM (#24190) · e26c6f03
  Yih-Dar authored Jun 12, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  e26c6f03
07 Jun, 2023 1 commit
- [Wav2Vec2] Fix torch srcipt (#24062) · 52972e70
  Patrick von Platen authored Jun 07, 2023
```
* [Wav2Vec2] Fix torch srcipt

* fix more
```
  52972e70
02 Jun, 2023 1 commit

[MMS] Scaling Speech Technology to 1,000+ Languages | Add attention adapter to Wav2Vec2 (#23813) · 5dfd407b

Patrick von Platen authored Jun 02, 2023



* add fine-tuned with adapter layer

* Add set_target_lang to tokenizer

* Implement load adapter

* add tests

* make style

* Apply suggestions from code review

* Update src/transformers/models/wav2vec2/tokenization_wav2vec2.py

* make fix-copies

* Apply suggestions from code review

* make fix-copies

* make style again

* mkae style again

* fix doc string

* Update tests/models/wav2vec2/test_tokenization_wav2vec2.py

* Apply suggestions from code review

* fix

* Correct wav2vec2 adapter

* mkae style

* Update src/transformers/models/wav2vec2/modeling_wav2vec2.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* add more nice docs

* finish

* finish

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions from code review

* all finish

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

5dfd407b

24 May, 2023 1 commit

Better TF docstring types (#23477) · f8b25744

Matt authored May 24, 2023

* Rework TF type hints to use | None instead of Optional[] for tf.Tensor

* Rework TF type hints to use | None instead of Optional[] for tf.Tensor

* Don't forget the imports

* Add the imports to tests too

* make fixup

* Refactor tests that depended on get_type_hints

* Better test refactor

* Fix an old hidden bug in the test_keras_fit input creation code

* Fix for the Deit tests

f8b25744

22 May, 2023 1 commit

Fix wav2vec2 is_batched check to include 2-D numpy arrays (#23223) · 5de2a6d5

LWprogramming authored May 22, 2023



* Fix wav2vec2 is_batched check to include 2-D numpy arrays

* address comment

* Add tests

* oops

* oops

* Switch to np array
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Switch to np array

* condition merge

* Specify mono channel only in comment

* oops, add other comment too

* make style

* Switch list check from falsiness to empty

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

5de2a6d5

26 Apr, 2023 1 commit

Add TensorFlow Wav2Vec2 for sequence classification (#22073) · 20ac86c6

Ritik Nandwal authored Apr 26, 2023

* Add initial changes for TF wav2vec2 for sequence classification

* Add suggested changes

* Add serving and serving output methods

* Add serving_output implementation and fix layer_weights

* Add fixes

* Fixed test cases

* Fixing test and adding suggested changes

20ac86c6

04 Apr, 2023 1 commit

Fix inverted conditional in TF common test! (#22540) · edb704b2

Matt authored Apr 04, 2023

* Fix inverted conditional in TF common test!

* Make the same change in the PT tests file

* Make sure hidden states for GPT2 have the same output shape in PT/TF

* Minor fix to PT implementation of token classification loss

* Skip loss equivalence test for TFHubert because it keeps overflowing to inf

* Compute LM loss for TF the (weird) way it's computed in PT

* Skip loss equivalence test for Wav2Vec2 for the same reason as Hubert

* Fix - don't try to access the hidden states property when output is a tuple

edb704b2

21 Mar, 2023 1 commit

Time to Say Goodbye, torch 1.7 and 1.8 (#22291) · 67c2dbdb

Yih-Dar authored Mar 21, 2023



* time to say goodbye, torch 1.7 and 1.8

* clean up torch_int_div

* clean up is_torch_less_than_1_8-9

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

67c2dbdb

01 Mar, 2023 1 commit

prepare for "__floordiv__ is deprecated and its behavior will change in a... · 44e3e3fb

Arthur authored Mar 01, 2023

prepare for "__floordiv__ is deprecated  and its behavior will change in a future version of pytorch" (#20211)

* rounding_mode = "floor"  instead of // to prevent behavioral change

* add other TODO

* use `torch_int_div` from pytrch_utils

* same for tests

* fix copies

* style

* use relative imports when needed

* Co-authored-by: sgugger <sylvain.gugger@gmail.com>

44e3e3fb

28 Feb, 2023 1 commit

🔥

Rework pipeline testing by removing `PipelineTestCaseMeta`

🚀

(#21516) · 871c31a6

Yih-Dar authored Feb 28, 2023



* Add PipelineTesterMixin

* remove class PipelineTestCaseMeta

* move validate_test_components

* Add for ViT

* Add to SPECIAL_MODULE_TO_TEST_MAP

* style and quality

* Add feature-extraction

* update

* raise instead of skip

* add tiny_model_summary.json

* more explicit

* skip tasks not in mapping

* add availability check

* Add Copyright

* A way to diable irrelevant tests

* update with main

* remove disable_irrelevant_tests

* skip tests

* better skip message

* better skip message

* Add all pipeline task tests

* revert

* Import PipelineTesterMixin

* subclass test classes with PipelineTesterMixin

* Add pipieline_model_mapping

* Fix import after adding pipieline_model_mapping

* Fix style and quality after adding pipieline_model_mapping

* Fix one more import after adding pipieline_model_mapping

* Fix style and quality after adding pipieline_model_mapping

* Fix test issues

* Fix import requirements

* Fix mapping for MobileViTModelTest

* Update

* Better skip message

* pipieline_model_mapping could not be None

* Remove some PipelineTesterMixin

* Fix typo

* revert tests_fetcher.py

* update

* rename

* revert

* Remove PipelineTestCaseMeta from ZeroShotAudioClassificationPipelineTests

* style and quality

* test fetcher for all pipeline/model tests

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

871c31a6

22 Feb, 2023 1 commit
- Apply ruff flake8-comprehensions (#21694) · 5e8c8eb5
  Aaron Gokaslan authored Feb 22, 2023
  
  5e8c8eb5
15 Feb, 2023 2 commits
- Skipping more high mem tests - Wav2Vec2 Hubert (#21647) · 3499c49c
  amyeroberts authored Feb 15, 2023
```
Skipping more tests
```
  3499c49c
- Skip wav2vec2 hubert high mem tests (#21643) · fc28c006
  amyeroberts authored Feb 15, 2023
```
* Skip high memory tests

* Skip high memory tests

* Remove unused import
```
  fc28c006
14 Feb, 2023 2 commits
- Fix the real failing test · 31728292
  Sylvain Gugger authored Feb 14, 2023
  
  31728292
- Skip failing test · c6f163c7
  Sylvain Gugger authored Feb 14, 2023
  
  c6f163c7
13 Feb, 2023 2 commits
- Fix TF CTC tests (#21606) · 56b03c96
  Joao Gante authored Feb 13, 2023
  
  56b03c96
- Fix env. variable type issue in testing (#21609) · cbecf121
  Yih-Dar authored Feb 13, 2023
```
* fix env issue

* fix env issue

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  cbecf121
10 Feb, 2023 1 commit
- Replace input_values_processing with unpack_inputs (#21502) · cb565901
  amyeroberts authored Feb 10, 2023
```
* Replace input_values_prrocessing with unpack_inputs

* Skip test failing with OOM

* Update tests
```
  cb565901
06 Feb, 2023 1 commit

Update quality tooling for formatting (#21480) · 6f79d264

Sylvain Gugger authored Feb 06, 2023

* Result of black 23.1

* Update target to Python 3.7

* Switch flake8 to ruff

* Configure isort

* Configure isort

* Apply isort with line limit

* Put the right black version

* adapt black in check copies

* Fix copies

6f79d264

27 Dec, 2022 1 commit

[Past CI]

🔥

Leave Past CI failures in the past

🔥

(#20861) · 5fa0b17c

Yih-Dar authored Dec 27, 2022



* torch.jit._state

* Fix past CI

* Fix for perceiver

* Fix REALM

* Fix for Bloom

* Fix for SwinMode

* Fix for TrajectoryTransformerModel

* Fix for test_wav2vec2_with_lm

* make style
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

5fa0b17c

10 Nov, 2022 1 commit

[processor] Add 'model input names' property (#20117) · 905e5773

Sanchit Gandhi authored Nov 10, 2022

* [processor] Add 'model input names' property

* add test

* no f string

* add generic property method to mixin

* copy to multimodal

* copy to vision

* tests for all audio

* remove ad-hoc tests

* style

* fix flava test

* fix test

* fix processor code

905e5773

27 Oct, 2022 1 commit
- Fix bug in Wav2Vec2's GPU tests (#19803) · ea118ae2
  Antonio Carlos Falcão Petri authored Oct 27, 2022
```
* Fix tests when running on GPU

* Fix tests that require mp.set_start_method
```
  ea118ae2
18 Oct, 2022 2 commits

Clean up deprecation warnings (#19654) · a23819ed

David Yang authored Oct 19, 2022

* Clean up deprecation warnings

Notes:
Changed some strings in tests to raw strings, which will change the literal content of the strings as they are fed into whatever machine handles them.
Test cases for past in the past/past_key_values switch changed/removed due to warning of impending removal

* Add PILImageResampling abstraction for PIL.Image.Resampling

a23819ed

Allow user-managed Pool in Wav2Vec2ProcessorWithLM.batch_decode (#18351) · af150e4a

Antonio Carlos Falcão Petri authored Oct 18, 2022



* [Wav2Vec2] Allow user-managed Pool in Wav2Vec2ProcessorWithLM.batch_decode

* [Wav2Vec2] Add user-managed LM's pool tests and usage examples

* Improve styling
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* [Wav2Vec2] Fix hyperlink references
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

af150e4a

12 Oct, 2022 1 commit

Add a decorator for flaky tests (#19498) · 209bec46

Sylvain Gugger authored Oct 12, 2022

* Add a decorator for flaky tests

* Quality

* Don't break the rest

* Address review comments

* Fix test name

* Fix typo and print to stderr

209bec46

16 Sep, 2022 1 commit

FX support for ConvNext, Wav2Vec2 and ResNet (#19053) · c603c80f

Michael Benayoun authored Sep 16, 2022

* Support for ConvNext

* Support for Wav2Vec2

* Support for Resnet

* Fix small issue in test_modeling_convnext

c603c80f

12 Sep, 2022 1 commit
- TF: TF 2.10 unpin + related onnx test skips (#18995) · 1182b945
  Joao Gante authored Sep 12, 2022
  
  1182b945
09 Sep, 2022 1 commit

Fix train_step, test_step and tests for CLIP (#18684) · 660e0b97

Matt authored Sep 09, 2022



* Fix train_step and test_step, correctly enable CLIP fit test

* Stop using get_args on older Python versions

* Don't use get_origin either

* UnionType is actually even newer, don't use that either

* Apply the same fix to test_loss_computation

* Just realized I was accidentally skipping a bunch of tests!

* Fix test_loss_computation for models without separable labels

* Fix scalar losses in test_step and train_step

* Stop committing your breakpoints

* Fix Swin loss shape

* Fix Tapas loss shape

* Shape fixes for TAPAS, DeIT, HuBERT and ViTMAE

* Add loss computation to TFMobileBertForPreTraining

* make fixup and move copied from statement

* make fixup and move copied from statement

* Correct copied from

* Add labels and next_sentence_label inputs to TFMobileBERT

* Make sure total_loss is always defined

* Update tests/test_modeling_tf_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fix copied from

* Ensure CTC models get labels in tests

* Ensure CTC models get labels in tests

* Fix tests for vit_mae

* Fix tests for vit_mae

* Fix tests for vit_mae

* Reduce batch size for wav2vec2 testing because it was causing OOM

* Skip some TAPAS tests that are failing

* Skip a failing HuBERT test

* make style

* Fix mobilebertforpretraining test

* Skip Wav2Vec2 tests that use huge amounts of mem

* Skip keras_fit for Wav2Vec2 as well
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

660e0b97