Commits · fe1c16e95a1c2f7730b5d7340b669a63ab0b8ded · chenpangpang / transformers

"templates/vscode:/vscode.git/clone" did not exist on "00031785a846c5a9fe7541a717a436840b609084"

22 Nov, 2023 1 commit

[Whisper] Add sequential longform decoding (#27492) · 4151fbb4

Patrick von Platen authored Nov 22, 2023

* [Whisper] Add seq gen

* [Whisper] Add seq gen

* more debug

* Fix whisper logit processor

* Improve whisper code further

* Fix more

* more debug

* more debug

* Improve further

* Add tests

* Prep for batch size > 1

* Get batch_size>1 working

* Correct more

* Add extensive tests

* more debug

* more debug

* more debug

* add more tests

* more debug

* Apply suggestions from code review

* more debug

* add comments to explain the code better

* add comments to explain the code better

* add comments to explain the code better

* Add more examples

* add comments to explain the code better

* fix more

* add comments to explain the code better

* add comments to explain the code better

* correct

* correct

* finalize

* Apply suggestions from code review

* Apply suggestions from code review

4151fbb4

16 Nov, 2023 2 commits

[`Styling`] stylify using ruff (#27144) · 651408a0

Arthur authored Nov 16, 2023



* try to stylify using ruff

* might need to remove these changes?

* use ruf format andruff check

* use isinstance instead of type comparision

* use # fmt: skip

* use # fmt: skip

* nits

* soem styling changes

* update ci job

* nits isinstance

* more files update

* nits

* more nits

* small nits

* check and format

* revert wrong changes

* actually use formatter instead of checker

* nits

* well docbuilder is overwriting this commit

* revert notebook changes

* try to nuke docbuilder

* style

* fix feature exrtaction test

* remve `indent-width = 4`

* fixup

* more nits

* update the ruff version that we use

* style

* nuke docbuilder styling

* leve the print for detected changes

* nits

* Remove file I/O
Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com>

* style

* nits

* revert notebook changes

* Add # fmt skip when possible

* Add # fmt skip when possible

* Fix

* More `  # fmt: skip` usage

* More `  # fmt: skip` usage

* More `  # fmt: skip` usage

* NIts

* more fixes

* fix tapas

* Another way to skip

* Recommended way

* Fix two more fiels

* Remove asynch
Remove asynch

---------
Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com>

651408a0

Set `usedforsecurity=False` in hashlib methods (FIPS compliance) (#27483) · fd65aa98

Lucain authored Nov 16, 2023

* Set usedforsecurity=False in hashlib methods (FIPS compliance)

* trigger ci

* tokenizers version

* deps

* bump hfh version

* let's try this

fd65aa98

14 Nov, 2023 1 commit
- [Whisper] Fix pipeline test (#27442) · a4616c67
  Sanchit Gandhi authored Nov 14, 2023
  
  a4616c67
09 Nov, 2023 1 commit
- Fix RequestCounter to make it more future-proof (#27406) · e38348ae
  Lucain authored Nov 09, 2023
```
* Fix RequestCounter to make it more future-proof

* code quality
```
  e38348ae
07 Nov, 2023 1 commit

[Whisper] Block language/task args for English-only (#27322) · da7ea9a4

Sanchit Gandhi authored Nov 07, 2023



* [Whisper] Block language/task args for English-only

* Update src/transformers/models/whisper/modeling_whisper.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

da7ea9a4

02 Nov, 2023 1 commit

Enrich TTS pipeline parameters naming (#26473) · 0ed6729b

Yoach Lacombe authored Nov 02, 2023



* enrich TTS pipeline docstring for clearer forward_params use

* change token leghts

* update Pipeline parameters

* correct docstring and make style

* fix tests

* make style

* change music prompt
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* raise errors if generate_kwargs with forward-only models

* make style

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

0ed6729b

31 Oct, 2023 3 commits

Backward compatibility fix for the Conversation class (#27176) · 05f22901
Matt authored Oct 31, 2023
```
* Backward compatibility fix for the Conversation class

* Explain what's going on in the conditional
```
05f22901
device agnostic pipelines testing (#27129) · f53041a7
Hz, Ji authored Oct 31, 2023
```
* device agnostic pipelines testing

* pass torch_device
```
f53041a7

Shorten the conversation tests for speed + fixing position overflows (#26960) · 08fadc80

Matt authored Oct 31, 2023



* Shorten the conversation tests for speed + fixing position overflows

* Put max_new_tokens back to 5

* Remove test skips

* Increase max_position_embeddings in blenderbot tests

* Add skips for blenderbot_small

* Correct TF test skip

* make fixup

* Reformat skips to use is_pipeline_test_to_skip

* Update tests/models/blenderbot_small/test_modeling_blenderbot_small.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/blenderbot_small/test_modeling_flax_blenderbot_small.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/blenderbot_small/test_modeling_tf_blenderbot_small.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

08fadc80

19 Oct, 2023 1 commit

Fix and re-enable ConversationalPipeline tests (#26907) · bdbcd5d4

Matt authored Oct 19, 2023

* Fix and re-enable conversationalpipeline tests

* Fix the batch test so the change only applies to conversational pipeline

bdbcd5d4

12 Oct, 2023 1 commit
- Add many missing spaces in adjacent strings (#26751) · 40ea9ab2
  Tom Aarsen authored Oct 12, 2023
```
Add missing spaces in adjacent strings
```
  40ea9ab2
03 Oct, 2023 1 commit

Add tokenizer kwargs to fill mask pipeline. (#26234) · b5ca8fcd

Nathan Cahill authored Oct 03, 2023



* add tokenizer kwarg inputs

* Adding tokenizer_kwargs to _sanitize_parameters

* Add truncation=True example to tests

* Update test_pipelines_fill_mask.py

* Update test_pipelines_fill_mask.py

* make fix-copies and make style

* Update fill_mask.py

Replace single tick with double

* make fix-copies

* Style

---------
Co-authored-by: Lysandre <lysandre@huggingface.co>

b5ca8fcd

25 Sep, 2023 1 commit

Update tiny model information and pipeline tests (#26285) · d9e4bc28

Yih-Dar authored Sep 25, 2023



* Update tiny model summary file

* add to pipeline tests

* revert

* fix import

* fix import

* fix

* fix

* update

* update

* update

* fix

* remove BarkModelTest

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

d9e4bc28

22 Sep, 2023 1 commit

Add image to image pipeline (#25393) · 576cd45a

LeviVasconcelos authored Sep 22, 2023



* Add image to image pipeline

Add image to image pipeline

* remove swin2sr from tf auto

* make ImageToImage importable

* make style

make style

make style

make style

* remove tf support

* remove nonused imports

* fix postprocessing

* add important comments; add unit tests

* add documentation

* remove support for TF

* make fixup

* fix typehint Image.Image

* fix documentation code

* address review request; fix unittest type checking

* address review request; fix unittest type checking

* make fixup

* address reviews

* Update src/transformers/pipelines/image_to_image.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* enhance docs

* make style

* make style

* improve docetest time

* improve docetest time

* Update tests/pipelines/test_pipelines_image_to_image.py
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

* Update tests/pipelines/test_pipelines_image_to_image.py
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

* make fixup

* undo faulty merge

* undo faulty merge

* add image-to-image to test pipeline mixin

* Update src/transformers/pipelines/image_to_image.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/pipelines/test_pipelines_image_to_image.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* improve docs

---------
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

576cd45a

18 Sep, 2023 2 commits

🚨

[`Tokenizer`] attemp to fix add_token issues

🚨

(#23909) · 2da88537

Arthur authored Sep 18, 2023



* fix test for bart. Order is correct now let's skip BPEs

* ouf

* styling

* fix bert....

* slow refactoring

* current updates

* massive refactoring

* update

* NICE!

* update to see where I am at

* updates

* update

* update

* revert

* updates

* updates

* start supporting legacy_save

* styling

* big update

* revert some changes

* nits

* nniiiiiice

* small fixes

* kinda fix t5 with new behaviour

* major update

* fixup

* fix copies

* today's updates

* fix byt5

* upfate

* update

* update

* updates

* update vocab size test

* Barthez does not use not need the fairseq offset ids

* super calll must be after

* calll super

* move all super init

* move other super init

* fixup

* nits

* more fixes

* nits

* more fixes

* nits

* more fix

* remove useless files

* ouch all of them are affected

* and more!

* small imporvements

* no more sanitize token

* more changes around unique no split tokens

* partially fix more things

* keep legacy save but add warning

* so... more fixes

* updates

* guess deberta tokenizer could be nuked

* fixup

* fixup did some bad things

* nuke it if it breaks

* remove prints and pretrain fast from slow with new format.

* fixups

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fiou

* nit

* by default specials should not be normalized?

* update

* remove brakpoint

* updates

* a lot of updates

* fixup

* fixes revert some changes to match fast

* small nits

* that makes it cleaner

* fix camembert accordingly

* update

* some lest breaking changes

* update

* fixup

* fix byt5 and whisper mostly

* some more fixes, canine's byte vocab

* fix gpt2

* fix most of the perceiver tests (4 left)

* fix layout lmv3

* fixup

* fix copies for gpt2 style

* make sure to only warn once

* fix perciever and gpt2 tests

* some more backward compatibility: also read special tokens map because some ppl use it........////.....

* fixup

* add else when reading

* nits

* fresh updates

* fix copies

* will this make everything faster?

* fixes

* more fixes

* update

* more fixes

* fixup

* is the source of truth right?

* sorry camembert for the troubles

* current updates

* fixup

* update led

* update

* fix regression

* fix single word

* more model specific fixes

* fix t5 tests

* fixup

* more comments

* update

* fix nllb

* rstrip removed

* small fixes

* better handle additional_special_tokens and vocab sizes

* fixing

* styling

* fix 4 / 21

* fixup

* fix nlbb's tests

* some fixes

* fix t5

* fixes

* style

* fix canine tests

* damn this is nice

* nits

* m2m100 nit

* fixups

* fixes!

* fixup

* stash

* fix merge

* revert bad change

* fixup

* correct order for code Llama

* fix speecht5 post merge

* styling

* revert source of 11 fails

* small nits

* all changes in one go

* fnet hack

* fix 2 more tests

* update based on main branch of tokenizers

* fixup

* fix VITS issues

* more fixes

* fix mgp test

* fix camembert issues

* oups camembert still has 2 failing tests

* mluke fixes

* decode fixes

* small nits

* nits

* fix llama and vits

* fix camembert

* smal nits

* more fixes when initialising a fast from a slow and etc

* fix one of the last test

* fix CPM tokenizer test

* fixups

* fix pop2piano

* fixup

* ⚠️ Change tokenizers required version ⚠️

* ⚠️ Change tokenizers required version ⚠️

* "tokenizers>=0.14,<0.15", don't forget smaller than

* fix musicgen tests and pretraiendtokenizerfast

* fix owlvit and all

* update t5

* fix 800 red

* fix tests

* fix the fix of the fix of t5

* styling

* documentation nits

* cache _added_tokens_encoder

* fixups

* Nit

* fix red tests

* one last nit!

* make eveything a lot simpler

* Now it's over 😉



* few small nits

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* updates that work for now

* tests that should no be skipped / changed and fixed next

* fixup

* i am ashamed

* pushe the fix

* update

* fixups

* nits

* fix added_tokens_encoder

* fix canine test

* fix pegasus vocab

* fix transfoXL

* fixup

* whisper needs to be fixed for train new

* pegasus nits

* more pegasus fixes

* minor update

* better error message in failed test

* fix whisper failing test

* fix whisper failing test

* fix pegasus

* fixup

* fix **** pegasus

* reset things

* remove another file

* attempts to fix the strange custome encoder and offset

* nits here and there

* update

* fixup

* nit

* fix the whisper test

* nits nits

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* updates based on review

* some small update to potentially remove

* nits

* import rlu cache

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

* move warning to `from_pretrained`

* update tests results now that the special tokens are always added

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>

2da88537

Fix ConversationalPipeline tests (#26217) · f0a6057f

Matt authored Sep 18, 2023

Add BlenderbotSmall templates and correct handling for conversation.past_user_inputs

f0a6057f

14 Sep, 2023 3 commits

[Whisper] Fix word-level timestamps for audio < 30 seconds (#25607) · 95fe0f5d

Joshua Lochner authored Sep 14, 2023



* Fix word-level timestamps for audio < 30 seconds

* Fix code quality

* fix unit tests

* Fix unit tests

* Fix unit test

* temp: print out result

* temp: set max diff to None

* fix unit tests

* fix typo

* Fix typo
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Use generation config for `num_frames`

* fix docs

* Move `num_frames` to kwargs

* compute stride/attn_mask once

* mark test as slow

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>

95fe0f5d

[MusicGen] Add sampling rate to config (#26136) · 44a0490d

Sanchit Gandhi authored Sep 14, 2023



* [MusicGen] Add sampling rate to config

* remove tiny

* make property

* Update tests/pipelines/test_pipelines_text_to_audio.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* style

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

44a0490d

Overhaul Conversation class and prompt templating (#25323) · 866df66f

Matt authored Sep 14, 2023



* First commit while I figure this out

* make fixup

* Remove unused method

* Store prompt attrib

* Fix prompt argument for tests

* Make same changes in fast tokenizer

* Remove global prompts from fast tokenizer too

* stash commit

* stash commit

* Migrate PromptConfig to its True Final Location

* Replace Conversation entirely with the new class

* Import/dependency fixes

* Import/dependency fixes

* Change format for lots of default prompts

* More default prompt fixups

* Revert llama old methods so we can compare

* Fix some default configs

* Fix some default configs

* Fix misspelled kwarg

* Fixes for Blenderbot

* make fixup

* little rebase cleanup

* Add basic documentation

* Quick doc fix

* Truncate docstring for now

* Add handling for the case when messages is a single string

* Quick llama merges

* Update conversational pipeline and tests

* Add a couple of legacy properties for backward compatibility

* More legacy handling

* Add docstring for build_conversation_input_ids

* Restructure PromptConfig

* Let's start T E M P L A T I N G

* Refactor all default configs to use templates instead

* Revert changes to the special token properties since we don't need them anymore

* More class templates

* Make the sandbox even sandier

* Everything replaced with pure templating

* Remove docs for PromptConfig

* Add testing and optional requirement boilerplate

* Fix imports and make fixup

* Fix LLaMA tests and add Conversation docstring

* Finally get LLaMA working with the template system

* Finally get LLaMA working with the template system

* make fixup

* make fixup

* fmt-off for the long lists of test tokens

* Rename method to apply_chat_template for now

* Start on documentation

* Make chat_template a property that reads through to the default if it's not set

* Expand docs

* Expand chat templating doc some more

* trim/lstrip blocks by default and update doc

* Few doc tweaks

* rebase cleanup

* Clarify docstring

* rebase cleanup

* rebase cleanup

* make fixup

* Quick doc edit

* Reformat the standard template to match ChatML

* Re-add PEFT check

* Update docs/source/en/chat_templating.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add apply_chat_template to the tokenizer doc

* make fixup

* Add doc links

* Fix chat links

* Fix chat links

* Explain system messages in the doc

* Add chat template test

* Proper save-loading for chat template attribute

* Add test skips for layout models

* Remove _build_conversation_input_ids, add default_chat_template to code_llama

* Make sure all LLaMA models are using the latest template

* Remove default_system_prompt block in code_llama because it has no default prompt

* Update ConversationPipeline preprocess

* Add correct #Copied from links to the default_chat_templates

* Remove unneeded type checking line

* Add a dummy mark_processsed method

* Reorganize Conversation to have **deprecated_kwargs

* Update chat_templating.md

* Quick fix to LLAMA tests

* Small doc tweaks

* Add proper docstrings and "copied from" statements to all default chat templates

* Merge use_default_system_prompt support for code_llama too

* Improve clarity around self.chat_template

* Docstring fix

* Fix blenderbot default template

* More doctest fix

* Break out some tokenizer kwargs

* Update doc to explain default templates

* Quick tweaks to tokenizer args

* Cleanups for tokenizer args

* Add note about cacheing

* Quick tweak to the chat-templating doc

* Update the LLaMA template with error checking and correct system message embedding

* make fixup

* make fixup

* add requires_jinja

* Cleanup to expected output formatting

* Add cacheing

* Fix typo in llama default template

* Update LLaMA tests

* Update documentation

* Improved legacy handling in the Conversation class

* Update Jinja template with proper error handling

* Quick bugfix

* Proper exception raising

* Change cacheing behaviour so it doesn't try to pickle an entire Jinja env

* make fixup

* rebase cleanup

---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

866df66f

05 Sep, 2023 2 commits

[`CI`] Fix red CI and ERROR failed should show (#25995) · d0354e5e

Arthur authored Sep 05, 2023

* start with error too

* fix ?

* start with nit

* one more path

* use `job_name`

* mark pipeline test as slow

d0354e5e

[Wav2Vec2 Conformer] Fix inference float16 (#25985) · 8d518013
Sanchit Gandhi authored Sep 05, 2023
```
* [Wav2Vec2 Conformer] Fix inference float16

* fix test

* fix test more

* clean pipe test
```
8d518013

01 Sep, 2023 1 commit

[VITS] Add to TTA pipeline (#25906) · b439129e

Sanchit Gandhi authored Sep 01, 2023



* [VITS] Add to TTA pipeline

* Update tests/pipelines/test_pipelines_text_to_audio.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* remove extra spaces

---------
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

b439129e

31 Aug, 2023 1 commit
- Save image_processor while saving pipeline (ImageSegmentationPipeline) (#25884) · 2be8a909
  raghavanone authored Aug 31, 2023
```
* Save image_processor while saving pipeline (ImageSegmentationPipeline)

* Fix black issues
```
  2be8a909
30 Aug, 2023 1 commit

Add Blip2 model in VQA pipeline (#25532) · 09dc9951

Juan Pizarro authored Aug 30, 2023

* Add Blip2 model in VQA pipeline

* use require_torch_gpu for test_large_model_pt_blip2

* use can_generate in vqa pipeline

* test Blip2ForConditionalGeneration using float16

* remove custom can_generate from Blip2ForConditionalGeneration

09dc9951

24 Aug, 2023 1 commit
- [ASR Pipe Test] Fix CTC timestamps error message (#25727) · 02188768
  Sanchit Gandhi authored Aug 24, 2023
  
  02188768
18 Aug, 2023 1 commit

[`Llama`] remove prompt and fix prefix finetuning (#25565) · bc3e20dc

Arthur authored Aug 18, 2023

* nit

* update

* make sure use_default_system_prompt is saved

* update checkpointing

* consistency

* use_default_system_prompt for test

bc3e20dc

17 Aug, 2023 1 commit

Add Text-To-Speech pipeline (#24952) · b8f69d0d

Yoach Lacombe authored Aug 17, 2023



* add AutoModelForTextToSpeech class

* add TTS pipeline and tessting

* add docstrings to text_to_speech pipeline

* fix torch dependency

* corrector 'processor is None' case in Pipeline

* correct repo id

* modify text-to-speech -> text-to-audio

* remove processor

* rename text_to_speech pipelines files to text_audio

* add textToWaveform and textToSpectrogram instead of textToAudio classes

* update TTS pipeline to the bare minimum

* update tests TTS pipeline

* make style and erase useless import torch in TTS pipeline tests

* modify how to check if generate or forward in TTS pipeline

* remove unnecessary extra new lines

* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* refactor input_texts -> text_inputs

* correct docstrings of TTS.__call__

* correct the shape of generated waveform

* take care of Bark tokenizer special case

* correct run_pipeline_test TTS

* make style

* update TTS docstrings

* address Sylvain nit refactors

* make style

* refactor into one liners

* correct squeeze

* correct way to test if forward or generate

* Update output audio waveform shape

* make style

* correct import

* modify how the TTS pipeline test if a model can generate

* align shape output of TTS pipeline with consistent shape

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

b8f69d0d

16 Aug, 2023 1 commit

[ASR Pipeline] Fix init with timestamps (#25438) · 36f183eb

Sanchit Gandhi authored Aug 16, 2023

* [ASR Pipeline] Fix init

* refactor test

* change default kwarg setting

* only perform checks if we have to

* override init

* move pre/forward/post checks to sanitize

36f183eb

08 Aug, 2023 1 commit

[ASR Pipeline] Clarify return timestamps (#25344) · dedd1116

Sanchit Gandhi authored Aug 08, 2023

* [ASR Pipeline] Clarify return timestamps

* fix indentation

* fix ctc check

* fix ctc error message!

* fix test

* fix other test

* add new tests

* final comment

dedd1116

28 Jul, 2023 1 commit

🚨

Fix rescale ViVit Efficientnet (#25174) · 05cda5df

amyeroberts authored Jul 28, 2023

* Fix rescaling bug

* Add tests

* Update integration tests

* Fix up

* Update src/transformers/image_transforms.py

* Update test - new possible order in list

05cda5df

18 Jul, 2023 2 commits

[`Llama2`] Add support for Llama 2 (#24891) · 07360b6c

Arthur authored Jul 18, 2023



* add llama

* add other readmes

* update padding id in readme

* add link to paper

* fix paths and tokenizer

* more nits

* styling

* fit operation in 2 lines when possible

* nits

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add form

* update reademe

* update readme, we don't have a default pad token

* update test and tokenization

* LLaMA instead of Llama

* nits

* add expected text

* add greeedy output

* styling

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* sequential device map

* skip relevant changes

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

07360b6c

Enable `ZeroShotAudioClassificationPipelineTests::test_small_model_pt` (#24882) · 57da42ad
Yih-Dar authored Jul 18, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
57da42ad

17 Jul, 2023 1 commit
- Skip failing `ZeroShotAudioClassificationPipelineTests::test_small_model_pt` for now (#24867) · 870dfc15
  Yih-Dar authored Jul 17, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  870dfc15
13 Jul, 2023 1 commit

Llama/GPTNeoX: add RoPE scaling (#24653) · 34d94094

Joao Gante authored Jul 13, 2023



* add rope_scaling

* tmp commit

* add gptneox

* add tests

* GPTNeoX can now handle long inputs, so the pipeline test was wrong

* Update src/transformers/models/open_llama/configuration_open_llama.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove ntk

* remove redundant validation

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

34d94094

26 Jun, 2023 1 commit

[`pipeline`] Fix str device issue (#24396) · 914289ac

Younes Belkada authored Jun 26, 2023



* fix str device issue

* fixup

* adapt from suggestions

* forward contrib credits from suggestions

* better fix

* added backward compatibility for older PT versions

* final fixes

* oops

* Attempting something with less branching.

---------
Co-authored-by: amyeroberts <amyeroberts@users.noreply.github.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

914289ac

23 Jun, 2023 1 commit

Allow dict input for audio classification pipeline (#23445) · 8767958f

Sanchit Gandhi authored Jun 23, 2023



* Allow dict input for audio classification pipeline

* make style

* Empty commit to trigger CI

* Empty commit to trigger CI

* check for torchaudio

* add pip instructions
Co-authored-by: Sylvain <sylvain.gugger@gmail.com>

* Update src/transformers/pipelines/audio_classification.py
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

* asr -> audio class

* asr -> audio class

---------
Co-authored-by: Sylvain <sylvain.gugger@gmail.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

8767958f

22 Jun, 2023 1 commit
- Skip `test_conditional_generation_pt_pix2struct` in Past CI (torch < 1.11) (#24417) · 652ece07
  Yih-Dar authored Jun 22, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  652ece07
21 Jun, 2023 1 commit

add word-level timestamps to Whisper (#23205) · cd927a47

Matthijs Hollemans authored Jun 21, 2023

* let's go!

* initial implementation of token-level timestamps

* only return a single timestamp per token

* remove token probabilities

* fix return type

* fix doc comment

* strip special tokens

* rename

* revert to not stripping special tokens

* only support models that have alignment_heads

* add integration test

* consistently name it token-level timestamps

* small DTW tweak

* initial support for ASR pipeline

* fix pipeline doc comments

* resolve token timestamps in pipeline with chunking

* change warning when no final timestamp is found

* return word-level timestamps

* fixup

* fix bug that skipped final word in each chunk

* fix failing unit tests

* merge punctuations into the words

* also return word tokens

* also return token indices

* add (failing) unit test for combine_tokens_into_words

* make combine_tokens_into_words private

* restore OpenAI's punctuation rules

* add pipeline tests

* make requested changes

* PR review changes

* fix failing pipeline test

* small stuff from PR

* only return words and their timestamps, not segments

* move alignment_heads into generation config

* forgot to set alignment_heads in pipeline tests

* tiny comment fix

* grr

cd927a47

20 Jun, 2023 1 commit
- Update tiny models for pipeline testing. (#24364) · c23d131e
  Yih-Dar authored Jun 20, 2023
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  c23d131e