1. 15 Jan, 2024 1 commit
  2. 11 Jan, 2024 2 commits
  3. 08 Jan, 2024 1 commit
    • NielsRogge's avatar
      Add SigLIP (#26522) · 3b742ea8
      NielsRogge authored
      
      
      * Add first draft
      
      * Use appropriate gelu function
      
      * More improvements
      
      * More improvements
      
      * More improvements
      
      * Convert checkpoint
      
      * More improvements
      
      * Improve docs, remove print statements
      
      * More improvements
      
      * Add link
      
      * remove unused masking function
      
      * begin tokenizer
      
      * do_lower_case
      
      * debug
      
      * set split_special_tokens=True
      
      * Remove script
      
      * Fix style
      
      * Fix rebase
      
      * Use same design as CLIP
      
      * Add fast tokenizer
      
      * Add SiglipTokenizer to init, remove extra_ids
      
      * Improve conversion script
      
      * Use smaller inputs in conversion script
      
      * Update conversion script
      
      * More improvements
      
      * Add processor to conversion script
      
      * Add tests
      
      * Remove print statements
      
      * Add tokenizer tests
      
      * Fix more tests
      
      * More improvements related to weight initialization
      
      * More improvements
      
      * Make more tests pass
      
      * More improvements
      
      * More improvements
      
      * Add copied from
      
      * Add canonicalize_text
      
      * Enable fast tokenizer tests
      
      * More improvements
      
      * Fix most slow tokenizer tests
      
      * Address comments
      
      * Fix style
      
      * Remove script
      
      * Address some comments
      
      * Add copied from to tests
      
      * Add more copied from
      
      * Add more copied from
      
      * Add more copied from
      
      * Remove is_flax_available
      
      * More updates
      
      * Address comment
      
      * Remove SiglipTokenizerFast for now
      
      * Add caching
      
      * Remove umt5 test
      
      * Add canonicalize_text inside _tokenize, thanks Arthur
      
      * Fix image processor tests
      
      * Skip tests which are not applicable
      
      * Skip test_initialization
      
      * More improvements
      
      * Compare pixel values
      
      * Fix doc tests, add integration test
      
      * Add do_normalize
      
      * Remove causal mask and leverage ignore copy
      
      * Fix attention_mask
      
      * Fix remaining tests
      
      * Fix dummies
      
      * Rename temperature and bias
      
      * Address comments
      
      * Add copied from to tokenizer tests
      
      * Add SiglipVisionModel to auto mapping
      
      * Add copied from to image processor tests
      
      * Improve doc
      
      * Remove SiglipVisionModel from index
      
      * Address comments
      
      * Improve docs
      
      * Simplify config
      
      * Add first draft
      
      * Make it like mistral
      
      * More improvements
      
      * Fix attention_mask
      
      * Fix output_attentions
      
      * Add note in docs
      
      * Convert multilingual model
      
      * Convert large checkpoint
      
      * Convert more checkpoints
      
      * Add pipeline support, correct image_mean and image_std
      
      * Use padding=max_length by default
      
      * Make processor like llava
      
      * Add code snippet
      
      * Convert more checkpoints
      
      * Set keep_punctuation_string=None as in OpenCLIP
      
      * Set normalized=False for special tokens
      
      * Fix doc test
      
      * Update integration test
      
      * Add figure
      
      * Update organization
      
      * Happy new year
      
      * Use AutoModel everywhere
      
      ---------
      Co-authored-by: default avatarpatil-suraj <surajp815@gmail.com>
      3b742ea8
  4. 05 Jan, 2024 1 commit
  5. 22 Dec, 2023 1 commit
  6. 15 Dec, 2023 1 commit
  7. 14 Dec, 2023 1 commit
    • Matt's avatar
      Proper build() methods for TF (#27794) · 050e0b44
      Matt authored
      * Add a convenience method for building in your own name scope
      
      * Second attempt at auto layer building
      
      * Revert "Second attempt at auto layer building"
      
      This reverts commit e03a3aaecf9ec41a805582b83cbdfe3290a631be.
      
      * Attempt #3
      
      * Revert "Attempt #3"
      
      This reverts commit b9df7a0857560d29b5abbed6127d9e9eca77cf47.
      
      * Add missing attributes that we're going to need later
      
      * Add some attributes we're going to need later
      
      * A fourth attempt! Feel the power flow through you!
      
      * Revert "A fourth attempt! Feel the power flow through you!"
      
      This reverts commit 6bf4aaf3875d6f28485f50187617a4c616c8aff7.
      
      * Add more values we'll need later
      
      * TF refactor that we'll need later
      
      * Revert "TF refactor that we'll need later"
      
      This reverts commit ca07202fb5b7b7436b893baa8d688b4f348ea7b9.
      
      * Revert "Revert "TF refactor that we'll need later""
      
      This reverts commit 1beb0f39f293ed9c27594575e1c849aadeb15c13.
      
      * make fixup
      
      * Attempt five!
      
      * Revert "Attempt five!"
      
      This reverts commit 3302207958dfd0374b0447a51c06eea51a506044.
      
      * Attempt six - this time don't add empty methods
      
      * Revert "Attempt six - this time don't add empty methods"
      
      This reverts commit 67d60129be75416b6beb8f47c7d38d77b18d79bb.
      
      * Attempt seven - better base model class detection!
      
      * Revert "Attempt seven - better base model class detection!"
      
      This reverts commit 5f14845e92ea0e87c598da933bfbfee10f553bc9.
      
      * Another attribute we'll need later
      
      * Try again with the missing attribute!
      
      * Revert "Try again with the missing attribute!"
      
      This reverts commit 760c6f30c5dffb3e04b0e73c34a77d1882a0fef7.
      
      * This is the attempt that will pierce the heavens!
      
      * Revert "This is the attempt that will pierce the heavens!"
      
      This reverts commit c868bb657de057aca7a5260350a3f831fc4dfee6.
      
      * Attempt seven - snag list is steadily decreasing
      
      * Revert "Attempt seven - snag list is steadily decreasing"
      
      This reverts commit 46fbd975deda64429bfb3e5fac4fc0370c00d316.
      
      * Attempt eight - will an empty snag list do it?
      
      * Revert "Attempt eight - will an empty snag list do it?"
      
      This reverts commit 7c8a3c2b083253649569e9877e02054ae5cec67b.
      
      * Fixes to Hubert issues that cause problems later
      
      * Trying again with Conv1D/SeparableConv fixes
      
      * Revert "Trying again with Conv1D/SeparableConv fixes"
      
      This reverts commit 55092bca952bc0f750aa1ffe246a640bf1e2036e.
      
      * Apply the build shape fixes to Wav2Vec2 as well
      
      * One more attempt!
      
      * Revert "One more attempt!"
      
      This reverts commit 5ac3e4cb01b9458cc93312873725f9444ae7261c.
      
      * Another attempt!
      
      * Revert "Another attempt!"
      
      This reverts commit ea16d890e019d7de8792a3b8e72f3b1c02adae50.
      
      * Let's see how many failures we get without the internal build method
      
      * Fix OpenAI
      
      * Fix MobileBERT
      
      * (Mostly) fix GroupVIT
      
      * Fix BLIP
      
      * One more BLIP fix
      
      * One more BLIP fix!
      
      * Fix Regnet
      
      * Finally fully fix GroupViT
      
      * Fix Data2Vec and add the new AdaptivePool
      
      * Fix Segformer
      
      * Fix Albert
      
      * Fix Deberta/DebertaV2
      
      * Fix XLM
      
      * Actually fix XLM
      
      * Fix Flaubert
      
      * Fix lxmert
      
      * Fix Resnet
      
      * Fix ConvBERT
      
      * Fix ESM
      
      * Fix Convnext / ConvnextV2
      
      * Fix SAM
      
      * Fix Efficientformer
      
      * Fix LayoutLMv3
      
      * Fix speech_to_text
      
      * Fix mpnet and mobilevit
      
      * Fix Swin
      
      * Fix CTRL
      
      * Fix CVT
      
      * Fix DPR
      
      * Fix Wav2Vec2
      
      * Fix T5
      
      * Fix Hubert
      
      * Fix GPT2
      
      * Fix Whisper
      
      * Fix DeiT
      
      * Fix the encoder-decoder / dual-encoder classes
      
      * make fix-copies
      
      * build in name scope
      
      * Fix summarization test
      
      * Fix tied weight names for BART + Blenderbot
      
      * Fix tied weight name building
      
      * Fix to TFESM weight building
      
      * Update TF SAM
      
      * Expand all the shapes out into Big Boy Shapes
      050e0b44
  8. 08 Dec, 2023 1 commit
  9. 07 Dec, 2023 1 commit
    • Younes Belkada's avatar
      [`Llava`] Add Llava to transformers (#27662) · 44b5506d
      Younes Belkada authored
      * add model like
      
      * logits match
      
      * minor fixes
      
      * fixes
      
      * up
      
      * up
      
      * add todo
      
      * llava processor
      
      * keep the processor simple
      
      * add conversion script
      
      * fixup
      
      * fix copies
      
      * up
      
      * add to index
      
      * fix config + logits
      
      * fix
      
      * refactor
      
      * more refactor
      
      * more refactor
      
      * fix copies
      
      * add authors
      
      * v1 tests
      
      * add `LlavaProcessor` in init
      
      * remove unneeded import
      
      * up
      
      * up
      
      * docs
      
      * up
      
      * fix CI
      
      * fix CI
      
      * add attention  mask in test
      
      * make fixup
      
      * remove the vision model
      
      * that' s the dirty way to do it
      
      * nits
      
      * nits
      
      * updates
      
      * add more tests
      
      * add input tests
      
      * fixup
      
      * more styling
      
      * nits
      
      * updates amd cleanup
      
      * fixup the generation expected results
      
      * fix the testing script
      
      * some cleanup and simplification which does not work yet but almost there!
      
      * make correct dispatch operations
      
      * vectorize works for batch of images and text
      
      * last todos
      
      * nits
      
      * update test and modeling code
      
      * remove useless function for now
      
      * fix few issues
      
      * fix generation
      
      * some nits
      
      * add bakllava
      
      * nits
      
      * remove duplicated code
      
      * finis merge
      
      * cleanup
      
      * missed this line
      
      * fill the todos
      
      * add left padding offset
      
      * add left and rignt padding logic
      
      * bool to properly index
      
      * make sure
      
      * more cleanups
      
      * batch is fixed 😉
      
      
      
      * add correct device for tensor creation
      
      * fix some dtype missmatch
      
      * ruff
      
      * update conversion script
      
      * Update src/transformers/__init__.py
      
      * fa 2 support + fix conversion script
      
      * more
      
      * correct reshaping
      
      * fix test dict
      
      * fix copies by ignoring
      
      * fix nit
      
      * skip clip vision model
      
      * fixup
      
      * fixup
      
      * LlavaForVisionText2Text -> LlavaForCausalLM
      
      * update
      
      * fix
      
      * raise correct errors
      
      * fix
      
      * docs
      
      * nuke for now
      
      * nits here and there
      
      * fixup
      
      * fix remaining tests
      
      * update LlavaForConditionalGeneration instead of CausalLM
      
      * fixups
      
      * pipeline support
      
      * slow and piepline tests
      
      * supports batch
      
      * nits
      
      * cleanup
      
      * fix first integration tests
      
      * add pad token where needed
      
      * correct etsts
      
      * fixups
      
      * update pipeline testr
      
      * fix quality
      
      * nits
      
      * revert unneeded change
      
      * nit
      
      * use BatchFeature
      
      * from ...feature_extraction_utils import BatchFeature
      
      * nits
      
      * nits
      
      * properly update
      
      * more f*** nits
      
      * fix copies
      
      * comment
      
      * keep slow test slow
      
      * Update src/transformers/models/llava/processing_llava.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * add piepline example
      
      * add pixel values in docstrign
      
      * update pr doctest
      
      * fix
      
      * fix slow tests
      
      * remove hack
      
      * fixup
      
      * small note
      
      * forward contrib credits from PR25789
      
      * forward contrib credits from original implementation and work
      
      * add arthur
      
      * Update src/transformers/models/llava/processing_llava.py
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      
      * update docstring
      
      * nit
      
      * move to not doctested because of timeout issues
      
      * fixup
      
      * add description
      
      * more
      
      * fix-copies
      
      * fix docs
      
      * add beam search
      
      * add more comments
      
      * add typehints on processor
      
      * add speedup plot
      
      * update slow tests and docs
      
      * push test
      
      * push batched test
      
      * fix batched generation with different number of images
      
      * remove benchmark due to a bug
      
      * fix test
      
      * fix copies
      
      * add gcolab demo
      
      ---------
      Co-authored-by: default avatarArthur Zucker <arthur.zucker@gmail.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      Co-authored-by: default avatarshauray8 <shauray8@users.noreply.github.com>
      Co-authored-by: default avatarhaotian-liu <haotian-liu@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      44b5506d
  10. 04 Dec, 2023 1 commit
  11. 23 Nov, 2023 1 commit
  12. 22 Nov, 2023 1 commit
    • Patrick von Platen's avatar
      [Whisper] Add sequential longform decoding (#27492) · 4151fbb4
      Patrick von Platen authored
      * [Whisper] Add seq gen
      
      * [Whisper] Add seq gen
      
      * more debug
      
      * Fix whisper logit processor
      
      * Improve whisper code further
      
      * Fix more
      
      * more debug
      
      * more debug
      
      * Improve further
      
      * Add tests
      
      * Prep for batch size > 1
      
      * Get batch_size>1 working
      
      * Correct more
      
      * Add extensive tests
      
      * more debug
      
      * more debug
      
      * more debug
      
      * add more tests
      
      * more debug
      
      * Apply suggestions from code review
      
      * more debug
      
      * add comments to explain the code better
      
      * add comments to explain the code better
      
      * add comments to explain the code better
      
      * Add more examples
      
      * add comments to explain the code better
      
      * fix more
      
      * add comments to explain the code better
      
      * add comments to explain the code better
      
      * correct
      
      * correct
      
      * finalize
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      4151fbb4
  13. 16 Nov, 2023 2 commits
    • Arthur's avatar
      [`Styling`] stylify using ruff (#27144) · 651408a0
      Arthur authored
      
      
      * try to stylify using ruff
      
      * might need to remove these changes?
      
      * use ruf format andruff check
      
      * use isinstance instead of type comparision
      
      * use # fmt: skip
      
      * use # fmt: skip
      
      * nits
      
      * soem styling changes
      
      * update ci job
      
      * nits isinstance
      
      * more files update
      
      * nits
      
      * more nits
      
      * small nits
      
      * check and format
      
      * revert wrong changes
      
      * actually use formatter instead of checker
      
      * nits
      
      * well docbuilder is overwriting this commit
      
      * revert notebook changes
      
      * try to nuke docbuilder
      
      * style
      
      * fix feature exrtaction test
      
      * remve `indent-width = 4`
      
      * fixup
      
      * more nits
      
      * update the ruff version that we use
      
      * style
      
      * nuke docbuilder styling
      
      * leve the print for detected changes
      
      * nits
      
      * Remove file I/O
      Co-authored-by: default avatarcharliermarsh <charlie.r.marsh@gmail.com>
      
      * style
      
      * nits
      
      * revert notebook changes
      
      * Add # fmt skip when possible
      
      * Add # fmt skip when possible
      
      * Fix
      
      * More `  # fmt: skip` usage
      
      * More `  # fmt: skip` usage
      
      * More `  # fmt: skip` usage
      
      * NIts
      
      * more fixes
      
      * fix tapas
      
      * Another way to skip
      
      * Recommended way
      
      * Fix two more fiels
      
      * Remove asynch
      Remove asynch
      
      ---------
      Co-authored-by: default avatarcharliermarsh <charlie.r.marsh@gmail.com>
      651408a0
    • Lucain's avatar
      Set `usedforsecurity=False` in hashlib methods (FIPS compliance) (#27483) · fd65aa98
      Lucain authored
      * Set usedforsecurity=False in hashlib methods (FIPS compliance)
      
      * trigger ci
      
      * tokenizers version
      
      * deps
      
      * bump hfh version
      
      * let's try this
      fd65aa98
  14. 14 Nov, 2023 1 commit
  15. 09 Nov, 2023 1 commit
  16. 07 Nov, 2023 1 commit
  17. 02 Nov, 2023 1 commit
  18. 31 Oct, 2023 3 commits
  19. 19 Oct, 2023 1 commit
  20. 12 Oct, 2023 1 commit
  21. 03 Oct, 2023 1 commit
  22. 25 Sep, 2023 1 commit
  23. 22 Sep, 2023 1 commit
  24. 18 Sep, 2023 2 commits
    • Arthur's avatar
      🚨🚨 🚨🚨 [`Tokenizer`] attemp to fix add_token issues🚨🚨 🚨🚨 (#23909) · 2da88537
      Arthur authored
      
      
      * fix test for bart. Order is correct now let's skip BPEs
      
      * ouf
      
      * styling
      
      * fix bert....
      
      * slow refactoring
      
      * current updates
      
      * massive refactoring
      
      * update
      
      * NICE!
      
      * update to see where I am at
      
      * updates
      
      * update
      
      * update
      
      * revert
      
      * updates
      
      * updates
      
      * start supporting legacy_save
      
      * styling
      
      * big update
      
      * revert some changes
      
      * nits
      
      * nniiiiiice
      
      * small fixes
      
      * kinda fix t5 with new behaviour
      
      * major update
      
      * fixup
      
      * fix copies
      
      * today's updates
      
      * fix byt5
      
      * upfate
      
      * update
      
      * update
      
      * updates
      
      * update vocab size test
      
      * Barthez does not use not need the fairseq offset ids
      
      * super calll must be after
      
      * calll super
      
      * move all super init
      
      * move other super init
      
      * fixup
      
      * nits
      
      * more fixes
      
      * nits
      
      * more fixes
      
      * nits
      
      * more fix
      
      * remove useless files
      
      * ouch all of them are affected
      
      * and more!
      
      * small imporvements
      
      * no more sanitize token
      
      * more changes around unique no split tokens
      
      * partially fix more things
      
      * keep legacy save but add warning
      
      * so... more fixes
      
      * updates
      
      * guess deberta tokenizer could be nuked
      
      * fixup
      
      * fixup did some bad things
      
      * nuke it if it breaks
      
      * remove prints and pretrain fast from slow with new format.
      
      * fixups
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * fiou
      
      * nit
      
      * by default specials should not be normalized?
      
      * update
      
      * remove brakpoint
      
      * updates
      
      * a lot of updates
      
      * fixup
      
      * fixes revert some changes to match fast
      
      * small nits
      
      * that makes it cleaner
      
      * fix camembert accordingly
      
      * update
      
      * some lest breaking changes
      
      * update
      
      * fixup
      
      * fix byt5 and whisper mostly
      
      * some more fixes, canine's byte vocab
      
      * fix gpt2
      
      * fix most of the perceiver tests (4 left)
      
      * fix layout lmv3
      
      * fixup
      
      * fix copies for gpt2 style
      
      * make sure to only warn once
      
      * fix perciever and gpt2 tests
      
      * some more backward compatibility: also read special tokens map because some ppl use it........////.....
      
      * fixup
      
      * add else when reading
      
      * nits
      
      * fresh updates
      
      * fix copies
      
      * will this make everything faster?
      
      * fixes
      
      * more fixes
      
      * update
      
      * more fixes
      
      * fixup
      
      * is the source of truth right?
      
      * sorry camembert for the troubles
      
      * current updates
      
      * fixup
      
      * update led
      
      * update
      
      * fix regression
      
      * fix single word
      
      * more model specific fixes
      
      * fix t5 tests
      
      * fixup
      
      * more comments
      
      * update
      
      * fix nllb
      
      * rstrip removed
      
      * small fixes
      
      * better handle additional_special_tokens and vocab sizes
      
      * fixing
      
      * styling
      
      * fix 4 / 21
      
      * fixup
      
      * fix nlbb's tests
      
      * some fixes
      
      * fix t5
      
      * fixes
      
      * style
      
      * fix canine tests
      
      * damn this is nice
      
      * nits
      
      * m2m100 nit
      
      * fixups
      
      * fixes!
      
      * fixup
      
      * stash
      
      * fix merge
      
      * revert bad change
      
      * fixup
      
      * correct order for code Llama
      
      * fix speecht5 post merge
      
      * styling
      
      * revert source of 11 fails
      
      * small nits
      
      * all changes in one go
      
      * fnet hack
      
      * fix 2 more tests
      
      * update based on main branch of tokenizers
      
      * fixup
      
      * fix VITS issues
      
      * more fixes
      
      * fix mgp test
      
      * fix camembert issues
      
      * oups camembert still has 2 failing tests
      
      * mluke fixes
      
      * decode fixes
      
      * small nits
      
      * nits
      
      * fix llama and vits
      
      * fix camembert
      
      * smal nits
      
      * more fixes when initialising a fast from a slow and etc
      
      * fix one of the last test
      
      * fix CPM tokenizer test
      
      * fixups
      
      * fix pop2piano
      
      * fixup
      
      * ️ Change tokenizers required version ️
      
      * ️ Change tokenizers required version ️
      
      * "tokenizers>=0.14,<0.15", don't forget smaller than
      
      * fix musicgen tests and pretraiendtokenizerfast
      
      * fix owlvit and all
      
      * update t5
      
      * fix 800 red
      
      * fix tests
      
      * fix the fix of the fix of t5
      
      * styling
      
      * documentation nits
      
      * cache _added_tokens_encoder
      
      * fixups
      
      * Nit
      
      * fix red tests
      
      * one last nit!
      
      * make eveything a lot simpler
      
      * Now it's over 😉
      
      
      
      * few small nits
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * updates that work for now
      
      * tests that should no be skipped / changed and fixed next
      
      * fixup
      
      * i am ashamed
      
      * pushe the fix
      
      * update
      
      * fixups
      
      * nits
      
      * fix added_tokens_encoder
      
      * fix canine test
      
      * fix pegasus vocab
      
      * fix transfoXL
      
      * fixup
      
      * whisper needs to be fixed for train new
      
      * pegasus nits
      
      * more pegasus fixes
      
      * minor update
      
      * better error message in failed test
      
      * fix whisper failing test
      
      * fix whisper failing test
      
      * fix pegasus
      
      * fixup
      
      * fix **** pegasus
      
      * reset things
      
      * remove another file
      
      * attempts to fix the strange custome encoder and offset
      
      * nits here and there
      
      * update
      
      * fixup
      
      * nit
      
      * fix the whisper test
      
      * nits nits
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * updates based on review
      
      * some small update to potentially remove
      
      * nits
      
      * import rlu cache
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      
      * move warning to `from_pretrained`
      
      * update tests results now that the special tokens are always added
      
      ---------
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      2da88537
    • Matt's avatar
      Fix ConversationalPipeline tests (#26217) · f0a6057f
      Matt authored
      Add BlenderbotSmall templates and correct handling for conversation.past_user_inputs
      f0a6057f
  25. 14 Sep, 2023 3 commits
    • Joshua Lochner's avatar
      [Whisper] Fix word-level timestamps for audio < 30 seconds (#25607) · 95fe0f5d
      Joshua Lochner authored
      
      
      * Fix word-level timestamps for audio < 30 seconds
      
      * Fix code quality
      
      * fix unit tests
      
      * Fix unit tests
      
      * Fix unit test
      
      * temp: print out result
      
      * temp: set max diff to None
      
      * fix unit tests
      
      * fix typo
      
      * Fix typo
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Use generation config for `num_frames`
      
      * fix docs
      
      * Move `num_frames` to kwargs
      
      * compute stride/attn_mask once
      
      * mark test as slow
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      Co-authored-by: default avatarsanchit-gandhi <sanchit@huggingface.co>
      95fe0f5d
    • Sanchit Gandhi's avatar
      [MusicGen] Add sampling rate to config (#26136) · 44a0490d
      Sanchit Gandhi authored
      
      
      * [MusicGen] Add sampling rate to config
      
      * remove tiny
      
      * make property
      
      * Update tests/pipelines/test_pipelines_text_to_audio.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * style
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      44a0490d
    • Matt's avatar
      Overhaul Conversation class and prompt templating (#25323) · 866df66f
      Matt authored
      
      
      * First commit while I figure this out
      
      * make fixup
      
      * Remove unused method
      
      * Store prompt attrib
      
      * Fix prompt argument for tests
      
      * Make same changes in fast tokenizer
      
      * Remove global prompts from fast tokenizer too
      
      * stash commit
      
      * stash commit
      
      * Migrate PromptConfig to its True Final Location
      
      * Replace Conversation entirely with the new class
      
      * Import/dependency fixes
      
      * Import/dependency fixes
      
      * Change format for lots of default prompts
      
      * More default prompt fixups
      
      * Revert llama old methods so we can compare
      
      * Fix some default configs
      
      * Fix some default configs
      
      * Fix misspelled kwarg
      
      * Fixes for Blenderbot
      
      * make fixup
      
      * little rebase cleanup
      
      * Add basic documentation
      
      * Quick doc fix
      
      * Truncate docstring for now
      
      * Add handling for the case when messages is a single string
      
      * Quick llama merges
      
      * Update conversational pipeline and tests
      
      * Add a couple of legacy properties for backward compatibility
      
      * More legacy handling
      
      * Add docstring for build_conversation_input_ids
      
      * Restructure PromptConfig
      
      * Let's start T E M P L A T I N G
      
      * Refactor all default configs to use templates instead
      
      * Revert changes to the special token properties since we don't need them anymore
      
      * More class templates
      
      * Make the sandbox even sandier
      
      * Everything replaced with pure templating
      
      * Remove docs for PromptConfig
      
      * Add testing and optional requirement boilerplate
      
      * Fix imports and make fixup
      
      * Fix LLaMA tests and add Conversation docstring
      
      * Finally get LLaMA working with the template system
      
      * Finally get LLaMA working with the template system
      
      * make fixup
      
      * make fixup
      
      * fmt-off for the long lists of test tokens
      
      * Rename method to apply_chat_template for now
      
      * Start on documentation
      
      * Make chat_template a property that reads through to the default if it's not set
      
      * Expand docs
      
      * Expand chat templating doc some more
      
      * trim/lstrip blocks by default and update doc
      
      * Few doc tweaks
      
      * rebase cleanup
      
      * Clarify docstring
      
      * rebase cleanup
      
      * rebase cleanup
      
      * make fixup
      
      * Quick doc edit
      
      * Reformat the standard template to match ChatML
      
      * Re-add PEFT check
      
      * Update docs/source/en/chat_templating.md
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Add apply_chat_template to the tokenizer doc
      
      * make fixup
      
      * Add doc links
      
      * Fix chat links
      
      * Fix chat links
      
      * Explain system messages in the doc
      
      * Add chat template test
      
      * Proper save-loading for chat template attribute
      
      * Add test skips for layout models
      
      * Remove _build_conversation_input_ids, add default_chat_template to code_llama
      
      * Make sure all LLaMA models are using the latest template
      
      * Remove default_system_prompt block in code_llama because it has no default prompt
      
      * Update ConversationPipeline preprocess
      
      * Add correct #Copied from links to the default_chat_templates
      
      * Remove unneeded type checking line
      
      * Add a dummy mark_processsed method
      
      * Reorganize Conversation to have **deprecated_kwargs
      
      * Update chat_templating.md
      
      * Quick fix to LLAMA tests
      
      * Small doc tweaks
      
      * Add proper docstrings and "copied from" statements to all default chat templates
      
      * Merge use_default_system_prompt support for code_llama too
      
      * Improve clarity around self.chat_template
      
      * Docstring fix
      
      * Fix blenderbot default template
      
      * More doctest fix
      
      * Break out some tokenizer kwargs
      
      * Update doc to explain default templates
      
      * Quick tweaks to tokenizer args
      
      * Cleanups for tokenizer args
      
      * Add note about cacheing
      
      * Quick tweak to the chat-templating doc
      
      * Update the LLaMA template with error checking and correct system message embedding
      
      * make fixup
      
      * make fixup
      
      * add requires_jinja
      
      * Cleanup to expected output formatting
      
      * Add cacheing
      
      * Fix typo in llama default template
      
      * Update LLaMA tests
      
      * Update documentation
      
      * Improved legacy handling in the Conversation class
      
      * Update Jinja template with proper error handling
      
      * Quick bugfix
      
      * Proper exception raising
      
      * Change cacheing behaviour so it doesn't try to pickle an entire Jinja env
      
      * make fixup
      
      * rebase cleanup
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      866df66f
  26. 05 Sep, 2023 2 commits
  27. 01 Sep, 2023 1 commit
  28. 31 Aug, 2023 1 commit
  29. 30 Aug, 2023 1 commit
    • Juan Pizarro's avatar
      Add Blip2 model in VQA pipeline (#25532) · 09dc9951
      Juan Pizarro authored
      * Add Blip2 model in VQA pipeline
      
      * use require_torch_gpu for test_large_model_pt_blip2
      
      * use can_generate in vqa pipeline
      
      * test Blip2ForConditionalGeneration using float16
      
      * remove custom can_generate from Blip2ForConditionalGeneration
      09dc9951
  30. 24 Aug, 2023 1 commit
  31. 18 Aug, 2023 1 commit
  32. 17 Aug, 2023 1 commit
    • Yoach Lacombe's avatar
      Add Text-To-Speech pipeline (#24952) · b8f69d0d
      Yoach Lacombe authored
      
      
      * add AutoModelForTextToSpeech class
      
      * add TTS pipeline and tessting
      
      * add docstrings to text_to_speech pipeline
      
      * fix torch dependency
      
      * corrector 'processor is None' case in Pipeline
      
      * correct repo id
      
      * modify text-to-speech -> text-to-audio
      
      * remove processor
      
      * rename text_to_speech pipelines files to text_audio
      
      * add textToWaveform and textToSpectrogram instead of textToAudio classes
      
      * update TTS pipeline to the bare minimum
      
      * update tests TTS pipeline
      
      * make style and erase useless import torch in TTS pipeline tests
      
      * modify how to check if generate or forward in TTS pipeline
      
      * remove unnecessary extra new lines
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * refactor input_texts -> text_inputs
      
      * correct docstrings of TTS.__call__
      
      * correct the shape of generated waveform
      
      * take care of Bark tokenizer special case
      
      * correct run_pipeline_test TTS
      
      * make style
      
      * update TTS docstrings
      
      * address Sylvain nit refactors
      
      * make style
      
      * refactor into one liners
      
      * correct squeeze
      
      * correct way to test if forward or generate
      
      * Update output audio waveform shape
      
      * make style
      
      * correct import
      
      * modify how the TTS pipeline test if a model can generate
      
      * align shape output of TTS pipeline with consistent shape
      
      ---------
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      b8f69d0d