1. 30 May, 2023 1 commit
  2. 24 May, 2023 2 commits
    • Sanchit Gandhi's avatar
      d8222be5
    • Matt's avatar
      Better TF docstring types (#23477) · f8b25744
      Matt authored
      * Rework TF type hints to use | None instead of Optional[] for tf.Tensor
      
      * Rework TF type hints to use | None instead of Optional[] for tf.Tensor
      
      * Don't forget the imports
      
      * Add the imports to tests too
      
      * make fixup
      
      * Refactor tests that depended on get_type_hints
      
      * Better test refactor
      
      * Fix an old hidden bug in the test_keras_fit input creation code
      
      * Fix for the Deit tests
      f8b25744
  3. 23 May, 2023 1 commit
  4. 19 May, 2023 1 commit
    • Connor Henderson's avatar
      feat: Whisper prompting (#22496) · 2acedf47
      Connor Henderson authored
      * initial working additions
      
      * clean and rename, add cond stripping initial prompt to decode
      
      * cleanup, edit create_initial_prompt_ids, add tests
      
      * repo consistency, flip order of conditional
      
      * fix error, move the processor fn to the tokenizer
      
      * repo consistency, update test ids to corresponding tokenizer
      
      * use convert_tokens_to_ids not get_vocab...
      
      * use actual conditional in generate
      
      * make sytle
      
      * initial address comments
      
      * initial working add new params to pipeline
      
      * first draft of sequential generation for condition_on_previous_text
      
      * add/update tests, make compatible with timestamps
      
      * make compatible with diff. input kwargs and max length
      
      * add None check
      
      * add temperature check
      
      * flip temp check operand
      
      * refocusing to prev pr scope
      
      * remove the params too
      
      * make style
      
      * edits, move max length incorporating prompt to whisper
      
      * address comments
      
      * remove asr pipeline prompt decoding, fix indexing
      
      * address comments (more tests, validate prompt)
      
      * un-comment out tests (from debug)
      
      * remove old comment
      
      * address comments
      
      * fix typo
      
      * remove timestamp token from test
      
      * make style
      
      * cleanup
      
      * copy method to fast tokenizer, set max_new_tokens for test
      
      * prompt_ids type just pt
      
      * address Amy's comments
      
      * make style
      2acedf47
  5. 11 May, 2023 2 commits
  6. 09 May, 2023 1 commit
    • Matthijs Hollemans's avatar
      audio_utils improvements (#21998) · 7f919509
      Matthijs Hollemans authored
      * silly change to allow making a PR
      
      * clean up doc comments
      
      * simplify hertz_to_mel and mel_to_hertz
      
      * fixup
      
      * clean up power_to_db
      
      * also add amplitude_to_db
      
      * move functions
      
      * clean up mel_filter_bank
      
      * fixup
      
      * credit librosa & torchaudio authors
      
      * add unit tests
      
      * tests for power_to_db and amplitude_to_db
      
      * add mel_filter_bank tests
      
      * rewrite STFT
      
      * add convenience spectrogram function
      
      * missing transpose
      
      * fewer transposes
      
      * add integration test to M-CTC-T
      
      * frame length can be either window or FFT length
      
      * rewrite stft API
      
      * add preemphasis coefficient
      
      * move argument
      
      * add log option to spectrogram
      
      * replace M-CTC-T feature extractor
      
      * fix api thing
      
      * replace whisper STFT
      
      * replace whisper mel filters
      
      * replace tvlt's stft
      
      * allow alternate window names
      
      * replace speecht5 stft
      
      * fixup
      
      * fix integration tests
      
      * fix doc comments
      
      * remove manual FFT length calculation
      
      * fix docs
      
      * go away, deprecation warnings
      
      * combine everything into spectrogram function
      
      * add deprecated functions back
      
      * fixup
      7f919509
  7. 05 May, 2023 2 commits
  8. 04 May, 2023 2 commits
  9. 18 Apr, 2023 1 commit
    • Joao Gante's avatar
      Generate: Add assisted generation (#22211) · 78cda46f
      Joao Gante authored
      * working mvp
      
      * remove breakpoint
      
      * fix commit
      
      * standardize outputs
      
      * tmp commit
      
      * tests almost ready
      
      * tmp commit
      
      * skip a few models
      
      * Add streaming; Docs and examples
      
      * document limitations
      
      * PR commits
      
      * Amy PR comments
      78cda46f
  10. 06 Apr, 2023 2 commits
  11. 22 Mar, 2023 1 commit
  12. 13 Mar, 2023 1 commit
  13. 09 Mar, 2023 1 commit
  14. 07 Mar, 2023 1 commit
  15. 03 Mar, 2023 1 commit
  16. 02 Mar, 2023 2 commits
  17. 01 Mar, 2023 1 commit
  18. 28 Feb, 2023 1 commit
    • Yih-Dar's avatar
      馃敟Rework pipeline testing by removing `PipelineTestCaseMeta` 馃殌 (#21516) · 871c31a6
      Yih-Dar authored
      
      
      * Add PipelineTesterMixin
      
      * remove class PipelineTestCaseMeta
      
      * move validate_test_components
      
      * Add for ViT
      
      * Add to SPECIAL_MODULE_TO_TEST_MAP
      
      * style and quality
      
      * Add feature-extraction
      
      * update
      
      * raise instead of skip
      
      * add tiny_model_summary.json
      
      * more explicit
      
      * skip tasks not in mapping
      
      * add availability check
      
      * Add Copyright
      
      * A way to diable irrelevant tests
      
      * update with main
      
      * remove disable_irrelevant_tests
      
      * skip tests
      
      * better skip message
      
      * better skip message
      
      * Add all pipeline task tests
      
      * revert
      
      * Import PipelineTesterMixin
      
      * subclass test classes with PipelineTesterMixin
      
      * Add pipieline_model_mapping
      
      * Fix import after adding pipieline_model_mapping
      
      * Fix style and quality after adding pipieline_model_mapping
      
      * Fix one more import after adding pipieline_model_mapping
      
      * Fix style and quality after adding pipieline_model_mapping
      
      * Fix test issues
      
      * Fix import requirements
      
      * Fix mapping for MobileViTModelTest
      
      * Update
      
      * Better skip message
      
      * pipieline_model_mapping could not be None
      
      * Remove some PipelineTesterMixin
      
      * Fix typo
      
      * revert tests_fetcher.py
      
      * update
      
      * rename
      
      * revert
      
      * Remove PipelineTestCaseMeta from ZeroShotAudioClassificationPipelineTests
      
      * style and quality
      
      * test fetcher for all pipeline/model tests
      
      ---------
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      871c31a6
  19. 24 Feb, 2023 2 commits
  20. 21 Feb, 2023 1 commit
  21. 20 Feb, 2023 2 commits
    • Sylvain Gugger's avatar
      Fix quality · c87bbe1f
      Sylvain Gugger authored
      c87bbe1f
    • Andy Ehrenberg's avatar
      add flax whisper implementation (#20479) · 2840272c
      Andy Ehrenberg authored
      
      
      * add flax whisper implementation
      
      * rever change to setup
      
      * remove unused imports
      
      * revert generation changes
      
      * flax whisper docs
      
      * docs
      
      * import order
      
      * import sorting
      
      * isort
      
      * add dummy objects
      
      * doc formatting
      
      * formatting
      
      * remove trailing whitespaces
      
      * fix flax whisper docs
      
      * add generation logic to unlock flax whisper
      
      * remove scans
      
      * give credits to Flax Bart implementation
      
      * remove unused imports
      
      * add license
      
      * remove assert
      
      * more credits to Bart
      
      * fix style
      
      * formatting
      
      * support left padding
      
      * add flax whisper generation test
      
      * remove copied from comments whenever not a full copy
      
      * fix docstrings for logits processors
      
      * revert change to FlaxForceTokensLogitsProcessor
      
      * revert doc changes
      
      * improve generation docs
      
      * reorganize
      
      * formatting
      
      * cleanup docs
      
      * add tests
      
      * handle empty list case
      
      * fix forced decoder ids in flax tests
      
      * add flax whisper to inits
      
      * upate dummy objects
      
      * docs for FlaxAutoModelForSpeechSeq2Seq
      
      * fix decoder_position_ids computation in pretrained model decode/__call__ fns
      
      * add Copied from statements as necessary
      
      * compute position_ids only in __call__ and decode methods of pretrained model subclasses
      
      * improve readabilityof compute positional embeddings
      
      * check dimensionality of input_features instead of hidden_states
      
      * copied from statement for init_cache
      
      * formatting
      
      * fix copies
      
      * fix copies
      
      * pass attention mask to encoder layers
      
      * fix decoder module outputs
      
      * set dtype
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * smaller flax model for whisper test
      
      * Update src/transformers/generation/flax_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/whisper/modeling_flax_whisper.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update tests/models/whisper/test_modeling_flax_whisper.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * cleanup
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/whisper/modeling_flax_whisper.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * bias cleanup
      
      * doc fix
      
      * align style for force tokens processor
      
      * readability
      
      * fix input shape in tests
      
      * revert FlaxGenerationMixin docstring
      
      * formatting
      
      * fix tests
      
      * fix imports
      
      * consistent encoder hidden states
      
      * consistent hidden states
      
      * input shapes
      
      * typo
      
      * partial class trick
      
      * partial class for input shape
      
      * base_class with correct input shape
      
      * partial base classes
      
      * match by name
      
      * set main_input_name
      
      * compare on names
      
      * formatting
      
      * remove unused import
      
      * safer position ids computation
      
      * safer position id computation
      
      * Update src/transformers/models/whisper/modeling_flax_whisper.py
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * Update src/transformers/models/whisper/modeling_flax_whisper.py
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * remove identical inherited tests
      
      * fix prompt ids in tests
      
      * use generation config
      
      * use jnp array
      
      * better var names
      
      * more explicit bias use
      
      * import transformers
      
      * formatting
      
      * test formatting
      
      * remove unused imports
      
      * remove unused imports
      
      * formatting
      
      * isort
      
      * docs
      
      * fix ln orders for encoder hidden states
      
      * whisper unique generation stuff
      
      * flake
      
      * use finfo for attention bias
      
      * docs
      
      * Update src/transformers/generation/flax_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * docs
      
      * add timestamp flax test
      
      * jit for timestamps
      
      * formatting
      
      * clean up timestamps processor
      
      * formatting
      
      * remove if_true
      
      * cleanup
      
      ---------
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      2840272c
  22. 13 Feb, 2023 2 commits
  23. 10 Feb, 2023 2 commits
    • Quentin Meeus's avatar
      Remove CLI spams with Whisper FeatureExtractor (#21267) · 5b72b341
      Quentin Meeus authored
      * Remove CLI spams with Whisper FeatureExtractor
      
      Whisper feature extractor representation includes the MEL filters, a list of list that is represented as ~16,000 lines. This needlessly spams the command line. I added a `__repr__` method that replaces this list with a string "<array of shape (80, 201)>"
      
      * Remove mel_filters from to_dict output  
      
      Credits to @ArthurZucker
      
      * remove unused import
      
      * update feature extraction tests for the changes in to_dict
      5b72b341
    • Sylvain Gugger's avatar
      Skip failing test for now · 97d3390f
      Sylvain Gugger authored
      97d3390f
  24. 06 Feb, 2023 1 commit
    • Sylvain Gugger's avatar
      Update quality tooling for formatting (#21480) · 6f79d264
      Sylvain Gugger authored
      * Result of black 23.1
      
      * Update target to Python 3.7
      
      * Switch flake8 to ruff
      
      * Configure isort
      
      * Configure isort
      
      * Apply isort with line limit
      
      * Put the right black version
      
      * adapt black in check copies
      
      * Fix copies
      6f79d264
  25. 27 Jan, 2023 1 commit
  26. 25 Jan, 2023 2 commits
    • Arthur's avatar
      [WHISPER] Small patch (#21307) · 6f3faf38
      Arthur authored
      * add small patch
      
      * update tests, forced decoder ids is not prioritary against generation config
      
      * fix two new tests
      6f3faf38
    • Arthur's avatar
      [Whisper] Refactor whisper (#21252) · 255257f3
      Arthur authored
      * update whisper logit processor
      
      * add generate for whisper
      
      * remove part of the whisper specific code from pipeline
      
      * update logit processes
      
      * major update
      
      * enforce first timestamp
      
      * update generate
      
      * add more tests
      
      * update new decoding strategy
      
      * Apply suggestions from code review
      
      * update docstring
      
      * fixup
      
      * default config will not have multilingual ar
      
      * update expected tokenizer size, see pull on the hub for whisper-tiny
      255257f3
  27. 19 Jan, 2023 1 commit
    • Arthur's avatar
      [Whisper] Fix timestamp processor (#21187) · e9b4800d
      Arthur authored
      
      
      * add draft logit processor
      
      * add template functions
      
      * update timesapmt processor parameters
      
      * draft script
      
      * simplify code
      
      * cleanup
      
      * fixup and clean
      
      * update pipeline
      
      * style
      
      * clean up previous idea
      
      * add tokenization utils
      
      * update tokenizer and asr output
      
      * fit whisper type
      
      * style and update test
      
      * clean test
      
      * style test
      
      * update tests
      
      * update error test
      
      * udpate code (not based on review yet)
      
      * update tokenization
      
      * update asr pipeline
      
      * update code
      
      * cleanup and update test
      
      * fmt
      
      * remove text verificatino
      
      * cleanup
      
      * cleanup
      
      * add model test
      
      * update tests
      
      * update code add docstring
      
      * update code and add docstring
      
      * fix pipeline tests
      
      * add draft logit processor
      
      add template functions
      
      update timesapmt processor parameters
      
      draft script
      
      simplify code
      
      cleanup
      
      fixup and clean
      
      update pipeline
      
      style
      
      clean up previous idea
      
      add tokenization utils
      
      update tokenizer and asr output
      
      fit whisper type
      
      style and update test
      
      clean test
      
      style test
      
      update tests
      
      update error test
      
      udpate code (not based on review yet)
      
      update tokenization
      
      update asr pipeline
      
      update code
      
      cleanup and update test
      
      fmt
      
      remove text verificatino
      
      cleanup
      
      cleanup
      
      add model test
      
      update tests
      
      update code add docstring
      
      update code and add docstring
      
      fix pipeline tests
      
      * Small update.
      
      * Fixup.
      
      * Tmp.
      
      * More support.
      
      * Making `forced_decoder_ids` non mandatory for users to set.
      
      * update and fix first bug
      
      * properly process sequence right after merge if last
      
      * tofo
      
      * allow list inputs + compute begin index better
      
      * start adding tests
      
      * add the 3 edge cases
      
      * style
      
      * format sequences
      
      * fixup
      
      * update
      
      * update
      
      * style
      
      * test passes, edge cases should be good
      
      * update last value
      
      * remove Trie
      
      * update tests and expec ted values
      
      * handle bigger chunk_length
      
      * clean tests a bit
      
      * refactor chunk iter and clean pipeline
      
      * update tests
      
      * style
      
      * refactor chunk iter and clean pipeline
      
      * upade
      
      * resolve comments
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNicolas Patry <patry.nicolas@protonmail.com>
      
      * take stride right into account
      
      * update test expected values
      
      * Update code based on review
      Co-authored-by: default avatarsgugger <sylvain.gugger@gmail.com>
      
      * major refactor
      
      * add correct strides for tests
      
      * Update src/transformers/pipelines/automatic_speech_recognition.py
      
      * fix whisper timestamp test
      Co-authored-by: default avatarNicolas Patry <patry.nicolas@protonmail.com>
      Co-authored-by: default avatarsgugger <sylvain.gugger@gmail.com>
      e9b4800d
  28. 17 Jan, 2023 1 commit
    • Arthur's avatar
      Whisper Timestamp processor and prediction (#20620) · bb300ac6
      Arthur authored
      
      
      * add draft logit processor
      
      * add template functions
      
      * update timesapmt processor parameters
      
      * draft script
      
      * simplify code
      
      * cleanup
      
      * fixup and clean
      
      * update pipeline
      
      * style
      
      * clean up previous idea
      
      * add tokenization utils
      
      * update tokenizer and asr output
      
      * fit whisper type
      
      * style and update test
      
      * clean test
      
      * style test
      
      * update tests
      
      * update error test
      
      * udpate code (not based on review yet)
      
      * update tokenization
      
      * update asr pipeline
      
      * update code
      
      * cleanup and update test
      
      * fmt
      
      * remove text verificatino
      
      * cleanup
      
      * cleanup
      
      * add model test
      
      * update tests
      
      * update code add docstring
      
      * update code and add docstring
      
      * fix pipeline tests
      
      * add draft logit processor
      
      add template functions
      
      update timesapmt processor parameters
      
      draft script
      
      simplify code
      
      cleanup
      
      fixup and clean
      
      update pipeline
      
      style
      
      clean up previous idea
      
      add tokenization utils
      
      update tokenizer and asr output
      
      fit whisper type
      
      style and update test
      
      clean test
      
      style test
      
      update tests
      
      update error test
      
      udpate code (not based on review yet)
      
      update tokenization
      
      update asr pipeline
      
      update code
      
      cleanup and update test
      
      fmt
      
      remove text verificatino
      
      cleanup
      
      cleanup
      
      add model test
      
      update tests
      
      update code add docstring
      
      update code and add docstring
      
      fix pipeline tests
      
      * Small update.
      
      * Fixup.
      
      * Tmp.
      
      * More support.
      
      * Making `forced_decoder_ids` non mandatory for users to set.
      
      * update and fix first bug
      
      * properly process sequence right after merge if last
      
      * tofo
      
      * allow list inputs + compute begin index better
      
      * start adding tests
      
      * add the 3 edge cases
      
      * style
      
      * format sequences
      
      * fixup
      
      * update
      
      * update
      
      * style
      
      * test passes, edge cases should be good
      
      * update last value
      
      * remove Trie
      
      * update tests and expec ted values
      
      * handle bigger chunk_length
      
      * clean tests a bit
      
      * refactor chunk iter and clean pipeline
      
      * update tests
      
      * style
      
      * refactor chunk iter and clean pipeline
      
      * upade
      
      * resolve comments
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNicolas Patry <patry.nicolas@protonmail.com>
      
      * take stride right into account
      
      * update test expected values
      
      * Update code based on review
      Co-authored-by: default avatarsgugger <sylvain.gugger@gmail.com>
      Co-authored-by: default avatarNicolas Patry <patry.nicolas@protonmail.com>
      Co-authored-by: default avatarsgugger <sylvain.gugger@gmail.com>
      bb300ac6
  29. 07 Dec, 2022 1 commit