1. 14 Nov, 2022 2 commits
  2. 14 Oct, 2022 1 commit
    • Nicolas Patry's avatar
      Improve error messaging for ASR pipeline. (#19570) · 463226e2
      Nicolas Patry authored
      * Improve error messaging for ASR pipeline.
      
      - Raise error early (in `_sanitize`) so users don't waste time trying to
        run queries with invalid params.
      
      - Fix the error was after using `config.inputs_to_logits_ratio` so our
        check was masked by the failing property does not exist.
      
      - Added some manual check on s2t for the error message.
        No non ctc model seems to be used by the default runner (they are all
        skipped).
      
      * Removing pdb.
      
      * Stop the early error it doesn't really work :(.
      463226e2
  3. 07 Oct, 2022 1 commit
  4. 28 Jul, 2022 1 commit
  5. 21 Apr, 2022 1 commit
  6. 19 Apr, 2022 1 commit
  7. 12 Apr, 2022 1 commit
    • Nicolas Patry's avatar
      Change the chunk_iter function to handle (#16730) · a192f61e
      Nicolas Patry authored
      * Change the chunk_iter function to handle
      
      the subtle cases where the last chunk gets ignored since all the
      data is in the `left_strided` data.
      
      We need to remove the right striding on the previous item.
      
      * Remove commented line.
      a192f61e
  8. 23 Mar, 2022 1 commit
    • Sylvain Gugger's avatar
      Reorganize file utils (#16264) · 4975002d
      Sylvain Gugger authored
      * Split file_utils in several submodules
      
      * Fixes
      
      * Add back more objects
      
      * More fixes
      
      * Who exactly decided to import that from there?
      
      * Second suggestion to code with code review
      
      * Revert wront move
      
      * Fix imports
      
      * Adapt all imports
      
      * Adapt all imports everywhere
      
      * Revert this import, will fix in a separate commit
      4975002d
  9. 02 Mar, 2022 1 commit
  10. 28 Feb, 2022 1 commit
  11. 25 Feb, 2022 1 commit
  12. 15 Feb, 2022 1 commit
  13. 07 Feb, 2022 1 commit
  14. 02 Feb, 2022 2 commits
    • Sylvain Gugger's avatar
      Fic docstring of ASR pipeline (#15481) · 13297ac7
      Sylvain Gugger authored
      13297ac7
    • Nicolas Patry's avatar
      Adding support for `microphone` streaming within pipeline. (#15046) · 623d8cb4
      Nicolas Patry authored
      
      
      * Adding support for `microphone` streaming within pipeline.
      
      - Uses `ffmpeg` to get microphone data.
      - Makes sure alignment is made to `size_of_sample`.
      - Works by sending `{"raw": ..data.., "stride": (n, left, right),
      "partial": bool}`
      directly to the pipeline enabling to stream partial results and still
      get inference.
      - Let's `partial` information flow through the pipeline to enable caller
        to get it back and choose to display text or not.
      
      - The striding reconstitution is bound to have errors since CTC does not
      keep previous state. Currently most of the errors are we don't know if
      there's a space or not between two chunks.
      Since we have some left striding info, we could use that during decoding
      to choose what to do with those spaces and even extra letters maybe (if
      the stride is long enough, it's bound to cover at least a few symbols)
      
      Fixing tests.
      
      Protecting with `require_torch`.
      
      `raw_ctc` support for nicer demo.
      
      Post rebase fixes.
      
      Revamp to split raw_mic_data from it's live chunking.
      
      - Requires a refactor to make everything a bit cleaner.
      
      Automatic resampling.
      
      Small fix.
      
      Small fix.
      
      * Post rebase fix (need to let super handle more logic, reorder args.)
      
      * Update docstrings
      
      * Docstring format.
      
      * Remove print.
      
      * Prevent flow of `input_values`.
      
      * Fixing `stride` too.
      
      * Fixing the PR by removing `raw_ctc`.
      
      * Better docstrings.
      
      * Fixing init.
      
      * Update src/transformers/pipelines/audio_utils.py
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * Update tests/test_pipelines_automatic_speech_recognition.py
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * Quality.
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      623d8cb4
  15. 24 Jan, 2022 1 commit
  16. 19 Jan, 2022 1 commit
  17. 18 Jan, 2022 1 commit
  18. 12 Jan, 2022 1 commit
  19. 04 Jan, 2022 1 commit
    • Nicolas Patry's avatar
      Hotfix `chunk_length_s` instead of `_ms`. (#15029) · 19d37c2d
      Nicolas Patry authored
      * Hotfix `chunk_length_s` instead of `_ms`.
      
      * Adding fix of `pad_token` which should be last/previous token for CTC
      
      proper decoding
      
      * Fixing ChunkPipeline unwrapping.
      
      * Adding a PackIterator specific test.
      19d37c2d
  20. 03 Jan, 2022 1 commit
  21. 27 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Doc styler v2 (#14950) · 87e6e4fe
      Sylvain Gugger authored
      * New doc styler
      
      * Fix issue with args at the start
      
      * Code sample fixes
      
      * Style code examples in MDX
      
      * Fix more patterns
      
      * Typo
      
      * Typo
      
      * More patterns
      
      * Do without black for now
      
      * Get more info in error
      
      * Docstring style
      
      * Re-enable check
      
      * Quality
      
      * Fix add_end_docstring decorator
      
      * Fix docstring
      87e6e4fe
  22. 21 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Mass conversion of documentation from rst to Markdown (#14866) · 27b3031d
      Sylvain Gugger authored
      * Convert docstrings of all configurations and tokenizers
      
      * Processors and fixes
      
      * Last modeling files and fixes to models
      
      * Pipeline modules
      
      * Utils files
      
      * Data submodule
      
      * All the other files
      
      * Style
      
      * Missing examples
      
      * Style again
      
      * Fix copies
      
      * Say bye bye to rst docstrings forever
      27b3031d
  23. 06 Dec, 2021 1 commit
  24. 29 Oct, 2021 1 commit
  25. 27 Oct, 2021 1 commit
  26. 21 Sep, 2021 1 commit
  27. 10 Sep, 2021 1 commit
    • Nicolas Patry's avatar
      [Large PR] Entire rework of pipelines. (#13308) · c63fcabf
      Nicolas Patry authored
      
      
      * Enabling dataset iteration on pipelines.
      
      Enabling dataset iteration on pipelines.
      
      Unifying parameters under `set_parameters` function.
      
      Small fix.
      
      Last fixes after rebase
      
      Remove print.
      
      Fixing text2text `generate_kwargs`
      
      No more `self.max_length`.
      
      Fixing tf only conversational.
      
      Consistency in start/stop index over TF/PT.
      
      Speeding up drastically on TF (nasty bug where max_length would increase
      a ton.)
      
      Adding test for support for non fast tokenizers.
      
      Fixign GPU usage on zero-shot.
      
      Fix working on Tf.
      
      Update src/transformers/pipelines/base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      Update src/transformers/pipelines/base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      Small cleanup.
      
      Remove all asserts + simple format.
      
      * Fixing audio-classification for large PR.
      
      * Overly explicity null checking.
      
      * Encapsulating GPU/CPU pytorch manipulation directly within `base.py`.
      
      * Removed internal state for parameters of the  pipeline.
      
      Instead of overriding implicitly internal state, we moved
      to real named arguments on every `preprocess`, `_forward`,
      `postprocess` function.
      
      Instead `_sanitize_parameters` will be used to split all kwargs
      of both __init__ and __call__ into the 3 kinds of named parameters.
      
      * Move import warnings.
      
      * Small fixes.
      
      * Quality.
      
      * Another small fix, using the CI to debug faster.
      
      * Last fixes.
      
      * Last fix.
      
      * Small cleanup of tensor moving.
      
      * is not None.
      
      * Adding a bunch of docs + a iteration test.
      
      * Fixing doc style.
      
      * KeyDataset = None guard.
      
      * RRemoving the Cuda test for pipelines (was testing).
      
      * Even more simple iteration test.
      
      * Correct import .
      
      * Long day.
      
      * Fixes in docs.
      
      * [WIP] migrating object detection.
      
      * Fixed the target_size bug.
      
      * Fixup.
      
      * Bad variable name.
      
      * Fixing `ensure_on_device` respects original ModelOutput.
      c63fcabf
  28. 01 Sep, 2021 1 commit
  29. 26 May, 2021 1 commit
  30. 30 Apr, 2021 1 commit
    • Nicolas Patry's avatar
      Adding `AutomaticSpeechRecognitionPipeline`. (#11337) · db9dd09c
      Nicolas Patry authored
      
      
      * Adding `AutomaticSpeechRecognitionPipeline`.
      
      - Because we added everything to enable this pipeline, we probably
      should add it to `transformers`.
      - This PR tries to limit the scope and focuses only on the pipeline part
      (what should go in, and out).
      - The tests are very specific for S2T and Wav2vec2 to make sure both
      architectures are supported by the pipeline. We don't use the mixin for
      tests right now, because that requires more work in the `pipeline`
      function (will be done in a follow up PR).
      - Unsure about the "helper" function `ffmpeg_read`. It makes a lot of
        sense from a user perspective, it does not add any additional
      dependencies (as in hard dependency, because users can always use their
      own load mechanism). Meanwhile, it feels slightly clunky to have so much
      optional preprocessing.
      - The pipeline is not done to support streaming audio right now.
      
      Future work:
      
      - Add `automatic-speech-recognition` as a `task`. And add the
      FeatureExtractor.from_pretrained within `pipeline` function.
      - Add small models within tests
      - Add the Mixin to tests.
      - Make the logic between ForCTC vs ForConditionalGeneration better.
      
      * Update tests/test_pipelines_automatic_speech_recognition.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Adding docs + main import + type checking + LICENSE.
      
      * Doc style !.
      
      * Fixing TYPE_HINT.
      
      * Specifying waveform shape in the docs.
      
      * Adding asserts + specify in the documentation the shape of the input
      np.ndarray.
      
      * Update src/transformers/pipelines/automatic_speech_recognition.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Adding require to tests + move the `feature_extractor` doc.
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      db9dd09c