"tests/models/rt_detr/__init__.py" did not exist on "783d7d2629e97c5f0c5f9ef01b8c66410275c204"
  1. 12 May, 2022 1 commit
  2. 12 Apr, 2022 1 commit
    • Nicolas Patry's avatar
      Change the chunk_iter function to handle (#16730) · a192f61e
      Nicolas Patry authored
      * Change the chunk_iter function to handle
      
      the subtle cases where the last chunk gets ignored since all the
      data is in the `left_strided` data.
      
      We need to remove the right striding on the previous item.
      
      * Remove commented line.
      a192f61e
  3. 02 Mar, 2022 1 commit
  4. 28 Feb, 2022 1 commit
  5. 25 Feb, 2022 1 commit
  6. 23 Feb, 2022 1 commit
  7. 15 Feb, 2022 2 commits
  8. 07 Feb, 2022 1 commit
  9. 02 Feb, 2022 1 commit
    • Nicolas Patry's avatar
      Adding support for `microphone` streaming within pipeline. (#15046) · 623d8cb4
      Nicolas Patry authored
      
      
      * Adding support for `microphone` streaming within pipeline.
      
      - Uses `ffmpeg` to get microphone data.
      - Makes sure alignment is made to `size_of_sample`.
      - Works by sending `{"raw": ..data.., "stride": (n, left, right),
      "partial": bool}`
      directly to the pipeline enabling to stream partial results and still
      get inference.
      - Let's `partial` information flow through the pipeline to enable caller
        to get it back and choose to display text or not.
      
      - The striding reconstitution is bound to have errors since CTC does not
      keep previous state. Currently most of the errors are we don't know if
      there's a space or not between two chunks.
      Since we have some left striding info, we could use that during decoding
      to choose what to do with those spaces and even extra letters maybe (if
      the stride is long enough, it's bound to cover at least a few symbols)
      
      Fixing tests.
      
      Protecting with `require_torch`.
      
      `raw_ctc` support for nicer demo.
      
      Post rebase fixes.
      
      Revamp to split raw_mic_data from it's live chunking.
      
      - Requires a refactor to make everything a bit cleaner.
      
      Automatic resampling.
      
      Small fix.
      
      Small fix.
      
      * Post rebase fix (need to let super handle more logic, reorder args.)
      
      * Update docstrings
      
      * Docstring format.
      
      * Remove print.
      
      * Prevent flow of `input_values`.
      
      * Fixing `stride` too.
      
      * Fixing the PR by removing `raw_ctc`.
      
      * Better docstrings.
      
      * Fixing init.
      
      * Update src/transformers/pipelines/audio_utils.py
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * Update tests/test_pipelines_automatic_speech_recognition.py
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * Quality.
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      623d8cb4
  10. 19 Jan, 2022 1 commit
  11. 18 Jan, 2022 2 commits
  12. 12 Jan, 2022 1 commit
  13. 04 Jan, 2022 1 commit
    • Nicolas Patry's avatar
      Hotfix `chunk_length_s` instead of `_ms`. (#15029) · 19d37c2d
      Nicolas Patry authored
      * Hotfix `chunk_length_s` instead of `_ms`.
      
      * Adding fix of `pad_token` which should be last/previous token for CTC
      
      proper decoding
      
      * Fixing ChunkPipeline unwrapping.
      
      * Adding a PackIterator specific test.
      19d37c2d
  14. 03 Jan, 2022 1 commit
  15. 30 Dec, 2021 1 commit
  16. 27 Dec, 2021 1 commit
    • Nicolas Patry's avatar
      ChunkPipeline (batch_size enabled on `zero-cls` and `qa` pipelines. (#14225) · b058490c
      Nicolas Patry authored
      
      
      * Pipeline chunks.
      
      * Batching for Chunking pipelines ?
      
      * Batching for `question-answering` and `zero-shot-cls`.
      
      * Fixing for FNet.
      
      * Making ASR a chunk pipeline.
      
      * Chunking ASR API.
      
      * doc style.
      
      * Fixing ASR test.
      
      * Fixing QA eror (p_mask, padding is 1, not 0).
      
      * Enable both vad and simple chunking.
      
      * Max length for vad.
      
      * remove inference mode, crashing on s2t.
      
      * Revert ChunkPipeline for ASRpipeline.
      
      Too many knobs for simple integration within the pipeline, better stick
      to external convenience functions instead, more control to be had,
      simpler pipeline and also easier to replace with other things later.
      
      * Drop necessity for PT for these.
      
      * Enabling generators.
      
      * Add mic + cleanup.
      
      * Typo.
      
      * Typo2.
      
      * Remove ASR work, it does not belong in this PR anymore.
      
      * Update src/transformers/pipelines/pt_utils.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Update src/transformers/pipelines/zero_shot_classification.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Adding many comments.
      
      * Doc quality.
      
      * `hidden_states` handling.
      
      * Adding doc.
      
      * Bad rebase.
      
      * Autofixing docs.
      
      * Fixing CRITICAL bug in the new Zerocls pipeline.
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      b058490c
  17. 16 Dec, 2021 1 commit
  18. 17 Nov, 2021 1 commit
  19. 29 Oct, 2021 1 commit
  20. 14 Oct, 2021 1 commit
  21. 21 Sep, 2021 2 commits
  22. 01 Sep, 2021 1 commit
  23. 07 Jul, 2021 1 commit
    • Nicolas Patry's avatar
      Adding support for `pipeline("automatic-speech-recognition")`. (#11525) · ebc69afc
      Nicolas Patry authored
      * Adding support for `pipeline("automatic-speech-recognition")`.
      
      - Ugly `"config"` choice for AutoModel. It would be great to have the
      possibility to have something like `AutoModelFor` that would implement
      the same logic (Load the config, check Architectures and load the first
      one)
      
      * Remove `model_id` was not needed in the end.
      
      * Rebased !
      
      * Remove old code.
      
      * Rename `nlp`.
      ebc69afc
  24. 30 Apr, 2021 1 commit
    • Nicolas Patry's avatar
      Adding `AutomaticSpeechRecognitionPipeline`. (#11337) · db9dd09c
      Nicolas Patry authored
      
      
      * Adding `AutomaticSpeechRecognitionPipeline`.
      
      - Because we added everything to enable this pipeline, we probably
      should add it to `transformers`.
      - This PR tries to limit the scope and focuses only on the pipeline part
      (what should go in, and out).
      - The tests are very specific for S2T and Wav2vec2 to make sure both
      architectures are supported by the pipeline. We don't use the mixin for
      tests right now, because that requires more work in the `pipeline`
      function (will be done in a follow up PR).
      - Unsure about the "helper" function `ffmpeg_read`. It makes a lot of
        sense from a user perspective, it does not add any additional
      dependencies (as in hard dependency, because users can always use their
      own load mechanism). Meanwhile, it feels slightly clunky to have so much
      optional preprocessing.
      - The pipeline is not done to support streaming audio right now.
      
      Future work:
      
      - Add `automatic-speech-recognition` as a `task`. And add the
      FeatureExtractor.from_pretrained within `pipeline` function.
      - Add small models within tests
      - Add the Mixin to tests.
      - Make the logic between ForCTC vs ForConditionalGeneration better.
      
      * Update tests/test_pipelines_automatic_speech_recognition.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Adding docs + main import + type checking + LICENSE.
      
      * Doc style !.
      
      * Fixing TYPE_HINT.
      
      * Specifying waveform shape in the docs.
      
      * Adding asserts + specify in the documentation the shape of the input
      np.ndarray.
      
      * Update src/transformers/pipelines/automatic_speech_recognition.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Adding require to tests + move the `feature_extractor` doc.
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      db9dd09c