1. 27 Feb, 2022 1 commit
  2. 26 Feb, 2022 3 commits
    • Moto Hira's avatar
      Enable ffmpeg prototyep unit test (#2261) · 955ffb47
      Moto Hira authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/audio/pull/2261
      
      Enables prototype ffmpeg io tests in fbcode.
      
      Reviewed By: nateanl
      
      Differential Revision: D33698353
      
      fbshipit-source-id: 61de997c564135e677cd68e34fd7cc5dc0c5e036
      955ffb47
    • Zhaoheng Ni's avatar
      Add apply_beamforming to torchaudio.functional (#2232) · 9c56ffb4
      Zhaoheng Ni authored
      Summary:
      This PR adds ``apply_beamforming`` method to ``torchaudio.functional``.
      The method employs the beamforming weight to the multi-channel noisy spectrum to obtain the single-channel enhanced spectrum.
      The input arguments are the complex-valued beamforming weight Tensor and the multi-channel noisy spectrum.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2232
      
      Reviewed By: mthrok
      
      Differential Revision: D34474561
      
      Pulled By: nateanl
      
      fbshipit-source-id: 2910251a8f111e65375dfb50495b6a415113f06d
      9c56ffb4
    • moto's avatar
      Improve device streaming (#2202) · 365313ed
      moto authored
      Summary:
      This commit adds tutorial for device ASR, and update API for device streaming.
      
      The changes for the interface are
      1. Add `timeout` and `backoff` parameters to `process_packet` and `stream` methods.
      2. Move `fill_buffer` method to private.
      
      When dealing with device stream, there are situations where the device buffer is not
      ready and the system returns `EAGAIN`. In such case, the previous implementation of
      `process_packet` method raised an exception in Python layer , but for device ASR,
      this is inefficient. A better approach is to retry within C++ layer in blocking manner.
      The new `timeout` parameter serves this purpose.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2202
      
      Reviewed By: nateanl
      
      Differential Revision: D34475829
      
      Pulled By: mthrok
      
      fbshipit-source-id: bb6d0b125d800f87d189db40815af06fbd4cab59
      365313ed
  3. 25 Feb, 2022 6 commits
  4. 24 Feb, 2022 4 commits
  5. 23 Feb, 2022 1 commit
  6. 18 Feb, 2022 2 commits
  7. 17 Feb, 2022 4 commits
  8. 16 Feb, 2022 10 commits
    • Zhaoheng Ni's avatar
      Add EMFORMER_RNNT_BASE_MUSTC into pipeline demo script (#2248) · 38569ef0
      Zhaoheng Ni authored
      Summary:
      This PR adds ``EMFORMER_RNNT_BASE_MUSTC`` support in `pipeline_demo.py`. The bundle is trained on MuST-C release 2.0 dataset. The model  preserves the casing and punctuations in the transcript.
      
      Here is a screen recording of how it works in streaming and non-streaming modes:
      
      https://user-images.githubusercontent.com/8653221/154356521-fe84bdc1-fb0c-41bd-8729-9edbb3224a07.mov
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2248
      
      Reviewed By: hwangjeff
      
      Differential Revision: D34282598
      
      Pulled By: nateanl
      
      fbshipit-source-id: 42ed7e2623031dfebd176ef0c6bfd70da3c897d4
      38569ef0
    • Zhaoheng Ni's avatar
      Refactor torchscript consistency test in functional (#2246) · 87d79889
      Zhaoheng Ni authored
      Summary:
      In torchscript_consistency tests, the `func` in each test method only accepts one `tensor` as the argument, for the other arguments of `F.xyz` method, they need to be defined inside the `func`. If there is no `Tensor` argument in `F.xzy`, the tests use a `dummy` tensor which is not used anywhere. In this PR, we refactor ``_assert_consistency`` and ``_assert_consistency_complex`` to accept a tuple of inputs instead of just one `tensor`.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2246
      
      Reviewed By: carolineechen
      
      Differential Revision: D34273057
      
      Pulled By: nateanl
      
      fbshipit-source-id: a3900edb3b2c58638e513e1490279d771ebc3d0b
      87d79889
    • Zhaoheng Ni's avatar
      Refactor pipeline_demo script in emformer_rnnt recipes (#2239) · fdea0a7c
      Zhaoheng Ni authored
      Summary:
      - Use dictionary to select the `RNNTBundle` and the corresponding dataset.
      - Use the dictionary's keys as choices in ArgumentParser
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2239
      
      Reviewed By: mthrok
      
      Differential Revision: D34267070
      
      Pulled By: nateanl
      
      fbshipit-source-id: 99c7942d5c7c1518694e1ae02a55a7decd87c220
      fdea0a7c
    • Zhaoheng Ni's avatar
      Refactor eval and pipeline_demo scripts in emformer_rnnt (#2238) · e3b40d1c
      Zhaoheng Ni authored
      Summary:
      - Add docstring to `eval.py` and `pipeline_demo.py` under `emformer_rnnt` directory.
      - Refactor logger and ArgumentParser
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2238
      
      Reviewed By: mthrok
      
      Differential Revision: D34267059
      
      Pulled By: nateanl
      
      fbshipit-source-id: 4b8d3d183ee7bc0ad71ce305cab87bfa90208b2e
      e3b40d1c
    • Zhaoheng Ni's avatar
      Add complex dtype support in functional autograd test (#2244) · eeba91dc
      Zhaoheng Ni authored
      Summary:
      In autograd tests, to guarantee the precision, the dtype of Tensors are converted to `torch.float64` if they are real. However, the complex dtype is not considered. This PR adds `self.complex_dtype` support to the inputs.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2244
      
      Reviewed By: mthrok
      
      Differential Revision: D34272998
      
      Pulled By: nateanl
      
      fbshipit-source-id: e8698a74d7b8d99ee0fcb5f5cb5f2ffc8c80b9b5
      eeba91dc
    • Caroline Chen's avatar
      Fix lm used for ctc decoder example (#2235) · c2decba4
      Caroline Chen authored
      Summary:
      LM in example script was unintentionally changed to None when adding no LM support previously. this changes it back and is consistent with the WERs listed in the readme
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2235
      
      Reviewed By: nateanl
      
      Differential Revision: D34273042
      
      Pulled By: carolineechen
      
      fbshipit-source-id: 824b1ce18195e39dc534b2ec9c5312bbe3bb1812
      c2decba4
    • Zhaoheng Ni's avatar
      Add shebang lines to scripts in emformer_rnnt recipes (#2237) · aac83fe5
      Zhaoheng Ni authored
      Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2237
      
      Reviewed By: mthrok
      
      Differential Revision: D34267000
      
      Pulled By: nateanl
      
      fbshipit-source-id: 4c264aea6cf3fba5d8728d5fe60f9f471815852d
      aac83fe5
    • Zhaoheng Ni's avatar
      Add EMFORMER_RNNT_BASE_MUSTC bundle to torchaudio.prototype (#2241) · 99b5ef5c
      Zhaoheng Ni authored
      Summary:
      This PR provides a RNNTBundle that is pre-trained on the MuST-C release v2.0 dataset.
      The model preserves the casing and punctuations of the transcripts when training the SentencePiece model.
      
      Here is the model performance on the dev and test sets of MuST-C 2.0:
      |                   |          WER |
      |:-----------------:|-------------:|
      | dev               |       0.190  |
      | tst-COMMON        |       0.213  |
      | tst-HE            |       0.186  |
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2241
      
      Reviewed By: mthrok
      
      Differential Revision: D34267792
      
      Pulled By: nateanl
      
      fbshipit-source-id: 67bca9f277e66d41a4530d01615f249b3cec7167
      99b5ef5c
    • Zhaoheng Ni's avatar
      Refactor ArgumentParser arguments in emformer_rnnt recipes (#2236) · 81f56f64
      Zhaoheng Ni authored
      Summary:
      Replace underscore with dash in ArgumentParser's arguments.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2236
      
      Reviewed By: mthrok
      
      Differential Revision: D34266977
      
      Pulled By: nateanl
      
      fbshipit-source-id: ceacac12c04016a8dbf2a1a7d6bbcf65d4d53d21
      81f56f64
    • moto's avatar
      Fix prototype exclusion in release (#2225) · a007e922
      moto authored
      Summary:
      This commit fixes the feature to exclude `torchaudio.prototype` module.
      
      In `setup.py` there is a special case that is triggered if the commit is on release branch or release tag, that  excludes `torchaudio.prototype`. This was introduced to make it easy for release-related work.
      It turned out that the submodules under `torchaudio.prototype`, such as `torchaudio.prototype.pipelines`, are not properly excluded from packaging.
      These sub modules did not exist in previous releases, so it was not an issue.
      
      **Note** This feature is triggered only in release branch, so the fix is not visible in the CI of this PR.
      https://app.circleci.com/pipelines/github/pytorch/audio/9674/workflows/d0c9a6f1-8ca9-441a-a5f5-08926075fa39/jobs/553985?invite=true#step-104-193
      
      The following outputs were observed when running it on local env.
      
      * Before the change
      
      ```
      $ BUILD_FFMPEG=0 BUILD_SOX=0 BUILD_CTC_DECODER=0 BUILD_RNNT=0 BUILD_KALDI=0 python setup.py clean bdist_wheel
      ```
      ```
      -- Git branch: prototype-exclusion
      -- Git SHA: 0af1edaa420c46be10292cbea7150c34ef80a0e1
      -- Git tag: None
      -- PyTorch dependency: torch
      -- Building version 0.11.0+0af1eda
       --- Initializing submodules
       --- Initialized submodule
      Excluding torchaudio.prototype from the package.
      ...
      creating build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype
      creating build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/io
      copying torchaudio/prototype/io/streamer.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/io
      copying torchaudio/prototype/io/__init__.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/io
      creating build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/pipelines
      copying torchaudio/prototype/pipelines/__init__.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/pipelines
      copying torchaudio/prototype/pipelines/rnnt_pipeline.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/pipelines
      creating build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/ctc_decoder
      copying torchaudio/prototype/ctc_decoder/ctc_decoder.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/ctc_decoder
      copying torchaudio/prototype/ctc_decoder/__init__.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/ctc_decoder
      warning: build_py: byte-compiling is disabled, skipping.
      ```
      
      * After the change
      
      ```
      $ BUILD_FFMPEG=0 BUILD_SOX=0 BUILD_CTC_DECODER=0 BUILD_RNNT=0 BUILD_KALDI=0 python setup.py clean bdist_wheel
      ```
      
      ```
      -- Git branch: prototype-exclusion
      -- Git SHA: 0af1edaa420c46be10292cbea7150c34ef80a0e1
      -- Git tag: None
      -- PyTorch dependency: torch
      -- Building version 0.11.0+0af1eda
       --- Initializing submodules
       --- Initialized submodule
      Excluding torchaudio.prototype from the package.
      ...
      creating build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2
      copying torchaudio/models/wav2vec2/__init__.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2
      copying torchaudio/models/wav2vec2/model.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2
      copying torchaudio/models/wav2vec2/components.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2
      creating build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2/utils
      copying torchaudio/models/wav2vec2/utils/__init__.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2/utils
      copying torchaudio/models/wav2vec2/utils/import_huggingface.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2/utils
      copying torchaudio/models/wav2vec2/utils/import_fairseq.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2/utils
      warning: build_py: byte-compiling is disabled, skipping.
      ```
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2225
      
      Reviewed By: nateanl
      
      Differential Revision: D34257128
      
      Pulled By: mthrok
      
      fbshipit-source-id: a3d6eca5803356e5aa3fe0eda82f6a9f5affb8e8
      a007e922
  9. 15 Feb, 2022 3 commits
    • moto's avatar
      Improve ffmpeg library discovery (#2204) · 963905e4
      moto authored
      Summary:
      This commit fixes the issue with ffmpeg discovery at build time.
      The original implementation had issues like.
      
      1. Wrong usage of FindFFMPEG, which caused mixture of ffmpeg libraries from system directory and user directory.
      2. The optional `FFMPEG_ROOT` variable was not set within cmake.
      
      The issue 1 is problematic when a user does not have a permission to
      modify the environment. For example, an old version of ffmpeg, which is
      installed in a directory managed by the system (such as `/usr/local/lib`),
      then there is no way to specify a path in which user installs a supported version
      of ffmpeg.
      
      This commit changes the behavior by first searching the library
      in `FFMPEG_ROOT` environment variables, then
      resorting to the original behavior of searching the custom paths with
      system default path.
      
      Also this commirt removes support for `libavresample`, which is deprecated in
      ffmpeg 4 and removed in ffmpeg 5.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2204
      
      Reviewed By: carolineechen
      
      Differential Revision: D34225769
      
      Pulled By: mthrok
      
      fbshipit-source-id: 95b0bfaaef31e2e69e6df29f789010f48a48210b
      963905e4
    • moto's avatar
      Update context building to not delay the inference (#2213) · 8e3c6144
      moto authored
      Summary:
      Updating the context cacher so that fetched audio chunk is used for inference immediately.
      
      https://github.com/pytorch/audio/pull/2202#discussion_r802838174
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2213
      
      Reviewed By: hwangjeff
      
      Differential Revision: D34235230
      
      Pulled By: mthrok
      
      fbshipit-source-id: 6e4aee7cca34ca81e40c0cb13497182f20f7f04e
      8e3c6144
    • hwangjeff's avatar
      Adjust Conformer args (#2223) · 411b5dcf
      hwangjeff authored
      Summary:
      Orders and names Conformer's initializer args to be more consistent with Emformer's.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2223
      
      Reviewed By: mthrok
      
      Differential Revision: D34226177
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: 111c7ff27841aeac302ea5f6f7b50cc72c570829
      411b5dcf
  10. 11 Feb, 2022 6 commits