1. 24 Feb, 2022 1 commit
  2. 23 Feb, 2022 1 commit
  3. 18 Feb, 2022 2 commits
  4. 17 Feb, 2022 4 commits
  5. 16 Feb, 2022 10 commits
    • Zhaoheng Ni's avatar
      Add EMFORMER_RNNT_BASE_MUSTC into pipeline demo script (#2248) · 38569ef0
      Zhaoheng Ni authored
      Summary:
      This PR adds ``EMFORMER_RNNT_BASE_MUSTC`` support in `pipeline_demo.py`. The bundle is trained on MuST-C release 2.0 dataset. The model  preserves the casing and punctuations in the transcript.
      
      Here is a screen recording of how it works in streaming and non-streaming modes:
      
      https://user-images.githubusercontent.com/8653221/154356521-fe84bdc1-fb0c-41bd-8729-9edbb3224a07.mov
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2248
      
      Reviewed By: hwangjeff
      
      Differential Revision: D34282598
      
      Pulled By: nateanl
      
      fbshipit-source-id: 42ed7e2623031dfebd176ef0c6bfd70da3c897d4
      38569ef0
    • Zhaoheng Ni's avatar
      Refactor torchscript consistency test in functional (#2246) · 87d79889
      Zhaoheng Ni authored
      Summary:
      In torchscript_consistency tests, the `func` in each test method only accepts one `tensor` as the argument, for the other arguments of `F.xyz` method, they need to be defined inside the `func`. If there is no `Tensor` argument in `F.xzy`, the tests use a `dummy` tensor which is not used anywhere. In this PR, we refactor ``_assert_consistency`` and ``_assert_consistency_complex`` to accept a tuple of inputs instead of just one `tensor`.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2246
      
      Reviewed By: carolineechen
      
      Differential Revision: D34273057
      
      Pulled By: nateanl
      
      fbshipit-source-id: a3900edb3b2c58638e513e1490279d771ebc3d0b
      87d79889
    • Zhaoheng Ni's avatar
      Refactor pipeline_demo script in emformer_rnnt recipes (#2239) · fdea0a7c
      Zhaoheng Ni authored
      Summary:
      - Use dictionary to select the `RNNTBundle` and the corresponding dataset.
      - Use the dictionary's keys as choices in ArgumentParser
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2239
      
      Reviewed By: mthrok
      
      Differential Revision: D34267070
      
      Pulled By: nateanl
      
      fbshipit-source-id: 99c7942d5c7c1518694e1ae02a55a7decd87c220
      fdea0a7c
    • Zhaoheng Ni's avatar
      Refactor eval and pipeline_demo scripts in emformer_rnnt (#2238) · e3b40d1c
      Zhaoheng Ni authored
      Summary:
      - Add docstring to `eval.py` and `pipeline_demo.py` under `emformer_rnnt` directory.
      - Refactor logger and ArgumentParser
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2238
      
      Reviewed By: mthrok
      
      Differential Revision: D34267059
      
      Pulled By: nateanl
      
      fbshipit-source-id: 4b8d3d183ee7bc0ad71ce305cab87bfa90208b2e
      e3b40d1c
    • Zhaoheng Ni's avatar
      Add complex dtype support in functional autograd test (#2244) · eeba91dc
      Zhaoheng Ni authored
      Summary:
      In autograd tests, to guarantee the precision, the dtype of Tensors are converted to `torch.float64` if they are real. However, the complex dtype is not considered. This PR adds `self.complex_dtype` support to the inputs.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2244
      
      Reviewed By: mthrok
      
      Differential Revision: D34272998
      
      Pulled By: nateanl
      
      fbshipit-source-id: e8698a74d7b8d99ee0fcb5f5cb5f2ffc8c80b9b5
      eeba91dc
    • Caroline Chen's avatar
      Fix lm used for ctc decoder example (#2235) · c2decba4
      Caroline Chen authored
      Summary:
      LM in example script was unintentionally changed to None when adding no LM support previously. this changes it back and is consistent with the WERs listed in the readme
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2235
      
      Reviewed By: nateanl
      
      Differential Revision: D34273042
      
      Pulled By: carolineechen
      
      fbshipit-source-id: 824b1ce18195e39dc534b2ec9c5312bbe3bb1812
      c2decba4
    • Zhaoheng Ni's avatar
      Add shebang lines to scripts in emformer_rnnt recipes (#2237) · aac83fe5
      Zhaoheng Ni authored
      Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2237
      
      Reviewed By: mthrok
      
      Differential Revision: D34267000
      
      Pulled By: nateanl
      
      fbshipit-source-id: 4c264aea6cf3fba5d8728d5fe60f9f471815852d
      aac83fe5
    • Zhaoheng Ni's avatar
      Add EMFORMER_RNNT_BASE_MUSTC bundle to torchaudio.prototype (#2241) · 99b5ef5c
      Zhaoheng Ni authored
      Summary:
      This PR provides a RNNTBundle that is pre-trained on the MuST-C release v2.0 dataset.
      The model preserves the casing and punctuations of the transcripts when training the SentencePiece model.
      
      Here is the model performance on the dev and test sets of MuST-C 2.0:
      |                   |          WER |
      |:-----------------:|-------------:|
      | dev               |       0.190  |
      | tst-COMMON        |       0.213  |
      | tst-HE            |       0.186  |
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2241
      
      Reviewed By: mthrok
      
      Differential Revision: D34267792
      
      Pulled By: nateanl
      
      fbshipit-source-id: 67bca9f277e66d41a4530d01615f249b3cec7167
      99b5ef5c
    • Zhaoheng Ni's avatar
      Refactor ArgumentParser arguments in emformer_rnnt recipes (#2236) · 81f56f64
      Zhaoheng Ni authored
      Summary:
      Replace underscore with dash in ArgumentParser's arguments.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2236
      
      Reviewed By: mthrok
      
      Differential Revision: D34266977
      
      Pulled By: nateanl
      
      fbshipit-source-id: ceacac12c04016a8dbf2a1a7d6bbcf65d4d53d21
      81f56f64
    • moto's avatar
      Fix prototype exclusion in release (#2225) · a007e922
      moto authored
      Summary:
      This commit fixes the feature to exclude `torchaudio.prototype` module.
      
      In `setup.py` there is a special case that is triggered if the commit is on release branch or release tag, that  excludes `torchaudio.prototype`. This was introduced to make it easy for release-related work.
      It turned out that the submodules under `torchaudio.prototype`, such as `torchaudio.prototype.pipelines`, are not properly excluded from packaging.
      These sub modules did not exist in previous releases, so it was not an issue.
      
      **Note** This feature is triggered only in release branch, so the fix is not visible in the CI of this PR.
      https://app.circleci.com/pipelines/github/pytorch/audio/9674/workflows/d0c9a6f1-8ca9-441a-a5f5-08926075fa39/jobs/553985?invite=true#step-104-193
      
      The following outputs were observed when running it on local env.
      
      * Before the change
      
      ```
      $ BUILD_FFMPEG=0 BUILD_SOX=0 BUILD_CTC_DECODER=0 BUILD_RNNT=0 BUILD_KALDI=0 python setup.py clean bdist_wheel
      ```
      ```
      -- Git branch: prototype-exclusion
      -- Git SHA: 0af1edaa420c46be10292cbea7150c34ef80a0e1
      -- Git tag: None
      -- PyTorch dependency: torch
      -- Building version 0.11.0+0af1eda
       --- Initializing submodules
       --- Initialized submodule
      Excluding torchaudio.prototype from the package.
      ...
      creating build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype
      creating build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/io
      copying torchaudio/prototype/io/streamer.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/io
      copying torchaudio/prototype/io/__init__.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/io
      creating build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/pipelines
      copying torchaudio/prototype/pipelines/__init__.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/pipelines
      copying torchaudio/prototype/pipelines/rnnt_pipeline.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/pipelines
      creating build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/ctc_decoder
      copying torchaudio/prototype/ctc_decoder/ctc_decoder.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/ctc_decoder
      copying torchaudio/prototype/ctc_decoder/__init__.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/prototype/ctc_decoder
      warning: build_py: byte-compiling is disabled, skipping.
      ```
      
      * After the change
      
      ```
      $ BUILD_FFMPEG=0 BUILD_SOX=0 BUILD_CTC_DECODER=0 BUILD_RNNT=0 BUILD_KALDI=0 python setup.py clean bdist_wheel
      ```
      
      ```
      -- Git branch: prototype-exclusion
      -- Git SHA: 0af1edaa420c46be10292cbea7150c34ef80a0e1
      -- Git tag: None
      -- PyTorch dependency: torch
      -- Building version 0.11.0+0af1eda
       --- Initializing submodules
       --- Initialized submodule
      Excluding torchaudio.prototype from the package.
      ...
      creating build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2
      copying torchaudio/models/wav2vec2/__init__.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2
      copying torchaudio/models/wav2vec2/model.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2
      copying torchaudio/models/wav2vec2/components.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2
      creating build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2/utils
      copying torchaudio/models/wav2vec2/utils/__init__.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2/utils
      copying torchaudio/models/wav2vec2/utils/import_huggingface.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2/utils
      copying torchaudio/models/wav2vec2/utils/import_fairseq.py -> build/lib.macosx-11.0-arm64-3.9/torchaudio/models/wav2vec2/utils
      warning: build_py: byte-compiling is disabled, skipping.
      ```
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2225
      
      Reviewed By: nateanl
      
      Differential Revision: D34257128
      
      Pulled By: mthrok
      
      fbshipit-source-id: a3d6eca5803356e5aa3fe0eda82f6a9f5affb8e8
      a007e922
  6. 15 Feb, 2022 3 commits
    • moto's avatar
      Improve ffmpeg library discovery (#2204) · 963905e4
      moto authored
      Summary:
      This commit fixes the issue with ffmpeg discovery at build time.
      The original implementation had issues like.
      
      1. Wrong usage of FindFFMPEG, which caused mixture of ffmpeg libraries from system directory and user directory.
      2. The optional `FFMPEG_ROOT` variable was not set within cmake.
      
      The issue 1 is problematic when a user does not have a permission to
      modify the environment. For example, an old version of ffmpeg, which is
      installed in a directory managed by the system (such as `/usr/local/lib`),
      then there is no way to specify a path in which user installs a supported version
      of ffmpeg.
      
      This commit changes the behavior by first searching the library
      in `FFMPEG_ROOT` environment variables, then
      resorting to the original behavior of searching the custom paths with
      system default path.
      
      Also this commirt removes support for `libavresample`, which is deprecated in
      ffmpeg 4 and removed in ffmpeg 5.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2204
      
      Reviewed By: carolineechen
      
      Differential Revision: D34225769
      
      Pulled By: mthrok
      
      fbshipit-source-id: 95b0bfaaef31e2e69e6df29f789010f48a48210b
      963905e4
    • moto's avatar
      Update context building to not delay the inference (#2213) · 8e3c6144
      moto authored
      Summary:
      Updating the context cacher so that fetched audio chunk is used for inference immediately.
      
      https://github.com/pytorch/audio/pull/2202#discussion_r802838174
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2213
      
      Reviewed By: hwangjeff
      
      Differential Revision: D34235230
      
      Pulled By: mthrok
      
      fbshipit-source-id: 6e4aee7cca34ca81e40c0cb13497182f20f7f04e
      8e3c6144
    • hwangjeff's avatar
      Adjust Conformer args (#2223) · 411b5dcf
      hwangjeff authored
      Summary:
      Orders and names Conformer's initializer args to be more consistent with Emformer's.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2223
      
      Reviewed By: mthrok
      
      Differential Revision: D34226177
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: 111c7ff27841aeac302ea5f6f7b50cc72c570829
      411b5dcf
  7. 11 Feb, 2022 7 commits
  8. 10 Feb, 2022 1 commit
  9. 09 Feb, 2022 2 commits
    • hwangjeff's avatar
      Clean up Emformer (#2207) · 87d7694d
      hwangjeff authored
      Summary:
      - Make `segment_length` a required argument rather than optional argument to force users to consciously choose input segment lengths for their use cases.
      - Clarify expected input shapes in API documentation.
      - Adjust `infer` tests to reflect expected usage.
      - Add assertion for input shape for `infer`.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2207
      
      Reviewed By: mthrok
      
      Differential Revision: D34101205
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: 1d1233d5edee5818d4669b4e47d44559e7ebb304
      87d7694d
    • hwangjeff's avatar
      Fix librosa calls (#2208) · e5d567c9
      hwangjeff authored
      Summary:
      Yesterday's release of librosa 0.9.0 made args keyword-only and changed default padding from "reflect" to "zero" for some functions. This PR adjusts callsites in our tutorials and tests accordingly.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2208
      
      Reviewed By: mthrok
      
      Differential Revision: D34099793
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: 4e2642cdda8aae6d0a928befaf1bbb3873d229bc
      e5d567c9
  10. 04 Feb, 2022 1 commit
  11. 03 Feb, 2022 3 commits
  12. 02 Feb, 2022 5 commits