- 25 Jul, 2022 1 commit
-
-
Sean Kim authored
Summary: Previous Issue: --use-tmp-hub-dir expected the temp directories used to store large file to be deleted after each test case, but pytest erases directories after 3 full test sessions. This commit fixes by manually deleting a new subdirectory created in each test case. https://github.com/pytorch/audio/pull/2565#discussion_r929007101 Pull Request resolved: https://github.com/pytorch/audio/pull/2569 Reviewed By: nateanl Differential Revision: D38117848 Pulled By: skim0514 fbshipit-source-id: 3767cb8df1238fd6218f6aaa58d5d583cea72699
-
- 22 Jul, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: - Add documentation page for `SourceSeparationBundle` and `CONVTASNET_BASE_LIBRI2MIX`. - Add citation of Libri2Mix dataset in the bundle documentation. - url in integration test should use slash instead of `os.path.join` as it will fail on Windows. Change it to f-string. Pull Request resolved: https://github.com/pytorch/audio/pull/2559 Reviewed By: carolineechen Differential Revision: D38036116 Pulled By: nateanl fbshipit-source-id: 736732805191113955badfec3955e2e24e8f4836
-
- 21 Jul, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: - Add SourceSeparationBundle class for source separation pipeline - Add `CONVTASNET_BASE_LIBRI2MIX` that is trained on Libri2Mix dataset. - Add integration test with example mixture audio and expected scale-invariant signal-to-distortion ratio (Si-SDR) score. The test computes the Si-SDR score with permutation-invariant training (PIT) criterion for all permutations of sources and use the highest value as the final output. The test verifies if the score is equal to or larger than the expected value. Pull Request resolved: https://github.com/pytorch/audio/pull/2440 Reviewed By: mthrok Differential Revision: D37997646 Pulled By: nateanl fbshipit-source-id: c951bcbbe8b7ed9553cb8793d6dc1ef90d5a29fe
-
- 27 Jun, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: In https://github.com/pytorch/audio/issues/2283, torchaudio's downloading function is updated to reduce code duplication. The links in `EMFORMER_RNNT_BASE_LIBRISPEECH` are updated, but the ones in prototype pipelines are not. This PR addresses it by updating the download links of `EMFORMER_RNNT_BASE_MUSTC` and `EMFORMER_RNNT_BASE_TEDLIUM3` in prototype. Corresponding integration tests are added as well. Pull Request resolved: https://github.com/pytorch/audio/pull/2444 Reviewed By: mthrok Differential Revision: D37389178 Pulled By: nateanl fbshipit-source-id: 46598dd71c95be47d1e1b54cef89ea51d280e17a
-
- 01 Jun, 2022 1 commit
-
-
Caroline Chen authored
Summary: Move CTC beam search decoder out of prototype to new `torchaudio.models.decoder` module. hwangjeff mthrok any thoughts on the new module + naming, and if we should move rnnt beam search here as well?? Pull Request resolved: https://github.com/pytorch/audio/pull/2410 Reviewed By: mthrok Differential Revision: D36784521 Pulled By: carolineechen fbshipit-source-id: a2ec52f86bba66e03327a9af0c5df8bbefcd67ed
-
- 15 May, 2022 1 commit
-
-
John Reese authored
Summary: Applies new import merging and sorting from µsort v1.0. When merging imports, µsort will make a best-effort to move associated comments to match merged elements, but there are known limitations due to the diynamic nature of Python and developer tooling. These changes should not produce any dangerous runtime changes, but may require touch-ups to satisfy linters and other tooling. Note that µsort uses case-insensitive, lexicographical sorting, which results in a different ordering compared to isort. This provides a more consistent sorting order, matching the case-insensitive order used when sorting import statements by module name, and ensures that "frog", "FROG", and "Frog" always sort next to each other. For details on µsort's sorting and merging semantics, see the user guide: https://usort.readthedocs.io/en/stable/guide.html#sorting Reviewed By: lisroach Differential Revision: D36402214 fbshipit-source-id: b641bfa9d46242188524d4ae2c44998922a62b4c
-
- 26 Apr, 2022 1 commit
-
-
Caroline Chen authored
Summary: Add support for lexicon free decoding based on [fairseq's](https://github.com/pytorch/fairseq/blob/main/examples/speech_recognition/new/decoders/flashlight_decoder.py#L53) implementation. Reached numerical parity with fairseq's decoder in offline experimentation Follow ups - Add pretrained LM support for lex free decoding - Add example in tutorial - Replace flashlight C++ source code with flashlight text submodule - [optional] fairseq compatibility test Pull Request resolved: https://github.com/pytorch/audio/pull/2342 Reviewed By: nateanl Differential Revision: D35856104 Pulled By: carolineechen fbshipit-source-id: b64286550984df906ebb747e82f6fb1f21948ac7
-
- 21 Apr, 2022 1 commit
-
-
hwangjeff authored
Summary: PyTorch Lite, which is becoming a standard for mobile PyTorch usage, does not support containers containing custom classes. Consequently, because TorchAudio's RNN-T decoder currently returns and accepts lists of `Hypothesis` namedtuples, it is not compatible with PyTorch Lite. This PR resolves said incompatibility by changing the underlying implementation of `Hypothesis` to tuple. Pull Request resolved: https://github.com/pytorch/audio/pull/2339 Reviewed By: nateanl Differential Revision: D35806529 Pulled By: hwangjeff fbshipit-source-id: 9cbae5504722390511d35e7f9966af2519ccede5
-
- 25 Mar, 2022 1 commit
-
-
Caroline Chen authored
Summary: add function to download pretrained files for LibriSpeech 3-gram/4-gram KenLM, tests, and updated tutorial Pull Request resolved: https://github.com/pytorch/audio/pull/2275 Reviewed By: mthrok Differential Revision: D35115418 Pulled By: carolineechen fbshipit-source-id: 83ff22380fce9c753bb4a7b7e3d89aa66c2831c0
-
- 22 Mar, 2022 1 commit
-
-
moto authored
Summary: In recent updates, torchaudio added features that download assets/models from download.pytorch.org/torchaudio. To reduce the code duplication, the implementations uses utilities from ``torch.hub``, but still, there are patterns repeated in implementing the fetch mechanism, notably cache and local file path handling. This commit introduces the utility function that handles download/cache/local path management that can be used for fetching pre-trained model data. Pull Request resolved: https://github.com/pytorch/audio/pull/2283 Reviewed By: carolineechen Differential Revision: D35050469 Pulled By: mthrok fbshipit-source-id: 219dd806f9a96c54d2d31e981c1bbe282772702b
-
- 01 Feb, 2022 1 commit
-
-
hwangjeff authored
Summary: Moves ASR features out of `torchaudio.prototype`. Specifically, merges contents of `torchaudio.prototype.models` into `torchaudio.models` and contents of `torchaudio.prototype.pipelines` into `torchaudio.pipelines` and updates refs, tests, and docs accordingly. Pull Request resolved: https://github.com/pytorch/audio/pull/2187 Reviewed By: nateanl, mthrok Differential Revision: D33918092 Pulled By: hwangjeff fbshipit-source-id: f003f289a7e5d7d43f85b7c270b58bdf2ed6344c
-
- 26 Jan, 2022 1 commit
-
-
hwangjeff authored
Summary: Adds integration test for pretrained ASR pipeline `EMFORMER_RNNT_BASE_LIBRISPEECH`. Pull Request resolved: https://github.com/pytorch/audio/pull/2172 Reviewed By: carolineechen, nateanl Differential Revision: D33793324 Pulled By: hwangjeff fbshipit-source-id: d0613e2ab98fe5afa7b16ca39b67f0a0304d13fc
-
- 30 Dec, 2021 1 commit
-
-
Joao Gomes authored
Summary: cc mthrok Pull Request resolved: https://github.com/pytorch/audio/pull/2116 Reviewed By: mthrok Differential Revision: D33368453 Pulled By: jdsgomes fbshipit-source-id: 09cf3fe5ed6f771c2f16505633c0e59b0c27453c
-
- 23 Dec, 2021 1 commit
-
-
Joao Gomes authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2096 run: `arc lint --apply-patches --paths-cmd 'hg files -I "./**/*.py"'` Reviewed By: mthrok Differential Revision: D33297351 fbshipit-source-id: 7bf5956edf0717c5ca90219f72414ff4eeaf5aa8
-
- 04 Nov, 2021 1 commit
-
-
moto authored
This commit changes all the `torch.hub` network utility functions to be imported from `torchaudio._internal`, so that later we can replace the function within fbcode.
-
- 03 Nov, 2021 1 commit
-
-
moto authored
-
- 02 Nov, 2021 3 commits
- 27 Oct, 2021 1 commit
-
-
moto authored
-
- 25 Oct, 2021 1 commit
-
-
moto authored
-
- 22 Oct, 2021 1 commit
-
-
moto authored
- Make the test support other languages - Fetch tetst asset on-the-fly
-
- 21 Oct, 2021 1 commit
-
-
moto authored
* [BC-breaking] Remove unused dimension from pretrained Wav2Vec2 ASR The Wav2Vec2 ASR pretrained weights originated from fairseq have extra dimension that have nothing to do with the ASR task. https://github.com/pytorch/fairseq/blob/c5ff181125c7e6126b49a85e5ebdd5f5b6a07914/fairseq/data/dictionary.py#L18-L37 which is masked during the loss computation as https://github.com/pytorch/fairseq/blob/c5ff181125c7e6126b49a85e5ebdd5f5b6a07914/fairseq/criterions/ctc.py#L126-L128 This change removes it. * Use '-' for blank token representation.
-
- 15 Oct, 2021 2 commits
-
-
moto authored
Future work items: - length computation of GriffinLim - better way to make InverseMelScale work in inference_mode
-
moto authored
- Move wav2vec2 pretrained weights to `torchaudio.pipelines` namespace to align with #1872. - Split `Wav2Vec2PretrainedModelBundle` into `Wav2Vec2Bundle` (for pre-training model) and `Wav2Vec2ASRBundle` (for models fine-tuned for ASR). - Update base URL
-
- 08 Oct, 2021 1 commit
-
-
moto authored
-
- 06 Oct, 2021 2 commits
-
-
moto authored
Add pretrained weights from https://github.com/pytorch/fairseq/tree/main/examples/wav2vec#pre-trained-models - Wav2Vec 2.0 Base / Large / Large (LV-60) - XLSR-53
-
moto authored
This commit adds - HUBERT_LARGE - HUBERT_XLARGE - HUBERT_ASR_XLARGE
-
- 05 Oct, 2021 1 commit
-
-
moto authored
-