- 02 Feb, 2022 1 commit
-
-
hwangjeff authored
Summary: Rather than apply SentencePiece's `decode` to directly convert each hypothesis's token id sequence to an output string, we convert each token id sequence to word pieces and then manually join the word pieces ourselves. This allows us to preserve leading whitespaces on output strings and therefore account for word breaks and continuations across token processor invocations, which is particularly useful when performing streaming ASR. https://user-images.githubusercontent.com/8345689/152093668-11fb775a-bf7b-4b1d-9516-9f8d5a9b6683.mov Versus the previous behavior visualized in https://github.com/pytorch/audio/issues/2093, the scheme here properly constructs words comprising multiple pieces. Pull Request resolved: https://github.com/pytorch/audio/pull/2192 Reviewed By: mthrok Differential Revision: D33936622 Pulled By: hwangjeff fbshipit-source-id: e550980c7d4cac9e982315508f793a6b816752e9
-
- 01 Feb, 2022 3 commits
-
-
hwangjeff authored
Summary: Missed a couple of spots in https://github.com/pytorch/audio/issues/2187. Pull Request resolved: https://github.com/pytorch/audio/pull/2189 Reviewed By: carolineechen, nateanl, mthrok Differential Revision: D33926342 Pulled By: hwangjeff fbshipit-source-id: e1324c0fe8f9be90ad3143d19cd61c3d53f02b06
-
hwangjeff authored
Summary: Moves ASR features out of `torchaudio.prototype`. Specifically, merges contents of `torchaudio.prototype.models` into `torchaudio.models` and contents of `torchaudio.prototype.pipelines` into `torchaudio.pipelines` and updates refs, tests, and docs accordingly. Pull Request resolved: https://github.com/pytorch/audio/pull/2187 Reviewed By: nateanl, mthrok Differential Revision: D33918092 Pulled By: hwangjeff fbshipit-source-id: f003f289a7e5d7d43f85b7c270b58bdf2ed6344c
-
hwangjeff authored
Summary: Adds script for generating global feature statistics along with new feature statistics json for LibriSpeech RNN-T training recipe. Pull Request resolved: https://github.com/pytorch/audio/pull/2183 Reviewed By: mthrok Differential Revision: D33902377 Pulled By: hwangjeff fbshipit-source-id: ec347a685ae67aefc485084aac6ed2efd653250f
-
- 27 Jan, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2178 Reviewed By: mthrok Differential Revision: D33797649 Pulled By: nateanl fbshipit-source-id: 7a8f54294e7b5bd4d343c8e361e747bfd8b5b603
-
- 30 Dec, 2021 1 commit
-
-
Joao Gomes authored
Summary: cc mthrok Pull Request resolved: https://github.com/pytorch/audio/pull/2116 Reviewed By: mthrok Differential Revision: D33368453 Pulled By: jdsgomes fbshipit-source-id: 09cf3fe5ed6f771c2f16505633c0e59b0c27453c
-
- 29 Dec, 2021 3 commits
-
-
hwangjeff authored
Summary: Regroup RNN-T components under `torchaudio.prototype.models` and `torchaudio.prototype.pipelines`. Updated docs: https://492321-90321822-gh.circle-artifacts.com/0/docs/prototype.html Pull Request resolved: https://github.com/pytorch/audio/pull/2110 Reviewed By: carolineechen, mthrok Differential Revision: D33354116 Pulled By: hwangjeff fbshipit-source-id: 9cf4afed548cb173d56211c16d31bcfa25a8e4cb
-
CodemodService Bot authored
Reviewed By: zertosh Differential Revision: D33347867 fbshipit-source-id: 7672f65392e363c0359de2d86e745782a09cf9dc
-
hwangjeff authored
Summary: Adds pretrained Emformer RNN-T inference pipeline that's capable of performing streaming and non-streaming ASR. Includes demo script that uses pipeline to alternately perform streaming and non-streaming ASR on LibriSpeech test samples (video below). https://user-images.githubusercontent.com/8345689/147590753-d5126557-d575-4551-8dfe-5977276cb4ad.mov Pull Request resolved: https://github.com/pytorch/audio/pull/2093 Reviewed By: mthrok Differential Revision: D33340776 Pulled By: hwangjeff fbshipit-source-id: fbb3b1d471b4e9f1b93fa9dea9c464154537a8ac
-
- 23 Dec, 2021 1 commit
-
-
Joao Gomes authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2096 run: `arc lint --apply-patches --paths-cmd 'hg files -I "./**/*.py"'` Reviewed By: mthrok Differential Revision: D33297351 fbshipit-source-id: 7bf5956edf0717c5ca90219f72414ff4eeaf5aa8
-
- 03 Dec, 2021 1 commit
-
-
hwangjeff authored
Summary: Add training recipe for RNN-T Emformer ASR model to examples directory. Pull Request resolved: https://github.com/pytorch/audio/pull/2052 Reviewed By: nateanl Differential Revision: D32814096 Pulled By: hwangjeff fbshipit-source-id: a5153044efc16cb39f0e6413369a6791637af76a
-