- 11 Feb, 2022 2 commits
-
-
hwangjeff authored
Summary: Adds SentencePiece model training script for LibriSpeech Emformer RNN-T example recipe; updates readme with references. Pull Request resolved: https://github.com/pytorch/audio/pull/2218 Reviewed By: nateanl Differential Revision: D34177295 Pulled By: hwangjeff fbshipit-source-id: 9f32805af792fb8c6f834f2812e20104177a6c43
-
nateanl authored
Summary: We refactored the demo script that can apply RNNT decoding using both `torchaudio.pipelines.EMFORMER_RNNT_BASE_LIBRISPEECH` and `torchaudio.prototype.pipelines.EMFORMER_RNNT_BASE_TEDLIUM3` in both streaming and non-streaming mode. (The first hypothesis prediction is streaming and the second one is non-streaming). We convert each token id sequence to word pieces and then manually join the word pieces. This allows us to preserve leading whitespaces on output strings and therefore account for word breaks and continuations across token processor invocations, which is particularly useful when performing streaming ASR. https://user-images.githubusercontent.com/8653221/153627956-f0806f18-3c1c-44df-ac07-ec2def58a0cf.mov Pull Request resolved: https://github.com/pytorch/audio/pull/2203 Reviewed By: carolineechen Differential Revision: D34006388 Pulled By: nateanl fbshipit-source-id: 3d31173ee10cdab8a2f5802570e22b50fcce5632
-
- 10 Feb, 2022 1 commit
-
-
hwangjeff authored
Summary: Consolidates LibriSpeech and TED-LIUM Release 3 Emformer RNN-T training recipes in a single directory. Pull Request resolved: https://github.com/pytorch/audio/pull/2212 Reviewed By: mthrok Differential Revision: D34120104 Pulled By: hwangjeff fbshipit-source-id: 29c6e27195d5998f76d67c35b718110e73529456
-