- 24 Jul, 2023 1 commit
-
-
Pingchuan Ma authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3489 Reviewed By: mthrok Differential Revision: D47726448 Pulled By: mpc001 fbshipit-source-id: 3d5aa7646c6bb816dcbbf70c61e98404bb148841
-
- 16 Jun, 2023 1 commit
-
-
Pingchuan Ma authored
Summary: This PR adds a data preparation recipe that uses the ultra face detector to extract full-face video. The resulting video output is then used as input for training and evaluating RNNT-based models for automatic speech recognition (ASR), visual speech recognition (VSR), and audio-visual ASR (AV-ASR) on the LRS3 dataset. This PR also updates the word error rate (WER) for AV-ASR LRS3 models and improves the code readability. Pull Request resolved: https://github.com/pytorch/audio/pull/3421 Reviewed By: mpc001 Differential Revision: D46799748 Pulled By: mthrok fbshipit-source-id: 97af3feac0592b240617faaffa4c0ac8cef614a9
-
- 25 May, 2023 1 commit
-
-
Pingchuan Ma authored
Summary: This PR adds AV-ASR recipe which contains sample implementations of training and evaluation pipelines for RNNT based automatic, visual, and audio-visual (ASR, VSR, AV-ASR) models on LRS3. This repository includes both streaming/non-streaming modes. CC stavros99 xiaohui-zhang YumengTao mthrok nateanl hwangjeff Pull Request resolved: https://github.com/pytorch/audio/pull/3278 Reviewed By: nateanl Differential Revision: D46121550 Pulled By: mpc001 fbshipit-source-id: bb44b97ae25e87df2a73a707008be46af4ad0fc6
-
- 13 Apr, 2022 1 commit
-
-
hwangjeff authored
Summary: Adds Conformer RNN-T LibriSpeech training recipe to examples directory. Produces 30M-parameter model that achieves the following WER: | | WER | |:-------------------:|-------------:| | test-clean | 0.0310 | | test-other | 0.0805 | | dev-clean | 0.0314 | | dev-other | 0.0827 | Pull Request resolved: https://github.com/pytorch/audio/pull/2329 Reviewed By: xiaohui-zhang Differential Revision: D35578727 Pulled By: hwangjeff fbshipit-source-id: afa9146c5b647727b8605d104d928110a1d3976d
-
- 16 Feb, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2237 Reviewed By: mthrok Differential Revision: D34267000 Pulled By: nateanl fbshipit-source-id: 4c264aea6cf3fba5d8728d5fe60f9f471815852d
-
- 11 Feb, 2022 1 commit
-
-
hwangjeff authored
Summary: Adds SentencePiece model training script for LibriSpeech Emformer RNN-T example recipe; updates readme with references. Pull Request resolved: https://github.com/pytorch/audio/pull/2218 Reviewed By: nateanl Differential Revision: D34177295 Pulled By: hwangjeff fbshipit-source-id: 9f32805af792fb8c6f834f2812e20104177a6c43
-