Commits · 66f661df0a102aeebbeaa5336599fbfa0467e5b0 · OpenDAS / Torchaudio

24 Jul, 2023 1 commit

Move examples/asr/avsr_rnnt to examples/avsr folder (#3489) · 66f661df

Pingchuan Ma authored Jul 24, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3489

Reviewed By: mthrok

Differential Revision: D47726448

Pulled By: mpc001

fbshipit-source-id: 3d5aa7646c6bb816dcbbf70c61e98404bb148841

66f661df

16 Jun, 2023 1 commit

Add LRS3 data preparation (#3421) · 77cdd160

Pingchuan Ma authored Jun 16, 2023

Summary:
This PR adds a data preparation recipe that uses the ultra face detector to extract full-face video. The resulting video output is then used as input for training and evaluating RNNT-based models for automatic speech recognition (ASR), visual speech recognition (VSR), and audio-visual ASR (AV-ASR) on the LRS3 dataset.

This PR also updates the word error rate (WER) for AV-ASR LRS3 models and improves the code readability.

Pull Request resolved: https://github.com/pytorch/audio/pull/3421

Reviewed By: mpc001

Differential Revision: D46799748

Pulled By: mthrok

fbshipit-source-id: 97af3feac0592b240617faaffa4c0ac8cef614a9

77cdd160

06 Jun, 2023 1 commit

Fix style issue (#3410) · 27aa52fb

moto authored Jun 06, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3410

Differential Revision: D46496786

Pulled By: mthrok

fbshipit-source-id: e517b273c40b340f39ce7db7ab1be1c3eb5f2059

27aa52fb

25 May, 2023 1 commit

Add LRS3 AV-ASR recipe (#3278) · c6624fa6

Pingchuan Ma authored May 25, 2023

Summary:
This PR adds AV-ASR recipe which contains sample implementations of training and evaluation pipelines for RNNT based automatic, visual, and audio-visual (ASR, VSR, AV-ASR) models on LRS3. This repository includes both streaming/non-streaming modes.

CC stavros99 xiaohui-zhang YumengTao mthrok nateanl hwangjeff

Pull Request resolved: https://github.com/pytorch/audio/pull/3278

Reviewed By: nateanl

Differential Revision: D46121550

Pulled By: mpc001

fbshipit-source-id: bb44b97ae25e87df2a73a707008be46af4ad0fc6

c6624fa6