1. 24 Jul, 2023 1 commit
  2. 16 Jun, 2023 1 commit
    • Pingchuan Ma's avatar
      Add LRS3 data preparation (#3421) · 77cdd160
      Pingchuan Ma authored
      Summary:
      This PR adds a data preparation recipe that uses the ultra face detector to extract full-face video. The resulting video output is then used as input for training and evaluating RNNT-based models for automatic speech recognition (ASR), visual speech recognition (VSR), and audio-visual ASR (AV-ASR) on the LRS3 dataset.
      
      This PR also updates the word error rate (WER) for AV-ASR LRS3 models and improves the code readability.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3421
      
      Reviewed By: mpc001
      
      Differential Revision: D46799748
      
      Pulled By: mthrok
      
      fbshipit-source-id: 97af3feac0592b240617faaffa4c0ac8cef614a9
      77cdd160
  3. 25 May, 2023 1 commit
    • Pingchuan Ma's avatar
      Add LRS3 AV-ASR recipe (#3278) · c6624fa6
      Pingchuan Ma authored
      Summary:
      This PR adds AV-ASR recipe which contains sample implementations of training and evaluation pipelines for RNNT based automatic, visual, and audio-visual (ASR, VSR, AV-ASR) models on LRS3. This repository includes both streaming/non-streaming modes.
      
      CC stavros99 xiaohui-zhang YumengTao mthrok nateanl hwangjeff
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3278
      
      Reviewed By: nateanl
      
      Differential Revision: D46121550
      
      Pulled By: mpc001
      
      fbshipit-source-id: bb44b97ae25e87df2a73a707008be46af4ad0fc6
      c6624fa6
  4. 13 Apr, 2022 1 commit
    • hwangjeff's avatar
      Add Conformer RNN-T LibriSpeech training recipe (#2329) · c262758b
      hwangjeff authored
      Summary:
      Adds Conformer RNN-T LibriSpeech training recipe to examples directory.
      
      Produces 30M-parameter model that achieves the following WER:
      
      |                     |          WER |
      |:-------------------:|-------------:|
      | test-clean          |       0.0310 |
      | test-other          |       0.0805 |
      | dev-clean           |       0.0314 |
      | dev-other           |       0.0827 |
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2329
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D35578727
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: afa9146c5b647727b8605d104d928110a1d3976d
      c262758b
  5. 16 Feb, 2022 1 commit
  6. 11 Feb, 2022 1 commit