"vscode:/vscode.git/clone" did not exist on "87252d80c3ea8eb6fba8b6de8c2dac9ede4fadee"
  1. 24 Jul, 2023 1 commit
  2. 16 Jun, 2023 1 commit
    • Pingchuan Ma's avatar
      Add LRS3 data preparation (#3421) · 77cdd160
      Pingchuan Ma authored
      Summary:
      This PR adds a data preparation recipe that uses the ultra face detector to extract full-face video. The resulting video output is then used as input for training and evaluating RNNT-based models for automatic speech recognition (ASR), visual speech recognition (VSR), and audio-visual ASR (AV-ASR) on the LRS3 dataset.
      
      This PR also updates the word error rate (WER) for AV-ASR LRS3 models and improves the code readability.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3421
      
      Reviewed By: mpc001
      
      Differential Revision: D46799748
      
      Pulled By: mthrok
      
      fbshipit-source-id: 97af3feac0592b240617faaffa4c0ac8cef614a9
      77cdd160