1. 12 Jul, 2023 3 commits
    • moto's avatar
      Use FFmpeg6 in build doc (#3475) · 989702b3
      moto authored
      Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3475
      
      Differential Revision: D47403772
      
      Pulled By: mthrok
      
      fbshipit-source-id: 5cdde521dbbbbf33856470a9dc79419b4a3a1683
      989702b3
    • Moto Hira's avatar
      Fix FFmpeg initialization logic (#3474) · 49e269ab
      Moto Hira authored
      Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3474
      
      Differential Revision: D47398447
      
      fbshipit-source-id: f77b685d54ddfc222b806475707d4a10239872f5
      49e269ab
    • moto's avatar
      Support multiple FFmpeg versions (#3464) · 786066b4
      moto authored
      Summary:
      This commit introduces support for multiple FFmpeg versions for OSS binary distributions.
      
      Currently torchaudio only works with FFmpeg 4. This is inconvenient from installing to runtime linking.
      This commit allows to pick FFmpeg 4, 5 or 6 at runtime, instead of just looking for v4.
      
      The way it works is that we compile the FFmpeg extension three times with different FFmpeg and ship them.
      At runtime, we look for libavutil of specific version and when one is found, load the corresponding FFmpeg extension.
      The order of preference is 6, 5, then 4.
      
      To make the build process simple and reproducible, we use pre-built binaries of FFmpeg during the build.
      They are LGPL and downloaded from S3 at build time, instead of building every time.
      
      The use of pre-built binaries as scaffolding limits the system that can build torchaudio, so it also introduces
      single FFmpeg version support mode. setting FFMPEG_ROOT during the build will change the way binaries are built
      so that it will only support one specific version of FFmpeg.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3464
      
      Differential Revision: D47300223
      
      Pulled By: mthrok
      
      fbshipit-source-id: 560c7968315e4c8922afa11a4693f648c0356d04
      786066b4
  2. 11 Jul, 2023 4 commits
  3. 10 Jul, 2023 1 commit
  4. 07 Jul, 2023 3 commits
  5. 06 Jul, 2023 2 commits
  6. 05 Jul, 2023 4 commits
  7. 03 Jul, 2023 1 commit
  8. 28 Jun, 2023 2 commits
  9. 26 Jun, 2023 1 commit
  10. 21 Jun, 2023 2 commits
  11. 16 Jun, 2023 1 commit
    • Pingchuan Ma's avatar
      Add LRS3 data preparation (#3421) · 77cdd160
      Pingchuan Ma authored
      Summary:
      This PR adds a data preparation recipe that uses the ultra face detector to extract full-face video. The resulting video output is then used as input for training and evaluating RNNT-based models for automatic speech recognition (ASR), visual speech recognition (VSR), and audio-visual ASR (AV-ASR) on the LRS3 dataset.
      
      This PR also updates the word error rate (WER) for AV-ASR LRS3 models and improves the code readability.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3421
      
      Reviewed By: mpc001
      
      Differential Revision: D46799748
      
      Pulled By: mthrok
      
      fbshipit-source-id: 97af3feac0592b240617faaffa4c0ac8cef614a9
      77cdd160
  12. 15 Jun, 2023 1 commit
    • moto's avatar
      Update forced alignment tutorial (#3440) · 18601691
      moto authored
      Summary:
      * Fix backtrack visualization (the cooridnate was off-by-one.)
      * Add note about the simplification and the new align API
      * Explicitly handle SOS and EOS
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3440
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D46761282
      
      Pulled By: mthrok
      
      fbshipit-source-id: b0b6c9754674e8e23543e9f002e29b55102c92f8
      18601691
  13. 14 Jun, 2023 1 commit
  14. 13 Jun, 2023 2 commits
  15. 12 Jun, 2023 1 commit
  16. 09 Jun, 2023 3 commits
  17. 08 Jun, 2023 8 commits