1. 07 Jul, 2023 2 commits
    • moto's avatar
      Fix StreamWriter regression around RGB0/BGR0 (#3428) · 9210cba2
      moto authored
      Summary:
      - Add RGB0/BGR0 support to CPU encoder
      - Allow to pass RGB/BGR when expectged format is RGB0/BGR0
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3428
      
      Differential Revision: D47274370
      
      Pulled By: mthrok
      
      fbshipit-source-id: d34d940e04b07673bb86f518fe895c0735912444
      9210cba2
    • moto's avatar
      Use pre-built binaries for ffmpeg extension (#3460) · f77c3e5b
      moto authored
      Summary:
      This commit changes the way FFmpeg extension is built.
      
      Originally, the build process expected the FFmpeg binaries to be somehow available in build env.
      This makes the build process unpredictable and prevents default enabling FFmpeg extension.
      
      The proposed change uses pre-built FFmpeg binaries as build-time only scaffold, which are built in our CI job https://github.com/pytorch/audio/actions/workflows/ffmpeg.yml.
      
      This makes the build process more predictable and removes the necessity to build FFmpeg in our CI.
      Currently, it supports macOS (arm64, x86_64), unix (x86_64, aarch64) and windows (amd64).
      The downside is that it no longer works with the architecture not listed above.
      We can potentially workaround by searching the FFmpeg binaries available in system (the old way) for
      these system, but since they are not supported by PyTorch, the priority is low.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3460
      
      Differential Revision: D47261885
      
      Pulled By: mthrok
      
      fbshipit-source-id: 223a15e95c9140c95688af968beb35ff40354476
      f77c3e5b
  2. 06 Jul, 2023 2 commits
  3. 05 Jul, 2023 4 commits
  4. 03 Jul, 2023 1 commit
  5. 28 Jun, 2023 2 commits
  6. 26 Jun, 2023 1 commit
  7. 21 Jun, 2023 2 commits
  8. 16 Jun, 2023 1 commit
    • Pingchuan Ma's avatar
      Add LRS3 data preparation (#3421) · 77cdd160
      Pingchuan Ma authored
      Summary:
      This PR adds a data preparation recipe that uses the ultra face detector to extract full-face video. The resulting video output is then used as input for training and evaluating RNNT-based models for automatic speech recognition (ASR), visual speech recognition (VSR), and audio-visual ASR (AV-ASR) on the LRS3 dataset.
      
      This PR also updates the word error rate (WER) for AV-ASR LRS3 models and improves the code readability.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3421
      
      Reviewed By: mpc001
      
      Differential Revision: D46799748
      
      Pulled By: mthrok
      
      fbshipit-source-id: 97af3feac0592b240617faaffa4c0ac8cef614a9
      77cdd160
  9. 15 Jun, 2023 1 commit
    • moto's avatar
      Update forced alignment tutorial (#3440) · 18601691
      moto authored
      Summary:
      * Fix backtrack visualization (the cooridnate was off-by-one.)
      * Add note about the simplification and the new align API
      * Explicitly handle SOS and EOS
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3440
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D46761282
      
      Pulled By: mthrok
      
      fbshipit-source-id: b0b6c9754674e8e23543e9f002e29b55102c92f8
      18601691
  10. 14 Jun, 2023 1 commit
  11. 13 Jun, 2023 2 commits
  12. 12 Jun, 2023 1 commit
  13. 09 Jun, 2023 3 commits
  14. 08 Jun, 2023 8 commits
  15. 07 Jun, 2023 2 commits
  16. 06 Jun, 2023 4 commits
  17. 05 Jun, 2023 1 commit
  18. 04 Jun, 2023 1 commit
  19. 03 Jun, 2023 1 commit