1. 18 Mar, 2024 1 commit
  2. 10 Nov, 2023 1 commit
  3. 03 Oct, 2023 1 commit
  4. 29 Sep, 2023 2 commits
  5. 20 Sep, 2023 1 commit
  6. 19 Sep, 2023 1 commit
  7. 13 Sep, 2023 1 commit
  8. 08 Sep, 2023 1 commit
  9. 04 Sep, 2023 2 commits
  10. 21 Aug, 2023 1 commit
  11. 20 Aug, 2023 1 commit
  12. 15 Aug, 2023 1 commit
    • moto's avatar
      [BC-breaking] Update pre-built ffmpeg4 to 4.4.4 (#3561) · bf07ea6b
      moto authored
      Summary:
      In https://github.com/pytorch/audio/pull/3460, we switched the build process for FFmpeg extension.
      Since it is complicated to install FFmpeg in some environments, at build time, pre-built binaries and its headers
      are downloaded and used as a scaffolding for torchaudio build.
      
      Now even though we did not change any code or FFmpeg version, it turned out that this causes segmentation
      fault on Ubuntu when using system Python and FFmpeg 4.4 installed via aptitude.
      While investigating the issue, I swapped the said pre-built FFmpeg scaffolding with FFmpeg 4.4 from aptitude,
      and the segmentation fault did not happen. This indicates that it is binary compatibility issue.
      
      Before https://github.com/pytorch/audio/issues/3460, each binary build job was building FFmpeg 4.1.8 using the same compiler used to build torchaudio,
      but after https://github.com/pytorch/audio/issues/3460 the environments to build FFmpeg 4.1.8 and torchaudio are different. My hypothesis is that
      this difference is causing some ABI incompatibility when linking against FFmpeg 4.4. (Also, I don't remember well,
      but I read somewhere that 4.4 has a different ABI)
      
      Through experiments, it turned out upgrading the pre-built FFmpeg scaffolding to 4.4 resolves this.
      So this commit upgrade the pre-built FFmpeg 4 to 4.4.
      The potential (yet unconfirmed) downside is that torchaudio will no longer work with 4.1, 4.2, and 4.3.
      Since FFmpeg 4.4 is what Ubuntu 20.04 and 22.04 support by default, and Google Colab is also on 20.04,
      I think it is more important to support 4.4.
      
      Therefore we drop the support for 4.1-4.3 from normal build (and official distributions). Those who wish to
      use 4.1-4.3 can build torchaudio from source by linking to specific FFmpeg.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3561
      
      Reviewed By: hwangjeff
      
      Differential Revision: D48340201
      
      Pulled By: mthrok
      
      fbshipit-source-id: 7ece82910f290c7cf83f58311c4cf6a384e8795e
      bf07ea6b
  13. 10 Aug, 2023 1 commit
  14. 08 Aug, 2023 4 commits
  15. 04 Aug, 2023 1 commit
  16. 01 Aug, 2023 1 commit
  17. 31 Jul, 2023 2 commits
  18. 29 Jul, 2023 1 commit
    • moto's avatar
      Refactor compat (#3518) · 8497ee91
      moto authored
      Summary:
      The I/O functions in _compat module was introduced there so that
      everything related to FFmpeg is in torchaudio.io and FFmpeg library
      initialization can be carried out in `torchaudio.io.__init__`.
      
      Now that this constraint is removed, (all the initialization happens
      at `torchaudio._extension.__init__`) and `_compat` is only used by
      FFmpeg dispatcher backend, we move the module to `torchaudio._backend`
      for better locality.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3518
      
      Reviewed By: huangruizhe
      
      Differential Revision: D47877412
      
      Pulled By: mthrok
      
      fbshipit-source-id: aa18c8cb6e5d5360950df5158c33c653e37c565f
      8497ee91
  19. 28 Jul, 2023 2 commits
  20. 26 Jul, 2023 1 commit
  21. 25 Jul, 2023 2 commits
    • Pingchuan Ma's avatar
      Update avsr recipe (#3493) · d4644793
      Pingchuan Ma authored
      Summary:
      This PR is to include few changes in the AV-ASR recipe. The changes include better results, a faster face detector (Mediapipe), renamed variable names, a streamlined dataloader, and a few illustrated examples. These changes were made to improve the usability of the recipe.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3493
      
      Reviewed By: mthrok
      
      Differential Revision: D47758072
      
      Pulled By: mpc001
      
      fbshipit-source-id: 4533587776f3a7a74f3f11b0ece773a0934bacdc
      d4644793
    • moto's avatar
      Update nvdec/nvenc tutorials (#3483) · 56e22664
      moto authored
      Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3483
      
      Differential Revision: D47725664
      
      Pulled By: mthrok
      
      fbshipit-source-id: e4249e1488fa7af8670be4a5077957912ff3420b
      56e22664
  22. 24 Jul, 2023 1 commit
  23. 18 Jul, 2023 1 commit
  24. 15 Jul, 2023 1 commit
  25. 05 Jul, 2023 1 commit
  26. 28 Jun, 2023 1 commit
  27. 26 Jun, 2023 1 commit
  28. 21 Jun, 2023 1 commit
  29. 16 Jun, 2023 1 commit
    • Pingchuan Ma's avatar
      Add LRS3 data preparation (#3421) · 77cdd160
      Pingchuan Ma authored
      Summary:
      This PR adds a data preparation recipe that uses the ultra face detector to extract full-face video. The resulting video output is then used as input for training and evaluating RNNT-based models for automatic speech recognition (ASR), visual speech recognition (VSR), and audio-visual ASR (AV-ASR) on the LRS3 dataset.
      
      This PR also updates the word error rate (WER) for AV-ASR LRS3 models and improves the code readability.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3421
      
      Reviewed By: mpc001
      
      Differential Revision: D46799748
      
      Pulled By: mthrok
      
      fbshipit-source-id: 97af3feac0592b240617faaffa4c0ac8cef614a9
      77cdd160
  30. 15 Jun, 2023 1 commit
    • moto's avatar
      Update forced alignment tutorial (#3440) · 18601691
      moto authored
      Summary:
      * Fix backtrack visualization (the cooridnate was off-by-one.)
      * Add note about the simplification and the new align API
      * Explicitly handle SOS and EOS
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3440
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D46761282
      
      Pulled By: mthrok
      
      fbshipit-source-id: b0b6c9754674e8e23543e9f002e29b55102c92f8
      18601691
  31. 07 Jun, 2023 1 commit
  32. 06 Jun, 2023 1 commit