1. 26 Oct, 2023 1 commit
  2. 25 Oct, 2023 1 commit
  3. 12 Oct, 2023 1 commit
  4. 09 Oct, 2023 1 commit
  5. 12 Jul, 2023 1 commit
    • moto's avatar
      Support multiple FFmpeg versions (#3464) · 786066b4
      moto authored
      Summary:
      This commit introduces support for multiple FFmpeg versions for OSS binary distributions.
      
      Currently torchaudio only works with FFmpeg 4. This is inconvenient from installing to runtime linking.
      This commit allows to pick FFmpeg 4, 5 or 6 at runtime, instead of just looking for v4.
      
      The way it works is that we compile the FFmpeg extension three times with different FFmpeg and ship them.
      At runtime, we look for libavutil of specific version and when one is found, load the corresponding FFmpeg extension.
      The order of preference is 6, 5, then 4.
      
      To make the build process simple and reproducible, we use pre-built binaries of FFmpeg during the build.
      They are LGPL and downloaded from S3 at build time, instead of building every time.
      
      The use of pre-built binaries as scaffolding limits the system that can build torchaudio, so it also introduces
      single FFmpeg version support mode. setting FFMPEG_ROOT during the build will change the way binaries are built
      so that it will only support one specific version of FFmpeg.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3464
      
      Differential Revision: D47300223
      
      Pulled By: mthrok
      
      fbshipit-source-id: 560c7968315e4c8922afa11a4693f648c0356d04
      786066b4
  6. 28 Apr, 2023 1 commit
    • Yuekai Zhang's avatar
      Add cuctc decoder (#3096) · 0a1801ed
      Yuekai Zhang authored
      Summary:
      This PR implements a CUDA based ctc prefix beam search decoder.
      
      Attach serveral benchmark results using V100 below:
      |decoder type| model |datasets       | decoding time (secs)| beam size | batch size | model unit | subsampling times | vocab size |
      |--------------|---------|------|-----------------|------------|-------------|------------|-----------------------|------------|
      | cuctc |  conformer nemo    |dev clean        |7.68s | 8           |  32       | bpe         |    4  | 1000|
      | cuctc |  conformer nemo   |dev clean  (sort by length)      |1.6s | 8           |  32       | bpe         |    4  | 1000|
      | cuctc |  wav2vec2.0 torchaudio |dev clean                                |22s | 10           |  1       | char         |    2  | 29|
      | cuctc |   conformer espnet   |aishell1 test                             | 5s | 10           |  24       | char         |    4  | 4233|
      
      Note:
      1.  The design is to parallel computation through batch and vocab axis, for loop the frames axis. So it's more friendly with smaller sequence lengths, larger vocab size comparing with CPU implementations.
      2. WER is the same as CPU implementations. However, it can't decode with LM now.
      
      Resolves: https://github.com/pytorch/audio/issues/2957.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3096
      
      Reviewed By: nateanl
      
      Differential Revision: D44709397
      
      Pulled By: mthrok
      
      fbshipit-source-id: 3078c54a2b44dc00eb4a81b4c657487eeff8c155
      0a1801ed
  7. 03 Apr, 2023 1 commit
  8. 17 Mar, 2023 1 commit
  9. 08 Feb, 2023 1 commit
    • moto's avatar
      Update the guard mechanism for FFmpeg-related features (#3028) · 98b3ac17
      moto authored
      Summary:
      Instead of raising an error when lazy import happens, this method allows to import features, and raises an error when the feature is being used.
      
      This makes it easy to adopt the same error mechanism across different modules. It is how it's done for sox-related features.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3028
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D42966976
      
      Pulled By: mthrok
      
      fbshipit-source-id: 423dfe0b8a3970cd07f20e841c794c7f2809f993
      98b3ac17
  10. 30 Jan, 2023 1 commit
    • moto's avatar
      Add get_build_config ffmpeg utility function (#3014) · 635d8cff
      moto authored
      Summary:
      We often need to look at which FFmpeg was found and linked when debugging an issue.
      
      Version number is often not enough but there is no easy way to find where the library was found either.
      
      This commit adds utility function that prints the build time configuration.
      
      It helps to distinguish if the linked FFmpeg is the one from binary distribution built in CI or locally built.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3014
      
      Reviewed By: hwangjeff
      
      Differential Revision: D42794952
      
      Pulled By: mthrok
      
      fbshipit-source-id: 91ed358fde8cfe9d6d950f34742b1722e729cf4e
      635d8cff
  11. 06 Jan, 2023 1 commit
    • moto's avatar
      Add utility functions to fetch available formats/devices/codecs/protocols. (#2958) · b6d147ad
      moto authored
      Summary:
      This commit adds utility functions that fetch the available/supported formats/devices/codecs.
      
      These functions are mostly same with commands like `ffmpeg -decoders`. But the use of `ffmpeg` CLI can report different resutls if there are multiple installation of FFmpegs. Or, the CLI might not be available.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2958
      
      Reviewed By: hwangjeff
      
      Differential Revision: D42371640
      
      Pulled By: mthrok
      
      fbshipit-source-id: 96a96183815a126cb1adc97ab7754aef216fff6f
      b6d147ad
  12. 03 Oct, 2022 1 commit
  13. 27 Jun, 2022 1 commit
  14. 04 Jun, 2022 1 commit
    • moto's avatar
      Make FFmpeg log level configurable (#2439) · 877a88c5
      moto authored
      Summary:
      Undesired logs are one of the loudest UX complains we get.
      Yet, loading media files involves uncertainty which is
      difficult to debug without debug log.
      
      This commit introduces utility functions to configure logging level
      so that we can ask users to enable it when they encounter an issue,
      while defaulting to non-verbose option.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2439
      
      Reviewed By: hwangjeff, xiaohui-zhang
      
      Differential Revision: D36903763
      
      Pulled By: mthrok
      
      fbshipit-source-id: f4ddd9915b13197c2a2eb97e965005b8b5b8d987
      877a88c5