1. 12 Oct, 2023 1 commit
  2. 11 Oct, 2023 1 commit
  3. 09 Oct, 2023 1 commit
  4. 12 Jul, 2023 1 commit
    • moto's avatar
      Support multiple FFmpeg versions (#3464) · 786066b4
      moto authored
      Summary:
      This commit introduces support for multiple FFmpeg versions for OSS binary distributions.
      
      Currently torchaudio only works with FFmpeg 4. This is inconvenient from installing to runtime linking.
      This commit allows to pick FFmpeg 4, 5 or 6 at runtime, instead of just looking for v4.
      
      The way it works is that we compile the FFmpeg extension three times with different FFmpeg and ship them.
      At runtime, we look for libavutil of specific version and when one is found, load the corresponding FFmpeg extension.
      The order of preference is 6, 5, then 4.
      
      To make the build process simple and reproducible, we use pre-built binaries of FFmpeg during the build.
      They are LGPL and downloaded from S3 at build time, instead of building every time.
      
      The use of pre-built binaries as scaffolding limits the system that can build torchaudio, so it also introduces
      single FFmpeg version support mode. setting FFMPEG_ROOT during the build will change the way binaries are built
      so that it will only support one specific version of FFmpeg.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3464
      
      Differential Revision: D47300223
      
      Pulled By: mthrok
      
      fbshipit-source-id: 560c7968315e4c8922afa11a4693f648c0356d04
      786066b4
  5. 07 Jul, 2023 1 commit
    • moto's avatar
      Use pre-built binaries for ffmpeg extension (#3460) · f77c3e5b
      moto authored
      Summary:
      This commit changes the way FFmpeg extension is built.
      
      Originally, the build process expected the FFmpeg binaries to be somehow available in build env.
      This makes the build process unpredictable and prevents default enabling FFmpeg extension.
      
      The proposed change uses pre-built FFmpeg binaries as build-time only scaffold, which are built in our CI job https://github.com/pytorch/audio/actions/workflows/ffmpeg.yml.
      
      This makes the build process more predictable and removes the necessity to build FFmpeg in our CI.
      Currently, it supports macOS (arm64, x86_64), unix (x86_64, aarch64) and windows (amd64).
      The downside is that it no longer works with the architecture not listed above.
      We can potentially workaround by searching the FFmpeg binaries available in system (the old way) for
      these system, but since they are not supported by PyTorch, the priority is low.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3460
      
      Differential Revision: D47261885
      
      Pulled By: mthrok
      
      fbshipit-source-id: 223a15e95c9140c95688af968beb35ff40354476
      f77c3e5b
  6. 05 Jul, 2023 1 commit
  7. 03 Jun, 2023 1 commit
  8. 02 Jun, 2023 1 commit
  9. 01 Jun, 2023 1 commit
    • moto's avatar
      Use dlopen for FFmpeg (#3353) · b14ced1a
      moto authored
      Summary:
      This commit changes the way FFmpeg extension is built and used.
      Instead of linking (LGPL) FFmpeg libraries to torchaudio at build time,
      It uses dlopen to search and link them at run time.
      
      For dlopen-ing, we use PyTorch's `at::DynamicLibrary` class, which provides
      portable wrapper.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3353
      
      Differential Revision: D46059199
      
      Pulled By: mthrok
      
      fbshipit-source-id: 4493a5fd8a4c802178d20276522f5334d637307d
      b14ced1a
  10. 09 May, 2023 2 commits
  11. 25 Apr, 2023 1 commit
  12. 07 Apr, 2023 1 commit
  13. 03 Apr, 2023 1 commit
  14. 21 Mar, 2023 1 commit
    • Moto Hira's avatar
      Refactor the internal of StreamReader (#3188) · f6e3d070
      Moto Hira authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/audio/pull/3188
      
      Refactor the process after decoding in StreamRader.
      
      The post-decode process consists of three parts,
      1. preprocessing using FilterGraph
      2. conversion to Tensor
      3. store in Buffer
      
      The FilterGraph class is a thin wrapper around AVFilterGraph
      structure from FFmpeg and it is agnostic to media type. However
      Tensor conversion and buffering consists of bunch of different
      logics.
      
      Currently, conversion process is abstracted away with
      template, i.e. `template<typename Conversion> Buffer`, and the whole
      process is implemeted in Sink class which consists of `FilterGraph`
      and `Buffer` which internally contains Conversion logic, even
      though conversion logic and buffer have nothing in common and beter
      logically separated.
      
      The new implementation replaces `Sink` class with `IPostDecodeProcess`
      interface, which contains the three components.
      The different post process is implemented as a template argument of the
      actual implementation, i.e.
      
      ```c++
      template<typename Converter, typename Buffer>
      ProcessImpl : IPostDecodeProcess
      ```
      
      and stored as `unique_ptr<IPostDecodeProcess>` on `StreamProcessor`.
      ([functionoid pattern](https://isocpp.org/wiki/faq/pointers-to-members#functionoids), which allows to eliminate all the branching based on the media format.)
      
      Note:
      This implementation was not possible at the initial version of
      StreamReader, as there was no way of knowing the media attributes coming out
      of `AVFilterGraph`. https://github.com/pytorch/audio/pull/3155 and https://github.com/pytorch/audio/pull/3183
      added features to parse it properly, so we can finally make the post processing strongly-typed.
      
      Reviewed By: hwangjeff
      
      Differential Revision: D44242647
      
      fbshipit-source-id: 96b8c6c72a2b8af4fa86a9b02292c65078ee265b
      f6e3d070
  15. 17 Mar, 2023 1 commit
  16. 16 Mar, 2023 1 commit
    • moto's avatar
      Refactor Tensor conversion in StreamReader (#3170) · 014d7140
      moto authored
      Summary:
      Currently, when the Buffer converts AVFrame* to torch::Tensor,
      it checks the format at each time a frame is passed, and
      perform the conversion.
      
      This commit changes it so that the conversion operation is
      pre-instantiated at the time outside stream is configured.
      
      It introduces Converter implementations for various formats,
      and use template to embed them in Buffer class.
      This way, branching like if/switch are eliminated from
      decoding path.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3170
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D44048293
      
      Pulled By: mthrok
      
      fbshipit-source-id: 30d8b240a5695d7513f499ce17853f2f0ffcab9f
      014d7140
  17. 09 Mar, 2023 1 commit
    • Moto Hira's avatar
      Refactor StreamReader - let StreamProcessor own codec context (#3157) · a8f4e97b
      Moto Hira authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/audio/pull/3157
      
      AVCodecContext plays central role in decoding and encoding.
      Currently in StreamReader, the object is owned inside of Decoder class
      and it's not accessible from other objects.
      
      This commit move the ownership of AVCodecContext out of Decoder to
      StreamProcessor class so that other components can check access its field.
      
      Also, the Decoder class, which is super thin wrapper around AVCodecContext
      object, is now absorbed to StreamProcessor class.
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D43924664
      
      fbshipit-source-id: e53254955d9ce16871e393bcd8bb2794ce6a51ff
      a8f4e97b
  18. 06 Mar, 2023 1 commit
    • Moto Hira's avatar
      Refactor encoding process (#3146) · 8a9ab2a4
      Moto Hira authored
      Summary:
      After the series of simplification, audio/video encoding processes
      can be merged, and it allows the gets rid of the boilerplate code.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3146
      
      (Note: this ignores all push blocking failures!)
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D43815640
      
      fbshipit-source-id: 2a14e372b2cc75db7eeabc27d855a24c3f7d5063
      8a9ab2a4
  19. 02 Mar, 2023 1 commit
  20. 01 Mar, 2023 1 commit
    • Moto Hira's avatar
      Extract image conversions into separate class (#3120) · 0bf00d20
      Moto Hira authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/audio/pull/3120
      
      This commits extract image conversion ops into ImageTensorConverter class, and make it independent from OutputStream class.
      
      ImageTensorConverter class implementes range-based for-loop interface, like
      
      ```
      for (auto const& frame : ImageTensorConverter::convert(...)) {
          post_process_with_avframe(frame);
      }
      ```
      
      This allows to decouple encoder from image conversion.
      
      Reviewed By: nateanl
      
      Differential Revision: D43666296
      
      fbshipit-source-id: 754efe677bc7695b3f138a6d076be2106e186b79
      0bf00d20
  21. 27 Feb, 2023 3 commits
  22. 24 Feb, 2023 2 commits
    • moto's avatar
      Cleanup ffmpeg bidings (#3095) · b46628ba
      moto authored
      Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3095
      
      Reviewed By: nateanl
      
      Differential Revision: D43544998
      
      Pulled By: mthrok
      
      fbshipit-source-id: 4359cdbbdbee53084016a84129cb3d65900b0457
      b46628ba
    • moto's avatar
      Bind StreamReader/Writer with PyBind11 (#3091) · b012b452
      moto authored
      Summary:
      This commit is kind of clean up and preparation for future
      development.
      
      We plan to pass around more complicated objects among
      StreamReader and StreamWriter, and TorchBind is not expressive enough
      for defining intermediate object, so we use PyBind11 for binding
      StreamWriter.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3091
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D43515714
      
      Pulled By: mthrok
      
      fbshipit-source-id: 9097bb104bbf8c1536a5fab6f87447c08b10a7f2
      b012b452
  23. 23 Feb, 2023 2 commits
    • moto's avatar
      Replace c10::Dict with std::map in StreamReader/Writer (#3092) · c3310018
      moto authored
      Summary:
      This commit is kind of clean up and preparation for future development.
      
      We plan to pass around more complicated objects among StreamReader and StreamWriter, and TorchBind is not expressive enough for defining intermediate object, so we want to use PyBind11 for binding StreamReader/Writer.
      
      PyBind11 converts Python dict into std::map, while TorchBind converts it into c10::Dict. Because of this descrepancy, conversion from c10::Dict to std::map have to happen in multiple places, and this makes the binding code thicker as it requires to wrapper methods.
      
      Using std::map reduces the number of wrapper methods / conversions, because the same method can be bound for file-like object and the others.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3092
      
      Reviewed By: nateanl
      
      Differential Revision: D43524808
      
      Pulled By: mthrok
      
      fbshipit-source-id: f7467c66ccd37dbf4abc337bbb18ffaac21a0058
      c3310018
    • mthrok's avatar
      Remove Tensor binding from StreamReader (#3093) · d3c9295c
      mthrok authored
      Summary:
      Remove the Tensor input support from StreamReader
      
      Follow up of https://github.com/pytorch/audio/pull/3086
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3093
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D43526066
      
      Pulled By: mthrok
      
      fbshipit-source-id: 57ba4866c413649173e1c2c3b23ba7de3231b7bc
      d3c9295c
  24. 26 Jan, 2023 1 commit
    • Moto Hira's avatar
      Abstract away AVFormatContext from StreamReader/Writer constructor (#3007) · 7ea69e61
      Moto Hira authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/audio/pull/3007
      
      Simplify the construction of StreamReader/Writer in C++.
      
      Currently these classes require client code to build AVFormatContext
      manually. This is tedious and not user freindly.
      
      Some client code actually uses the same helper function that
      TorchAudio codebase uses.
      
      This commit moves the helper logic inside of the constructor of
      StreamReader/Writer, so that the signatures of these constructors
      are easy to use and similar to Python interface.
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D42662520
      
      fbshipit-source-id: d95e5236810c48d7d9bd2d89c05d4f60a44b3ba1
      7ea69e61
  25. 29 Dec, 2022 1 commit