1. 21 Mar, 2023 6 commits
  2. 20 Mar, 2023 3 commits
    • Moto Hira's avatar
      Refactor StreamReader internals (#3184) · c17226a0
      Moto Hira authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/audio/pull/3184
      
      Tweak internals of StreamReader
      1. Pass time_base to Buffer class so that
          * no need to pass frame_duration separately
          * Conversion of PTS to double type can be delayed until when it's popped
      2. Merge `get_output_timebase` method into `get_output_stream_info`.
      3. If filter description is not provided, fill in null filter at top-level StreamReader
      4. Expose filer and filter description from Sink class to get rid of wrapper get methods.
      
      Reviewed By: nateanl
      
      Differential Revision: D44207976
      
      fbshipit-source-id: f25ac9be69c9897e9dcec0c6e978f29b83b166e8
      c17226a0
    • Moto Hira's avatar
      Fix GPU memory leak on StreamReader (#3186) · 9533d300
      Moto Hira authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/audio/pull/3186
      
      Fix the GPU memory leak introduced in https://github.com/pytorch/audio/pull/3183
      
      The HW frames context is owned by AVCodecContext.
      The removed `av_buffer_ref` call increased the ferenrence counting unnecessarily,
      and prevented AVCodecContext from feeing the resource.
      
      (Note: this ignores all push blocking failures!)
      
      Reviewed By: nateanl
      
      Differential Revision: D44231876
      
      fbshipit-source-id: 9be2c33049dd02a3fa82a85271de7fb62e5b09ea
      9533d300
    • moto's avatar
      Support CUDA frame in FilterGraph (#3183) · c5b96558
      moto authored
      Summary:
      This commit adds CUDA frame support to FilterGraph
      
      It initializes and attaches CUDA frames context to FilterGraph,
      so that CUDA frames can be processed in FilterGraph.
      
      As a result, it enables
      1. CUDA filter support such as `scale_cuda`
      2. Properly retrieve the pixel format coming out of FilterGraph when
         CUDA HW acceleration is enabled. (currently it is reported as "cuda")
      
      Resolves https://github.com/pytorch/audio/issues/3159
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3183
      
      Reviewed By: hwangjeff
      
      Differential Revision: D44183722
      
      Pulled By: mthrok
      
      fbshipit-source-id: 522d21039c361ddfaa87fa89cf49c19d210ac62f
      c5b96558
  3. 17 Mar, 2023 4 commits
  4. 16 Mar, 2023 2 commits
    • jiyuntu-eero's avatar
      Fix initialization of `get_trellis`. (#3172) · a6b34a5d
      jiyuntu-eero authored
      Summary:
      Fix https://github.com/pytorch/audio/issues/3166. In `get_trellis` method, the index of blank symbol is regarded as 0 by default. It should be changed to `blank_id`.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3172
      
      Reviewed By: mthrok
      
      Differential Revision: D44090889
      
      Pulled By: nateanl
      
      fbshipit-source-id: d119f4ded895d31aeefd59f8d975224870100264
      a6b34a5d
    • moto's avatar
      Refactor Tensor conversion in StreamReader (#3170) · 014d7140
      moto authored
      Summary:
      Currently, when the Buffer converts AVFrame* to torch::Tensor,
      it checks the format at each time a frame is passed, and
      perform the conversion.
      
      This commit changes it so that the conversion operation is
      pre-instantiated at the time outside stream is configured.
      
      It introduces Converter implementations for various formats,
      and use template to embed them in Buffer class.
      This way, branching like if/switch are eliminated from
      decoding path.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3170
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D44048293
      
      Pulled By: mthrok
      
      fbshipit-source-id: 30d8b240a5695d7513f499ce17853f2f0ffcab9f
      014d7140
  5. 15 Mar, 2023 2 commits
    • Carl Parker's avatar
      Enhance UX on TorchAudio pages to improve awareness of doc versioning (#3167) · 92f2ea89
      Carl Parker authored
      Summary:
      - Boldface the version-selection UX and increase size by three percent.
      - Add text to breadcrumbs to indicate version and stability.
      - New `breadcrumbs.html` in `_templates` overrides Sphinx version.
      
      I create a new variable in `conf.py`, **version_stable**, which has the version number for the most-recent stable release. I define this variable in the **html_context** dictionary so that it is visible to the templates.
      
      I use this approach because I was not able to find any other way of discerning the current stable release during the build. Note that the `versions.html` file--which identifies the current stable release--appears to be available only in the **gh-pages** branch and so it is not available at build time.
      
      However, this means that someone will need to update `conf.py` whenever the current stable release changes.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3167
      
      Reviewed By: mthrok
      
      Differential Revision: D44112224
      
      Pulled By: carljparker
      
      fbshipit-source-id: e76f5cb6734a784d161342964459577aa9b64cac
      92f2ea89
    • Zhaoheng Ni's avatar
      Fix MFCC autograd test (#3169) · ee0b97f2
      Zhaoheng Ni authored
      Summary:
      Autograd test randomly fails for MFCC transform. Fix it by increasing `nondet_tol` to `1e-10`.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3169
      
      Reviewed By: xiaohui-zhang, mthrok
      
      Differential Revision: D44069673
      
      Pulled By: nateanl
      
      fbshipit-source-id: addafefe381104e778b09bfbaafb322df1d9054c
      ee0b97f2
  6. 14 Mar, 2023 2 commits
  7. 09 Mar, 2023 2 commits
    • Moto Hira's avatar
      Refactor StreamReader - let StreamProcessor own codec context (#3157) · a8f4e97b
      Moto Hira authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/audio/pull/3157
      
      AVCodecContext plays central role in decoding and encoding.
      Currently in StreamReader, the object is owned inside of Decoder class
      and it's not accessible from other objects.
      
      This commit move the ownership of AVCodecContext out of Decoder to
      StreamProcessor class so that other components can check access its field.
      
      Also, the Decoder class, which is super thin wrapper around AVCodecContext
      object, is now absorbed to StreamProcessor class.
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D43924664
      
      fbshipit-source-id: e53254955d9ce16871e393bcd8bb2794ce6a51ff
      a8f4e97b
    • Moto Hira's avatar
      Remove private helper methods from StreamReader (#3156) · 430dd17c
      Moto Hira authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/audio/pull/3156
      
      Remove helper methods that are not worthy of being private method
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D43919385
      
      fbshipit-source-id: 2ce4efaf5ec9418076e78c7ce1f842e0dd7e3028
      430dd17c
  8. 08 Mar, 2023 3 commits
    • cai525's avatar
      Fix documentation of functional and transforms (#3134) · 85cb37e2
      cai525 authored
      Summary:
      Address #3101. The documentation for `power=1` should represent magnitude instead of energy.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3134
      
      Reviewed By: mthrok
      
      Differential Revision: D43910652
      
      Pulled By: nateanl
      
      fbshipit-source-id: e0768438e819222a5dde6b86c5123ab0e8af59fb
      85cb37e2
    • moto's avatar
      Include format information after filter (#3155) · 146195d8
      moto authored
      Summary:
      This commit adds fields to OutputStream, which shows the result
      of fitlers, such as width and height after filtering.
      
      Before
      
      ```
      OutputStream(
          source_index=0,
          filter_description='fps=3,scale=width=320:height=320,format=pix_fmts=gray')
      ```
      
      After
      
      ```
      OutputVideoStream(
          source_index=0,
          filter_description='fps=3,scale=width=320:height=320,format=pix_fmts=gray',
          media_type='video',
          format='gray',
          width=320,
          height=320,
          frame_rate=3.0)
      ```
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3155
      
      Reviewed By: nateanl
      
      Differential Revision: D43882399
      
      Pulled By: mthrok
      
      fbshipit-source-id: 620676b1a06f293fdd56de8203a11120f228fa2d
      146195d8
    • moto's avatar
      Support overwriting PTS in StreamWriter (#3135) · 8d2f6f8d
      moto authored
      Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3135
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D43724273
      
      Pulled By: mthrok
      
      fbshipit-source-id: 9b52823618948945a26e57d5b3deccbf5f9268c1
      8d2f6f8d
  9. 07 Mar, 2023 5 commits
  10. 06 Mar, 2023 1 commit
    • Moto Hira's avatar
      Refactor encoding process (#3146) · 8a9ab2a4
      Moto Hira authored
      Summary:
      After the series of simplification, audio/video encoding processes
      can be merged, and it allows the gets rid of the boilerplate code.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3146
      
      (Note: this ignores all push blocking failures!)
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D43815640
      
      fbshipit-source-id: 2a14e372b2cc75db7eeabc27d855a24c3f7d5063
      8a9ab2a4
  11. 04 Mar, 2023 2 commits
  12. 03 Mar, 2023 3 commits
  13. 02 Mar, 2023 5 commits