1. 24 May, 2023 5 commits
  2. 23 May, 2023 6 commits
  3. 22 May, 2023 4 commits
  4. 21 May, 2023 2 commits
  5. 20 May, 2023 1 commit
  6. 19 May, 2023 1 commit
  7. 17 May, 2023 4 commits
    • moto's avatar
      Improve the performance of YUV420P frame conversion (#3342) · 72d3fe09
      moto authored
      Summary:
      This commit improve the performance of conversions of YUV420P format from AVFrame to torch Tensor.
      
      It changes two things;
      1. Change the implementation of nearest-neighbor upsampling from `torch::nn::functional::interpolate` to manual data copy.
      2.  Get rid of intermediate UV plane copy
      
      The following compares the time it takes to process 30 seconds of YUV420P frame at 25 FPS of resolution 320x240. The measurement times are sorted by values.
      
      Some observations
      * `torch::nn::functional::interpolate` with `torch::kNearest` option is not as fast as copying data manually.
      * switching from `interpolate` to manual data copy reduces the variance.
      
      run | main | 1 | 1+2 | improvement (from main to 1+2)
      -- | -- | -- | -- | --
      1 | 0.452250583 | 0.417490125 | 0.40155375 | 11.21%
      2 | 0.462039958 | 0.42006675 | 0.401764125 | 13.05%
      3 | 0.463067666 | 0.42416 | 0.402651334 | 13.05%
      4 | 0.464228166 | 0.424545458 | 0.402985667 | 13.19%
      5 | 0.465777375 | 0.425629208 | 0.405604625 | 12.92%
      6 | 0.469628666 | 0.427044333 | 0.40628525 | 13.49%
      7 | 0.475935125 | 0.42805875 | 0.406412167 | 14.61%
      8 | 0.482277667 | 0.429921209 | 0.407279 | 15.55%
      9 | 0.496695208 | 0.431182792 | 0.442013791 | 11.01%
      10 | 0.546653625 | 0.541639584 | 0.4711585 | 13.81%
      
      [second]
      
      Increasing the resolution, the improvement is smaller but is consistent.
      
      run | main | 1+2 | improvement
      -- | -- | -- | --
      1 | 4.032393 | 3.991784667 | 1.01%
      2 | 4.052248084 | 3.992672208 | 1.47%
      3 | 4.07705575 | 4.000541666 | 1.88%
      4 | 4.143954792 | 4.020671584 | 2.98%
      5 | 4.170711959 | 4.025753125 | 3.48%
      6 | 4.240229292 | 4.045504875 | 4.59%
      7 | 4.267384042 | 4.045588125 | 5.20%
      8 | 4.277025958 | 4.061980083 | 5.03%
      9 | 4.312192042 | 4.163251959 | 3.45%
      10 | 4.406109875 | 4.312560334 | 2.12%
      
      <details><summary>code</summary>
      
      ```python
      import time
      
      from torchaudio.io import StreamReader
      
      def test():
          r = StreamReader(src="testsrc=duration=30", format="lavfi")
          # r = StreamReader(src="testsrc=duration=30:size=1080x720", format="lavfi")
          r.add_video_stream(-1, filter_desc="format=yuv420p")
          t0 = time.monotonic()
          r.process_all_packets()
          elapsed = time.monotonic() - t0
          print(elapsed)
      
      for _ in range(10):
          test()
      ```
      </details>
      
      <details><summary>env</summary>
      
      ```
      PyTorch version: 2.1.0.dev20230325
      Is debug build: False
      CUDA used to build PyTorch: None
      ROCM used to build PyTorch: N/A
      
      OS: macOS 13.3.1 (arm64)
      GCC version: Could not collect
      Clang version: 14.0.6
      CMake version: version 3.22.1
      Libc version: N/A
      
      Python version: 3.9.16 (main, Mar  8 2023, 04:29:24)  [Clang 14.0.6 ] (64-bit runtime)
      Python platform: macOS-13.3.1-arm64-arm-64bit
      Is CUDA available: False
      CUDA runtime version: No CUDA
      CUDA_MODULE_LOADING set to: N/A
      GPU models and configuration: No CUDA
      Nvidia driver version: No CUDA
      cuDNN version: No CUDA
      HIP runtime version: N/A
      MIOpen runtime version: N/A
      Is XNNPACK available: True
      
      CPU:
      Apple M1
      
      Versions of relevant libraries:
      [pip3] torch==2.1.0.dev20230325
      [pip3] torchaudio==2.1.0a0+541b525
      [conda] pytorch                   2.1.0.dev20230325         py3.9_0    pytorch-nightly
      [conda] torchaudio                2.1.0a0+541b525           dev_0    <develop>
      ```
      
      </details>
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3342
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D45947716
      
      Pulled By: mthrok
      
      fbshipit-source-id: 17e5930f57544b4f2e48a9b2185464694a88ab68
      72d3fe09
    • moto's avatar
      Improve the performance of NV12 frame conversion (#3344) · c11661e0
      moto authored
      Summary:
      Similar to https://github.com/pytorch/audio/pull/3342, this commit improves the performance of NV12 frame conversion.
      
      It changes two things;
      
      - Change the implementation of nearest-neighbor upsampling from `torch::nn::functional::interpolate` to manual data copy.
      - Get rid of intermediate UV plane copy
      
      with 320x240
      
      run | main | pr | improvement
      -- | -- | -- | --
      1 | 0.600671417 | 0.464993125 | 22.59%
      2 | 0.638846084 | 0.456763542 | 28.50%
      3 | 0.64158175 | 0.458295333 | 28.57%
      4 | 0.649868584 | 0.455450583 | 29.92%
      5 | 0.612171333 | 0.462435625 | 24.46%
      6 | 0.6128095 | 0.456716166 | 25.47%
      7 | 0.632084583 | 0.463357083 | 26.69%
      8 | 0.610733083 | 0.46148625 | 24.44%
      9 | 0.613825834 | 0.4559555 | 25.72%
      10 | 0.653857458 | 0.455375375 | 30.36%
      
      [second]
      
      with 1080x720 video
      
      run | main | pr | improvement
      -- | -- | -- | --
      1 | 4.984154333 | 4.21090375 | 15.51%
      2 | 4.988090625 | 4.239649375 | 15.00%
      3 | 4.988896375 | 4.227277458 | 15.27%
      4 | 4.998186584 | 4.161077042 | 16.75%
      5 | 5.06180425 | 4.191672584 | 17.19%
      6 | 5.108769667 | 4.198468458 | 17.82%
      7 | 5.151363625 | 4.181942167 | 18.82%
      8 | 5.199527875 | 4.239319084 | 18.47%
      9 | 5.224903708 | 4.194901959 | 19.71%
      10 | 5.333422583 | 4.320925792 | 18.98%
      
      [second]
      
      <details><summary>code</summary>
      
      ```python
      import time
      
      from torchaudio.io import StreamReader
      
      def test():
          r = StreamReader(src="testsrc=duration=30", format="lavfi")
          # r = StreamReader(src="testsrc=duration=30:size=1080x720", format="lavfi")
          r.add_video_stream(-1, filter_desc="format=nv12")
          t0 = time.monotonic()
          r.process_all_packets()
          elapsed = time.monotonic() - t0
          print(elapsed)
      
      for _ in range(10):
          test()
      ```
      </details>
      
      <details><summary>env</summary>
      
      ```
      PyTorch version: 2.1.0.dev20230325
      Is debug build: False
      CUDA used to build PyTorch: None
      ROCM used to build PyTorch: N/A
      
      OS: macOS 13.3.1 (arm64)
      GCC version: Could not collect
      Clang version: 14.0.6
      CMake version: version 3.22.1
      Libc version: N/A
      
      Python version: 3.9.16 (main, Mar  8 2023, 04:29:24)  [Clang 14.0.6 ] (64-bit runtime)
      Python platform: macOS-13.3.1-arm64-arm-64bit
      Is CUDA available: False
      CUDA runtime version: No CUDA
      CUDA_MODULE_LOADING set to: N/A
      GPU models and configuration: No CUDA
      Nvidia driver version: No CUDA
      cuDNN version: No CUDA
      HIP runtime version: N/A
      MIOpen runtime version: N/A
      Is XNNPACK available: True
      
      CPU:
      Apple M1
      
      Versions of relevant libraries:
      [pip3] torch==2.1.0.dev20230325
      [pip3] torchaudio==2.1.0a0+541b525
      [conda] pytorch                   2.1.0.dev20230325         py3.9_0    pytorch-nightly
      [conda] torchaudio                2.1.0a0+541b525           dev_0    <develop>
      ```
      
      </details>
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3344
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D45948511
      
      Pulled By: mthrok
      
      fbshipit-source-id: ae9b300cbcb4295f3f7470736f258280005a21e5
      c11661e0
    • Carl Parker's avatar
      Fix for breadcrumbs displaying "Old version (stable)" on Nightly build (#3333) · 3ffd76c8
      Carl Parker authored
      Summary:
      Previously, `breadcrumbs.html` identified a nightly build version by the prefix "Nightly" which would normally be prepended to the version in `conf.py`. However, the version string is coming through without the "Nightly" prefix, so this change causes `breadcrumbs.html` to key on the substring "dev" instead.
      
      The reason we aren't getting "Nightly" is apparently because the environment variable BUILD_VERSION is available, so `conf.py` is using the value of that env var instead of the version string imported from the `torchaudio` module itself, which actually appears to be incorrect; see below.
      
      If I install torchaudio using
      
          conda install torchaudio -c pytorch-nightly
      
      then `torchaudio.__version__` returns the incorrect version string:
      
          2.0.0.dev20230309
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3333
      
      Reviewed By: mthrok
      
      Differential Revision: D45926466
      
      Pulled By: carljparker
      
      fbshipit-source-id: d5516f2d9f1716c2400d3e9b285bd5d32b4b3a77
      3ffd76c8
    • moto's avatar
      Add 420p10le CPU support to StreamReader (#3332) · c12f4734
      moto authored
      Summary:
      This commit add support to decode YUV420P010LE format.
      
      The image tensor returned by this format
      - NCHW format (C == 3)
      - int16 type
      - value range [0, 2^10).
      
      Note that the value range is different from what "hevc_cuvid" decoder
      returns. "hevc_cuvid" decoder uses full range of int16 (internally,
      it's uint16) to express the color (with some intervals), but the values
      returned by CPU "hevc" decoder are with in [0, 2^10).
      
      Address https://github.com/pytorch/audio/issues/3331
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3332
      
      Reviewed By: hwangjeff
      
      Differential Revision: D45925097
      
      Pulled By: mthrok
      
      fbshipit-source-id: 4e669b65c030f388bba2fdbb8f00faf7e2981508
      c12f4734
  8. 16 May, 2023 3 commits
  9. 15 May, 2023 1 commit
  10. 11 May, 2023 3 commits
  11. 10 May, 2023 4 commits
  12. 09 May, 2023 6 commits