1. 22 Dec, 2022 1 commit
  2. 21 Dec, 2022 1 commit
    • moto's avatar
      Extract libsox integration from libtorchaudio (#2929) · 1706a72f
      moto authored
      Summary:
      This commit makes the following changes to the C++ library organization
      - Move sox-related feature implementations from `libtorchaudio` to `libtorchaudio_sox`.
      - Remove C++ implementation of `is_sox_available` and `is_ffmpeg_available` as it is now sufficient to check the existence of `libtorchaudio_sox` and `libtorchaudio_ffmpeg` to check the availability. This makes `libtorchaudio_sox` and `libtorchaudio_ffmpeg` independent from `libtorchaudio`.
      - Move PyBind11-based bindings (`_torchaudio_sox`, `_torchaudio_ffmpeg`) into `torchaudio.lib` so that the built library structure is less cluttered.
      
      Background:
      Originally, when the `libsox` was the only C++ extension and `libtorchaudio` was supposed to contain all the C++ code.
      The things are different now. We have a bunch of C++ extensions and we need to make the code/build structure more modular.
      
      The new `libtorchaudio_sox` contains the implementations and `_torchaudio_sox` contains the PyBin11-based bindings.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2929
      
      Reviewed By: hwangjeff
      
      Differential Revision: D42159594
      
      Pulled By: mthrok
      
      fbshipit-source-id: 1a0fbca9e4143137f6363fc001b2378ce6029aa7
      1706a72f
  3. 20 Dec, 2022 1 commit
  4. 19 Dec, 2022 2 commits
    • moto's avatar
      Split extract_archive into dedicated functions. (#2927) · 5807078c
      moto authored
      Summary:
      `extra_archive` in `datasets.utils` does not distinguish the input type, and blindly treats it as tar, then zip in case of failure.
      
      This is an anti-pattern. All the dataset implementations know which archive type the downloaded files are.
      
      This commit splits extract_archive function into dedicated functions, and make each dataset use the correct one.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2927
      
      Reviewed By: carolineechen
      
      Differential Revision: D42154069
      
      Pulled By: mthrok
      
      fbshipit-source-id: bc46cc2af26aa086ef49aa1f9a94b6dedb55f85e
      5807078c
    • moto's avatar
      Remove deprecated/unused functions from datasets.utils (#2926) · d744f33f
      moto authored
      Summary:
      `stream_url`, `download_url` and `validate_file` are not used and not listed in documentation (`download_url` is marked as deprecated) so remove them.
      
      This will also fix the failing bandit workflow.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2926
      
      Reviewed By: carolineechen
      
      Differential Revision: D42153484
      
      Pulled By: mthrok
      
      fbshipit-source-id: 0fccdc7b7e0e40db8046e12f46eb68de57d838ca
      d744f33f
  5. 17 Dec, 2022 1 commit
  6. 16 Dec, 2022 3 commits
  7. 15 Dec, 2022 1 commit
    • DanilBaibak's avatar
      Switch to Nova MacOS Wheel (#2907) · 0be8423d
      DanilBaibak authored
      Summary:
      Switch to Nova MacOS and M1 Wheels. This PR is a step in migrating from CircleCI to the Nova workflow.
      
      - [x] Disable the CircleCI builds for MacOS Wheel.
      - [x] Disable the CircleCI builds for M1 Wheel.
      - [x] Enable the Nova workflow for MacOS Wheel.
      - [x] Enable the Nova workflow for M1 Wheel.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2907
      
      Reviewed By: osalpekar, mthrok
      
      Differential Revision: D42040965
      
      Pulled By: DanilBaibak
      
      fbshipit-source-id: b87f028cf5686bf97265109591fb0a8c1190324c
      0be8423d
  8. 13 Dec, 2022 2 commits
  9. 12 Dec, 2022 1 commit
    • moto's avatar
      Update precise seek behavior for t=0 (#2915) · cbd35438
      moto authored
      Summary:
      It was reported that when videos with invalid PTS values are fed to StreamReader, StreamReader returns only the last frame.
      
      https://github.com/pytorch/vision/blob/677fc939b21a8893f07db4c1f90482b648b6573f/test/assets/videos/RATRACE_wave_f_nm_np1_fr_goo_37.avi
      
      ```
      import torchaudio
      
      src = "RATRACE_wave_f_nm_np1_fr_goo_37.avi"
      
      streamer = torchaudio.io.StreamReader(src=src)
      streamer.add_basic_video_stream(frames_per_chunk=-1)
      streamer.process_all_packets()
      video, = streamer.pop_chunks()
      
      print(video.size(0))  # prints 1, but there are more than 70 frames
      ```
      
      The reason why all the frames are not returned is due to invalid PTS values. All the frames's PTS values are `-9223372036854775808` so the internal mechanism discards them.
      
      The reason why the last frame is output is because when entering drain mode, the discard value of -1 is used, which is interpreted as no discard.
      
      For the second issue, the discard behavior should be consistent across regular decoding and drain mode.
      
      For the first issue, although the normal behavior is not guaranteed for such invalid input, we can support the case where one reads video from start (or when one seeks into t=0)
      
       ---
      
      This commits make the following changes to address the above two.
      
      1. Define the discard_before_pts attribtue on StreamProcessor, so that StreamProcessor is aware of the discard behavior without being told by StreamReader, and its behavior is consistent between regular decoding and drain.
      
         This gets rid of the discard_before_pts computation that is currently happening at the every time a frame is processed, so this should improve the peformance a bit.
      
      2. Change the meaning of discard_before_pts, so that when it's 0, no discard happens. With this change, the negative value is not necessary so we put it a UB status.
      
      Note:
         Even with those changes seeking videos with invalid PTS is not plausible, client codes can implement a fallback which decodes frames first and discard undesired ones.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2915
      
      Reviewed By: nateanl
      
      Differential Revision: D41957784
      
      Pulled By: mthrok
      
      fbshipit-source-id: 2dafdbada5aa33bfc81c986306f80642ba6277df
      cbd35438
  10. 11 Dec, 2022 1 commit
  11. 10 Dec, 2022 2 commits
  12. 09 Dec, 2022 5 commits
    • Zhaoheng Ni's avatar
      Fix integration test for WAV2VEC2_ASR_LARGE_LV60K_10M (#2910) · 90162812
      Zhaoheng Ni authored
      Summary:
      After https://github.com/pytorch/audio/issues/2873, the pre-trained Wav2Vec2 models with larger datasets can get better performances. The PR fixes the integration test of bundle `WAV2VEC2_ASR_LARGE_LV60K_10M` which predicts the word `CURIOUSITY` to `CURIOUSSITY` before but now to `CURIOUSITY` correctly.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2910
      
      Reviewed By: mthrok
      
      Differential Revision: D41881919
      
      Pulled By: nateanl
      
      fbshipit-source-id: 236fd00b983a5205c731f3efa31033a6b8257cab
      90162812
    • moto's avatar
      Update author and maintainer info (#2911) · eb8b1bda
      moto authored
      Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2911
      
      Reviewed By: carolineechen
      
      Differential Revision: D41887854
      
      Pulled By: mthrok
      
      fbshipit-source-id: eb91773ec67b4cda2d70733df450956d83742509
      eb8b1bda
    • Moto Hira's avatar
      Fix duplicated memory allocation in StreamWriter (#2906) · 90c456de
      Moto Hira authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/audio/pull/2906
      
      The correct way to create AVFormatContext* for output is to pass an address of an uninitialized *AVFormatContext struct to `avformat_alloc_output_context2` function.
      
      The current code pre-allocates AVFormatContext* with `avformat_alloc_context`, then this allocated object is lost inside of `avformat_alloc_output_context2`.
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D41865685
      
      fbshipit-source-id: 9a9dc83b5acfe9b450f191fe716c85ebb5a5d842
      90c456de
    • Moto Hira's avatar
      Fix wrong frame allocation in StreamWriter (#2905) · 3518df48
      Moto Hira authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/audio/pull/2905
      
      In StreamWriter, if the tensor format is different from the encoding format, then a FilterGraph object is automatically inserted to convert the format.
      
      The FilterGraph object operates on AVFrames. The input AVFrame must be allocated by us, but the output AVFrames is filled by FilterGraph, thus no need to allocate it.
      
      Now the output AVFrame is used as input to encoder regardless of whether FilterGraph was inserted. Thus the output AVFrame has to be manually allocated by us when FilterGraph is not used.
      
      The current code flips this condition and incorrectly allocates AVFrame when FilterGraph is present and does not allocate otherwise.
      
      This commit fix that.
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D41866198
      
      fbshipit-source-id: 40799c147dc8166a979ecfb58ed8e502539a6aed
      3518df48
    • atalman's avatar
      Toggle on/off ffmpeg test if needed (#2901) · ccda545c
      atalman authored
      Summary:
      Toggle on/off ffmpeg test if needed
      By default it ON, hence should not affect any current tests.
      To toggle ON no change required.
      To toggle OFF use:
      ```
      smoke_test.py --no-ffmpeg
      ```
      
      To be used when calling from builder currently. Since we do not install ffmpeg currently.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2901
      
      Reviewed By: carolineechen, mthrok
      
      Differential Revision: D41874976
      
      Pulled By: atalman
      
      fbshipit-source-id: c57b19f37c63a1f476f93a5211550e980e67d9c7
      ccda545c
  13. 08 Dec, 2022 4 commits
  14. 07 Dec, 2022 3 commits
  15. 06 Dec, 2022 1 commit
  16. 04 Dec, 2022 1 commit
  17. 02 Dec, 2022 1 commit
  18. 30 Nov, 2022 2 commits
  19. 29 Nov, 2022 5 commits
  20. 28 Nov, 2022 2 commits