- 14 Aug, 2023 1 commit
-
-
moto authored
Summary: Move the actual I/O implementation to `_backend` submodule so that the existing `backend` submodule contains only what's related to legacy backend utilities. Pull Request resolved: https://github.com/pytorch/audio/pull/3549 Reviewed By: huangruizhe Differential Revision: D48253550 Pulled By: mthrok fbshipit-source-id: c23f1664458c723f63e134c7974b3f7cf17a1e98
-
- 10 Aug, 2023 1 commit
-
-
moto authored
Summary: * Move Backend implementations to separate files Pull Request resolved: https://github.com/pytorch/audio/pull/3547 Reviewed By: hwangjeff Differential Revision: D48233538 Pulled By: mthrok fbshipit-source-id: bcc63fc07a5dfcd48929f0a2fb64bfcb3282eb92
-
- 29 Jul, 2023 1 commit
-
-
moto authored
Summary: The I/O functions in _compat module was introduced there so that everything related to FFmpeg is in torchaudio.io and FFmpeg library initialization can be carried out in `torchaudio.io.__init__`. Now that this constraint is removed, (all the initialization happens at `torchaudio._extension.__init__`) and `_compat` is only used by FFmpeg dispatcher backend, we move the module to `torchaudio._backend` for better locality. Pull Request resolved: https://github.com/pytorch/audio/pull/3518 Reviewed By: huangruizhe Differential Revision: D47877412 Pulled By: mthrok fbshipit-source-id: aa18c8cb6e5d5360950df5158c33c653e37c565f
-
- 12 Jul, 2023 1 commit
-
-
moto authored
Summary: This commit introduces support for multiple FFmpeg versions for OSS binary distributions. Currently torchaudio only works with FFmpeg 4. This is inconvenient from installing to runtime linking. This commit allows to pick FFmpeg 4, 5 or 6 at runtime, instead of just looking for v4. The way it works is that we compile the FFmpeg extension three times with different FFmpeg and ship them. At runtime, we look for libavutil of specific version and when one is found, load the corresponding FFmpeg extension. The order of preference is 6, 5, then 4. To make the build process simple and reproducible, we use pre-built binaries of FFmpeg during the build. They are LGPL and downloaded from S3 at build time, instead of building every time. The use of pre-built binaries as scaffolding limits the system that can build torchaudio, so it also introduces single FFmpeg version support mode. setting FFMPEG_ROOT during the build will change the way binaries are built so that it will only support one specific version of FFmpeg. Pull Request resolved: https://github.com/pytorch/audio/pull/3464 Differential Revision: D47300223 Pulled By: mthrok fbshipit-source-id: 560c7968315e4c8922afa11a4693f648c0356d04
-
- 01 Jun, 2023 1 commit
-
-
moto authored
Summary: The arguments of TorchAudio's save function ("format", "bits_per_sample" and "encoding") are not one-to-one mapping to the arguments of FFmpeg encoding. For example, to use vorbis codec, FFmpeg expects "ogg" container/extension with "vorbis" encoder. It does not recognize "vorbis" extension like TorchAudio (libsox) does. This commit refactors the logic to parse/map the arguments. As a result it now properly works with vorbis and mp3 extension. Pull Request resolved: https://github.com/pytorch/audio/pull/3387 Reviewed By: hwangjeff Differential Revision: D46328787 Pulled By: mthrok fbshipit-source-id: 36f993952a062bfec58a8b51be6aa86297571f90
-
- 07 Apr, 2023 1 commit
-
-
moto authored
Summary: Follow up of https://github.com/pytorch/audio/issues/3243. Save compat module had different semantics than info and load, which requires different way of performing path normalization. Pull Request resolved: https://github.com/pytorch/audio/pull/3248 Reviewed By: hwangjeff Differential Revision: D44774997 Pulled By: mthrok fbshipit-source-id: 4b967ae3ca6b45850d455b8e95aaa31762c5457e
-
- 04 Apr, 2023 1 commit
-
-
moto authored
Summary: Recently, we added bunch of options to make StreamReader/Writer flexible. As a result, their methods have many number of arguments, and some of them have semantic grouping. For example, the arguments of ``StreamWriter.add_video_stream`` are roughly grouped as follow; - Information about input media format `frame_rate`, `width`, `height`, `format` - Information about encoder `encoder`, `encoder_option` - Information about codec configuration `codec_config` - Information about encode media format `encoder_format`, `encoder_frame_rate`, `encoder_width`, `encoder_height` - Information about additional processing `filter_desc` - Hardware acceleration `hw_accel` We do not know what arguments will be added in the future, but when we do, we want to keep them roughly grouped, by inserting the new argument somewhere in a middle without breaking backward compatibility. This commit puts most of them in keyword-only argument, so that we can rearrange them without breaking backward compatibility. Pull Request resolved: https://github.com/pytorch/audio/pull/3227 Reviewed By: hwangjeff Differential Revision: D44681620 Pulled By: mthrok fbshipit-source-id: b55f6168f4c2f3d0f59731b9bb0db4ae54e5a90f
-
- 01 Mar, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: `Dict` is not used. Fix styecheck by removing the import of `Dict`. Pull Request resolved: https://github.com/pytorch/audio/pull/3126 Reviewed By: mthrok Differential Revision: D43699410 Pulled By: nateanl fbshipit-source-id: 8d6b5335124903453387c488f96f297d6fe3c819
-
- 24 Feb, 2023 2 commits
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3095 Reviewed By: nateanl Differential Revision: D43544998 Pulled By: mthrok fbshipit-source-id: 4359cdbbdbee53084016a84129cb3d65900b0457
-
moto authored
Summary: This commit is kind of clean up and preparation for future development. We plan to pass around more complicated objects among StreamReader and StreamWriter, and TorchBind is not expressive enough for defining intermediate object, so we use PyBind11 for binding StreamWriter. Pull Request resolved: https://github.com/pytorch/audio/pull/3091 Reviewed By: xiaohui-zhang Differential Revision: D43515714 Pulled By: mthrok fbshipit-source-id: 9097bb104bbf8c1536a5fab6f87447c08b10a7f2
-
- 15 Feb, 2023 1 commit
-
-
Jeff Hwang authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3058 Adds FFmpeg-based save function. Reviewed By: mthrok Differential Revision: D43264858 fbshipit-source-id: ae3f89012bc2520f3de11af65348ba8f77f0acff
-
- 22 Jan, 2023 1 commit
-
-
moto authored
Summary: This commit makes `StreamReader` report PTS (presentation time stamp) of the returned chunk as well. Example ```python from torchaudio.io import StreamReader s = StreamReader(...) s.add_video_stream(...) for (video_chunk, ) in s.stream(): # video_chunk is Torch tensor type but has extra attribute of PTS print(video_chunk.pts) # reports the PTS of the first frame of the video chunk. ``` For the backward compatibility, we introduce a `_ChunkTensor`, that is a composition of Tensor and metadata, but works like a normal tensor in PyTorch operations. The implementation of `_ChunkTensor` is based on [TrivialTensorViaComposition](https://github.com/albanD/subclass_zoo/blob/0eeb1d68fb59879029c610bc407f2997ae43ba0a/trivial_tensors.py#L83). It was also suggested to attach metadata directly to Tensor object, but the possibility to have the collision on torchaudio's metadata and new attributes introduced in PyTorch cannot be ignored, so we use Tensor subclass implementation. If any unexpected issue arise from metadata attribute name collision, client code can fetch the bare Tensor and continue. Pull Request resolved: https://github.com/pytorch/audio/pull/2975 Reviewed By: hwangjeff Differential Revision: D42526945 Pulled By: mthrok fbshipit-source-id: b4e9422e914ff328421b975120460f3001268f35
-
- 21 Dec, 2022 1 commit
-
-
moto authored
Summary: This commit makes the following changes to the C++ library organization - Move sox-related feature implementations from `libtorchaudio` to `libtorchaudio_sox`. - Remove C++ implementation of `is_sox_available` and `is_ffmpeg_available` as it is now sufficient to check the existence of `libtorchaudio_sox` and `libtorchaudio_ffmpeg` to check the availability. This makes `libtorchaudio_sox` and `libtorchaudio_ffmpeg` independent from `libtorchaudio`. - Move PyBind11-based bindings (`_torchaudio_sox`, `_torchaudio_ffmpeg`) into `torchaudio.lib` so that the built library structure is less cluttered. Background: Originally, when the `libsox` was the only C++ extension and `libtorchaudio` was supposed to contain all the C++ code. The things are different now. We have a bunch of C++ extensions and we need to make the code/build structure more modular. The new `libtorchaudio_sox` contains the implementations and `_torchaudio_sox` contains the PyBin11-based bindings. Pull Request resolved: https://github.com/pytorch/audio/pull/2929 Reviewed By: hwangjeff Differential Revision: D42159594 Pulled By: mthrok fbshipit-source-id: 1a0fbca9e4143137f6363fc001b2378ce6029aa7
-
- 10 Dec, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: The `src` or `dst` argument can be `str` or `file-like object`. Setting it to `str` in type annotation will confuse users that it only accepts `str` type. Pull Request resolved: https://github.com/pytorch/audio/pull/2913 Reviewed By: mthrok Differential Revision: D41896668 Pulled By: nateanl fbshipit-source-id: 1446a9f84186a0376cdbe4c61817fae4d5eaaab4
-
- 02 Nov, 2022 1 commit
-
-
hwangjeff authored
Summary: Partly addresses https://github.com/pytorch/audio/issues/2686 and https://github.com/pytorch/audio/issues/2356. Currently, when the buffer used for file object decoding is insufficiently large, `torchaudio.load` returns a shorter waveform than expected. To deal with this, the user is expected to increase the buffer size via `torchaudio.utils.sox_utils.get_buffer_size`, but this does not influence the buffer used by the FFMpeg fallback. To fix this, this PR introduces changes that apply the buffer size set for the SoX backend to FFMpeg. As a follow-up, we should see whether it's possible to programmatically detect that the buffer's too small and flag it to the user. Pull Request resolved: https://github.com/pytorch/audio/pull/2810 Reviewed By: mthrok Differential Revision: D40906978 Pulled By: hwangjeff fbshipit-source-id: 256fe1da8b21610b05bea9a0e043f484f9ea2e76
-
- 07 Oct, 2022 1 commit
-
-
hwangjeff authored
Summary: Modifies `info_audio` to compute and return number of frames if not found in stream info. This resolves the `num_frames == 0` issue for mp3 that's cited in https://github.com/pytorch/audio/issues/2524. Pull Request resolved: https://github.com/pytorch/audio/pull/2740 Reviewed By: nateanl Differential Revision: D40168639 Pulled By: nateanl fbshipit-source-id: bb45baa0f9cd56844315b04e40ab9835d825fc24
-
- 08 Jun, 2022 1 commit
-
-
moto authored
Summary: Add metadata, such as ID3 (https://github.com/pytorch/audio/commit/7d98db0567cb60fabcc173949b8c08e3a3487ac2)tag to `StreamReaderSourceAudioStream`. Pull Request resolved: https://github.com/pytorch/audio/pull/2461 Reviewed By: hwangjeff Differential Revision: D36985656 Pulled By: mthrok fbshipit-source-id: e66f9e6e980eb57c378cc643a8979b6b7813dae7
-
- 02 Jun, 2022 1 commit
-
-
moto authored
Summary: This commit add fallback mechanism to `info` and `load` functions of sox_io backend. If torchaudio is compiled to use FFmpeg, and runtime dependencies are properly loaded, in case `info` and `load` fail, it fallback to FFmpeg-based implementation. BC-breaking changes: - FFmpeg does not report the number of frames for MP3, this is because MP3 does not store the information of the number of frames. It can be estimated from the audio duration and sample rate, but it might be inaccurate, so we keep it 0. Depends on - https://github.com/pytorch/audio/issues/2416 - https://github.com/pytorch/audio/issues/2417 - https://github.com/pytorch/audio/issues/2418 - https://github.com/pytorch/audio/issues/2423 - https://github.com/pytorch/audio/issues/2427 Pull Request resolved: https://github.com/pytorch/audio/pull/2419 Reviewed By: carolineechen Differential Revision: D36740306 Pulled By: mthrok fbshipit-source-id: 9e2ad095b8b39e41404970de0d8d9b5aaa856c97
-