- 26 Oct, 2023 1 commit
-
-
moto-meta authored
Differential Revision: D50696105 Pull Request resolved: https://github.com/pytorch/audio/pull/3682
-
- 25 Oct, 2023 1 commit
-
-
moto-meta authored
Differential Revision: D50633306 Pull Request resolved: https://github.com/pytorch/audio/pull/3675
-
- 24 Oct, 2023 1 commit
-
-
moto-meta authored
Differential Revision: D50506299 Pull Request resolved: https://github.com/pytorch/audio/pull/3669
-
- 11 Oct, 2023 1 commit
-
-
moto-meta authored
Differential Revision: D50082877 Pull Request resolved: https://github.com/pytorch/audio/pull/3646
-
- 09 Oct, 2023 2 commits
-
-
moto authored
Addresses https://github.com/pytorch/audio/issues/3640
-
moto-meta authored
Differential Revision: D49965263 Pull Request resolved: https://github.com/pytorch/audio/pull/3639
-
- 20 Aug, 2023 1 commit
-
-
moto authored
Summary: Turned out FFmpeg 5 installed via conda reports video frame rate -1. FFmpeg 4 and 6 are fine. This is either a regression in FFmpeg or in the underlying decoding library. Make the reference value adoptive. Pull Request resolved: https://github.com/pytorch/audio/pull/3568 Reviewed By: huangruizhe Differential Revision: D48499621 Pulled By: mthrok fbshipit-source-id: fb64187bcf0dc57b753cb6c05f04d436238f5c51
-
- 12 Jul, 2023 1 commit
-
-
moto authored
Summary: This commit introduces support for multiple FFmpeg versions for OSS binary distributions. Currently torchaudio only works with FFmpeg 4. This is inconvenient from installing to runtime linking. This commit allows to pick FFmpeg 4, 5 or 6 at runtime, instead of just looking for v4. The way it works is that we compile the FFmpeg extension three times with different FFmpeg and ship them. At runtime, we look for libavutil of specific version and when one is found, load the corresponding FFmpeg extension. The order of preference is 6, 5, then 4. To make the build process simple and reproducible, we use pre-built binaries of FFmpeg during the build. They are LGPL and downloaded from S3 at build time, instead of building every time. The use of pre-built binaries as scaffolding limits the system that can build torchaudio, so it also introduces single FFmpeg version support mode. setting FFMPEG_ROOT during the build will change the way binaries are built so that it will only support one specific version of FFmpeg. Pull Request resolved: https://github.com/pytorch/audio/pull/3464 Differential Revision: D47300223 Pulled By: mthrok fbshipit-source-id: 560c7968315e4c8922afa11a4693f648c0356d04
-
- 05 Jul, 2023 1 commit
-
-
moto authored
Summary: This reverts commit b7d3e89a. We will use pre-built binaries instead of dlopen. Pull Request resolved: https://github.com/pytorch/audio/pull/3456 Differential Revision: D47239681 Pulled By: mthrok fbshipit-source-id: 0446a62410d914081184fc20c386afa00b1e41b6
-
- 03 Jun, 2023 1 commit
-
-
Moto Hira authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3402 This is a second attempt of https://github.com/pytorch/audio/pull/3353. The basic logic to enable dlopen for FFmpeg libraries are same. It uses `at::DynamicLibrary`, which allows to compile torchaudio without linking FFmpeg libraries. This time, the option to enable this feature DLOPEN_FFMPEG has been added, so that users have a way to disable this feature and keep using build-time linking. Please refer to stub.h for more technical detail. Differential Revision: D46403783 fbshipit-source-id: ca3db57ff6bdc50c8c225d22f12f3e76c6dc3f16
-
- 02 Jun, 2023 1 commit
-
-
Moto Hira authored
Differential Revision: D46059199 Original commit changeset: 4493a5fd8a4c Original Phabricator Diff: D46059199 fbshipit-source-id: 71cde3f8cd870d1ad9114e3e87cdd1ba564441c0
-
- 01 Jun, 2023 1 commit
-
-
moto authored
Summary: This commit changes the way FFmpeg extension is built and used. Instead of linking (LGPL) FFmpeg libraries to torchaudio at build time, It uses dlopen to search and link them at run time. For dlopen-ing, we use PyTorch's `at::DynamicLibrary` class, which provides portable wrapper. Pull Request resolved: https://github.com/pytorch/audio/pull/3353 Differential Revision: D46059199 Pulled By: mthrok fbshipit-source-id: 4493a5fd8a4c802178d20276522f5334d637307d
-
- 09 May, 2023 1 commit
-
-
Moto Hira authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3296 Reviewed By: hwangjeff Differential Revision: D45503774 fbshipit-source-id: 806c22bd0f54fd0cea43d61ef3dbedd67ffeb012
-
- 03 Apr, 2023 1 commit
-
-
moto authored
Summary: Utilities functions are only available to Python, so no need to use TorchBind for them. This should allow us to remove link-whole flag when linking `libtorchaudio_ffmpeg` part. Pull Request resolved: https://github.com/pytorch/audio/pull/3228 Reviewed By: nateanl Differential Revision: D44639560 Pulled By: mthrok fbshipit-source-id: 5116073ee8c5ab572c63ad123942c4826bfe1100
-
- 31 Mar, 2023 1 commit
-
-
moto authored
Summary: This commit adds the equivalent of `qscale` option in FFmpeg to StreamWriter.CodecConfig. `qscale` enables variable bit rate. The following figure illustrates the difference between currently available configs. From top to bottom; original, `compression_level=1`, `compression_level=9`, `bit_rate=192k`, `bit_rate=8k`, `qscale=9`, `qscale=1`.  Pull Request resolved: https://github.com/pytorch/audio/pull/3224 Reviewed By: hwangjeff Differential Revision: D44563633 Pulled By: mthrok fbshipit-source-id: ff74cd803b5abf1222f087e3e46ba7d81a35f672
-
- 27 Mar, 2023 1 commit
-
-
hwangjeff authored
Summary: For `StreamWriter`, * Renames arg `config` to codec_config`. * Renames struct `EncodingConfig` and dataclass `EncodeConfig` to `CodecConfig`. * Adds docstrings for arg codec_config`. * Updates `chunk` to `frames` in `write_*_chunk` methods. Pull Request resolved: https://github.com/pytorch/audio/pull/3203 Reviewed By: mthrok Differential Revision: D44350153 Pulled By: hwangjeff fbshipit-source-id: 1b940b1366a43ec0565c362bfcbf62744088b343
-
- 17 Mar, 2023 2 commits
-
-
moto authored
Summary: TODO: add cache release Pull Request resolved: https://github.com/pytorch/audio/pull/3178 Reviewed By: hwangjeff Differential Revision: D44136275 Pulled By: mthrok fbshipit-source-id: 4eaf646fe17a469e8bbbdf43441d5532f9f8461d
-
moto authored
Summary: Adds config object `EncodingConfig` and modifies `StreamWriter` to allow for passing in additional encoder configuration parameters, e.g. bit rate and compression level. Pull Request resolved: https://github.com/pytorch/audio/pull/3179 Pull Request resolved: https://github.com/pytorch/audio/pull/3164 Reviewed By: mthrok Differential Revision: D43861413 Pulled By: hwangjeff fbshipit-source-id: c1682cb2f6e682ab6f1a506511d2be7c7b254161
-
- 08 Mar, 2023 1 commit
-
-
moto authored
Summary: This commit adds fields to OutputStream, which shows the result of fitlers, such as width and height after filtering. Before ``` OutputStream( source_index=0, filter_description='fps=3,scale=width=320:height=320,format=pix_fmts=gray') ``` After ``` OutputVideoStream( source_index=0, filter_description='fps=3,scale=width=320:height=320,format=pix_fmts=gray', media_type='video', format='gray', width=320, height=320, frame_rate=3.0) ``` Pull Request resolved: https://github.com/pytorch/audio/pull/3155 Reviewed By: nateanl Differential Revision: D43882399 Pulled By: mthrok fbshipit-source-id: 620676b1a06f293fdd56de8203a11120f228fa2d
-
- 24 Feb, 2023 2 commits
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3095 Reviewed By: nateanl Differential Revision: D43544998 Pulled By: mthrok fbshipit-source-id: 4359cdbbdbee53084016a84129cb3d65900b0457
-
moto authored
Summary: This commit is kind of clean up and preparation for future development. We plan to pass around more complicated objects among StreamReader and StreamWriter, and TorchBind is not expressive enough for defining intermediate object, so we use PyBind11 for binding StreamWriter. Pull Request resolved: https://github.com/pytorch/audio/pull/3091 Reviewed By: xiaohui-zhang Differential Revision: D43515714 Pulled By: mthrok fbshipit-source-id: 9097bb104bbf8c1536a5fab6f87447c08b10a7f2
-
- 23 Feb, 2023 1 commit
-
-
moto authored
Summary: This commit is kind of clean up and preparation for future development. We plan to pass around more complicated objects among StreamReader and StreamWriter, and TorchBind is not expressive enough for defining intermediate object, so we want to use PyBind11 for binding StreamReader/Writer. PyBind11 converts Python dict into std::map, while TorchBind converts it into c10::Dict. Because of this descrepancy, conversion from c10::Dict to std::map have to happen in multiple places, and this makes the binding code thicker as it requires to wrapper methods. Using std::map reduces the number of wrapper methods / conversions, because the same method can be bound for file-like object and the others. Pull Request resolved: https://github.com/pytorch/audio/pull/3092 Reviewed By: nateanl Differential Revision: D43524808 Pulled By: mthrok fbshipit-source-id: f7467c66ccd37dbf4abc337bbb18ffaac21a0058
-
- 27 Jan, 2023 1 commit
-
-
Moto Hira authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3013 Namespace clean up before publishing the torchaudio C++ API as prototype. Reviewed By: hwangjeff Differential Revision: D42699903 fbshipit-source-id: 8a9eed0390dfa4a152124b42f2b927dbdd3e23d2
-
- 26 Jan, 2023 1 commit
-
-
Moto Hira authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3007 Simplify the construction of StreamReader/Writer in C++. Currently these classes require client code to build AVFormatContext manually. This is tedious and not user freindly. Some client code actually uses the same helper function that TorchAudio codebase uses. This commit moves the helper logic inside of the constructor of StreamReader/Writer, so that the signatures of these constructors are easy to use and similar to Python interface. Reviewed By: xiaohui-zhang Differential Revision: D42662520 fbshipit-source-id: d95e5236810c48d7d9bd2d89c05d4f60a44b3ba1
-
- 04 Jan, 2023 1 commit
-
-
moto authored
Summary: Currently, when iterating media data with StreamReader, using the for-loop is the only way with public API. This does not support usecases like "Fetch one chunk after seek" well. ```python s = StreamReader s.add_audio_stream(...) s.seek(10) chunk = None for chunk, in s.stream(): break ``` This commit make the `fill_buffer` used in iterative method public API so that one acn do ```python s.seek(10) s.fill_buffer() chunk, = s.pop_chunks() ``` --- Also this commit moves the implementation to C++ so that it reduces the number of FFI boundary crossing. This improves the performance when the iteration is longer. AVI (generated with `ffmpeg -hide_banner -f lavfi -t ${duration} -i testsrc "${file}.avi"`) | Video Duration [sec] | Original [msec] | Fill Buffer C++ | One Go (reference) | |----------------------|----------|-----------------|--------| | 1 | 18 | 18.4 | 16.6 | | 5 | 44 | 42.6 | 35.1 | | 10 | 75.3 | 74.4 | 60.9 | | 30 | 200 | 195 | 158 | | 60 | 423 | 382 | 343 | MP4 (generated with `ffmpeg -hide_banner -f lavfi -t ${duration} -i testsrc "${file}.mp4"`) | Video Duration [sec] | Original [msec] | Fill Buffer C++ | One Go | |----------------------|-----------------|-----------------|--------| | 1 | 18.7 | 18.1 | 10.3 | | 5 | 42.2 | 40.6 | 25.2 | | 10 | 73.9 | 71.8 | 43.6 | | 30 | 202 | 194 | 116 | | 60 | 396 | 386 | 227 | * Original (Python implementation) ```python r = StreamReader(src) r.add_video_stream(1, decoder_option={"threads": "1"}) for chunk, in r.stream(): pass ``` * This (C++) ```python r = StreamReader(src) r.add_video_stream(1, decoder_option={"threads": "1"}) for chunk, in r.stream(): pass ``` * Using `process_all_packets` (process all in one go) ```python r = StreamReader(src) r.add_video_stream(1, decoder_option={"threads": "1"}) r.process_all_packets() ``` Pull Request resolved: https://github.com/pytorch/audio/pull/2954 Reviewed By: carolineechen Differential Revision: D42349446 Pulled By: mthrok fbshipit-source-id: 9e4e37923e46299c3f43f4ad17a2a2b938b2b197
-
- 01 Sep, 2022 1 commit
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2648 Reviewed By: nateanl Differential Revision: D38976874 Pulled By: mthrok fbshipit-source-id: 0541dea2a633d97000b4b8609ff6b83f6b82c864
-
- 12 Jul, 2022 1 commit
-
-
moto authored
Summary: Python dictionary is bound to different types in TorchBind and PyBind. StreamReader has methods that receive and return dictionary. This commit cleans up the treatment of dictionary and consolidate helper functions. * The core implementation and TorchBind all uses `c10::Dict`. * PyBind version uses `std::map` and converts it to `c10::Dict`. * The helper functions to convert `std::map` <-> `c10::Dict` are consolidated in pybind directory. * The wrapper methods are implemented in `pybind` dir. Pull Request resolved: https://github.com/pytorch/audio/pull/2533 Reviewed By: hwangjeff Differential Revision: D37731866 Pulled By: mthrok fbshipit-source-id: 5a5cf1372668f7d3aacc0bb461bc69fa07212f3f
-
- 08 Jun, 2022 2 commits
-
-
moto authored
Summary: In https://github.com/pytorch/audio/issues/2461, `metadata` field was added to StreamInfo. However, the value attached to this new field was source-level metadata, while each stream can have different metadata. * source level metadata [AVFormatContext->metadata](https://ffmpeg.org/doxygen/4.1/structAVFormatContext.html#a3019a56080ed2e3297ff25bc2ff88adf) * stream level metadata [AVFormatContext->streams[]->metadata](https://ffmpeg.org/doxygen/4.1/structAVStream.html#a50d250a128a3da9ce3d135e84213fb82) This commit moves source level metadata to dedicated method, `get_metadata`, and fix the stream-level metadata to report stream metadata. Pull Request resolved: https://github.com/pytorch/audio/pull/2464 Reviewed By: hwangjeff, xiaohui-zhang Differential Revision: D36995452 Pulled By: mthrok fbshipit-source-id: 534be1f7feb07790a0ce8624c336cdb7b65a8697
-
moto authored
Summary: Add metadata, such as ID3 (https://github.com/pytorch/audio/commit/7d98db0567cb60fabcc173949b8c08e3a3487ac2)tag to `StreamReaderSourceAudioStream`. Pull Request resolved: https://github.com/pytorch/audio/pull/2461 Reviewed By: hwangjeff Differential Revision: D36985656 Pulled By: mthrok fbshipit-source-id: e66f9e6e980eb57c378cc643a8979b6b7813dae7
-
- 21 May, 2022 1 commit
-
-
moto authored
Summary: This commit adds file-like object support to Streaming API. ## Features - File-like objects are expected to implement `read(self, n)`. - Additionally `seek(self, offset, whence)` is used if available. - Without `seek` method, some formats cannot be decoded properly. - To work around this, one can use the existing `decoder` option to tell what decoder it should use. - The set of `decoder` and `decoder_option` arguments were added to `add_basic_[audio|video]_stream` method, similar to `add_[audio|video]_stream`. - So as to have the arguments common to both audio and video in front of the rest of the arguments, the order of the arguments are changed. - Also `dtype` and `format` arguments were changed to make them consistent across audio/video methods. ## Code structure The approach is very similar to how file-like object is supported in sox-based I/O. In Streaming API if the input src is string, it is passed to the implementation bound with TorchBind, if the src has `read` attribute, it is passed to the same implementation bound via PyBind 11.  ## Refactoring involved - Extracted to https://github.com/pytorch/audio/issues/2402 - Some implementation in the original TorchBind surface layer is converted to Wrapper class so that they can be re-used from PyBind11 bindings. The wrapper class serves to simplify the binding. - `add_basic_[audio|video]_stream` methods were removed from C++ layer as it was just constructing string and passing it to `add_[audio|video]_stream` method, which is simpler to do in Python. - The original core Streamer implementation kept the use of types in `c10` namespace minimum. All the `c10::optional` and `c10::Dict` were converted to the equivalents of `std` at binding layer. But since they work fine with PyBind11, Streamer core methods deal them directly. ## TODO: - [x] Check if it is possible to stream MP4 (yuv420p) from S3 and directly decode (with/without HW decoding). Pull Request resolved: https://github.com/pytorch/audio/pull/2400 Reviewed By: carolineechen Differential Revision: D36520073 Pulled By: mthrok fbshipit-source-id: a11d981bbe99b1ff0cc356e46264ac8e76614bc6
-