Commits · b2e07b58d4dd8028aa9bdc2207a4a624139044cf · OpenDAS / Torchaudio

17 Mar, 2023 2 commits

Update compatibility matrix (#3182) · b2e07b58

moto authored Mar 17, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3182

Reviewed By: nateanl

Differential Revision: D44167810

Pulled By: mthrok

fbshipit-source-id: 6ecbae54224ef7ba32835e4006aa5f2dc16b9acb

b2e07b58

Add EncodingConfig (#3179) · 9bb35070

moto authored Mar 16, 2023

Summary:
Adds config object `EncodingConfig` and modifies `StreamWriter` to allow for passing in additional encoder configuration parameters, e.g. bit rate and compression level.

Pull Request resolved: https://github.com/pytorch/audio/pull/3179

Pull Request resolved: https://github.com/pytorch/audio/pull/3164

Reviewed By: mthrok

Differential Revision: D43861413

Pulled By: hwangjeff

fbshipit-source-id: c1682cb2f6e682ab6f1a506511d2be7c7b254161

9bb35070

16 Mar, 2023 2 commits

Fix initialization of `get_trellis`. (#3172) · a6b34a5d

jiyuntu-eero authored Mar 16, 2023

Summary:
Fix https://github.com/pytorch/audio/issues/3166. In `get_trellis` method, the index of blank symbol is regarded as 0 by default. It should be changed to `blank_id`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3172

Reviewed By: mthrok

Differential Revision: D44090889

Pulled By: nateanl

fbshipit-source-id: d119f4ded895d31aeefd59f8d975224870100264

a6b34a5d

Refactor Tensor conversion in StreamReader (#3170) · 014d7140

moto authored Mar 15, 2023

Summary:
Currently, when the Buffer converts AVFrame* to torch::Tensor,
it checks the format at each time a frame is passed, and
perform the conversion.

This commit changes it so that the conversion operation is
pre-instantiated at the time outside stream is configured.

It introduces Converter implementations for various formats,
and use template to embed them in Buffer class.
This way, branching like if/switch are eliminated from
decoding path.

Pull Request resolved: https://github.com/pytorch/audio/pull/3170

Reviewed By: xiaohui-zhang

Differential Revision: D44048293

Pulled By: mthrok

fbshipit-source-id: 30d8b240a5695d7513f499ce17853f2f0ffcab9f

014d7140

15 Mar, 2023 2 commits

Enhance UX on TorchAudio pages to improve awareness of doc versioning (#3167) · 92f2ea89

Carl Parker authored Mar 15, 2023

Summary:
- Boldface the version-selection UX and increase size by three percent.
- Add text to breadcrumbs to indicate version and stability.
- New `breadcrumbs.html` in `_templates` overrides Sphinx version.

I create a new variable in `conf.py`, **version_stable**, which has the version number for the most-recent stable release. I define this variable in the **html_context** dictionary so that it is visible to the templates.

I use this approach because I was not able to find any other way of discerning the current stable release during the build. Note that the `versions.html` file--which identifies the current stable release--appears to be available only in the **gh-pages** branch and so it is not available at build time.

However, this means that someone will need to update `conf.py` whenever the current stable release changes.

Pull Request resolved: https://github.com/pytorch/audio/pull/3167

Reviewed By: mthrok

Differential Revision: D44112224

Pulled By: carljparker

fbshipit-source-id: e76f5cb6734a784d161342964459577aa9b64cac

92f2ea89

Fix MFCC autograd test (#3169) · ee0b97f2

Zhaoheng Ni authored Mar 14, 2023

Summary:
Autograd test randomly fails for MFCC transform. Fix it by increasing `nondet_tol` to `1e-10`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3169

Reviewed By: xiaohui-zhang, mthrok

Differential Revision: D44069673

Pulled By: nateanl

fbshipit-source-id: addafefe381104e778b09bfbaafb322df1d9054c

ee0b97f2

14 Mar, 2023 2 commits

Add documentation introducing I/O backend revision (#3147) · 6a8ed4a2

hwangjeff authored Mar 14, 2023

Summary:
Adds documentation that introduces forthcoming I/O backend revision and provides enablement directions for the current release.

Doc pages:
https://output.circle-artifacts.com/output/job/9c0e5a49-eaf4-404c-b910-ca1b18bb289b/artifacts/0/docs/torchaudio.html

Pull Request resolved: https://github.com/pytorch/audio/pull/3147

Reviewed By: mthrok

Differential Revision: D43824019

Pulled By: hwangjeff

fbshipit-source-id: ad21d60c7e8f69f64859c56a8ca75735ddc22e40

6a8ed4a2

Update compatibility matrix (#3168) · 10aec5bd

Zhaoheng Ni authored Mar 14, 2023

Summary:
Add `2.0.0` release to the compatibility matrix

Pull Request resolved: https://github.com/pytorch/audio/pull/3168

Reviewed By: mthrok

Differential Revision: D44059197

Pulled By: nateanl

fbshipit-source-id: a2830d059be90eddeab72b30e85cdfc393369bf8

10aec5bd

09 Mar, 2023 2 commits

Refactor StreamReader - let StreamProcessor own codec context (#3157) · a8f4e97b

Moto Hira authored Mar 09, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3157

AVCodecContext plays central role in decoding and encoding.
Currently in StreamReader, the object is owned inside of Decoder class
and it's not accessible from other objects.

This commit move the ownership of AVCodecContext out of Decoder to
StreamProcessor class so that other components can check access its field.

Also, the Decoder class, which is super thin wrapper around AVCodecContext
object, is now absorbed to StreamProcessor class.

Reviewed By: xiaohui-zhang

Differential Revision: D43924664

fbshipit-source-id: e53254955d9ce16871e393bcd8bb2794ce6a51ff

a8f4e97b

Remove private helper methods from StreamReader (#3156) · 430dd17c

Moto Hira authored Mar 08, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3156

Remove helper methods that are not worthy of being private method

Reviewed By: xiaohui-zhang

Differential Revision: D43919385

fbshipit-source-id: 2ce4efaf5ec9418076e78c7ce1f842e0dd7e3028

430dd17c

08 Mar, 2023 3 commits

Fix documentation of functional and transforms (#3134) · 85cb37e2

cai525 authored Mar 08, 2023

Summary:
Address #3101. The documentation for `power=1` should represent magnitude instead of energy.

Pull Request resolved: https://github.com/pytorch/audio/pull/3134

Reviewed By: mthrok

Differential Revision: D43910652

Pulled By: nateanl

fbshipit-source-id: e0768438e819222a5dde6b86c5123ab0e8af59fb

85cb37e2

Include format information after filter (#3155) · 146195d8

moto authored Mar 08, 2023

Summary:
This commit adds fields to OutputStream, which shows the result
of fitlers, such as width and height after filtering.

Before

```
OutputStream(
    source_index=0,
    filter_description='fps=3,scale=width=320:height=320,format=pix_fmts=gray')
```

After

```
OutputVideoStream(
    source_index=0,
    filter_description='fps=3,scale=width=320:height=320,format=pix_fmts=gray',
    media_type='video',
    format='gray',
    width=320,
    height=320,
    frame_rate=3.0)
```

Pull Request resolved: https://github.com/pytorch/audio/pull/3155

Reviewed By: nateanl

Differential Revision: D43882399

Pulled By: mthrok

fbshipit-source-id: 620676b1a06f293fdd56de8203a11120f228fa2d

146195d8

Support overwriting PTS in StreamWriter (#3135) · 8d2f6f8d

moto authored Mar 08, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3135

Reviewed By: xiaohui-zhang

Differential Revision: D43724273

Pulled By: mthrok

fbshipit-source-id: 9b52823618948945a26e57d5b3deccbf5f9268c1

8d2f6f8d

07 Mar, 2023 5 commits

Resolve the usage of deprecated method (#3149) · 3212a257

moto authored Mar 07, 2023

Summary:
FFmpeg 5 introduced a new API for channel configuration and channel_layout is deprecated.

This commit fixes one of the deprecated messages.

Pull Request resolved: https://github.com/pytorch/audio/pull/3149

Reviewed By: nateanl

Differential Revision: D43874808

Pulled By: mthrok

fbshipit-source-id: 3e76e8c8f1f34758b1014a426e77260e663b18ee

3212a257

Use deterministic algorithms for filtfilt autograd tests (#3150) · 1923be04

Zhaoheng Ni authored Mar 07, 2023

Summary:
`filtfilt` function uses `lfilter`, which calls `conv_1d` operation internally. `conv_1d` is expected to have autograd test failures (see https://pytorch.org/docs/stable/generated/torch.use_deterministic_algorithms.html). The PR uses deterministic algorithms in the autograd tests to make `filtfilt` related tests pass.

Pull Request resolved: https://github.com/pytorch/audio/pull/3150

Reviewed By: mthrok

Differential Revision: D43872977

Pulled By: nateanl

fbshipit-source-id: c3d6ec281f34db8a7092526ccb245797bf2338da

1923be04

Fix LFCC autograd test (#3154) · 67a49f3c

Zhaoheng Ni authored Mar 07, 2023

Summary:
Autograd test randomly failed on gpu linux machine. Increase `nondet_tol` to make it pass.

Pull Request resolved: https://github.com/pytorch/audio/pull/3154

Reviewed By: mthrok

Differential Revision: D43873028

Pulled By: nateanl

fbshipit-source-id: a6668c47967a085e5eafb00e2dd4e61b2b46412e

67a49f3c

Raise an error is StreamWriter is not opened (#3152) · 502d5811

Moto Hira authored Mar 07, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3152

In StreamWriter, if the destination is not opened when attempting to write data, it causes segmentation fault.
This commit adds guard so that instead of segfault, it will error-out.

Reviewed By: nateanl

Differential Revision: D43852649

fbshipit-source-id: aef5db7c1508f8a7db5834c2ab6de3cad09f9d60

502d5811

Fix Adam and AdamW initializers in wav2letter example (#3145) · cea12eaf

Maciej Torhan authored Mar 06, 2023

Summary:
In wav2letter example there is passed `momentum` to `Adam` and `AdamW` initializer, which is not a correct parameter. To fix that we need to add `beta_1` and `beta_2` to arguments and replace `momentum` with them. I also added `eps` similar to `Adadelta` initializer.

Pull Request resolved: https://github.com/pytorch/audio/pull/3145

Reviewed By: mthrok

Differential Revision: D43847713

Pulled By: nateanl

fbshipit-source-id: 94f7c48232fabf520cfce81471694cb545d160c6

cea12eaf

06 Mar, 2023 1 commit

Refactor encoding process (#3146) · 8a9ab2a4

Moto Hira authored Mar 06, 2023

Summary:
After the series of simplification, audio/video encoding processes
can be merged, and it allows the gets rid of the boilerplate code.

Pull Request resolved: https://github.com/pytorch/audio/pull/3146

(Note: this ignores all push blocking failures!)

Reviewed By: xiaohui-zhang

Differential Revision: D43815640

fbshipit-source-id: 2a14e372b2cc75db7eeabc27d855a24c3f7d5063

8a9ab2a4

04 Mar, 2023 2 commits

Fix linux gpu tests (#3144) · b96a7ebb

Zhaoheng Ni authored Mar 04, 2023

Summary:
Environment variable `TORCHAUDIO_TEST_ALLOW_SKIP_IF_NO_MACOS ` needs to be added when running the bash script

Pull Request resolved: https://github.com/pytorch/audio/pull/3144

Reviewed By: mthrok

Differential Revision: D43807178

Pulled By: nateanl

fbshipit-source-id: 27c57d2efaed5519a12aa027967968895f357c67

b96a7ebb

Refactor audio conversion (#3143) · db4898f3

Moto Hira authored Mar 03, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3143

Similar to https://github.com/pytorch/audio/pull/3140,
only provide objects which are semantically related to the
operation performed by AudioConverter.

Reviewed By: xiaohui-zhang

Differential Revision: D43781012

fbshipit-source-id: 4795e20f56272af5cfda8a5f46083e60d1890c3e

db4898f3

03 Mar, 2023 3 commits

Simplify HW encoder object handling (#3138) · 26acdbff

moto authored Mar 03, 2023

Summary:
hw_device_ctx and hw_frame_ctx assigned to an AVCodecContext
object are owned by libavformat, and get freed in [av_codec_free](https://ffmpeg.org/doxygen/4.1/group__lavc__core.html#gaf869d0829ed607cec3a4a02a1c7026b3)
(actually in [avcodec_close](https://ffmpeg.org/doxygen/4.1/libavcodec_2utils_8c_source.html#l01069)),
so we do not need to keep the reference around.

Pull Request resolved: https://github.com/pytorch/audio/pull/3138

Reviewed By: nateanl

Differential Revision: D43738009

Pulled By: mthrok

fbshipit-source-id: 8c1f4217fa7b21dce872d12be9245056f3fc7537

26acdbff

Fix HW accelerated encoder (#3140) · 41e3b93d

Moto Hira authored Mar 03, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3140

https://github.com/pytorch/audio/pull/3120 introduced regression in GPU encoder.

This happened because previously source AVPixelFormat (expected channel order of
input tensor) and AVCodecContext (encoding format) in converter (module to copy
input tensor to buffer), even though converter does not need to konw about the
encoding format.

This commit fixes the issue and make sure that converter does not recieve
codec context.

Reviewed By: nateanl

Differential Revision: D43759162

fbshipit-source-id: f5f191cb54ecc82bd882aececdcae16921250261

41e3b93d

Skip playback tests on linux gpu machine (#3141) · d359f887

Zhaoheng Ni authored Mar 03, 2023

Summary:
`playback` function was added in https://github.com/pytorch/audio/issues/3026, the function only supports MacOS, hence the tests should be skipped on other OS. The PR skips the tests on linux gpu machines on Circle CI.

Pull Request resolved: https://github.com/pytorch/audio/pull/3141

Reviewed By: xiaohui-zhang, mthrok

Differential Revision: D43760546

Pulled By: nateanl

fbshipit-source-id: 606907127feee28a66f61baca000a8ef708f8086

d359f887

02 Mar, 2023 5 commits

Fix build (#3136) · 5fac7173

moto authored Mar 02, 2023

Summary:
Follow-up https://github.com/pytorch/audio/issues/3130

Pull Request resolved: https://github.com/pytorch/audio/pull/3136

Reviewed By: hwangjeff

Differential Revision: D43732991

Pulled By: mthrok

fbshipit-source-id: 2e8cb56d96e22546645c82eca362b3c4dcf9c78f

5fac7173

Fix doc build (#3125) · 1ed38095

moto authored Mar 01, 2023

Summary:
Fix build_doc job

https://app.circleci.com/pipelines/github/pytorch/audio/15217/workflows/ce50b317-a59e-4741-b8d2-59129420deb8

- build.ffmpeg.html might not exist when IPython notebook is processed. Changing to main doc URL.
- Fix bash cell syntax in HW tutorial
- Fix C++ doc
- Fix duplicated target name in streamwriter tutorial

Pull Request resolved: https://github.com/pytorch/audio/pull/3125

Reviewed By: xiaohui-zhang

Differential Revision: D43724078

Pulled By: mthrok

fbshipit-source-id: ea7d46ec5e377cf2fbd7c3798df57da73750ac5c

1ed38095

Extract audio conversion into separate class (#3130) · 9133f2a0

Moto Hira authored Mar 01, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3130

Similar to https://github.com/pytorch/audio/pull/3120
Adopt the generator style slicing conversion to audio encoding
process.

Reviewed By: nateanl

Differential Revision: D43685380

fbshipit-source-id: 3e95655783e5c5d768486f8af6e6b47b0072999b

9133f2a0

Fix PTS regression (#3131) · fbf05f28

Moto Hira authored Mar 01, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3131

In https://github.com/pytorch/audio/pull/3122, the intermediate `num_frames` variable
is removed.

PTS can be incremented the same way, but the timing was wrong in #3122.
This commit fixes it.

Reviewed By: xiaohui-zhang

Differential Revision: D43712046

fbshipit-source-id: 2fe0082969296f4f3964e62e55b5325fcd45f4f9

fbf05f28

Update slicing conversion code (#3129) · 898db8c7

Moto Hira authored Mar 01, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3129

- Add step parameter to support audio slicing
- Rename to `SlicingTensorConverter` (`Generator` is too generic.)

Reviewed By: xiaohui-zhang

Differential Revision: D43704926

fbshipit-source-id: c4bf0ff766e0ae1b5d46b159a6367492ef68f9cd

898db8c7

01 Mar, 2023 6 commits

Fix stylecheck in io (#3126) · b0faecb2

Zhaoheng Ni authored Mar 01, 2023

Summary:
`Dict` is not used. Fix styecheck by removing the import of `Dict`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3126

Reviewed By: mthrok

Differential Revision: D43699410

Pulled By: nateanl

fbshipit-source-id: 8d6b5335124903453387c488f96f297d6fe3c819

b0faecb2

Tweak OutputStream implementation (#3122) · fce6180c

Moto Hira authored Mar 01, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3122

- Remove manual tracking of num_frames
- Remove unnecessary dispatch in AudioOutputStream

Reviewed By: nateanl

Differential Revision: D43685746

fbshipit-source-id: a7e62a81549fb62ad0caa3b741655eba3bc5e250

fce6180c

Extract image conversions into separate class (#3120) · 0bf00d20

Moto Hira authored Mar 01, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3120

This commits extract image conversion ops into ImageTensorConverter class, and make it independent from OutputStream class.

ImageTensorConverter class implementes range-based for-loop interface, like

```
for (auto const& frame : ImageTensorConverter::convert(...)) {
    post_process_with_avframe(frame);
}
```

This allows to decouple encoder from image conversion.

Reviewed By: nateanl

Differential Revision: D43666296

fbshipit-source-id: 754efe677bc7695b3f138a6d076be2106e186b79

0bf00d20

Move I/O logging to C++ (#3123) · c9c8c7e1

Moto Hira authored Mar 01, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3123

Moving the I/O usage logging to C++, so that C++ usages are also covered.

Reviewed By: nateanl

Differential Revision: D43686567

fbshipit-source-id: ad357028dd69eedb8bc2a2482fe07e95757a3a62

c9c8c7e1

Fix windows tests (#3119) · 6a4a8200

Zhaoheng Ni authored Mar 01, 2023

Summary:
`sox` is not available on Windows machines. Add skip decorators to the sox related tests to skip running tests on Windows.

Pull Request resolved: https://github.com/pytorch/audio/pull/3119

Reviewed By: mthrok

Differential Revision: D43682754

Pulled By: nateanl

fbshipit-source-id: f69987dac8232a3569be83f096b32389bd8bda81

6a4a8200

Remove redundant device arg from VideoOutputStream constructor (#3121) · af493e4e

Moto Hira authored Feb 28, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3121

After careful review, it turned out device arg in VideoOutputStream
constructor and related helper functions can be replaced with
AVCodecContext::pix_fmt == AV_PIX_FMT_CUDA.

Reviewed By: xiaohui-zhang

Differential Revision: D43677801

fbshipit-source-id: f8f34f1aed46e223b44250d39cccc4cd26ecb458

af493e4e

28 Feb, 2023 3 commits

Decouple image conversion and OutputStream class (#3113) · 2381beec

Moto Hira authored Feb 28, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3113

Decouple the Tensor to AVFrame conversion process from encoding process.

Reviewed By: nateanl

Differential Revision: D43628942

fbshipit-source-id: e698f3150292567dbc23e7d6795ad58265f24780

2381beec

Use null filter in case no filter is used (#3109) · fd24af00

Moto Hira authored Feb 27, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3109

Change the logic around StreamWriter preprocessing.
Currently, no preprocessing is expressed as `nullptr` to `unique_ptr<FilterGraph>`.

This commit changes it to `[a]null` filter, which is just a pass through.
This makes a code a bit simpler, and serves better preparation for adding
filters for CUDA process.

Reviewed By: xiaohui-zhang

Differential Revision: D43593321

fbshipit-source-id: 9ca71c2c8bf652384a0f56b4c41b32d908f61201

fd24af00

Reduce code duplication in VideoOutputStream (#3108) · be3bd1ac

Moto Hira authored Feb 27, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3108

- Introduce process_frame method
- De-dupe validation logic

Reviewed By: xiaohui-zhang

Differential Revision: D43632390

fbshipit-source-id: 76b7ca0beb725acf686269c877a62e1256921b28

be3bd1ac

27 Feb, 2023 2 commits

Add SquimObjectiveBundle to prototype (#3103) · 46fae2fe

Zhaoheng Ni authored Feb 27, 2023

Summary:
Add pre-trained pipeline support for `SquimObjective` model. The pre-trained model is trained on DNS 2020 challenge dataset.

Pull Request resolved: https://github.com/pytorch/audio/pull/3103

Reviewed By: xiaohui-zhang, mthrok

Differential Revision: D43611794

Pulled By: nateanl

fbshipit-source-id: 0ac76a27e7027a43ffccb158385ddb2409b8526d

46fae2fe

Move OutputStream init logic and simplify interface (#3105) · bc61f109

Moto Hira authored Feb 27, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3105

Refactor the construction of Audio/VideoOutputStream

Reviewed By: nateanl

Differential Revision: D43613013

fbshipit-source-id: 0e112cb1bab2658be68a368099ed00ef318ea4f1

bc61f109