Commits · fa59855f9535c9ebbbbdb83e65a29578e89e0b68 · OpenDAS / Torchaudio

23 May, 2023 1 commit

Fix cuda test failure (#3363) · fa59855f

Zhaoheng Ni authored May 23, 2023

Summary:
Fix https://github.com/pytorch/audio/issues/3361

When adding FunctionalCUDAOnlyTest, the class should inherit from `TestBaseMixin` instead of `Functional`

Pull Request resolved: https://github.com/pytorch/audio/pull/3363

Reviewed By: atalman, osalpekar

Differential Revision: D46112084

Pulled By: nateanl

fbshipit-source-id: 67c6472fda98cb718e0fc53ab248beda745feab5

fa59855f

22 May, 2023 1 commit

Fix CPU kernel of forced_align function (#3354) · 8a893fb3

Zhaoheng Ni authored May 21, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3354

when start ==0, the first item instead of Sth item of t row in backPtr_a should be 0.

Reviewed By: xiaohui-zhang

Differential Revision: D46059971

fbshipit-source-id: 89933134878513034eae033764b19f8562f24cb8

8a893fb3

20 May, 2023 1 commit

[audio][PR] Add forced_align function to torchaudio (#3348) · e7935cff

Zhaoheng Ni authored May 19, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3348

The pull request adds a CTC-based forced alignment function that supports both CPU and CUDA deviced. The function takes the CTC emissions and target labels as inputs and generates the corresponding labels for each frame.

Reviewed By: vineelpratap, xiaohui-zhang

Differential Revision: D45867265

fbshipit-source-id: 3e25b06bf9bc8bb1bdcdc08de7f4434d912154cb

e7935cff

17 May, 2023 1 commit

Add 420p10le CPU support to StreamReader (#3332) · c12f4734

moto authored May 16, 2023

Summary:
This commit add support to decode YUV420P010LE format.

The image tensor returned by this format
- NCHW format (C == 3)
- int16 type
- value range [0, 2^10).

Note that the value range is different from what "hevc_cuvid" decoder
returns. "hevc_cuvid" decoder uses full range of int16 (internally,
it's uint16) to express the color (with some intervals), but the values
returned by CPU "hevc" decoder are with in [0, 2^10).

Address https://github.com/pytorch/audio/issues/3331

Pull Request resolved: https://github.com/pytorch/audio/pull/3332

Reviewed By: hwangjeff

Differential Revision: D45925097

Pulled By: mthrok

fbshipit-source-id: 4e669b65c030f388bba2fdbb8f00faf7e2981508

c12f4734

10 May, 2023 2 commits

[BC-Breaking] Switch to the backend dispatcher (#3241) · 4463fbdf

moto authored May 10, 2023

Summary:
This commit makes the code defaults to the backend dispatcher by default. Enabling backend dispatcher puts the FFmpeg-based I/O implementation on higher priority (if the corresponding FFmpeg is available), and allows individual function call to specify the backend.

See also https://github.com/pytorch/audio/issues/2950

Pull Request resolved: https://github.com/pytorch/audio/pull/3241

Reviewed By: hwangjeff

Differential Revision: D44709068

Pulled By: mthrok

fbshipit-source-id: 43aac3433f78a681df6669e9ac46e8ecf3beb1be

4463fbdf

[BC-Breaking] Update InverseMelScale solution (#3280) · 5a85a461

Zhaoheng Ni authored May 09, 2023

Summary:
Address https://github.com/pytorch/audio/issues/2643

- replace `SGD` optimization with `torch.linalg.lstsq` which is much faster.
- Add autograd test for `InverseMelScale`
- update other tests

Pull Request resolved: https://github.com/pytorch/audio/pull/3280

Reviewed By: hwangjeff

Differential Revision: D45679988

Pulled By: nateanl

fbshipit-source-id: a42e8bff9dc0f38e47e0482fd8a2aad902eedd59

5a85a461

09 May, 2023 1 commit

Fix batch consistency test for InverseBarkScale (#3322) · 51cc1cbf

Zhaoheng Ni authored May 09, 2023

Summary:
The batch consistency test function should call `InverseBarkScale` instead of `InverseMelScale`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3322

Reviewed By: mthrok

Differential Revision: D45691769

Pulled By: nateanl

fbshipit-source-id: 4a1ed80c4a56c3a847a49a8d02f8b5cbe4f09045

51cc1cbf

05 May, 2023 1 commit

Add SpecAugment transform (#3309) · 82febc59

Xiaohui Zhang authored May 05, 2023

Summary:
(2/2 of the previous https://github.com/pytorch/audio/pull/2360 which I accidentally closed)

The previous way of doing SpecAugment via Frequency/TimeMasking transforms has the following problems:
- Only zero masking can be done; masking by mean value is not supported.
- mask_along_axis is hard-coded to mask the 1st dimension and mask_along_axis_iid is hard-code to mask the 2nd or 3rd dimension of the input tensor.
- For 3D spectrogram tensors where the first dimension is batch or channel, features from the same batch or different channels have to use the same mask, because mask_along_axis_iid only support 4D tensors, because of the above hard-coding
- For 2D spectrogram tensors w/o a batch or channel dimension, Time/Frequency masking can't be applied at all, since mask_along_axis only support 3D tensors, because of the above hard-coding.
- It's not straightforward to apply multiple time/frequency masks by the current design. If we need N masks across time/frequency axis, we need to sequentially apply N Frequency/TimeMasking transforms to input tensors, and such API looks very inconvenient. We need to introduce a separate SpecAugment transform to handle this.

To solve these issues, here we
[done in the previous [PR](https://github.com/pytorch/audio/pull/3289)] Extend mask_along_axis_iid to support 3D+ tensors and mask_along_axis to support 2D+ tensors. Now both of them are able to mask one of the last two dimensions (where the time or frequency dimension lives) of the input tensor.
[done in this PR] Introducing SpecAugment transform.

Pull Request resolved: https://github.com/pytorch/audio/pull/3309

Reviewed By: nateanl

Differential Revision: D45592926

Pulled By: xiaohui-zhang

fbshipit-source-id: 97cd686dbb6c1c6ff604716b71a876e616aaf1a2

82febc59

04 May, 2023 1 commit

Extend mask_along_axis{,_iid} (#3289) · 74bd971a

Xiaohui Zhang authored May 04, 2023

Summary:
(1/2 of the previous [PR](https://github.com/pytorch/audio/pull/2360) which I accidentally closed)

The previous way of doing SpecAugment via Frequency/TimeMasking transforms has the following problems:
- Only zero masking can be done; masking by mean value is not supported.
- mask_along_axis is hard-coded to mask the 1st dimension and mask_along_axis_iid is hard-code to mask the 2nd or 3rd dimension of the input tensor.
- For 3D spectrogram tensors where the first dimension is batch or channel, features from the same batch or different channels have to use the same mask, because mask_along_axis_iid only support 4D tensors, because of the above hard-coding
- For 2D spectrogram tensors w/o a batch or channel dimension, Time/Frequency masking can't be applied at all, since mask_along_axis only support 3D tensors, because of the above hard-coding.
- It's not straightforward to apply multiple time/frequency masks by the current design.

To solve these issues, here we
- Extend mask_along_axis_iid to support 3D tensors and mask_along_axis to support 2D tensors. Now both of them are able to mask one of the last two dimensions (where the time or frequency dimension lives) of the input tensor.

The introduction of SpecAugment transform will be done in another PR.

Pull Request resolved: https://github.com/pytorch/audio/pull/3289

Reviewed By: hwangjeff

Differential Revision: D45460357

Pulled By: xiaohui-zhang

fbshipit-source-id: 91bf448294799f13789d96a13d4bae2451461ef3

74bd971a

28 Apr, 2023 1 commit

Add cuctc decoder (#3096) · 0a1801ed

Yuekai Zhang authored Apr 28, 2023

Summary:
This PR implements a CUDA based ctc prefix beam search decoder.

Attach serveral benchmark results using V100 below:
|decoder type| model |datasets       | decoding time (secs)| beam size | batch size | model unit | subsampling times | vocab size |
|--------------|---------|------|-----------------|------------|-------------|------------|-----------------------|------------|
| cuctc |  conformer nemo    |dev clean        |7.68s | 8           |  32       | bpe         |    4  | 1000|
| cuctc |  conformer nemo   |dev clean  (sort by length)      |1.6s | 8           |  32       | bpe         |    4  | 1000|
| cuctc |  wav2vec2.0 torchaudio |dev clean                                |22s | 10           |  1       | char         |    2  | 29|
| cuctc |   conformer espnet   |aishell1 test                             | 5s | 10           |  24       | char         |    4  | 4233|

Note:
1.  The design is to parallel computation through batch and vocab axis, for loop the frames axis. So it's more friendly with smaller sequence lengths, larger vocab size comparing with CPU implementations.
2. WER is the same as CPU implementations. However, it can't decode with LM now.

Resolves: https://github.com/pytorch/audio/issues/2957.

Pull Request resolved: https://github.com/pytorch/audio/pull/3096

Reviewed By: nateanl

Differential Revision: D44709397

Pulled By: mthrok

fbshipit-source-id: 3078c54a2b44dc00eb4a81b4c657487eeff8c155

0a1801ed

12 Apr, 2023 2 commits

Allow overwrite temp data in ffmpeg test (#3263) · cc7b8bd4

moto authored Apr 11, 2023

Summary:
When `TORCHAUDIO_TEST_TEMP_DIR` is set,
all the unit test temporary data are stored in the  given directory.
Running unit tests multiple times reuses the
directory and the temporary files from the
previous test runs are found there.

FFmpeg save test writes reference data to the
temporary directory, but it is not given the
overwrite flag ("-y"), so it fails in such cases.

This commit fixes that.

Pull Request resolved: https://github.com/pytorch/audio/pull/3263

Reviewed By: hwangjeff

Differential Revision: D44859003

Pulled By: mthrok

fbshipit-source-id: 2db92fbdec1c015455f3779e10a18f7f1146166b

cc7b8bd4

Specify backend directly in test (#3262) · 563e409c

moto authored Apr 11, 2023

Summary:
Preparation to land https://github.com/pytorch/audio/pull/3241

This commit applies patch to make the sox_io TorchScript test pass when dispatcher is enabled.

Pull Request resolved: https://github.com/pytorch/audio/pull/3262

Reviewed By: hwangjeff

Differential Revision: D44897513

Pulled By: mthrok

fbshipit-source-id: 9b65f705cd02324328a2bc1c414aa4b7ca0fed32

563e409c

05 Apr, 2023 1 commit

Fix path-like object support in FFmpeg dispatcher (#3243) · d69e8857

moto authored Apr 05, 2023

Summary:
In dispatcher mode, FFmpeg backend does not handle file-like object, and C++ implementation raises an issue.

This commit fixes it by normalizing file-like object to string.

Pull Request resolved: https://github.com/pytorch/audio/pull/3243

Reviewed By: nateanl

Differential Revision: D44719280

Pulled By: mthrok

fbshipit-source-id: 9dae459e2a5fb4992b4ef53fe4829fe8c35b2edd

d69e8857

03 Apr, 2023 1 commit

Fix virtual function issue with CTC decoder (#3230) · 0c1e3253

moto authored Apr 03, 2023

Summary:
Currently, creating CTCDecoder object by passing a language model to
`lm` argument without assigning it to a variable elsewhere causes
`RuntimeError: Tried to call pure virtual function "LM::start"`.

According to discussions on PyBind11, (
https://github.com/pybind/pybind11/discussions/4013 and
https://github.com/pybind/pybind11/pull/2839
) this is due to Python object garbage-collected by the time
it's used by code implemented in C++. It attempts to call
methods defined in Python, which overrides the base pure virtual
function, but the object which provides this override gets
deleted by garbage collrector, as the original object is not
reference counted.

This commit fixes this by simply assiging the given `lm` object
as an attribute of CTCDecoder class.

Address https://github.com/pytorch/audio/issues/3218

Pull Request resolved: https://github.com/pytorch/audio/pull/3230

Reviewed By: hwangjeff

Differential Revision: D44642989

Pulled By: mthrok

fbshipit-source-id: a90af828c7c576bc0eb505164327365ebaadc471

0c1e3253

01 Apr, 2023 1 commit

Add AudioEffector (#3163) · a4036248

moto authored Mar 31, 2023

Summary:
This commit adds a new feature AudioEffector, which can be used to
apply various effects and codecs to waveforms in Tensor.

Under the hood it uses StreamWriter and StreamReader to apply
filters and encode/decode.

This is going to replace the deprecated `apply_codec` and
`apply_sox_effect_tensor` functions.

It can also perform online, chunk-by-chunk filtering.

Tutorial to follow.

closes https://github.com/pytorch/audio/issues/3161

Pull Request resolved: https://github.com/pytorch/audio/pull/3163

Reviewed By: hwangjeff

Differential Revision: D44576660

Pulled By: mthrok

fbshipit-source-id: 2c5cc87082ab431315d29d56d6ac9efaf4cf7aeb

a4036248

30 Mar, 2023 2 commits

Support encode spec change in StreamWriter (#3207) · 1b648626

moto authored Mar 30, 2023

Summary:
This commit adds support for changing the spec of media
(such as sample rate, #channels, image size and frame rate)
on-the-fly at encoding time.

The motivation behind this addition is that certain media
formats support only limited number of spec, and it is
cumbersome to require client code to change the spec
every time.

For example, OPUS supports only 48kHz sampling rate, and
vorbis only supports stereo.

To make it easy to work with media of different formats,
this commit makes it so that anything that's not compatible
with the format is automatically converted, and allows
users to specify the override.

Notable implementation detail is that, for sample format and
pixel format, the default value of encoder has higher precedent
to source value, while for other attributes like sample rate and
#channels, the source value has higher precedent as long as
they are supported.

Pull Request resolved: https://github.com/pytorch/audio/pull/3207

Reviewed By: nateanl

Differential Revision: D44439622

Pulled By: mthrok

fbshipit-source-id: 09524f201d485d201150481884a3e9e4d2aab081

1b648626

Support changing the number of channels in StreamReader (#3216) · 4bc4ca75

moto authored Mar 29, 2023

Summary:
This commit adds `num_channels` argument,
which allows one to change the number of channels on-the-fly.

Pull Request resolved: https://github.com/pytorch/audio/pull/3216

Reviewed By: hwangjeff

Differential Revision: D44516925

Pulled By: mthrok

fbshipit-source-id: 3e5a11b3fdbb19071f712a8148e27aff60341df3

4bc4ca75

29 Mar, 2023 1 commit

Reduce io tests (#3217) · 09ccf7cc

Moto Hira authored Mar 29, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3217

This commit removes some tests for file-like object from StreamWriter test.

The rational is that testing things after the output file is opened are
same for file-like object and regular files. Things like filter-graph and
encoder format change does not affect how the encoded bynary are written.

Reviewed By: hwangjeff

Differential Revision: D44518626

fbshipit-source-id: 821ec20deca92e5e5c85bf4d47997eed51735374

09ccf7cc

28 Mar, 2023 1 commit

Add additional filter graph option to StreamWriter (#3194) · 715eb34a

moto authored Mar 28, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3194

Reviewed By: hwangjeff

Differential Revision: D44283910

Pulled By: mthrok

fbshipit-source-id: 49125724896bf7190ec27f056b6bfef260019f8e

715eb34a

27 Mar, 2023 1 commit

Revise encoder config arg and docstrings (#3203) · b1de9f1a

hwangjeff authored Mar 27, 2023

Summary:
For `StreamWriter`,
* Renames arg `config` to codec_config`.
* Renames struct `EncodingConfig` and dataclass `EncodeConfig` to `CodecConfig`.
* Adds docstrings for arg codec_config`.
* Updates `chunk` to `frames` in `write_*_chunk` methods.

Pull Request resolved: https://github.com/pytorch/audio/pull/3203

Reviewed By: mthrok

Differential Revision: D44350153

Pulled By: hwangjeff

fbshipit-source-id: 1b940b1366a43ec0565c362bfcbf62744088b343

b1de9f1a

25 Mar, 2023 1 commit

Properly set #samples passed to encoder (#3204) · d8a37a21

moto authored Mar 25, 2023

Summary:
Some audio encoders expect specific, exact number of samples described as in `AVCodecContext.frame_size`.

The `AVFrame.nb_samples` is set for the frames passed to `AVFilterGraph`,
but frames coming out of the graph do not necessarily have the same numbr of frames.

This causes issues with encoding OPUS (among others).

This commit fixes it by inserting `asetnsamples` to filter graph if a fixed number of samples is requested.

Note:
It turned out that FFmpeg 4.1 has issue with OPUS encoding. It does not properly discard some sample.
We should probably move the minimum required FFmpeg to 4.2, but I am not sure if we can enforce it via ABI.
Work around will be to issue an warning if encoding OPUS with 4.1. (follow-up)

Pull Request resolved: https://github.com/pytorch/audio/pull/3204

Reviewed By: nateanl

Differential Revision: D44374668

Pulled By: mthrok

fbshipit-source-id: 10ef5333dc0677dfb83c8e40b78edd8ded1b21dc

d8a37a21

23 Mar, 2023 3 commits

Support YUV444P in GPU decoder (#3199) · 3240de92

moto authored Mar 23, 2023

Summary:
With the support of CUDA filter in https://github.com/pytorch/audio/issues/3183, it is now possible to change the pixel format of CUDA frame.

This commit adds conversion for YUV444P format.

Pull Request resolved: https://github.com/pytorch/audio/pull/3199

Reviewed By: hwangjeff

Differential Revision: D44323928

Pulled By: mthrok

fbshipit-source-id: 6d9b205e7235df5f21e7d3e06166b3a169f1ae9f

3240de92

Add SquimSubjective pre-trained pipeline (#3197) · 68fa1d3f

Zhaoheng Ni authored Mar 23, 2023

Summary:
The PR adds the pre-trained pipeline for `SquimSubjective` model which predicts MOS score for speech enhancement task.

Pull Request resolved: https://github.com/pytorch/audio/pull/3197

Reviewed By: mthrok

Differential Revision: D44313244

Pulled By: nateanl

fbshipit-source-id: 905095ff77006e9f441faa826fc25d9d8681e8aa

68fa1d3f

Set "experimental" automatically when using native opus/vorbis encoder (#3192) · bf1214a9

moto authored Mar 23, 2023

Summary:
OPUS encoder and VORBIS encoders require "strict=experimental" flags. This commit enables it automatically.

The rational behind of it is typically we care if we can encode these formats at all and not how they are encoded. (This might be concern when these encoder becomes more mature on FFmpeg side and providing flags would result in weird behavior)

Also when writing high-level functions that uses StreamWriter, if we do not set these flags, then these high-level functions have to add new options that should be passed down to StreamWriter, which turned out to be very painful in https://github.com/pytorch/audio/issues/3163

Pull Request resolved: https://github.com/pytorch/audio/pull/3192

Reviewed By: nateanl

Differential Revision: D44275089

Pulled By: mthrok

fbshipit-source-id: 74a757b4b7fc8467c8c88ffcb54fbaf89d6e4384

bf1214a9

22 Mar, 2023 1 commit

Fix oscillator bank test (#3196) · aa590a1b

moto authored Mar 22, 2023

Summary:
Follow up of https://github.com/pytorch/audio/pull/3083

Pull Request resolved: https://github.com/pytorch/audio/pull/3196

Reviewed By: nateanl

Differential Revision: D44308940

Pulled By: mthrok

fbshipit-source-id: e3ef27656e74c28ae78b767517d8e0ba3a9ac4a6

aa590a1b

21 Mar, 2023 2 commits

Add SquimSubjective Model (#3189) · a8a16238

Zhaoheng Ni authored Mar 21, 2023

Summary:
Add model architecture and factory functions for `SquimSubjective` which predicts subjective evaluation metric scores (e.g. MOS) for speech enhancement task.

Pull Request resolved: https://github.com/pytorch/audio/pull/3189

Reviewed By: mthrok

Differential Revision: D44267255

Pulled By: nateanl

fbshipit-source-id: f8060398b14c625b38ea1bb2417f61aeaec3f1db

a8a16238

Use keyword arguments for librosa.filters.mel in HiFiGAN unit test (#3185) · 9640757f

Zhaoheng Ni authored Mar 21, 2023

Summary:
In librosa 0.10 release, positional arguments are deprecated (see https://github.com/librosa/librosa/pull/1521 for details). The PR fixes the HiFiGAN unit test by using keyword arguments for `librosa.filters.mel` function.

Pull Request resolved: https://github.com/pytorch/audio/pull/3185

Reviewed By: mthrok

Differential Revision: D44218852

Pulled By: nateanl

fbshipit-source-id: 6171f7bec6a2144917697c1d640e701d95ec60d7

9640757f

20 Mar, 2023 1 commit

Support CUDA frame in FilterGraph (#3183) · c5b96558

moto authored Mar 20, 2023

Summary:
This commit adds CUDA frame support to FilterGraph

It initializes and attaches CUDA frames context to FilterGraph,
so that CUDA frames can be processed in FilterGraph.

As a result, it enables
1. CUDA filter support such as `scale_cuda`
2. Properly retrieve the pixel format coming out of FilterGraph when
   CUDA HW acceleration is enabled. (currently it is reported as "cuda")

Resolves https://github.com/pytorch/audio/issues/3159

Pull Request resolved: https://github.com/pytorch/audio/pull/3183

Reviewed By: hwangjeff

Differential Revision: D44183722

Pulled By: mthrok

fbshipit-source-id: 522d21039c361ddfaa87fa89cf49c19d210ac62f

c5b96558

17 Mar, 2023 1 commit

Add EncodingConfig (#3179) · 9bb35070

moto authored Mar 16, 2023

Summary:
Adds config object `EncodingConfig` and modifies `StreamWriter` to allow for passing in additional encoder configuration parameters, e.g. bit rate and compression level.

Pull Request resolved: https://github.com/pytorch/audio/pull/3179

Pull Request resolved: https://github.com/pytorch/audio/pull/3164

Reviewed By: mthrok

Differential Revision: D43861413

Pulled By: hwangjeff

fbshipit-source-id: c1682cb2f6e682ab6f1a506511d2be7c7b254161

9bb35070

16 Mar, 2023 1 commit

Refactor Tensor conversion in StreamReader (#3170) · 014d7140

moto authored Mar 15, 2023

Summary:
Currently, when the Buffer converts AVFrame* to torch::Tensor,
it checks the format at each time a frame is passed, and
perform the conversion.

This commit changes it so that the conversion operation is
pre-instantiated at the time outside stream is configured.

It introduces Converter implementations for various formats,
and use template to embed them in Buffer class.
This way, branching like if/switch are eliminated from
decoding path.

Pull Request resolved: https://github.com/pytorch/audio/pull/3170

Reviewed By: xiaohui-zhang

Differential Revision: D44048293

Pulled By: mthrok

fbshipit-source-id: 30d8b240a5695d7513f499ce17853f2f0ffcab9f

014d7140

15 Mar, 2023 1 commit

Fix MFCC autograd test (#3169) · ee0b97f2

Zhaoheng Ni authored Mar 14, 2023

Summary:
Autograd test randomly fails for MFCC transform. Fix it by increasing `nondet_tol` to `1e-10`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3169

Reviewed By: xiaohui-zhang, mthrok

Differential Revision: D44069673

Pulled By: nateanl

fbshipit-source-id: addafefe381104e778b09bfbaafb322df1d9054c

ee0b97f2

08 Mar, 2023 2 commits

Include format information after filter (#3155) · 146195d8

moto authored Mar 08, 2023

Summary:
This commit adds fields to OutputStream, which shows the result
of fitlers, such as width and height after filtering.

Before

```
OutputStream(
    source_index=0,
    filter_description='fps=3,scale=width=320:height=320,format=pix_fmts=gray')
```

After

```
OutputVideoStream(
    source_index=0,
    filter_description='fps=3,scale=width=320:height=320,format=pix_fmts=gray',
    media_type='video',
    format='gray',
    width=320,
    height=320,
    frame_rate=3.0)
```

Pull Request resolved: https://github.com/pytorch/audio/pull/3155

Reviewed By: nateanl

Differential Revision: D43882399

Pulled By: mthrok

fbshipit-source-id: 620676b1a06f293fdd56de8203a11120f228fa2d

146195d8

Support overwriting PTS in StreamWriter (#3135) · 8d2f6f8d

moto authored Mar 08, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3135

Reviewed By: xiaohui-zhang

Differential Revision: D43724273

Pulled By: mthrok

fbshipit-source-id: 9b52823618948945a26e57d5b3deccbf5f9268c1

8d2f6f8d

07 Mar, 2023 3 commits

Use deterministic algorithms for filtfilt autograd tests (#3150) · 1923be04

Zhaoheng Ni authored Mar 07, 2023

Summary:
`filtfilt` function uses `lfilter`, which calls `conv_1d` operation internally. `conv_1d` is expected to have autograd test failures (see https://pytorch.org/docs/stable/generated/torch.use_deterministic_algorithms.html). The PR uses deterministic algorithms in the autograd tests to make `filtfilt` related tests pass.

Pull Request resolved: https://github.com/pytorch/audio/pull/3150

Reviewed By: mthrok

Differential Revision: D43872977

Pulled By: nateanl

fbshipit-source-id: c3d6ec281f34db8a7092526ccb245797bf2338da

1923be04

Fix LFCC autograd test (#3154) · 67a49f3c

Zhaoheng Ni authored Mar 07, 2023

Summary:
Autograd test randomly failed on gpu linux machine. Increase `nondet_tol` to make it pass.

Pull Request resolved: https://github.com/pytorch/audio/pull/3154

Reviewed By: mthrok

Differential Revision: D43873028

Pulled By: nateanl

fbshipit-source-id: a6668c47967a085e5eafb00e2dd4e61b2b46412e

67a49f3c

Raise an error is StreamWriter is not opened (#3152) · 502d5811

Moto Hira authored Mar 07, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3152

In StreamWriter, if the destination is not opened when attempting to write data, it causes segmentation fault.
This commit adds guard so that instead of segfault, it will error-out.

Reviewed By: nateanl

Differential Revision: D43852649

fbshipit-source-id: aef5db7c1508f8a7db5834c2ab6de3cad09f9d60

502d5811

02 Mar, 2023 1 commit

Fix PTS regression (#3131) · fbf05f28

Moto Hira authored Mar 01, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3131

In https://github.com/pytorch/audio/pull/3122, the intermediate `num_frames` variable
is removed.

PTS can be incremented the same way, but the timing was wrong in #3122.
This commit fixes it.

Reviewed By: xiaohui-zhang

Differential Revision: D43712046

fbshipit-source-id: 2fe0082969296f4f3964e62e55b5325fcd45f4f9

fbf05f28

01 Mar, 2023 1 commit

Fix windows tests (#3119) · 6a4a8200

Zhaoheng Ni authored Mar 01, 2023

Summary:
`sox` is not available on Windows machines. Add skip decorators to the sox related tests to skip running tests on Windows.

Pull Request resolved: https://github.com/pytorch/audio/pull/3119

Reviewed By: mthrok

Differential Revision: D43682754

Pulled By: nateanl

fbshipit-source-id: f69987dac8232a3569be83f096b32389bd8bda81

6a4a8200

27 Feb, 2023 1 commit

Add SquimObjectiveBundle to prototype (#3103) · 46fae2fe

Zhaoheng Ni authored Feb 27, 2023

Summary:
Add pre-trained pipeline support for `SquimObjective` model. The pre-trained model is trained on DNS 2020 challenge dataset.

Pull Request resolved: https://github.com/pytorch/audio/pull/3103

Reviewed By: xiaohui-zhang, mthrok

Differential Revision: D43611794

Pulled By: nateanl

fbshipit-source-id: 0ac76a27e7027a43ffccb158385ddb2409b8526d

46fae2fe

25 Feb, 2023 1 commit

Fix unit tests for griffinlim and Spectrogram (#3099) · 75fc9a46

Zhaoheng Ni authored Feb 25, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3099

Reviewed By: mthrok

Differential Revision: D43596866

Pulled By: nateanl

fbshipit-source-id: 43a139bf8ebdf3261414e2855aefc3b53df298ac

75fc9a46