Commits · 6a4a82008560f00225096b004371db4b19dc991a · OpenDAS / Torchaudio

01 Mar, 2023 2 commits

Zhaoheng Ni authored Mar 01, 2023

Summary:
`sox` is not available on Windows machines. Add skip decorators to the sox related tests to skip running tests on Windows.

Pull Request resolved: https://github.com/pytorch/audio/pull/3119

Reviewed By: mthrok

Differential Revision: D43682754

Pulled By: nateanl

fbshipit-source-id: f69987dac8232a3569be83f096b32389bd8bda81

6a4a8200

Remove redundant device arg from VideoOutputStream constructor (#3121) · af493e4e

Moto Hira authored Feb 28, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3121

After careful review, it turned out device arg in VideoOutputStream
constructor and related helper functions can be replaced with
AVCodecContext::pix_fmt == AV_PIX_FMT_CUDA.

Reviewed By: xiaohui-zhang

Differential Revision: D43677801

fbshipit-source-id: f8f34f1aed46e223b44250d39cccc4cd26ecb458

af493e4e

28 Feb, 2023 3 commits

Decouple image conversion and OutputStream class (#3113) · 2381beec

Moto Hira authored Feb 28, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3113

Decouple the Tensor to AVFrame conversion process from encoding process.

Reviewed By: nateanl

Differential Revision: D43628942

fbshipit-source-id: e698f3150292567dbc23e7d6795ad58265f24780

2381beec

Use null filter in case no filter is used (#3109) · fd24af00

Moto Hira authored Feb 27, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3109

Change the logic around StreamWriter preprocessing.
Currently, no preprocessing is expressed as `nullptr` to `unique_ptr<FilterGraph>`.

This commit changes it to `[a]null` filter, which is just a pass through.
This makes a code a bit simpler, and serves better preparation for adding
filters for CUDA process.

Reviewed By: xiaohui-zhang

Differential Revision: D43593321

fbshipit-source-id: 9ca71c2c8bf652384a0f56b4c41b32d908f61201

fd24af00

Reduce code duplication in VideoOutputStream (#3108) · be3bd1ac

Moto Hira authored Feb 27, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3108

- Introduce process_frame method
- De-dupe validation logic

Reviewed By: xiaohui-zhang

Differential Revision: D43632390

fbshipit-source-id: 76b7ca0beb725acf686269c877a62e1256921b28

be3bd1ac

27 Feb, 2023 5 commits

Add SquimObjectiveBundle to prototype (#3103) · 46fae2fe

Zhaoheng Ni authored Feb 27, 2023

Summary:
Add pre-trained pipeline support for `SquimObjective` model. The pre-trained model is trained on DNS 2020 challenge dataset.

Pull Request resolved: https://github.com/pytorch/audio/pull/3103

Reviewed By: xiaohui-zhang, mthrok

Differential Revision: D43611794

Pulled By: nateanl

fbshipit-source-id: 0ac76a27e7027a43ffccb158385ddb2409b8526d

46fae2fe

Move OutputStream init logic and simplify interface (#3105) · bc61f109

Moto Hira authored Feb 27, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3105

Refactor the construction of Audio/VideoOutputStream

Reviewed By: nateanl

Differential Revision: D43613013

fbshipit-source-id: 0e112cb1bab2658be68a368099ed00ef318ea4f1

bc61f109

Split Audio/VideoOutputStream source (#3106) · 5b0580ae

Moto Hira authored Feb 27, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3106

Refactor Audio/VideoOutputStream.

Reviewed By: nateanl

Differential Revision: D43613008

fbshipit-source-id: 36c62fe00903066982573866d07de4e79b34240d

5b0580ae

Extract Encoder from OutputStream (#3104) · 5cac8de3

Moto Hira authored Feb 27, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3104

Continuation of StreamWriter refactoring

This commit extract Encoder (+muxer) from OutputStream

Reviewed By: nateanl

Differential Revision: D43610887

fbshipit-source-id: 30a9862b1aabd5af331ce3f33a5815df1decbad1

5cac8de3

Refactor StreamWriter and extract encoding process (#3100) · 23231033

Moto Hira authored Feb 27, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3100

Refactor StreamWriter and move OutputStream to dedicated source, then
split them into separate audio/video class.

Reviewed By: nateanl

Differential Revision: D43587337

fbshipit-source-id: 0fdbd1f56a7200dc6849e95eb9678854f5d933b8

23231033

25 Feb, 2023 1 commit

Fix unit tests for griffinlim and Spectrogram (#3099) · 75fc9a46

Zhaoheng Ni authored Feb 25, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3099

Reviewed By: mthrok

Differential Revision: D43596866

Pulled By: nateanl

fbshipit-source-id: 43a139bf8ebdf3261414e2855aefc3b53df298ac

75fc9a46

24 Feb, 2023 5 commits

Add Wav2Vec2DataModule in self_supervised_learning training recipe (#3081) · fd778091

Vladislav Agafonov authored Feb 24, 2023

Summary:
Add `Wav2Vec2DataModule` in self_supervised_learning training recipe to support Wav2Vec2 pre-training.

Pull Request resolved: https://github.com/pytorch/audio/pull/3081

Reviewed By: mthrok

Differential Revision: D43579239

Pulled By: nateanl

fbshipit-source-id: 3e935eb9a18ef0259a58940ae466cbdc3baf8494

fd778091

Add wav2vec2 loss function in self_supervised_learning training recipe (#3090) · c532f35c

Vladislav Agafonov authored Feb 24, 2023

Summary:
Add wav2vec2 loss function in the self_supervised_learning training recipe to support Wav2Vec2 pre-training.

Pull Request resolved: https://github.com/pytorch/audio/pull/3090

Reviewed By: mthrok

Differential Revision: D43579220

Pulled By: nateanl

fbshipit-source-id: 4b52792b518ddc5b01c9660c90ceb3c4ad1f0237

c532f35c

Cleanup ffmpeg bidings (#3095) · b46628ba

moto authored Feb 24, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3095

Reviewed By: nateanl

Differential Revision: D43544998

Pulled By: mthrok

fbshipit-source-id: 4359cdbbdbee53084016a84129cb3d65900b0457

b46628ba

Bind StreamReader/Writer with PyBind11 (#3091) · b012b452

moto authored Feb 24, 2023

Summary:
This commit is kind of clean up and preparation for future
development.

We plan to pass around more complicated objects among
StreamReader and StreamWriter, and TorchBind is not expressive enough
for defining intermediate object, so we use PyBind11 for binding
StreamWriter.

Pull Request resolved: https://github.com/pytorch/audio/pull/3091

Reviewed By: xiaohui-zhang

Differential Revision: D43515714

Pulled By: mthrok

fbshipit-source-id: 9097bb104bbf8c1536a5fab6f87447c08b10a7f2

b012b452

Use autosummary for torchaudio.prototyoe.models documentation (#3084) · f6d1bc96

Zhaoheng Ni authored Feb 24, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3084

Reviewed By: mthrok

Differential Revision: D43550150

Pulled By: nateanl

fbshipit-source-id: 5c5e3d9461e375be202493e3399ff38ce5cd7690

f6d1bc96

23 Feb, 2023 5 commits

Replace c10::Dict with std::map in StreamReader/Writer (#3092) · c3310018

moto authored Feb 23, 2023

Summary:
This commit is kind of clean up and preparation for future development.

We plan to pass around more complicated objects among StreamReader and StreamWriter, and TorchBind is not expressive enough for defining intermediate object, so we want to use PyBind11 for binding StreamReader/Writer.

PyBind11 converts Python dict into std::map, while TorchBind converts it into c10::Dict. Because of this descrepancy, conversion from c10::Dict to std::map have to happen in multiple places, and this makes the binding code thicker as it requires to wrapper methods.

Using std::map reduces the number of wrapper methods / conversions, because the same method can be bound for file-like object and the others.

Pull Request resolved: https://github.com/pytorch/audio/pull/3092

Reviewed By: nateanl

Differential Revision: D43524808

Pulled By: mthrok

fbshipit-source-id: f7467c66ccd37dbf4abc337bbb18ffaac21a0058

c3310018

Add TCPGen context-biasing Conformer RNN-T (#2890) · 1ed330b5

G. Sun authored Feb 23, 2023

Summary:
This commit adds the implementation of the tree-constrained pointer generator (TCPGen) for contextual biasing.

An example for Librispeech can be found in audio/examples/asr/librispeech_biasing.

Maintainer's note (mthrok):
It seems that TrieNode should be better typed as tuple, but changing the implementation from list to tuple
could cause some issue without running the code, so the code is not changed, though the annotation uses tuple.

Pull Request resolved: https://github.com/pytorch/audio/pull/2890

Reviewed By: nateanl

Differential Revision: D43171447

Pulled By: mthrok

fbshipit-source-id: 372bb077d997d720401dbf2dbfa131e6a958e37e

1ed330b5

Remove Tensor binding from StreamReader (#3093) · d3c9295c

mthrok authored Feb 23, 2023

Summary:
Remove the Tensor input support from StreamReader

Follow up of https://github.com/pytorch/audio/pull/3086

Pull Request resolved: https://github.com/pytorch/audio/pull/3093

Reviewed By: xiaohui-zhang

Differential Revision: D43526066

Pulled By: mthrok

fbshipit-source-id: 57ba4866c413649173e1c2c3b23ba7de3231b7bc

d3c9295c

Deprecate the use of Tensor as a mean of passing byte string (#3086) · a26c2f27

moto authored Feb 22, 2023

Summary:
The same functionality can be achieved with passing io.BytesIO to the constructor.

Pull Request resolved: https://github.com/pytorch/audio/pull/3086

Reviewed By: nateanl

Differential Revision: D43500360

Pulled By: mthrok

fbshipit-source-id: 2c6f37d100f50553b283c75c04fe57c8f9c07dc9

a26c2f27

Update CTCDecoder static build deprecation message (#3089) · 3b75b74f

moto authored Feb 22, 2023

Summary:
1. Fix spacing.
2. Move it to after successful import
3. Add link to the announcement issue

Pull Request resolved: https://github.com/pytorch/audio/pull/3089

Reviewed By: nateanl, xiaohui-zhang

Differential Revision: D43514075

Pulled By: mthrok

fbshipit-source-id: 3b2a24c65c63dab8c12c9c6aa1942a8354b2c0f1

3b75b74f

22 Feb, 2023 3 commits

Rename SQUIM_OBJECTIVE model to SquimObjective (#3087) · b0155938

Zhaoheng Ni authored Feb 22, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3087

Reviewed By: xiaohui-zhang, mthrok

Differential Revision: D43509865

Pulled By: nateanl

fbshipit-source-id: 569cc2ee8edd9de0b7d255a1e1075ac812b26cc8

b0155938

Fix ConformerWav2Vec2PretrainModel (#3085) · b35a5fcf

Zhaoheng Ni authored Feb 22, 2023

Summary:
The negative sampling should be applied to unmasked features in masked indices, the PR fixes the logic in ConformerWav2Vec2PretrainModel.

Pull Request resolved: https://github.com/pytorch/audio/pull/3085

Reviewed By: mthrok

Differential Revision: D43488570

Pulled By: nateanl

fbshipit-source-id: 3820400d50b74216bb98ca6a40dc6a7acca01564

b35a5fcf

Add objective metric estimation model for speech enhancement (#3042) · 3267c7ed

Zhaoheng Ni authored Feb 21, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3042

Reviewed By: mthrok

Differential Revision: D43405932

Pulled By: nateanl

fbshipit-source-id: 88f6dabae35565b699230e9909b8f68f4a57f5c7

3267c7ed

21 Feb, 2023 1 commit

Fix contiguous error when backpropagating through lfilter (#3080) · 6ab1325a

Chin-Yun Yu authored Feb 21, 2023

Summary:
I encountered the following errors when using the filter with gradients being enabled.

```sh
Traceback (most recent call last):
  File "/home/ycy/working/audio/test_backward.py", line 20, in <module>
    loss.backward()
  File "/home/ycy/miniconda3/envs/nightly/lib/python3.10/site-packages/torch/_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "/home/ycy/miniconda3/envs/nightly/lib/python3.10/site-packages/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: Expected input_signal_windows.is_contiguous() && a_coeff_flipped.is_contiguous() && padded_output_waveform.is_contiguous() to be true, but got false.  (Could this error message be improved?  If so, please report an enhancement request to PyTorch.)
```
This can happen if the outputs from lfilter was used by other operations.

### How to reproduce
The following script can reproduce the error on the stable and nightly versions.

```python
import torch
import torch.nn.functional as F
from torchaudio.functional import lfilter

a = torch.rand(250, 26, requires_grad=True)
b = torch.ones(250, 26, requires_grad=True)
x = torch.rand(250, 1024, requires_grad=True)
w = torch.eye(1024).unsqueeze(1)

y = lfilter(x, a, b, False)
y = F.conv_transpose1d(
    y.t().unsqueeze(0),
    w,
    stride=256,
).squeeze()
print(y.shape)
target = torch.ones_like(y)
loss = torch.nn.functional.mse_loss(y, target)
loss.backward()
```

### Cause

The inner call of differentiable IIR in the backward pass needs to ensure the input is contiguous. Adding a `contiguous()` call solve the problem.

Pull Request resolved: https://github.com/pytorch/audio/pull/3080

Reviewed By: xiaohui-zhang

Differential Revision: D43466612

Pulled By: mthrok

fbshipit-source-id: 375e0a147988656da47ac8397f7de6eae512a655

6ab1325a

17 Feb, 2023 3 commits

Make lengths optional for speed functions and modules (#3072) · 5af309d3

hwangjeff authored Feb 16, 2023

Summary:
Makes lengths input optional for `torchaudio.functional.speed`, `torchaudio.transforms.Speed`, and `torchaudio.transforms.SpeedPerturbation`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3072

Reviewed By: nateanl, mthrok

Differential Revision: D43371406

Pulled By: hwangjeff

fbshipit-source-id: ecb38bcc2bfff5c5a396a37eff238b22238e795a

5af309d3

Add py3.11 to windows nightly conda (#3071) · e663095c

atalman authored Feb 16, 2023

Summary:
Same as: https://github.com/pytorch/vision/pull/7263

Pull Request resolved: https://github.com/pytorch/audio/pull/3071

Reviewed By: weiwangmeta

Differential Revision: D43377741

Pulled By: atalman

fbshipit-source-id: 0dbe0aaa10b9a4bad713563e98642b1a65e9ac07

e663095c

Add precodition check for contiguous emissions tensor (#3074) · 06b1cc9d

Daniel Walker authored Feb 16, 2023

Summary:
This PR adds a precondition check to the `CTCDecoder` that raises a helpful exception when called on a noncontiguous emissions tensor.

Currently, noncontiguous tensors can be passed into the CTCDecoder, which in turn passes the tensors to the backing Flashlight C++ library and results in undefined behavior, since Flashlight requires the tensors to be laid out in contiguous memory. The following code demonstrates the problem:

```
import torch
from torchaudio.models.decoder import ctc_decoder

tokens = ['a', '-', '|']
decoder = ctc_decoder(lexicon=None, tokens=tokens)

emissions = torch.rand(len(tokens), 2)  # N x T contiguous
emissions = emissions.t()  # T x N noncontiguous

batch = emissions.unsqueeze(0)
result = decoder(batch)  # undefined behavior!!!
```

I stumbled on the issue accidentally when I noticed the decoder wasn't giving the expected results on my input only to realize, finally, that the tensor I had passed in was noncontiguous. In my case, Flashlight was iterating over unrelated segments of memory where it had expected to find a contiguous tensor. A precondition check will hopefully save others from making the same mistake.

Pull Request resolved: https://github.com/pytorch/audio/pull/3074

Reviewed By: nateanl, xiaohui-zhang

Differential Revision: D43376011

Pulled By: mthrok

fbshipit-source-id: 7c95aa8016d8f9f2d65b5b816a859b28ea4629f5

06b1cc9d

16 Feb, 2023 5 commits

Add guards to prevent ffmpeg failures during dispatcher import (#3073) · 85f8fc54

hwangjeff authored Feb 16, 2023

Summary:
With the introduction of the backend dispatcher, importing torchaudio fails when ffmpeg is not available. This PR adds guards to resolve these failures.

Pull Request resolved: https://github.com/pytorch/audio/pull/3073

Reviewed By: NivekT, mthrok

Differential Revision: D43372870

Pulled By: hwangjeff

fbshipit-source-id: 7f6c2795430d7aeb742c2feb97984d5273f20aac

85f8fc54

Fix DDP training in HuBERT recipes (#3068) · 2c9b3e59

Zhaoheng Ni authored Feb 16, 2023

Summary:
The `BucketizeBatchSampler` may return different iter_list in different node if `shuffle` is `True`, which will cause DPP training hang forever.
`shuffle` in `DistributedSampler` only happens in initialization, which means it will assign the same subset to replicas in all training epochs. The PR fixes the two above issues.

cc arlofaria

Pull Request resolved: https://github.com/pytorch/audio/pull/3068

Reviewed By: mthrok

Differential Revision: D43372110

Pulled By: nateanl

fbshipit-source-id: a162728406ae995e05d2a07cfc2444fb76cf345e

2c9b3e59

Update WER results for CTC n-gram decoding (#3070) · 11bdafc3

Zhaoheng Ni authored Feb 16, 2023

Summary:
In https://github.com/pytorch/audio/issues/2873, layer normalization is applied to waveforms for SSL models trained on large scale datasets. The word error rate is significantly reduced after the change. The PR updates the results for the affected models.

Without the change in https://github.com/pytorch/audio/issues/2873, here is the WER result table:
|                                                                                            Model | dev-clean | dev-other | test-clean | test-other |
|:------------------------------------------------------------------------------------------------|-----------:|-----------:|-----------:|-----------:|
| [WAV2VEC2_ASR_LARGE_LV60K_10M](https://pytorch.org/audio/main/generated/torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_10M.html#torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_10M) |        10.59|        15.62|        9.58|        16.33|
| [WAV2VEC2_ASR_LARGE_LV60K_100H](https://pytorch.org/audio/main/generated/torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_100H.html#torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_100H) |        2.80|        6.01|        2.82|        6.34|
| [WAV2VEC2_ASR_LARGE_LV60K_960H](https://pytorch.org/audio/main/generated/torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_960H.html#torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_960H) |        2.36|        4.43|        2.41|        4.96|
| [HUBERT_ASR_LARGE](https://pytorch.org/audio/main/generated/torchaudio.pipelines.HUBERT_ASR_LARGE.html#torchaudio.pipelines.HUBERT_ASR_LARGE) |        1.85|        3.46|        2.09|        3.89|
| [HUBERT_ASR_XLARGE](https://pytorch.org/audio/main/generated/torchaudio.pipelines.HUBERT_ASR_XLARGE.html#torchaudio.pipelines.HUBERT_ASR_XLARGE) |         2.21|        3.40|        2.26|        4.05|

After applying layer normalization, here is the updated result:
|                                                                                            Model | dev-clean | dev-other | test-clean | test-other |
|:------------------------------------------------------------------------------------------------|-----------:|-----------:|-----------:|-----------:|
| [WAV2VEC2_ASR_LARGE_LV60K_10M](https://pytorch.org/audio/main/generated/torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_10M.html#torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_10M) |        6.77|        10.03|        6.87|        10.51|
| [WAV2VEC2_ASR_LARGE_LV60K_100H](https://pytorch.org/audio/main/generated/torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_100H.html#torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_100H) |        2.19|        4.55|        2.32|        4.64|
| [WAV2VEC2_ASR_LARGE_LV60K_960H](https://pytorch.org/audio/main/generated/torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_960H.html#torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_960H) |        1.78|        3.51|        2.03|        3.68|
| [HUBERT_ASR_LARGE](https://pytorch.org/audio/main/generated/torchaudio.pipelines.HUBERT_ASR_LARGE.html#torchaudio.pipelines.HUBERT_ASR_LARGE) |        1.77|        3.32|        2.03|        3.68|
| [HUBERT_ASR_XLARGE](https://pytorch.org/audio/main/generated/torchaudio.pipelines.HUBERT_ASR_XLARGE.html#torchaudio.pipelines.HUBERT_ASR_XLARGE) |         1.73|        2.72|        1.90|        3.16|

Pull Request resolved: https://github.com/pytorch/audio/pull/3070

Reviewed By: mthrok

Differential Revision: D43365313

Pulled By: nateanl

fbshipit-source-id: 34a60ad2e5eb1299da64ef88ff0208ec8ec76e91

11bdafc3

Add deprecation warning to decoder (#3055) · 6b2086cf

moto authored Feb 16, 2023

Summary:
Flashlight Text decoder is now available on PyPI and KenLM support is being added at
https://github.com/flashlight/text/pull/43

Once this work is merged, we can rely on the official distribution of Flashlight Text package, so we are adding deprecation warning.

Once the decoder is fully available, one can install it with

```
pip install flashlight-text
pip install git+https://github.com/kpu/kenlm.git
```

Pull Request resolved: https://github.com/pytorch/audio/pull/3055

Reviewed By: hwangjeff, nateanl

Differential Revision: D43239150

Pulled By: mthrok

fbshipit-source-id: 728cb208b8403100cd4ccd80c6295d454756b414

6b2086cf

Introduce I/O backend dispatcher (#3015) · b799fcd6

hwangjeff authored Feb 16, 2023

Summary:
Adds I/O backend dispatcher that routes I/O requests to FFmpeg, SoX, or Soundfile backend, per library availability. It allows users to specify a backend mapped to a media library, i.e. one of `["ffmpeg", "sox", "soundfile"]`, to use via keyword argument, with FFmpeg being the default. Environment variable `TORCHAUDIO_USE_BACKEND_DISPATCHER` gates enablement of the dispatcher; specifically, if `TORCHAUDIO_USE_BACKEND_DISPATCHER` is explicitly set to `1`, importing TorchAudio makes it accessible via `torchaudio.info`, `torchaudio.load`, and `torchaudio.save`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3015

Reviewed By: mthrok

Differential Revision: D43258649

Pulled By: hwangjeff

fbshipit-source-id: 8f12e4e56b9fa3f0814dd3fed3e1783ab23a53a1

b799fcd6

15 Feb, 2023 5 commits

Implement exp sigmoid (#3056) · 9db4bdf1

Cole Li authored Feb 15, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3056

Task #2 from https://github.com/pytorch/audio/issues/2835

Reviewed By: mthrok

Differential Revision: D42854156

fbshipit-source-id: e1b3bd992c91fedc55f30a814e16efd7c51e0c80

9db4bdf1

Enable broadcasting for inputs to convolve (#3061) · a49edea5

hwangjeff authored Feb 15, 2023

Summary:
Relaxes input dimension matching constraint on `convolve` to enable broadcasting for inputs.

Pull Request resolved: https://github.com/pytorch/audio/pull/3061

Reviewed By: mthrok

Differential Revision: D43298078

Pulled By: hwangjeff

fbshipit-source-id: a6cc36674754523b88390fac0a05f06562921319

a49edea5

Add FFmpeg compat save function (#3058) · fb932674

Jeff Hwang authored Feb 15, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3058

Adds FFmpeg-based save function.

Reviewed By: mthrok

Differential Revision: D43264858

fbshipit-source-id: ae3f89012bc2520f3de11af65348ba8f77f0acff

fb932674

Update data augmentation tutorial to use new operators (#3062) · b9ef69d1

hwangjeff authored Feb 15, 2023

Summary:
Updates tutorial "Audio Data Augmentation" to use two of the newly introduced data augmentation operators in beta: `torchaudio.functional.fftconvolve` and `torchaudio.functional.add_noise`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3062

Reviewed By: mthrok

Differential Revision: D43298120

Pulled By: hwangjeff

fbshipit-source-id: 09ca736a5c67242568515d600b7d31eab32c2df1

b9ef69d1

Tweak docs around IO (#3064) · 12e8cb97

moto authored Feb 15, 2023

Summary:
* Mention context manager in StreamWriter
* Add FFmpeg as optional dependency

Pull Request resolved: https://github.com/pytorch/audio/pull/3064

Reviewed By: hwangjeff

Differential Revision: D43307818

Pulled By: mthrok

fbshipit-source-id: 86339d973aba85e090f520e08af65b5d736e3d18

12e8cb97

14 Feb, 2023 2 commits

Adding RC triggers for all build jobs (#3057) · b0af1406

Omkar Salpekar authored Feb 14, 2023

Summary:
Add triggers for RC branches and tags to all build workflows. This will ensure that the release-candidate builds will run with `CHANNEL=test`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3057

Reviewed By: atalman

Differential Revision: D43279657

Pulled By: osalpekar

fbshipit-source-id: 5abf3994b9b4a4897f53c540bd1db6c3d624b3e0

b0af1406

Update ssl example (#3060) · ff01be0f

Zhaoheng Ni authored Feb 14, 2023

Summary:
- Rename the current `ssl` example to `self_supervised_learning`
- Add README to demonstrate how to run the recipe with hubert task

Pull Request resolved: https://github.com/pytorch/audio/pull/3060

Reviewed By: mthrok

Differential Revision: D43287868

Pulled By: nateanl

fbshipit-source-id: 10352682485ef147ca32f4c4c9f9cde995444aa0

ff01be0f