Commits · cd80976e0b7d5a4ceab5f777549c2a1f6188a737 · OpenDAS / Torchaudio

07 Aug, 2023 1 commit

Make target_lengths/input_lengths in forced_align optional (#3533) · cd80976e

moto authored Aug 07, 2023

Summary:
Currently `torchaudio.functional.forced_align` function requires full information on input/target lengths.
When performing non-batched alignment, these can be inferred from the size of Tensor.

Pull Request resolved: https://github.com/pytorch/audio/pull/3533

Reviewed By: nateanl

Differential Revision: D48111041

Pulled By: mthrok

fbshipit-source-id: fbf07124d3959c5cc5533dcd86296851587082fb

cd80976e

31 Jul, 2023 1 commit

Migrate torch.norm to torch.linalg.vector_norm (#3522) · 8a2e12d3

moto authored Jul 31, 2023

Summary:
torch.norm is now deprecated.
The usages in torchaudio seems to be vector norm, so replacing them with torch.linalg.vector_norm

Resolves https://github.com/pytorch/audio/issues/3484

Pull Request resolved: https://github.com/pytorch/audio/pull/3522

Reviewed By: huangruizhe

Differential Revision: D47926659

Pulled By: mthrok

fbshipit-source-id: f7428cf0168109a3d340b8784adc99bb5f781084

8a2e12d3

28 Jul, 2023 1 commit

Amend amp_to_db docstring (#3519) · 61cbf791

moto authored Jul 28, 2023

Summary:
Context: https://github.com/pytorch/audio/issues/3448

The documentation of amplitude_to_DB is ambigious on how cut-off values are computed when the input tensor is 3D.

This commit clarifies that.

Closes: https://github.com/pytorch/audio/issues/3448

Pull Request resolved: https://github.com/pytorch/audio/pull/3519

Reviewed By: huangruizhe

Differential Revision: D47875505

Pulled By: mthrok

fbshipit-source-id: e06bb997e7a27e2abe35c8e2ac91ddfbded4e641

61cbf791

25 Jul, 2023 1 commit

Fix typo in melscale_fbank (#3487) · 135cb7ba

moto authored Jul 25, 2023

Summary:
Resolves https://github.com/pytorch/audio/issues/3486

Pull Request resolved: https://github.com/pytorch/audio/pull/3487

Differential Revision: D47724733

Pulled By: mthrok

fbshipit-source-id: 26f5641a8271a7e50c4a33861d09b0c8274b29e4

135cb7ba

12 Jul, 2023 1 commit

Fix resampling to support dynamic input lengths for onnx exports. (#3473) · a3b6bfb6

Bogdan Teleaga authored Jul 12, 2023

Summary:
This is a port of https://github.com/adefossez/julius/pull/17 for torchaudio.

Not sure if it's possible/desirable to add tests to test the functionality of ONNX exports, but I did a quick test on my machine to ensure this works. The logic is a bit simpler compared to the other PR because the torchaudio version does not support the additional flags available in julius.

Pull Request resolved: https://github.com/pytorch/audio/pull/3473

Differential Revision: D47401988

Pulled By: mthrok

fbshipit-source-id: 62fa1e4388923f6a62cef2c0f902a79ea179cec4

a3b6bfb6

11 Jul, 2023 1 commit

Fix doc style (#3468) · 18b20f77

moto authored Jul 11, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3468

Differential Revision: D47368070

Pulled By: mthrok

fbshipit-source-id: 9b5d57b0cb861a2556a1903121f526f8011a0e2d

18b20f77

05 Jul, 2023 1 commit

Update forced_align method to only support batch Tensors (#3433) · cc164478

Zhaoheng Ni authored Jul 05, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3433

Current design of forced_align accept 2D Tensor for `log_probs` and 1D Tensor for `targets`. To make the API simple, the PR make changes to only support batch Tensors (3D Tensor for `log_probs` and 2D Tensor for `targets`).

Reviewed By: mthrok

Differential Revision: D46657526

fbshipit-source-id: af17ec3f92f1a2c46dba91c6db2488a11de36f89

cc164478

13 Jun, 2023 1 commit

Fix build doc (#3435) · 0f682c77

moto authored Jun 13, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3435

Reviewed By: nateanl

Differential Revision: D46659362

Pulled By: mthrok

fbshipit-source-id: ffa033ad6759de6fd958b63ac51a4a1153ffb45d

0f682c77

07 Jun, 2023 1 commit

Fix style to prep #3414 (#3415) · 47716772

moto authored Jun 07, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3415

Differential Revision: D46526437

Pulled By: mthrok

fbshipit-source-id: f78d19c19d7e68f67712412de35d9ed50f47263b

47716772

06 Jun, 2023 2 commits

Revert D46126226: Update forced_align method to only support batch Tensors · bbc13b9a

Moto Hira authored Jun 06, 2023

Differential Revision:
D46126226

Original commit changeset: 42cb52b19d91

Original Phabricator Diff: D46126226

fbshipit-source-id: 372b2526d9e196e37e014f1556bf117d29bb1ac6

bbc13b9a

Update forced_align method to only support batch Tensors (#3365) · 5f17d81c

Zhaoheng Ni authored Jun 06, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3365

Reviewed By: vineelpratap

Differential Revision: D46126226

fbshipit-source-id: 42cb52b19d91bbff7dc040ccf60350545d75b3a2

5f17d81c

02 Jun, 2023 1 commit

[BC-Breaking] Remove compute_kaldi_pitch (#3368) · 5bbbb1d5

moto authored Jun 02, 2023

Summary:
This commit removes compute_kaldi_pitch function and the underlying Kaldi integration from torchaudio.

Kaldi pitch function was added in a short period of time by integrating the original Kaldi implementation, instead of reimplementing it in PyTorch.

The Kaldi integration employed a hack which replaces the base vector/matrix implementation of Kaldi with PyTorch Tensor so that there is only one blas library within torchaudio.

Recently, we are making torchaudio more lean, and we don't see a wide adoption of kaldi_pitch feature, so we decided to remove them.

See some of the discussion https://github.com/pytorch/audio/issues/1269

Pull Request resolved: https://github.com/pytorch/audio/pull/3368

Differential Revision: D46406176

Pulled By: mthrok

fbshipit-source-id: ee5e24d825188f379979ddccd680c7323b119b1e

5bbbb1d5

01 Jun, 2023 2 commits

Fix apply_codec to use named file (#3397) · 1dfac469

moto authored Jun 01, 2023

Summary:
Follow-up https://github.com/pytorch/audio/issues/3386 The intended change was to use path of temporary file, instead of file-like object

Pull Request resolved: https://github.com/pytorch/audio/pull/3397

Reviewed By: hwangjeff

Differential Revision: D46346189

Pulled By: mthrok

fbshipit-source-id: 44da799c6587bcb63a118a6313b7299bad742a40

1dfac469

Update and deprecate apply_codec function (#3386) · d6dd497c

moto authored May 31, 2023

Summary:
To prepare for the upcoming removal of file-like object support from sox_io backend,
this commit changes apply_codec function to use tempfile.

`apply_codec` function is now deprecated and users are encourated to use `torchaudio.io.AudioEffector`.
We will not remove the function itself, but will remove the entry from the doc.

Pull Request resolved: https://github.com/pytorch/audio/pull/3386

Reviewed By: hwangjeff

Differential Revision: D46330610

Pulled By: mthrok

fbshipit-source-id: 3071bdefa05b4cbb9f00629bef50f0981eae89b4

d6dd497c

24 May, 2023 1 commit

Resolve lint issue on LaTeX (#3366) · 8690e6ec

moto authored May 23, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3366

Reviewed By: nateanl

Differential Revision: D46136238

Pulled By: mthrok

fbshipit-source-id: 3432f5d007293831bab21460a79ae26b1bbc81a8

8690e6ec

22 May, 2023 1 commit

Update forced_align document (#3357) · c0702338

Zhaoheng Ni authored May 22, 2023

Summary:
- Fix latex formula rendering issue
- Add `devices` and `properties` tags
- Fix grammar

Pull Request resolved: https://github.com/pytorch/audio/pull/3357

Reviewed By: mthrok

Differential Revision: D46068633

Pulled By: nateanl

fbshipit-source-id: 80cb84508396fbcaf81c068228d46a24bb63b975

c0702338

20 May, 2023 1 commit

[audio][PR] Add forced_align function to torchaudio (#3348) · e7935cff

Zhaoheng Ni authored May 19, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3348

The pull request adds a CTC-based forced alignment function that supports both CPU and CUDA deviced. The function takes the CTC emissions and target labels as inputs and generates the corresponding labels for each frame.

Reviewed By: vineelpratap, xiaohui-zhang

Differential Revision: D45867265

fbshipit-source-id: 3e25b06bf9bc8bb1bdcdc08de7f4434d912154cb

e7935cff

04 May, 2023 1 commit

Extend mask_along_axis{,_iid} (#3289) · 74bd971a

Xiaohui Zhang authored May 04, 2023

Summary:
(1/2 of the previous [PR](https://github.com/pytorch/audio/pull/2360) which I accidentally closed)

The previous way of doing SpecAugment via Frequency/TimeMasking transforms has the following problems:
- Only zero masking can be done; masking by mean value is not supported.
- mask_along_axis is hard-coded to mask the 1st dimension and mask_along_axis_iid is hard-code to mask the 2nd or 3rd dimension of the input tensor.
- For 3D spectrogram tensors where the first dimension is batch or channel, features from the same batch or different channels have to use the same mask, because mask_along_axis_iid only support 4D tensors, because of the above hard-coding
- For 2D spectrogram tensors w/o a batch or channel dimension, Time/Frequency masking can't be applied at all, since mask_along_axis only support 3D tensors, because of the above hard-coding.
- It's not straightforward to apply multiple time/frequency masks by the current design.

To solve these issues, here we
- Extend mask_along_axis_iid to support 3D tensors and mask_along_axis to support 2D tensors. Now both of them are able to mask one of the last two dimensions (where the time or frequency dimension lives) of the input tensor.

The introduction of SpecAugment transform will be done in another PR.

Pull Request resolved: https://github.com/pytorch/audio/pull/3289

Reviewed By: hwangjeff

Differential Revision: D45460357

Pulled By: xiaohui-zhang

fbshipit-source-id: 91bf448294799f13789d96a13d4bae2451461ef3

74bd971a

08 Mar, 2023 1 commit

Fix documentation of functional and transforms (#3134) · 85cb37e2

cai525 authored Mar 08, 2023

Summary:
Address #3101. The documentation for `power=1` should represent magnitude instead of energy.

Pull Request resolved: https://github.com/pytorch/audio/pull/3134

Reviewed By: mthrok

Differential Revision: D43910652

Pulled By: nateanl

fbshipit-source-id: e0768438e819222a5dde6b86c5123ab0e8af59fb

85cb37e2

17 Feb, 2023 1 commit

Make lengths optional for speed functions and modules (#3072) · 5af309d3

hwangjeff authored Feb 16, 2023

Summary:
Makes lengths input optional for `torchaudio.functional.speed`, `torchaudio.transforms.Speed`, and `torchaudio.transforms.SpeedPerturbation`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3072

Reviewed By: nateanl, mthrok

Differential Revision: D43371406

Pulled By: hwangjeff

fbshipit-source-id: ecb38bcc2bfff5c5a396a37eff238b22238e795a

5af309d3

15 Feb, 2023 1 commit

Enable broadcasting for inputs to convolve (#3061) · a49edea5

hwangjeff authored Feb 15, 2023

Summary:
Relaxes input dimension matching constraint on `convolve` to enable broadcasting for inputs.

Pull Request resolved: https://github.com/pytorch/audio/pull/3061

Reviewed By: mthrok

Differential Revision: D43298078

Pulled By: hwangjeff

fbshipit-source-id: a6cc36674754523b88390fac0a05f06562921319

a49edea5

24 Jan, 2023 1 commit

Move data augmentation functions out of prototype (#3001) · 41b88314

hwangjeff authored Jan 23, 2023

Summary:
Moves `add_noise`, `fftconvolve`, `convolve`, `speed`, `preemphasis`, and `deemphasis` out of `torchaudio.prototype.functional` and into `torchaudio.functional`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3001

Reviewed By: mthrok

Differential Revision: D42688971

Pulled By: hwangjeff

fbshipit-source-id: 43280bd3ffeccddae57f1092ac45afb64dd426cc

41b88314

12 Jan, 2023 1 commit

Refactor extension modules initialization (#2968) · 5dfe0b22

mthrok authored Jan 12, 2023

Summary:
* Refactor _extension module so that
  * the implementation of initialization logic and its execution are separated.
    * logic goes to `_extension.utils`
    * the execution is at `_extension.__init__`
    * global variables are defined and modified in `__init__`.
* Replace `is_sox_available()` with `_extension._SOX_INITIALIZED`
* Replace `is_kaldi_available()` with `_extension._IS_KALDI_AVAILABLE`
* Move `requies_sox()` and `requires_kaldi()` to break the circular dependency among `_extension` and `_internal.module_utils`.
* Merge the sox-related initialization logic in `_extension.utils` module.

Pull Request resolved: https://github.com/pytorch/audio/pull/2968

Reviewed By: hwangjeff

Differential Revision: D42387251

Pulled By: mthrok

fbshipit-source-id: 0c3245dfab53f9bc1b8a83ec2622eb88ec96673f

5dfe0b22

16 Dec, 2022 1 commit

Rename resampling_method options (#2922) · e6bebe6a

Caroline Chen authored Dec 16, 2022

Summary:
resolves https://github.com/pytorch/audio/issues/2891

Rename `resampling_method` options to more accurately describe what is happening. Previously the methods were set to `sinc_interpolation` and `kaiser_window`, which can be confusing as both options actually use sinc interpolation methodology, but differ in the window function used. As a result, rename `sinc_interpolation` to `sinc_interp_hann` and `kaiser_window` to `sinc_interp_kaiser`. Using an old option will throw a warning, and those options will be deprecated in 2 released. The numerical behavior is unchanged.

Pull Request resolved: https://github.com/pytorch/audio/pull/2922

Reviewed By: mthrok

Differential Revision: D42083619

Pulled By: carolineechen

fbshipit-source-id: 9a9a7ea2d2daeadc02d53dddfd26afe249459e70

e6bebe6a

14 Nov, 2022 1 commit

Move bark spectrogram to prototype (#2843) · 7819f3f6

Caroline Chen authored Nov 14, 2022

Summary:
follow up to https://github.com/pytorch/audio/issues/2823
- move bark spectrogram to prototype
- decrease autograd test tolerance (passing on circle ci)
- add diagram for bark fbanks

cc jdariasl

Pull Request resolved: https://github.com/pytorch/audio/pull/2843

Reviewed By: nateanl

Differential Revision: D41199522

Pulled By: carolineechen

fbshipit-source-id: 8e6c2e20fb7b14f39477683b3c6ed8356359a213

7819f3f6

10 Nov, 2022 1 commit

BarkSpectrogram (#2823) · b326bc49

Julián D. Arias-Londoño authored Nov 10, 2022

Summary:
I have added BarkScale transform, which can transform a regular Spectrogram into a BarkSpectrograms similar to MelScale. ahmed-fau opened this requirement in December 2021 with the number (https://github.com/pytorch/audio/issues/2103). The new functionality includes three different well-known approximations of the Bark scale.

Pull Request resolved: https://github.com/pytorch/audio/pull/2823

Reviewed By: nateanl

Differential Revision: D41162100

Pulled By: carolineechen

fbshipit-source-id: b2670c4972e49c9ef424da5d5982576f7a4df831

b326bc49

08 Nov, 2022 1 commit

Enable log probs input for rnnt loss (#2798) · ca478823

Caroline Chen authored Nov 08, 2022

Summary:
Add `fused_log_softmax` argument (default/current behavior = True) to rnnt loss.

If setting it to `False`, call `log_softmax` on the logits prior to passing it in to the rnnt loss function.

The following should produce the same output:
```
rnnt_loss(logits, targets, logit_lengths, target_lengths, fused_log_softmax=True)
```

```
log_probs = torch.nn.functional.log_softmax(logits, dim=-1)
rnnt_loss(log_probs, targets, logit_lengths, target_lengths, fused_log_softmax=False)
```

testing -- unit tests + get same results on the conformer rnnt recipe

Pull Request resolved: https://github.com/pytorch/audio/pull/2798

Reviewed By: xiaohui-zhang

Differential Revision: D41083523

Pulled By: carolineechen

fbshipit-source-id: e15442ceed1f461bbf06b724aa0561ff8827ad61

ca478823

15 Sep, 2022 1 commit

Consolidate bibliography / reference (#2676) · 476ab9ab

moto authored Sep 14, 2022

Summary:
Preparation for the adoptation of `autosummary`.

Replace `:footcite:` with `:cite:` and introduce dedicated reference page, as `:footcite:` does not work well with `autosummary`.

Example:

https://output.circle-artifacts.com/output/job/4da47ba6-d9c7-418e-b5b0-e9f8a146a6c3/artifacts/0/docs/datasets.html#cmuarctic

https://output.circle-artifacts.com/output/job/4da47ba6-d9c7-418e-b5b0-e9f8a146a6c3/artifacts/0/docs/references.html

Pull Request resolved: https://github.com/pytorch/audio/pull/2676

Reviewed By: carolineechen

Differential Revision: D39509431

Pulled By: mthrok

fbshipit-source-id: e6003dd01ec3eff3d598054690f61de8ee31ac9a

476ab9ab

16 Aug, 2022 1 commit

Use double quotes for string in functional and transforms (#2618) · 7ac3e2e2

Zhaoheng Ni authored Aug 16, 2022

Summary:
To make the code consistent, we should use double quotation marks for all strings. This PR make such changes in functional and transforms.

Pull Request resolved: https://github.com/pytorch/audio/pull/2618

Reviewed By: carolineechen

Differential Revision: D38744137

Pulled By: nateanl

fbshipit-source-id: 74213a24d9f66c306cc92019d77dcb2a877f94bd

7ac3e2e2

03 Aug, 2022 1 commit

An implemenation of the ITU-R BS.1770-4 loudness recommendation (#2472) · 946b180a

bshall authored Aug 03, 2022

Summary:
I took a stab at implementing the ITU-R BS.1770-4 loudness recommendation (closes https://github.com/pytorch/audio/issues/1205). To give some more details:
- I've implemented K-weighting following csteinmetz1 instead of BrechtDeMan since it fit well with torchaudio's already implemented filters (`treble_biquad` and `highpass_biquad`).
- I've added four audio files to test compliance with the recommendation. These are linked in [this pdf](https://www.itu.int/dms_pub/itu-r/opb/rep/R-REP-BS.2217-2-2016-PDF-E.pdf). There are many more test files there but I didn't want to bog down the assets directory with too many files. Let me know if I should add or remove anything.
- I've kept many of the constant internal to the function (e.g. the block duration, overlap, and the absolute threshold gamma). I'm not sure if these should be exposed in the signature.
- I've implemented support for up to 5 channels (following both csteinmetz1 and BrechtDeMan). The recommendation includes weights for up to 24 channels. Is there any convention for how many channels to support?

I hope this is helpful! looking forward to hearing from you.

Pull Request resolved: https://github.com/pytorch/audio/pull/2472

Reviewed By: hwangjeff

Differential Revision: D38389155

Pulled By: carolineechen

fbshipit-source-id: fcc86d864c04ab2bedaa9acd941ebc4478ca6904

946b180a

28 Jul, 2022 1 commit

Add Union normalization parameter on spectrogram and inverse spectrogram (#2554) · 0fde7c57

Sean Kim authored Jul 28, 2022

Summary:
Add str to normalized parameter to enable frame_length based normalization to align with torch implementation of stft. Addresses issue https://github.com/pytorch/audio/issues/2104

Pull Request resolved: https://github.com/pytorch/audio/pull/2554

Reviewed By: carolineechen, mthrok

Differential Revision: D38247554

Pulled By: skim0514

fbshipit-source-id: c243c7a6b8fda2a1e565cef4600f7c5a06baf602

0fde7c57

27 Jul, 2022 1 commit

Replace assert with raise (#2579) · 0f4e1e8c

Piyush Soni authored Jul 27, 2022

Summary:
`assert` is not executed when running in optimized mode.

This commit replaces all instances of "assert" in /fbcode/pytorch/audio/torchaudio/functional/functional.py

Pull Request resolved: https://github.com/pytorch/audio/pull/2579

Reviewed By: mthrok

Differential Revision: D38158280

fbshipit-source-id: f8d7fca1c8f9b3955c6ca312b16947eb12894d81

0f4e1e8c

25 Jul, 2022 1 commit

[BC-breaking] Fix momentum in transforms.GriffinLim (#2568) · 1634ed01

proxyphi authored Jul 25, 2022

Summary:
The momentum in GriffinLim transform is modified before being passed
to the functional. causing inconsistency between functional and transforms.

Fix this by making it pass through in transform.

Fixes https://github.com/pytorch/audio/issues/2567

Pull Request resolved: https://github.com/pytorch/audio/pull/2568

Reviewed By: nateanl

Differential Revision: D38117632

Pulled By: mthrok

fbshipit-source-id: 99754be4b3b6dea45ba115aaea9fb6d7285bc2c9

1634ed01

21 Jul, 2022 1 commit

fix resample (#2561) · c18a103b

Sean Kim authored Jul 21, 2022

Summary:
Added back device in case of tensor creation

Pull Request resolved: https://github.com/pytorch/audio/pull/2561

Reviewed By: mthrok

Differential Revision: D38035351

Pulled By: skim0514

fbshipit-source-id: bdea07cbb34d0aa487187cded1a5636da6623d96

c18a103b

20 Jul, 2022 1 commit

Speed up resample with kernel generation modification (#2553) · 5c6e602c

Sean Kim authored Jul 20, 2022

Summary:
Modification from pull request https://github.com/pytorch/audio/issues/2415 to improve resample.

Benchmarked for a 89% time reduction, tested in comparison to original resample method.

Pull Request resolved: https://github.com/pytorch/audio/pull/2553

Reviewed By: carolineechen

Differential Revision: D37997533

Pulled By: skim0514

fbshipit-source-id: ef4b719450ac26794db6ea01f9882509f4fda5cf

5c6e602c

12 Jul, 2022 1 commit

Fix docstring (#2540) · 05d2580a

Zhaoheng Ni authored Jul 11, 2022

Summary:
The docstring of `apply_beamforming` has warning when building the documentation page. Fix it in this PR.

Pull Request resolved: https://github.com/pytorch/audio/pull/2540

Reviewed By: mthrok

Differential Revision: D37763745

Pulled By: nateanl

fbshipit-source-id: 0e9f1e098865af032b00ac56d918cb9d2ffc5024

05d2580a

13 Jun, 2022 1 commit
- [AutoAccept][Codemod][FBSourceBlackLinter] Daily `arc lint --take BLACK` · 71ed457e
  CodemodService FBSourceBlackLinterBot authored Jun 13, 2022
```
Reviewed By: ivanmurashko

Differential Revision: D37103342

fbshipit-source-id: adc908c790a413384bd88a75d3c2b4b0974c6674
```
  71ed457e
10 Jun, 2022 1 commit

Modifying Pitchshift for faster resampling (#2441) · df2262b5

Sean Kim authored Jun 10, 2022

Summary:
Split existing Pitchshift into multiple helper functions in order to cache kernel and speed up overall process addressing https://github.com/pytorch/audio/issues/2359.
Existing unit tests pass.

edit: functional and transforms unit test pass. Adopted lazy initialization to avoid BC-breaking.

Pull Request resolved: https://github.com/pytorch/audio/pull/2441

Reviewed By: carolineechen

Differential Revision: D36905582

Pulled By: skim0514

fbshipit-source-id: 6780db3ac8a29d59017a6abe7e82ce1fd17aaac2

df2262b5

02 Jun, 2022 1 commit

Remove mad (#2428) · d2ecba98

moto authored Jun 02, 2022

Summary:
Remove the code related to libmad, which had been disabled in https://github.com/pytorch/audio/issues/2354

In https://github.com/pytorch/audio/issues/2419, we mp3 decoding to ffmpeg. But CI tests were still using libmad.
This commit completely removes libmad from torchaudio.

This is BC-breaking change as `apply_sox_effects_file` function cannot handle MP3, and it cannot fallback to ffmpeg.
The workaround for this is to use `torchaudio.load` then `apply_sox_effects_tensor`.

Pull Request resolved: https://github.com/pytorch/audio/pull/2428

Reviewed By: carolineechen

Differential Revision: D36851805

Pulled By: mthrok

fbshipit-source-id: f98795c59a1ac61cef511f2bbeac37f7c3c69d55

d2ecba98

23 May, 2022 1 commit

Add assertion checks to multi-channel functions (#2401) · 38e530d7

Zhaoheng Ni authored May 23, 2022

Summary:
- The multi-channel functions only support complex-valued tensors for spectrogram and PSD matrices.
- The mask can be real-valued or complex-valued, hence there is no explicit assertion for mask.
- The shape of input Tensors need to be verified before the computation. For example, the shape of PSD matrix must be `(..., freq, channel, channel)`, the shape of the mask must be `(..., freq, time)`, etc.
- The autograd unittest of `apply_beamforming` has wrong dimensions for beamform_weights detected by the assertion check. FIx it in this PR.

Pull Request resolved: https://github.com/pytorch/audio/pull/2401

Reviewed By: carolineechen

Differential Revision: D36597689

Pulled By: nateanl

fbshipit-source-id: 6ad1adebe3726851cc1d865650bdf177a98985f6

38e530d7