Commits · 06301c0a0bbe554b10f2418d6d0482eaf37ba475 · OpenDAS / Torchaudio

10 Aug, 2023 1 commit

Add Frechet distance function (#3545) · 06301c0a

Jeff Hwang authored Aug 10, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3545

Adds function for computing the Fréchet distance between two multivariate normal distributions.

Reviewed By: mthrok

Differential Revision: D48126102

fbshipit-source-id: e4e122b831e1e752037c03f5baa9451e81ef1697

06301c0a

07 Aug, 2023 1 commit

Add merge_tokens / TokenSpan (#3535) · 30668afb

moto authored Aug 07, 2023

Summary:
This commit adds `merge_tokens` function which removes repeated tokens from CTC token sequences returned from `forced_align`.

Resolving repeated tokens is a necessary step and almost universal, thus it makes sense to have such helper function in torchaudio.

Pull Request resolved: https://github.com/pytorch/audio/pull/3535

Reviewed By: huangruizhe

Differential Revision: D48111202

Pulled By: mthrok

fbshipit-source-id: 25354bfa210aa5c03f8c1d3e201f253ca3761b24

30668afb

05 Jun, 2023 1 commit

Clean-up ComputeKaldiPitch residue (#3403) · c076d1a8

moto authored Jun 05, 2023

Summary:
Follow up of: https://github.com/pytorch/audio/pull/3368

Remove files and lines no longer used.

Pull Request resolved: https://github.com/pytorch/audio/pull/3403

Differential Revision: D46441462

Pulled By: mthrok

fbshipit-source-id: 11b881ec4b24fa0d625c6aee9f4bd91f637f9923

c076d1a8

22 May, 2023 1 commit

Add doc for forced_align (#3355) · 011f7f3d

Zhaoheng Ni authored May 21, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3355

Reviewed By: xiaohui-zhang

Differential Revision: D46060254

Pulled By: nateanl

fbshipit-source-id: c2e44f994739755daf049fe350dd24a987a9cc29

011f7f3d

24 Jan, 2023 1 commit

Move data augmentation functions out of prototype (#3001) · 41b88314

hwangjeff authored Jan 23, 2023

Summary:
Moves `add_noise`, `fftconvolve`, `convolve`, `speed`, `preemphasis`, and `deemphasis` out of `torchaudio.prototype.functional` and into `torchaudio.functional`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3001

Reviewed By: mthrok

Differential Revision: D42688971

Pulled By: hwangjeff

fbshipit-source-id: 43280bd3ffeccddae57f1092ac45afb64dd426cc

41b88314

14 Nov, 2022 1 commit

Move bark spectrogram to prototype (#2843) · 7819f3f6

Caroline Chen authored Nov 14, 2022

Summary:
follow up to https://github.com/pytorch/audio/issues/2823
- move bark spectrogram to prototype
- decrease autograd test tolerance (passing on circle ci)
- add diagram for bark fbanks

cc jdariasl

Pull Request resolved: https://github.com/pytorch/audio/pull/2843

Reviewed By: nateanl

Differential Revision: D41199522

Pulled By: carolineechen

fbshipit-source-id: 8e6c2e20fb7b14f39477683b3c6ed8356359a213

7819f3f6

10 Nov, 2022 1 commit

BarkSpectrogram (#2823) · b326bc49

Julián D. Arias-Londoño authored Nov 10, 2022

Summary:
I have added BarkScale transform, which can transform a regular Spectrogram into a BarkSpectrograms similar to MelScale. ahmed-fau opened this requirement in December 2021 with the number (https://github.com/pytorch/audio/issues/2103). The new functionality includes three different well-known approximations of the Bark scale.

Pull Request resolved: https://github.com/pytorch/audio/pull/2823

Reviewed By: nateanl

Differential Revision: D41162100

Pulled By: carolineechen

fbshipit-source-id: b2670c4972e49c9ef424da5d5982576f7a4df831

b326bc49

20 Sep, 2022 1 commit

Adopt `:autosummary:` in `torchaudio.functional` module doc (#2693) · ad15bc71

moto authored Sep 20, 2022

Summary:
https://output.circle-artifacts.com/output/job/b23174d2-5cee-4ee9-be39-3228b9ae4abe/artifacts/0/docs/functional.html

<img width="1133" alt="Screen Shot 2022-09-20 at 11 19 23 AM" src="https://user-images.githubusercontent.com/855818/191152824-96c5b16c-bd38-4656-b1ae-0b58699dbd62.png">

Pull Request resolved: https://github.com/pytorch/audio/pull/2693

Reviewed By: carolineechen

Differential Revision: D39650930

Pulled By: mthrok

fbshipit-source-id: 28b5b03d21b922e37e02bfddda2bf1dea696cc18

ad15bc71

15 Sep, 2022 1 commit

Consolidate bibliography / reference (#2676) · 476ab9ab

moto authored Sep 14, 2022

Summary:
Preparation for the adoptation of `autosummary`.

Replace `:footcite:` with `:cite:` and introduce dedicated reference page, as `:footcite:` does not work well with `autosummary`.

Example:

https://output.circle-artifacts.com/output/job/4da47ba6-d9c7-418e-b5b0-e9f8a146a6c3/artifacts/0/docs/datasets.html#cmuarctic

https://output.circle-artifacts.com/output/job/4da47ba6-d9c7-418e-b5b0-e9f8a146a6c3/artifacts/0/docs/references.html

Pull Request resolved: https://github.com/pytorch/audio/pull/2676

Reviewed By: carolineechen

Differential Revision: D39509431

Pulled By: mthrok

fbshipit-source-id: e6003dd01ec3eff3d598054690f61de8ee31ac9a

476ab9ab

03 Aug, 2022 1 commit

An implemenation of the ITU-R BS.1770-4 loudness recommendation (#2472) · 946b180a

bshall authored Aug 03, 2022

Summary:
I took a stab at implementing the ITU-R BS.1770-4 loudness recommendation (closes https://github.com/pytorch/audio/issues/1205). To give some more details:
- I've implemented K-weighting following csteinmetz1 instead of BrechtDeMan since it fit well with torchaudio's already implemented filters (`treble_biquad` and `highpass_biquad`).
- I've added four audio files to test compliance with the recommendation. These are linked in [this pdf](https://www.itu.int/dms_pub/itu-r/opb/rep/R-REP-BS.2217-2-2016-PDF-E.pdf). There are many more test files there but I didn't want to bog down the assets directory with too many files. Let me know if I should add or remove anything.
- I've kept many of the constant internal to the function (e.g. the block duration, overlap, and the absolute threshold gamma). I'm not sure if these should be exposed in the signature.
- I've implemented support for up to 5 channels (following both csteinmetz1 and BrechtDeMan). The recommendation includes weights for up to 24 channels. Is there any convention for how many channels to support?

I hope this is helpful! looking forward to hearing from you.

Pull Request resolved: https://github.com/pytorch/audio/pull/2472

Reviewed By: hwangjeff

Differential Revision: D38389155

Pulled By: carolineechen

fbshipit-source-id: fcc86d864c04ab2bedaa9acd941ebc4478ca6904

946b180a

26 Feb, 2022 1 commit

Add apply_beamforming to torchaudio.functional (#2232) · 9c56ffb4

Zhaoheng Ni authored Feb 25, 2022

Summary:
This PR adds ``apply_beamforming`` method to ``torchaudio.functional``.
The method employs the beamforming weight to the multi-channel noisy spectrum to obtain the single-channel enhanced spectrum.
The input arguments are the complex-valued beamforming weight Tensor and the multi-channel noisy spectrum.

Pull Request resolved: https://github.com/pytorch/audio/pull/2232

Reviewed By: mthrok

Differential Revision: D34474561

Pulled By: nateanl

fbshipit-source-id: 2910251a8f111e65375dfb50495b6a415113f06d

9c56ffb4

25 Feb, 2022 5 commits

Add rtf_power method to torchaudio.functional (#2231) · ea74813d

Zhaoheng Ni authored Feb 25, 2022

Summary:
This PR adds ``rtf_power`` method to ``torchaudio.functional``.
The method computes the relative transfer function (RTF) or the steering vector by [the power iteration method](https://onlinelibrary.wiley.com/doi/abs/10.1002/zamm.19290090206).
[This paper](https://arxiv.org/pdf/2011.15003.pdf) describes the power iteration method in English.
The input arguments are the complex-valued power spectral density (PSD) matrix of the target speech, PSD matrix of noise, int or one-hot Tensor to indicate the reference channel, number of iterations, respectively.

Pull Request resolved: https://github.com/pytorch/audio/pull/2231

Reviewed By: mthrok

Differential Revision: D34474503

Pulled By: nateanl

fbshipit-source-id: 47011427ec4373f808755f0e8eff1efca57655eb

ea74813d

Add rtf_evd method to torchaudio.functional (#2230) · 86fe4fa7

Zhaoheng Ni authored Feb 25, 2022

Summary:
This PR adds `rtf_evd` method to `torchaudio.functional`.
The method computes the relative transfer function (RTF) or the steering vector by eigenvalue decomposition.
The input argument is the power spectral density (PSD) matrix of the target speech.

Pull Request resolved: https://github.com/pytorch/audio/pull/2230

Reviewed By: mthrok

Differential Revision: D34474188

Pulled By: nateanl

fbshipit-source-id: 888df4b187608ed3c2b7271b34d2231cdabb0134

86fe4fa7

Add mvdr_weights_rtf to torchaudio.functional (#2229) · 3566ffc5

Zhaoheng Ni authored Feb 25, 2022

Summary:
This PR adds ``mvdr_weights_rtf`` method to ``torchaudio.functional``.
It computes the MVDR weight matrix based on the solution that applies relative transfer function (RTF). See [the paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.725.673&rep=rep1&type=pdf) for the reference.
The input arguments are the complex-valued RTF Tensor of the target speech, power spectral density (PSD) matrix of noise, int or one-hot Tensor to indicate the reference channel, respectively.

Pull Request resolved: https://github.com/pytorch/audio/pull/2229

Reviewed By: mthrok

Differential Revision: D34474119

Pulled By: nateanl

fbshipit-source-id: 2d6f62cd0858f29ed6e4e03c23dcc11c816204e2

3566ffc5

Add mvdr_weights_souden to torchaudio.functional (#2228) · 5d06a369

Zhaoheng Ni authored Feb 25, 2022

Summary:
This PR adds ``mvdr_weights_souden`` method to ``torchaudio.functional``.
It computes the MVDR weight matrix based on the solution proposed by [``Souden et, al.``](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.725.673&rep=rep1&type=pdf).
The input arguments are the complex-valued power spectral density (PSD) matrix of the target speech, PSD matrix of noise, int or one-hot Tensor to indicate the reference channel, respectively.

Pull Request resolved: https://github.com/pytorch/audio/pull/2228

Reviewed By: mthrok

Differential Revision: D34474018

Pulled By: nateanl

fbshipit-source-id: 725df812f8f6e6cc81cc37e8c3cb0da2ab3b74fb

5d06a369

Add psd method to torchaudio.functional (#2227) · 07bd1aa3

Zhaoheng Ni authored Feb 25, 2022

Summary:
This PR adds ``psd`` method to ``torchaudio.functional``.
It computes the power spectral density (PSD) matrix of the complex-valued spectrum.
The method also supports normalization of Time-Frequency mask.

Pull Request resolved: https://github.com/pytorch/audio/pull/2227

Reviewed By: mthrok

Differential Revision: D34473908

Pulled By: nateanl

fbshipit-source-id: c1cfc584085d77881b35d41d76d39b26fca1dda9

07bd1aa3

10 Nov, 2021 1 commit
- [BC-Breaking] Remove deprecated create_fb_matrix (#1998) · 22379d14
  Krishna Kalyan authored Nov 10, 2021
  
  22379d14
04 Nov, 2021 1 commit
- Add Sphinx-gallery to doc (#1967) · a3363539
  moto authored Nov 04, 2021
  
  a3363539
28 Oct, 2021 1 commit
- Remove F.complex_norm and T.ComplexNorm (#1942) · ab50909d
  S Harish authored Oct 28, 2021
  
  ab50909d
27 Oct, 2021 1 commit
- Remove deprecated F.angle (#1935) · 1d3dcdbd
  S Harish authored Oct 27, 2021
  
  1d3dcdbd
26 Oct, 2021 1 commit
- Remove deprecated `F.magphase` (#1934) · d35ea80e
  S Harish authored Oct 26, 2021
  
  d35ea80e
01 Sep, 2021 1 commit
- Add edit_distance to documentation with a new category Metric (#1743) · d579d4b2
  yangarbiter authored Sep 01, 2021
  
  d579d4b2
20 Aug, 2021 2 commits
- Add sections to transforms docs (#1720) · ecfaac11
  Caroline Chen authored Aug 20, 2021
  
  ecfaac11
- Add basic filtfilt implementation (#1681) · 496b381a
  hwangjeff authored Aug 20, 2021
```
* Add basic filtfilt implementation

* Add filtfilt to functional package; add tests
Co-authored-by: V G <vladislav.goncharenko@phystech.edu>
```
  496b381a
19 Aug, 2021 1 commit
- Move RNNT Loss out of prototype (#1711) · 2c115821
  Caroline Chen authored Aug 19, 2021
  
  2c115821
14 Aug, 2021 1 commit
- Add doc for InverseSpectrogram (#1706) · ee74056f
  nateanl authored Aug 14, 2021
  
  ee74056f
02 Aug, 2021 1 commit

Add melscale_fbanks and deprecate create_fb_matrix (#1653) · 83dc5ec7

Joel Frank authored Aug 02, 2021

- Renamed torchaudio.functional.create_fb_matrix to torchaudio.functional.melscale_fbanks.
- Added interface with a warning for create_fb_matrix

83dc5ec7

29 Jul, 2021 1 commit
- Add LFCC feature to transforms (#1611) · 86370639
  Joel Frank authored Jul 29, 2021
```
Summary:
- Add linear_fbank method
- Add LFCC in transforms
```
  86370639
16 Jul, 2021 1 commit
- Add PitchShift to functional and transform (#1629) · f5dbb002
  nateanl authored Jul 16, 2021
  
  f5dbb002
03 Jun, 2021 1 commit

Update docs (#1550) · 0166a851

moto authored Jun 03, 2021

* Use `bibtex` for paper citations.
  * add `override.css` for fixing back reference.
  * wav2vec2
  * wav2letter
  * convtasnet
  * deepspeech
  * rnnt-loss
  * griffinlim
* Fix broken references in `filtering`.
* Fix note in soundfile backends.
* Tweak wav2vec2 example.
* Removes unused `pytorch_theme.css`

0166a851

02 Jun, 2021 1 commit
- Reformat resample docs (#1548) · a87b33db
  Caroline Chen authored Jun 02, 2021
  
  a87b33db
22 Mar, 2021 1 commit
- Move resample to functional and add librosa comparison (#1402) · 14dd917e
  Caroline Chen authored Mar 22, 2021
```
This PR additionally adds batching to kaldi compliance resample interface.
```
  14dd917e
01 Mar, 2021 1 commit
- Add subcategories to functional documentation (#1325) · 53af9779
  moto authored Mar 01, 2021
  
  53af9779
26 Feb, 2021 1 commit
- Fixes #1314 (#1316) · 457148ea
  Vincent QB authored Feb 26, 2021
  
  457148ea
12 Feb, 2021 1 commit
- Add compute_kaldi_pitch to doc (#1260) · 4f9b5520
  moto authored Feb 12, 2021
  
  4f9b5520
04 Dec, 2020 1 commit

[Doc] Add missing modules and minor fixes (#1022) · 2a02d7f5

Krishna Kalyan authored Dec 04, 2020



* Add griffinlim and DB_to_amplitude
* Fix Dataset docstring
* Fix other formatting
Co-authored-by: krishnakalyan3 <skalyan@cloudera.com>

2a02d7f5

06 Nov, 2020 1 commit
- [Doc] Group filtering in functinal.rst (#1005) · 4b4b8bf6
  moto authored Nov 06, 2020
  
  4b4b8bf6
30 Jul, 2020 1 commit

Remove istft (#841) · dab7f64b

Jeremy Chen authored Jul 30, 2020



* `istft` has been migrated to `pytorch`, and `torchaudio.functional.istft` has been deprecated in 0.6.0 release. This PR removes it
Co-authored-by: Jeremy Chen <jeremyyy@fb.com>

dab7f64b

03 Jun, 2020 1 commit

Add Bass with Biquad (#661) · a466b3c2

jimchen90 authored Jun 03, 2020



* Add bass with biquad

* Update functional.py

Add the normalization coefficients

* Update test_sox_compatibility.py

In test_sox_compatibility.py file, I add two bass tests: one test sets gain = 30, atol = 1e-4, the other sets gain = 40, atol = 1.5e-4. The details can be seen in pytorch#676

* Update torchscript_consistency_impl.py

Add torchscript test

* Add flake8 test
Co-authored-by: Ji Chen <jimchen90@devfair0160.h2.fair>

a466b3c2

02 Jun, 2020 1 commit

Add flanger to functional.py (#651) · 9e27cf3d

Bhargav Kathivarapu authored Jun 02, 2020



* Add flanger to functional
Signed-off-by: Bhargav Kathivarapu <bhargavkathivarapu31@gmail.com>

* Add random seed
Signed-off-by: Bhargav Kathivarapu <bhargavkathivarapu31@gmail.com>

* fix flanger
Signed-off-by: Bhargav Kathivarapu <bhargavkathivarapu31@gmail.com>

* shape

* Change bool arguments to strings
Signed-off-by: Bhargav Kathivarapu <bhargavkathivarapu31@gmail.com>

* Refactor tests
Signed-off-by: Bhargav Kathivarapu <bhargavkathivarapu31@gmail.com>
Co-authored-by: Vincent QB <vincentqb@users.noreply.github.com>

9e27cf3d