Commits · ca478823c1d40c2d9b5ebf1908ed2f87ddf8a894 · OpenDAS / Torchaudio

08 Nov, 2022 1 commit

Enable log probs input for rnnt loss (#2798) · ca478823

Caroline Chen authored Nov 08, 2022

Summary:
Add `fused_log_softmax` argument (default/current behavior = True) to rnnt loss.

If setting it to `False`, call `log_softmax` on the logits prior to passing it in to the rnnt loss function.

The following should produce the same output:
```
rnnt_loss(logits, targets, logit_lengths, target_lengths, fused_log_softmax=True)
```

```
log_probs = torch.nn.functional.log_softmax(logits, dim=-1)
rnnt_loss(log_probs, targets, logit_lengths, target_lengths, fused_log_softmax=False)
```

testing -- unit tests + get same results on the conformer rnnt recipe

Pull Request resolved: https://github.com/pytorch/audio/pull/2798

Reviewed By: xiaohui-zhang

Differential Revision: D41083523

Pulled By: carolineechen

fbshipit-source-id: e15442ceed1f461bbf06b724aa0561ff8827ad61

ca478823

28 Jul, 2022 1 commit

Add Union normalization parameter on spectrogram and inverse spectrogram (#2554) · 0fde7c57

Sean Kim authored Jul 28, 2022

Summary:
Add str to normalized parameter to enable frame_length based normalization to align with torch implementation of stft. Addresses issue https://github.com/pytorch/audio/issues/2104

Pull Request resolved: https://github.com/pytorch/audio/pull/2554

Reviewed By: carolineechen, mthrok

Differential Revision: D38247554

Pulled By: skim0514

fbshipit-source-id: c243c7a6b8fda2a1e565cef4600f7c5a06baf602

0fde7c57

01 Jun, 2022 1 commit

Move Seed to Setup (#2425) · ac82bdc4

Sean Kim authored Jun 01, 2022

Summary:
Bringing in move seed commit from previous open commit https://github.com/pytorch/audio/issues/2267. Organizes seed to utils.

Pull Request resolved: https://github.com/pytorch/audio/pull/2425

Reviewed By: carolineechen, nateanl

Differential Revision: D36787599

Pulled By: skim0514

fbshipit-source-id: 37a0d632d13d4336a830c4b98bdb04828ed88c20

ac82bdc4

15 May, 2022 1 commit

[codemod][usort] apply import merging for fbcode (8 of 11) · d62875cc

John Reese authored May 15, 2022

Summary:
Applies new import merging and sorting from µsort v1.0.

When merging imports, µsort will make a best-effort to move associated
comments to match merged elements, but there are known limitations due to
the diynamic nature of Python and developer tooling. These changes should
not produce any dangerous runtime changes, but may require touch-ups to
satisfy linters and other tooling.

Note that µsort uses case-insensitive, lexicographical sorting, which
results in a different ordering compared to isort. This provides a more
consistent sorting order, matching the case-insensitive order used when
sorting import statements by module name, and ensures that "frog", "FROG",
and "Frog" always sort next to each other.

For details on µsort's sorting and merging semantics, see the user guide:
https://usort.readthedocs.io/en/stable/guide.html#sorting

Reviewed By: lisroach

Differential Revision: D36402214

fbshipit-source-id: b641bfa9d46242188524d4ae2c44998922a62b4c

d62875cc

10 May, 2022 1 commit

Add diagonal_loading optional to rtf_power (#2369) · da1e83cc

Zhaoheng Ni authored May 10, 2022

Summary:
When computing the MVDR beamforming weights using the power iteration method, the PSD matrix of noise can be applied with diagonal loading to improve the robustness. This is also applicable to computing the RTF matrix (See https://github.com/espnet/espnet/blob/master/espnet2/enh/layers/beamformer.py#L614 as an example). This also aligns with current `torchaudio.transforms.MVDR` module to keep the consistency.

This PR adds the `diagonal_loading` argument with `True` as default value to `torchaudio.functional.rtf_power`.

Pull Request resolved: https://github.com/pytorch/audio/pull/2369

Reviewed By: carolineechen

Differential Revision: D36204130

Pulled By: nateanl

fbshipit-source-id: 93a58d5c2107841a16c4e32f0c16ab0d6b2d9420

da1e83cc

26 Feb, 2022 1 commit

Add apply_beamforming to torchaudio.functional (#2232) · 9c56ffb4

Zhaoheng Ni authored Feb 25, 2022

Summary:
This PR adds ``apply_beamforming`` method to ``torchaudio.functional``.
The method employs the beamforming weight to the multi-channel noisy spectrum to obtain the single-channel enhanced spectrum.
The input arguments are the complex-valued beamforming weight Tensor and the multi-channel noisy spectrum.

Pull Request resolved: https://github.com/pytorch/audio/pull/2232

Reviewed By: mthrok

Differential Revision: D34474561

Pulled By: nateanl

fbshipit-source-id: 2910251a8f111e65375dfb50495b6a415113f06d

9c56ffb4

25 Feb, 2022 5 commits

Add rtf_power method to torchaudio.functional (#2231) · ea74813d

Zhaoheng Ni authored Feb 25, 2022

Summary:
This PR adds ``rtf_power`` method to ``torchaudio.functional``.
The method computes the relative transfer function (RTF) or the steering vector by [the power iteration method](https://onlinelibrary.wiley.com/doi/abs/10.1002/zamm.19290090206).
[This paper](https://arxiv.org/pdf/2011.15003.pdf) describes the power iteration method in English.
The input arguments are the complex-valued power spectral density (PSD) matrix of the target speech, PSD matrix of noise, int or one-hot Tensor to indicate the reference channel, number of iterations, respectively.

Pull Request resolved: https://github.com/pytorch/audio/pull/2231

Reviewed By: mthrok

Differential Revision: D34474503

Pulled By: nateanl

fbshipit-source-id: 47011427ec4373f808755f0e8eff1efca57655eb

ea74813d

Add rtf_evd method to torchaudio.functional (#2230) · 86fe4fa7

Zhaoheng Ni authored Feb 25, 2022

Summary:
This PR adds `rtf_evd` method to `torchaudio.functional`.
The method computes the relative transfer function (RTF) or the steering vector by eigenvalue decomposition.
The input argument is the power spectral density (PSD) matrix of the target speech.

Pull Request resolved: https://github.com/pytorch/audio/pull/2230

Reviewed By: mthrok

Differential Revision: D34474188

Pulled By: nateanl

fbshipit-source-id: 888df4b187608ed3c2b7271b34d2231cdabb0134

86fe4fa7

Add mvdr_weights_rtf to torchaudio.functional (#2229) · 3566ffc5

Zhaoheng Ni authored Feb 25, 2022

Summary:
This PR adds ``mvdr_weights_rtf`` method to ``torchaudio.functional``.
It computes the MVDR weight matrix based on the solution that applies relative transfer function (RTF). See [the paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.725.673&rep=rep1&type=pdf) for the reference.
The input arguments are the complex-valued RTF Tensor of the target speech, power spectral density (PSD) matrix of noise, int or one-hot Tensor to indicate the reference channel, respectively.

Pull Request resolved: https://github.com/pytorch/audio/pull/2229

Reviewed By: mthrok

Differential Revision: D34474119

Pulled By: nateanl

fbshipit-source-id: 2d6f62cd0858f29ed6e4e03c23dcc11c816204e2

3566ffc5

Add mvdr_weights_souden to torchaudio.functional (#2228) · 5d06a369

Zhaoheng Ni authored Feb 25, 2022

Summary:
This PR adds ``mvdr_weights_souden`` method to ``torchaudio.functional``.
It computes the MVDR weight matrix based on the solution proposed by [``Souden et, al.``](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.725.673&rep=rep1&type=pdf).
The input arguments are the complex-valued power spectral density (PSD) matrix of the target speech, PSD matrix of noise, int or one-hot Tensor to indicate the reference channel, respectively.

Pull Request resolved: https://github.com/pytorch/audio/pull/2228

Reviewed By: mthrok

Differential Revision: D34474018

Pulled By: nateanl

fbshipit-source-id: 725df812f8f6e6cc81cc37e8c3cb0da2ab3b74fb

5d06a369

Add psd method to torchaudio.functional (#2227) · 07bd1aa3

Zhaoheng Ni authored Feb 25, 2022

Summary:
This PR adds ``psd`` method to ``torchaudio.functional``.
It computes the power spectral density (PSD) matrix of the complex-valued spectrum.
The method also supports normalization of Time-Frequency mask.

Pull Request resolved: https://github.com/pytorch/audio/pull/2227

Reviewed By: mthrok

Differential Revision: D34473908

Pulled By: nateanl

fbshipit-source-id: c1cfc584085d77881b35d41d76d39b26fca1dda9

07bd1aa3

29 Dec, 2021 1 commit

Add parameter p to TimeMasking (#2090) · 1ec7ff73

hwangjeff authored Dec 29, 2021

Summary:
Adds parameter `p` to `TimeMasking` to allow for enforcing an upper bound on the proportion of time steps that it can mask. This behavior is consistent with the specifications provided in the SpecAugment paper (https://arxiv.org/abs/1904.08779).

Pull Request resolved: https://github.com/pytorch/audio/pull/2090

Reviewed By: carolineechen

Differential Revision: D33344772

Pulled By: hwangjeff

fbshipit-source-id: 6ff65f5304e489fa1c23e15c3d96b9946229fdcf

1ec7ff73

23 Dec, 2021 1 commit

Apply arc lint to pytorch audio (#2096) · 5859923a

Joao Gomes authored Dec 23, 2021

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/2096

run: `arc lint --apply-patches --paths-cmd 'hg files -I "./**/*.py"'`

Reviewed By: mthrok

Differential Revision: D33297351

fbshipit-source-id: 7bf5956edf0717c5ca90219f72414ff4eeaf5aa8

5859923a

04 Nov, 2021 1 commit
- Doc fixes (#1982) · c670898c
  Caroline Chen authored Nov 04, 2021
  
  c670898c
03 Nov, 2021 1 commit
- [BC-Breaking] Drop pseudo complex support from phase_vocoder / TimeStretch (#1957) · d3e146fd
  moto authored Nov 03, 2021
```
Following the plan #1337, this commit drops the support for pseudo complex type from `F.phase_vocoder` and `T.TimeStretch`.
```
  d3e146fd
28 Oct, 2021 1 commit
- Remove F.complex_norm and T.ComplexNorm (#1942) · ab50909d
  S Harish authored Oct 28, 2021
  
  ab50909d
13 Oct, 2021 1 commit
- [BC-Breaking] Ensure integer input frequencies for resample (#1857) · 25a8adf6
  Caroline Chen authored Oct 13, 2021
  
  25a8adf6
20 Aug, 2021 1 commit

Add basic filtfilt implementation (#1681) · 496b381a

hwangjeff authored Aug 20, 2021



* Add basic filtfilt implementation

* Add filtfilt to functional package; add tests
Co-authored-by: V G <vladislav.goncharenko@phystech.edu>

496b381a

19 Aug, 2021 1 commit
- Move RNNT Loss out of prototype (#1711) · 2c115821
  Caroline Chen authored Aug 19, 2021
  
  2c115821
10 Aug, 2021 1 commit
- Add batch support to lfilter (#1638) · 8094751f
  Chin-Yun Yu authored Aug 11, 2021
  
  8094751f
02 Aug, 2021 1 commit

Add melscale_fbanks and deprecate create_fb_matrix (#1653) · 83dc5ec7

Joel Frank authored Aug 02, 2021

- Renamed torchaudio.functional.create_fb_matrix to torchaudio.functional.melscale_fbanks.
- Added interface with a warning for create_fb_matrix

83dc5ec7

21 Jul, 2021 1 commit
- Add filterbanks support to lfilter (#1587) · aa0dd03b
  Chin-Yun Yu authored Jul 22, 2021
  
  aa0dd03b
16 Jul, 2021 1 commit
- Add PitchShift to functional and transform (#1629) · f5dbb002
  nateanl authored Jul 16, 2021
  
  f5dbb002
25 Jun, 2021 1 commit
- Add edit_distance · 6bfd83b4
  yangarbiter authored Jun 25, 2021
  
  6bfd83b4
04 Jun, 2021 1 commit
- Migrate resample tests from kaldi to functional (#1520) · 15a7f78c
  Caroline Chen authored Jun 03, 2021
  
  15a7f78c
01 Jun, 2021 1 commit
- Ensure resampling identity is unchanged (#1537) · fad19fab
  Caroline Chen authored Jun 01, 2021
  
  fad19fab
22 May, 2021 1 commit

fbsync (#1524) · ae9560da

parmeet authored May 22, 2021

* Remove `class FunctionalComplex` header accidentally re-introduced in #1490

ae9560da

11 May, 2021 1 commit
- Add warning for non-integer resampling frequencies (#1490) · 4b2de71f
  Caroline Chen authored May 11, 2021
  
  4b2de71f
06 May, 2021 1 commit
- Merge test classes for complex (#1491) · 7d45851d
  moto authored May 06, 2021
  
  7d45851d
03 May, 2021 1 commit

Ensure axis masking operations are not in-place (#1481) · 7fd5fce4

Caroline Chen authored May 03, 2021

It was reported in #1478 that spectrogram masking operations were done in-place and modified the original input tensors. This PR fixes this behavior and adds tests to ensure that the input tensor is not changed.

7fd5fce4

26 Apr, 2021 1 commit
- Run functional tests on GPU as well as CPU (#1475) · b5d80279
  Mark Saroufim authored Apr 26, 2021
  
  b5d80279
19 Apr, 2021 1 commit
- Refactor functional test (#1463) · b059f087
  dhthompson authored Apr 19, 2021
```
- Put functional test logic into one place, `functional_impl.py`
- Tidy imports
```
  b059f087
06 Apr, 2021 1 commit

Refactors functional test (#1435) · e9726f08

steveplazafb authored Apr 06, 2021

Merges lfilter and spectrogram classes together in the common implementation and modifies the cpu and gpu test definitions accordingly

e9726f08

02 Apr, 2021 1 commit

Make `F.phase_vocoder` and `T.TimeStretch` handle complex dtype (#1410) · 0433b7aa

moto authored Apr 02, 2021

1. `F.phase_vocoder` accepts Tensor with complex dtype.
    * The implementation path has been updated from #758 so that they share the same code path by internally converting the input Tensor to complex dtype and performing all the operation on top of it.
    * Adopted `torch.polar` for simpler Tensor generation from magnitude and angle.
2. Updated tests
    * librosa compatibility test for complex dtype and pseudo complex dtype
        * Extracted the output shape check test and moved it to functional so that it will be tested on all the combination of `{CPU | CUDA} x {complex64 | complex128}`
    * TorchScript compatibility test for `F.phase_vocoder` and `T.TimeStretch`.
    * batch consistency test for `T.TimeStretch`.

0433b7aa

15 Mar, 2021 1 commit
- Add backprop support to lfilter (#1310) · 2a3d52ff
  chin yun yu authored Mar 16, 2021
  
  2a3d52ff
05 Mar, 2021 1 commit
- Add test for validating lfilter shape (#1360) · e868d24c
  Aobo Yang authored Mar 05, 2021
  
  e868d24c
06 Jan, 2021 1 commit
- Fix nan gradient by using native complex abs op (#1013) · a7797d5c
  moto authored Jan 06, 2021
  
  a7797d5c
05 Aug, 2020 1 commit

[CI] Run unit test with non-editable installation (#845) · 9ba02d5b

moto authored Aug 04, 2020

We have been running unit test with editable installation. (i.e. `python setup.py develop`), with which we missed issues like #842.

This CC makes installation in CI non-editable, and change test directory structure so that the source code will not shadow the installed version of `torchaudio`. With simple `pytest test`, `pytest` modifies `sys.path` and prepend checked out repository, which shadows the installed version.

To remedy this, the whole test suites has been moved from `./test` to `./test/torchaudio_unittest`. This adds nice module structure to our test code and we can do absolute import in each test module, which makes it possible again to run test with `python -m unittest torchaudio_unittest/XXX.py`

This change does not affect the regular development process (`python setup.py develop` && `pytest test`)

9ba02d5b

11 Jun, 2020 1 commit

Get rid of dynamic test suite generation (#716) · 08217121

moto authored Jun 11, 2020

`type` used in `common_utils` generates test class definition in `common_utils` and
this modifies the module state after it's imported. This is anti-pattern.
This PR get rid of the related utility functions and define test suite manually.

08217121