- 08 Nov, 2022 1 commit
-
-
Caroline Chen authored
Summary: Add `fused_log_softmax` argument (default/current behavior = True) to rnnt loss. If setting it to `False`, call `log_softmax` on the logits prior to passing it in to the rnnt loss function. The following should produce the same output: ``` rnnt_loss(logits, targets, logit_lengths, target_lengths, fused_log_softmax=True) ``` ``` log_probs = torch.nn.functional.log_softmax(logits, dim=-1) rnnt_loss(log_probs, targets, logit_lengths, target_lengths, fused_log_softmax=False) ``` testing -- unit tests + get same results on the conformer rnnt recipe Pull Request resolved: https://github.com/pytorch/audio/pull/2798 Reviewed By: xiaohui-zhang Differential Revision: D41083523 Pulled By: carolineechen fbshipit-source-id: e15442ceed1f461bbf06b724aa0561ff8827ad61
-
- 28 Jul, 2022 1 commit
-
-
Sean Kim authored
Summary: Add str to normalized parameter to enable frame_length based normalization to align with torch implementation of stft. Addresses issue https://github.com/pytorch/audio/issues/2104 Pull Request resolved: https://github.com/pytorch/audio/pull/2554 Reviewed By: carolineechen, mthrok Differential Revision: D38247554 Pulled By: skim0514 fbshipit-source-id: c243c7a6b8fda2a1e565cef4600f7c5a06baf602
-
- 01 Jun, 2022 1 commit
-
-
Sean Kim authored
Summary: Bringing in move seed commit from previous open commit https://github.com/pytorch/audio/issues/2267. Organizes seed to utils. Pull Request resolved: https://github.com/pytorch/audio/pull/2425 Reviewed By: carolineechen, nateanl Differential Revision: D36787599 Pulled By: skim0514 fbshipit-source-id: 37a0d632d13d4336a830c4b98bdb04828ed88c20
-
- 15 May, 2022 1 commit
-
-
John Reese authored
Summary: Applies new import merging and sorting from µsort v1.0. When merging imports, µsort will make a best-effort to move associated comments to match merged elements, but there are known limitations due to the diynamic nature of Python and developer tooling. These changes should not produce any dangerous runtime changes, but may require touch-ups to satisfy linters and other tooling. Note that µsort uses case-insensitive, lexicographical sorting, which results in a different ordering compared to isort. This provides a more consistent sorting order, matching the case-insensitive order used when sorting import statements by module name, and ensures that "frog", "FROG", and "Frog" always sort next to each other. For details on µsort's sorting and merging semantics, see the user guide: https://usort.readthedocs.io/en/stable/guide.html#sorting Reviewed By: lisroach Differential Revision: D36402214 fbshipit-source-id: b641bfa9d46242188524d4ae2c44998922a62b4c
-
- 10 May, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: When computing the MVDR beamforming weights using the power iteration method, the PSD matrix of noise can be applied with diagonal loading to improve the robustness. This is also applicable to computing the RTF matrix (See https://github.com/espnet/espnet/blob/master/espnet2/enh/layers/beamformer.py#L614 as an example). This also aligns with current `torchaudio.transforms.MVDR` module to keep the consistency. This PR adds the `diagonal_loading` argument with `True` as default value to `torchaudio.functional.rtf_power`. Pull Request resolved: https://github.com/pytorch/audio/pull/2369 Reviewed By: carolineechen Differential Revision: D36204130 Pulled By: nateanl fbshipit-source-id: 93a58d5c2107841a16c4e32f0c16ab0d6b2d9420
-
- 26 Feb, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: This PR adds ``apply_beamforming`` method to ``torchaudio.functional``. The method employs the beamforming weight to the multi-channel noisy spectrum to obtain the single-channel enhanced spectrum. The input arguments are the complex-valued beamforming weight Tensor and the multi-channel noisy spectrum. Pull Request resolved: https://github.com/pytorch/audio/pull/2232 Reviewed By: mthrok Differential Revision: D34474561 Pulled By: nateanl fbshipit-source-id: 2910251a8f111e65375dfb50495b6a415113f06d
-
- 25 Feb, 2022 5 commits
-
-
Zhaoheng Ni authored
Summary: This PR adds ``rtf_power`` method to ``torchaudio.functional``. The method computes the relative transfer function (RTF) or the steering vector by [the power iteration method](https://onlinelibrary.wiley.com/doi/abs/10.1002/zamm.19290090206). [This paper](https://arxiv.org/pdf/2011.15003.pdf) describes the power iteration method in English. The input arguments are the complex-valued power spectral density (PSD) matrix of the target speech, PSD matrix of noise, int or one-hot Tensor to indicate the reference channel, number of iterations, respectively. Pull Request resolved: https://github.com/pytorch/audio/pull/2231 Reviewed By: mthrok Differential Revision: D34474503 Pulled By: nateanl fbshipit-source-id: 47011427ec4373f808755f0e8eff1efca57655eb
-
Zhaoheng Ni authored
Summary: This PR adds `rtf_evd` method to `torchaudio.functional`. The method computes the relative transfer function (RTF) or the steering vector by eigenvalue decomposition. The input argument is the power spectral density (PSD) matrix of the target speech. Pull Request resolved: https://github.com/pytorch/audio/pull/2230 Reviewed By: mthrok Differential Revision: D34474188 Pulled By: nateanl fbshipit-source-id: 888df4b187608ed3c2b7271b34d2231cdabb0134
-
Zhaoheng Ni authored
Summary: This PR adds ``mvdr_weights_rtf`` method to ``torchaudio.functional``. It computes the MVDR weight matrix based on the solution that applies relative transfer function (RTF). See [the paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.725.673&rep=rep1&type=pdf) for the reference. The input arguments are the complex-valued RTF Tensor of the target speech, power spectral density (PSD) matrix of noise, int or one-hot Tensor to indicate the reference channel, respectively. Pull Request resolved: https://github.com/pytorch/audio/pull/2229 Reviewed By: mthrok Differential Revision: D34474119 Pulled By: nateanl fbshipit-source-id: 2d6f62cd0858f29ed6e4e03c23dcc11c816204e2
-
Zhaoheng Ni authored
Summary: This PR adds ``mvdr_weights_souden`` method to ``torchaudio.functional``. It computes the MVDR weight matrix based on the solution proposed by [``Souden et, al.``](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.725.673&rep=rep1&type=pdf). The input arguments are the complex-valued power spectral density (PSD) matrix of the target speech, PSD matrix of noise, int or one-hot Tensor to indicate the reference channel, respectively. Pull Request resolved: https://github.com/pytorch/audio/pull/2228 Reviewed By: mthrok Differential Revision: D34474018 Pulled By: nateanl fbshipit-source-id: 725df812f8f6e6cc81cc37e8c3cb0da2ab3b74fb
-
Zhaoheng Ni authored
Summary: This PR adds ``psd`` method to ``torchaudio.functional``. It computes the power spectral density (PSD) matrix of the complex-valued spectrum. The method also supports normalization of Time-Frequency mask. Pull Request resolved: https://github.com/pytorch/audio/pull/2227 Reviewed By: mthrok Differential Revision: D34473908 Pulled By: nateanl fbshipit-source-id: c1cfc584085d77881b35d41d76d39b26fca1dda9
-
- 29 Dec, 2021 1 commit
-
-
hwangjeff authored
Summary: Adds parameter `p` to `TimeMasking` to allow for enforcing an upper bound on the proportion of time steps that it can mask. This behavior is consistent with the specifications provided in the SpecAugment paper (https://arxiv.org/abs/1904.08779). Pull Request resolved: https://github.com/pytorch/audio/pull/2090 Reviewed By: carolineechen Differential Revision: D33344772 Pulled By: hwangjeff fbshipit-source-id: 6ff65f5304e489fa1c23e15c3d96b9946229fdcf
-
- 23 Dec, 2021 1 commit
-
-
Joao Gomes authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2096 run: `arc lint --apply-patches --paths-cmd 'hg files -I "./**/*.py"'` Reviewed By: mthrok Differential Revision: D33297351 fbshipit-source-id: 7bf5956edf0717c5ca90219f72414ff4eeaf5aa8
-
- 04 Nov, 2021 1 commit
-
-
Caroline Chen authored
-
- 03 Nov, 2021 1 commit
-
-
moto authored
Following the plan #1337, this commit drops the support for pseudo complex type from `F.phase_vocoder` and `T.TimeStretch`.
-
- 28 Oct, 2021 1 commit
-
-
S Harish authored
-
- 13 Oct, 2021 1 commit
-
-
Caroline Chen authored
-
- 20 Aug, 2021 1 commit
-
-
hwangjeff authored
* Add basic filtfilt implementation * Add filtfilt to functional package; add tests Co-authored-by:V G <vladislav.goncharenko@phystech.edu>
-
- 19 Aug, 2021 1 commit
-
-
Caroline Chen authored
-
- 10 Aug, 2021 1 commit
-
-
Chin-Yun Yu authored
-
- 02 Aug, 2021 1 commit
-
-
Joel Frank authored
- Renamed torchaudio.functional.create_fb_matrix to torchaudio.functional.melscale_fbanks. - Added interface with a warning for create_fb_matrix
-
- 21 Jul, 2021 1 commit
-
-
Chin-Yun Yu authored
-
- 16 Jul, 2021 1 commit
-
-
nateanl authored
-
- 25 Jun, 2021 1 commit
-
-
yangarbiter authored
-
- 04 Jun, 2021 1 commit
-
-
Caroline Chen authored
-
- 01 Jun, 2021 1 commit
-
-
Caroline Chen authored
-
- 22 May, 2021 1 commit
-
-
parmeet authored
* Remove `class FunctionalComplex` header accidentally re-introduced in #1490
-
- 11 May, 2021 1 commit
-
-
Caroline Chen authored
-
- 06 May, 2021 1 commit
-
-
moto authored
-
- 03 May, 2021 1 commit
-
-
Caroline Chen authored
It was reported in #1478 that spectrogram masking operations were done in-place and modified the original input tensors. This PR fixes this behavior and adds tests to ensure that the input tensor is not changed.
-
- 26 Apr, 2021 1 commit
-
-
Mark Saroufim authored
-
- 19 Apr, 2021 1 commit
-
-
dhthompson authored
- Put functional test logic into one place, `functional_impl.py` - Tidy imports
-
- 06 Apr, 2021 1 commit
-
-
steveplazafb authored
Merges lfilter and spectrogram classes together in the common implementation and modifies the cpu and gpu test definitions accordingly
-
- 02 Apr, 2021 1 commit
-
-
moto authored
1. `F.phase_vocoder` accepts Tensor with complex dtype. * The implementation path has been updated from #758 so that they share the same code path by internally converting the input Tensor to complex dtype and performing all the operation on top of it. * Adopted `torch.polar` for simpler Tensor generation from magnitude and angle. 2. Updated tests * librosa compatibility test for complex dtype and pseudo complex dtype * Extracted the output shape check test and moved it to functional so that it will be tested on all the combination of `{CPU | CUDA} x {complex64 | complex128}` * TorchScript compatibility test for `F.phase_vocoder` and `T.TimeStretch`. * batch consistency test for `T.TimeStretch`.
-
- 15 Mar, 2021 1 commit
-
-
chin yun yu authored
-
- 05 Mar, 2021 1 commit
-
-
Aobo Yang authored
-
- 06 Jan, 2021 1 commit
-
-
moto authored
-
- 05 Aug, 2020 1 commit
-
-
moto authored
We have been running unit test with editable installation. (i.e. `python setup.py develop`), with which we missed issues like #842. This CC makes installation in CI non-editable, and change test directory structure so that the source code will not shadow the installed version of `torchaudio`. With simple `pytest test`, `pytest` modifies `sys.path` and prepend checked out repository, which shadows the installed version. To remedy this, the whole test suites has been moved from `./test` to `./test/torchaudio_unittest`. This adds nice module structure to our test code and we can do absolute import in each test module, which makes it possible again to run test with `python -m unittest torchaudio_unittest/XXX.py` This change does not affect the regular development process (`python setup.py develop` && `pytest test`)
-
- 11 Jun, 2020 1 commit
-
-
moto authored
`type` used in `common_utils` generates test class definition in `common_utils` and this modifies the module state after it's imported. This is anti-pattern. This PR get rid of the related utility functions and define test suite manually.
-