"vscode:/vscode.git/clone" did not exist on "8e64140e35c15a626d199a0dfdd9cc7f956ab6cc"
- 08 Nov, 2022 1 commit
-
-
Caroline Chen authored
Summary: Add `fused_log_softmax` argument (default/current behavior = True) to rnnt loss. If setting it to `False`, call `log_softmax` on the logits prior to passing it in to the rnnt loss function. The following should produce the same output: ``` rnnt_loss(logits, targets, logit_lengths, target_lengths, fused_log_softmax=True) ``` ``` log_probs = torch.nn.functional.log_softmax(logits, dim=-1) rnnt_loss(log_probs, targets, logit_lengths, target_lengths, fused_log_softmax=False) ``` testing -- unit tests + get same results on the conformer rnnt recipe Pull Request resolved: https://github.com/pytorch/audio/pull/2798 Reviewed By: xiaohui-zhang Differential Revision: D41083523 Pulled By: carolineechen fbshipit-source-id: e15442ceed1f461bbf06b724aa0561ff8827ad61
-
- 03 Aug, 2022 1 commit
-
-
bshall authored
Summary: I took a stab at implementing the ITU-R BS.1770-4 loudness recommendation (closes https://github.com/pytorch/audio/issues/1205). To give some more details: - I've implemented K-weighting following csteinmetz1 instead of BrechtDeMan since it fit well with torchaudio's already implemented filters (`treble_biquad` and `highpass_biquad`). - I've added four audio files to test compliance with the recommendation. These are linked in [this pdf](https://www.itu.int/dms_pub/itu-r/opb/rep/R-REP-BS.2217-2-2016-PDF-E.pdf). There are many more test files there but I didn't want to bog down the assets directory with too many files. Let me know if I should add or remove anything. - I've kept many of the constant internal to the function (e.g. the block duration, overlap, and the absolute threshold gamma). I'm not sure if these should be exposed in the signature. - I've implemented support for up to 5 channels (following both csteinmetz1 and BrechtDeMan). The recommendation includes weights for up to 24 channels. Is there any convention for how many channels to support? I hope this is helpful! looking forward to hearing from you. Pull Request resolved: https://github.com/pytorch/audio/pull/2472 Reviewed By: hwangjeff Differential Revision: D38389155 Pulled By: carolineechen fbshipit-source-id: fcc86d864c04ab2bedaa9acd941ebc4478ca6904
-
- 28 Jul, 2022 1 commit
-
-
Sean Kim authored
Summary: Add str to normalized parameter to enable frame_length based normalization to align with torch implementation of stft. Addresses issue https://github.com/pytorch/audio/issues/2104 Pull Request resolved: https://github.com/pytorch/audio/pull/2554 Reviewed By: carolineechen, mthrok Differential Revision: D38247554 Pulled By: skim0514 fbshipit-source-id: c243c7a6b8fda2a1e565cef4600f7c5a06baf602
-
- 23 Jun, 2022 1 commit
-
-
Summary: Meta: **If you take no action, this diff will be automatically accepted on 2022-06-23.** (To remove yourself from auto-accept diffs and just let them all land, add yourself to [this Butterfly rule](https://www.internalfb.com/butterfly/rule/904302247110220)) Produced by `tools/arcanist/lint/codemods/black-fbsource`. #nocancel Rules run: - CodemodTransformerSimpleShell Config Oncall: [lint](https://our.intern.facebook.com/intern/oncall3/?shortname=lint) CodemodConfig: [CodemodConfigFBSourceBlackLinter](https://www.internalfb.com/code/www/flib/intern/codemod_service/config/fbsource_arc_f/CodemodConfigFBSourceBlackLinter.php) ConfigType: php Sandcastle URL: https://www.internalfb.com/intern/sandcastle/job/13510799586951394/ This diff was automatically created with CodemodService. To learn more about CodemodService, check out the [CodemodService wiki](https://fburl.com/CodemodService). _____ ## Questions / Comments / Feedback? **[Click here to give feedback about this diff](https://www.internalfb.com/codemod_service/feedback?sandcastle_job_id=13510799586951394).** * Returning back to author or abandoning this diff will only cause the diff to be regenerated in the future. * Do **NOT** post in the CodemodService Feedback group about this specific diff. drop-conflicts Reviewed By: adamjernst Differential Revision: D37375235 fbshipit-source-id: 3d7eb39e5c0539a78d1412f37562dec90b0fc759
-
- 03 Jun, 2022 1 commit
-
-
Sean Kim authored
Summary: For test files where applicable, removed manual seeds where applicable. Refactoring https://github.com/pytorch/audio/issues/2267 Pull Request resolved: https://github.com/pytorch/audio/pull/2436 Reviewed By: carolineechen Differential Revision: D36896854 Pulled By: skim0514 fbshipit-source-id: 7b4dd8a8dbfbef271f5cc56564dc83a760407e6c
-
- 01 Jun, 2022 1 commit
-
-
Sean Kim authored
Summary: Bringing in move seed commit from previous open commit https://github.com/pytorch/audio/issues/2267. Organizes seed to utils. Pull Request resolved: https://github.com/pytorch/audio/pull/2425 Reviewed By: carolineechen, nateanl Differential Revision: D36787599 Pulled By: skim0514 fbshipit-source-id: 37a0d632d13d4336a830c4b98bdb04828ed88c20
-
- 23 May, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: - The multi-channel functions only support complex-valued tensors for spectrogram and PSD matrices. - The mask can be real-valued or complex-valued, hence there is no explicit assertion for mask. - The shape of input Tensors need to be verified before the computation. For example, the shape of PSD matrix must be `(..., freq, channel, channel)`, the shape of the mask must be `(..., freq, time)`, etc. - The autograd unittest of `apply_beamforming` has wrong dimensions for beamform_weights detected by the assertion check. FIx it in this PR. Pull Request resolved: https://github.com/pytorch/audio/pull/2401 Reviewed By: carolineechen Differential Revision: D36597689 Pulled By: nateanl fbshipit-source-id: 6ad1adebe3726851cc1d865650bdf177a98985f6
-
- 15 May, 2022 1 commit
-
-
John Reese authored
Summary: Applies new import merging and sorting from µsort v1.0. When merging imports, µsort will make a best-effort to move associated comments to match merged elements, but there are known limitations due to the diynamic nature of Python and developer tooling. These changes should not produce any dangerous runtime changes, but may require touch-ups to satisfy linters and other tooling. Note that µsort uses case-insensitive, lexicographical sorting, which results in a different ordering compared to isort. This provides a more consistent sorting order, matching the case-insensitive order used when sorting import statements by module name, and ensures that "frog", "FROG", and "Frog" always sort next to each other. For details on µsort's sorting and merging semantics, see the user guide: https://usort.readthedocs.io/en/stable/guide.html#sorting Reviewed By: lisroach Differential Revision: D36402214 fbshipit-source-id: b641bfa9d46242188524d4ae2c44998922a62b4c
-
- 10 May, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: When computing the MVDR beamforming weights using the power iteration method, the PSD matrix of noise can be applied with diagonal loading to improve the robustness. This is also applicable to computing the RTF matrix (See https://github.com/espnet/espnet/blob/master/espnet2/enh/layers/beamformer.py#L614 as an example). This also aligns with current `torchaudio.transforms.MVDR` module to keep the consistency. This PR adds the `diagonal_loading` argument with `True` as default value to `torchaudio.functional.rtf_power`. Pull Request resolved: https://github.com/pytorch/audio/pull/2369 Reviewed By: carolineechen Differential Revision: D36204130 Pulled By: nateanl fbshipit-source-id: 93a58d5c2107841a16c4e32f0c16ab0d6b2d9420
-
- 08 Apr, 2022 1 commit
-
-
moto authored
Summary: Add badges of supported properties and devices to functionals and transforms. This commit adds `.. devices::` and `.. properties::` directives to sphinx. APIs with these directives will have badges (based off of shields.io) which link to the page with description of these features. Continuation of https://github.com/pytorch/audio/issues/2316 Excluded dtypes for further improvement, and actually added badges to most of functional/transforms. Pull Request resolved: https://github.com/pytorch/audio/pull/2321 Reviewed By: hwangjeff Differential Revision: D35489063 Pulled By: mthrok fbshipit-source-id: f68a70ebb22df29d5e9bd171273bd19007a81762
-
- 26 Feb, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: This PR adds ``apply_beamforming`` method to ``torchaudio.functional``. The method employs the beamforming weight to the multi-channel noisy spectrum to obtain the single-channel enhanced spectrum. The input arguments are the complex-valued beamforming weight Tensor and the multi-channel noisy spectrum. Pull Request resolved: https://github.com/pytorch/audio/pull/2232 Reviewed By: mthrok Differential Revision: D34474561 Pulled By: nateanl fbshipit-source-id: 2910251a8f111e65375dfb50495b6a415113f06d
-
- 25 Feb, 2022 5 commits
-
-
Zhaoheng Ni authored
Summary: This PR adds ``rtf_power`` method to ``torchaudio.functional``. The method computes the relative transfer function (RTF) or the steering vector by [the power iteration method](https://onlinelibrary.wiley.com/doi/abs/10.1002/zamm.19290090206). [This paper](https://arxiv.org/pdf/2011.15003.pdf) describes the power iteration method in English. The input arguments are the complex-valued power spectral density (PSD) matrix of the target speech, PSD matrix of noise, int or one-hot Tensor to indicate the reference channel, number of iterations, respectively. Pull Request resolved: https://github.com/pytorch/audio/pull/2231 Reviewed By: mthrok Differential Revision: D34474503 Pulled By: nateanl fbshipit-source-id: 47011427ec4373f808755f0e8eff1efca57655eb
-
Zhaoheng Ni authored
Summary: This PR adds `rtf_evd` method to `torchaudio.functional`. The method computes the relative transfer function (RTF) or the steering vector by eigenvalue decomposition. The input argument is the power spectral density (PSD) matrix of the target speech. Pull Request resolved: https://github.com/pytorch/audio/pull/2230 Reviewed By: mthrok Differential Revision: D34474188 Pulled By: nateanl fbshipit-source-id: 888df4b187608ed3c2b7271b34d2231cdabb0134
-
Zhaoheng Ni authored
Summary: This PR adds ``mvdr_weights_rtf`` method to ``torchaudio.functional``. It computes the MVDR weight matrix based on the solution that applies relative transfer function (RTF). See [the paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.725.673&rep=rep1&type=pdf) for the reference. The input arguments are the complex-valued RTF Tensor of the target speech, power spectral density (PSD) matrix of noise, int or one-hot Tensor to indicate the reference channel, respectively. Pull Request resolved: https://github.com/pytorch/audio/pull/2229 Reviewed By: mthrok Differential Revision: D34474119 Pulled By: nateanl fbshipit-source-id: 2d6f62cd0858f29ed6e4e03c23dcc11c816204e2
-
Zhaoheng Ni authored
Summary: This PR adds ``mvdr_weights_souden`` method to ``torchaudio.functional``. It computes the MVDR weight matrix based on the solution proposed by [``Souden et, al.``](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.725.673&rep=rep1&type=pdf). The input arguments are the complex-valued power spectral density (PSD) matrix of the target speech, PSD matrix of noise, int or one-hot Tensor to indicate the reference channel, respectively. Pull Request resolved: https://github.com/pytorch/audio/pull/2228 Reviewed By: mthrok Differential Revision: D34474018 Pulled By: nateanl fbshipit-source-id: 725df812f8f6e6cc81cc37e8c3cb0da2ab3b74fb
-
Zhaoheng Ni authored
Summary: This PR adds ``psd`` method to ``torchaudio.functional``. It computes the power spectral density (PSD) matrix of the complex-valued spectrum. The method also supports normalization of Time-Frequency mask. Pull Request resolved: https://github.com/pytorch/audio/pull/2227 Reviewed By: mthrok Differential Revision: D34473908 Pulled By: nateanl fbshipit-source-id: c1cfc584085d77881b35d41d76d39b26fca1dda9
-
- 17 Feb, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: In batch_consistency tests, the `assert_batch_consistency` method only accepts single Tensor, which is not applicable to some methods. For example, `lfilter` and `filtfilt` requires three Tensors as the arguments, hence they don't follow `assert_batch_consistency` in the tests. This PR refactors the test to accept a tuple of Tensors which have `batch` dimension. For the other arguments like `int` or `str`, they are given as `*args` after the tuple. Pull Request resolved: https://github.com/pytorch/audio/pull/2245 Reviewed By: mthrok Differential Revision: D34273035 Pulled By: nateanl fbshipit-source-id: 0096b4f062fb4e983818e5374bed6efc7b15b056
-
- 16 Feb, 2022 2 commits
-
-
Zhaoheng Ni authored
Summary: In torchscript_consistency tests, the `func` in each test method only accepts one `tensor` as the argument, for the other arguments of `F.xyz` method, they need to be defined inside the `func`. If there is no `Tensor` argument in `F.xzy`, the tests use a `dummy` tensor which is not used anywhere. In this PR, we refactor ``_assert_consistency`` and ``_assert_consistency_complex`` to accept a tuple of inputs instead of just one `tensor`. Pull Request resolved: https://github.com/pytorch/audio/pull/2246 Reviewed By: carolineechen Differential Revision: D34273057 Pulled By: nateanl fbshipit-source-id: a3900edb3b2c58638e513e1490279d771ebc3d0b
-
Zhaoheng Ni authored
Summary: In autograd tests, to guarantee the precision, the dtype of Tensors are converted to `torch.float64` if they are real. However, the complex dtype is not considered. This PR adds `self.complex_dtype` support to the inputs. Pull Request resolved: https://github.com/pytorch/audio/pull/2244 Reviewed By: mthrok Differential Revision: D34272998 Pulled By: nateanl fbshipit-source-id: e8698a74d7b8d99ee0fcb5f5cb5f2ffc8c80b9b5
-
- 09 Feb, 2022 1 commit
-
-
hwangjeff authored
Summary: Yesterday's release of librosa 0.9.0 made args keyword-only and changed default padding from "reflect" to "zero" for some functions. This PR adjusts callsites in our tutorials and tests accordingly. Pull Request resolved: https://github.com/pytorch/audio/pull/2208 Reviewed By: mthrok Differential Revision: D34099793 Pulled By: hwangjeff fbshipit-source-id: 4e2642cdda8aae6d0a928befaf1bbb3873d229bc
-
- 29 Dec, 2021 1 commit
-
-
hwangjeff authored
Summary: Adds parameter `p` to `TimeMasking` to allow for enforcing an upper bound on the proportion of time steps that it can mask. This behavior is consistent with the specifications provided in the SpecAugment paper (https://arxiv.org/abs/1904.08779). Pull Request resolved: https://github.com/pytorch/audio/pull/2090 Reviewed By: carolineechen Differential Revision: D33344772 Pulled By: hwangjeff fbshipit-source-id: 6ff65f5304e489fa1c23e15c3d96b9946229fdcf
-
- 23 Dec, 2021 1 commit
-
-
Joao Gomes authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2096 run: `arc lint --apply-patches --paths-cmd 'hg files -I "./**/*.py"'` Reviewed By: mthrok Differential Revision: D33297351 fbshipit-source-id: 7bf5956edf0717c5ca90219f72414ff4eeaf5aa8
-
- 04 Nov, 2021 1 commit
-
-
Caroline Chen authored
-
- 03 Nov, 2021 2 commits
-
-
moto authored
Following the plan #1337, this commit drops the support for pseudo complex type from `F.phase_vocoder` and `T.TimeStretch`.
-
moto authored
Following the plan #1337, this commit drops the support for pseudo complex type from `F.spectrogram` and `T.Spectrogram`. It also deprecates the use of `return_complex` argument.
-
- 28 Oct, 2021 1 commit
-
-
S Harish authored
-
- 13 Oct, 2021 1 commit
-
-
Caroline Chen authored
-
- 02 Sep, 2021 1 commit
-
-
jayleverett authored
* put output tensor on device in `get_whitenoise()` * Update `get_spectrogram()` so that window uses same device as waveform * put window on proper device in `test_griffinlim()`
-
- 27 Aug, 2021 1 commit
-
-
moto authored
Introduce a helper function `torch_script` that performs scripting in the recommended way.
-
- 20 Aug, 2021 1 commit
-
-
hwangjeff authored
* Add basic filtfilt implementation * Add filtfilt to functional package; add tests Co-authored-by:V G <vladislav.goncharenko@phystech.edu>
-
- 19 Aug, 2021 1 commit
-
-
Caroline Chen authored
-
- 11 Aug, 2021 1 commit
-
-
nateanl authored
- Provide InverseSpectrogram module that corresponds to Spectrogram module - Add length parameter to the forward method in transforms Co-authored-by:
dgenzel <dgenzel@fb.com> Co-authored-by:
Zhaoheng Ni <zni@fb.com>
-
- 10 Aug, 2021 1 commit
-
-
Chin-Yun Yu authored
-
- 02 Aug, 2021 1 commit
-
-
Joel Frank authored
- Renamed torchaudio.functional.create_fb_matrix to torchaudio.functional.melscale_fbanks. - Added interface with a warning for create_fb_matrix
-
- 29 Jul, 2021 1 commit
-
-
Joel Frank authored
Summary: - Add linear_fbank method - Add LFCC in transforms
-
- 21 Jul, 2021 1 commit
-
-
Chin-Yun Yu authored
-
- 16 Jul, 2021 1 commit
-
-
nateanl authored
-
- 25 Jun, 2021 1 commit
-
-
yangarbiter authored
-
- 04 Jun, 2021 2 commits
-
-
moto authored
* [BC-Breaking] Default to native complex type when returning raw spectrogram Part of https://github.com/pytorch/audio/issues/1337 . - This code changes the return type of spectrogram to be native complex dtype, when (and only when) returning raw (complex-valued) spectrogram. - Change `return_complex=False` to `return_complex=True` in spectrogram ops. - `return_complex` is only effective when `power` is `None`. It is ignored for cases where `power` is not `None`. Because the returned Tensor is power spectrogram, which is real-valued Tensors.
-
Caroline Chen authored
-