- 16 Sep, 2022 2 commits
-
-
moto authored
Summary: * Adopts `:autosummary:` in decoder module doc * Hide the constructor signature of `CTCDecoder` as `ctc_decoder` function is the one client code is supposed to be using. * Introduce `children` property to `CTCDecoderLMState` otherwise it does not show up in the doc. https://output.circle-artifacts.com/output/job/7aac5eb9-7d2d-4f63-bcdf-83a6f40b4e5a/artifacts/0/docs/models.decoder.html <img width="748" alt="Screen Shot 2022-09-16 at 5 23 22 PM" src="https://user-images.githubusercontent.com/855818/190592409-0c2ec8a4-d2cf-4d76-a965-8a570faaeb1a.png"> https://output.circle-artifacts.com/output/job/7aac5eb9-7d2d-4f63-bcdf-83a6f40b4e5a/artifacts/0/docs/generated/torchaudio.models.decoder.CTCDecoder.html#torchaudio.models.decoder.CTCDecoder <img width="723" alt="Screen Shot 2022-09-16 at 5 23 53 PM" src="https://user-images.githubusercontent.com/855818/190592501-3fad1e07-ae3e-44f5-93be-f33181025390.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2684 Reviewed By: carolineechen Differential Revision: D39574272 Pulled By: mthrok fbshipit-source-id: d977660bd46f5cf98c535adbf2735be896b28773
-
moto authored
Summary: This commit adopts :autosummary: directive to `torchaudio.io` module. It adds table of contents on `torchaudio.io` level. https://output.circle-artifacts.com/output/job/282089d1-c120-4d22-809f-0e0ac0947c37/artifacts/0/docs/io.html <img width="1094" alt="Screen Shot 2022-09-16 at 7 33 32 AM" src="https://user-images.githubusercontent.com/855818/190520248-27e469f8-7689-4dc2-b591-7b3f08bb4dff.png"> https://output.circle-artifacts.com/output/job/282089d1-c120-4d22-809f-0e0ac0947c37/artifacts/0/docs/generated/torchaudio.io.StreamReader.html#torchaudio.io.StreamReader <img width="1108" alt="Screen Shot 2022-09-16 at 7 33 59 AM" src="https://user-images.githubusercontent.com/855818/190520292-d090fed0-2f18-4961-b9f3-9e4808fd437e.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2681 Reviewed By: carolineechen Differential Revision: D39560459 Pulled By: mthrok fbshipit-source-id: 3de5f22b8d8d0834dfd8bac8619fbfaa44c5f4dd
-
- 15 Sep, 2022 3 commits
-
-
moto authored
Summary: Previous versions of Sphinx reported wrong path for return class. This issue is fixed on the latest Sphinx. It allows to remove the patch we apply in `conf.py`. This is essential for the adoptation of `:autosummary:`, as it won't render correctly with the patch. https://output.circle-artifacts.com/output/job/19d93ede-08de-4b9e-9d66-67ca5dab964e/artifacts/0/docs/pipelines.html Pull Request resolved: https://github.com/pytorch/audio/pull/2678 Reviewed By: carolineechen Differential Revision: D39509447 Pulled By: mthrok fbshipit-source-id: e104bc6a87f32cba6c549a9fe8f2d1e489ee27e4
-
moto authored
Summary: To follow the change related to Linux Foundation movement. (we are still pinning the theme version so that our customization does not break randomly.) Pull Request resolved: https://github.com/pytorch/audio/pull/2679 Reviewed By: carolineechen Differential Revision: D39531566 Pulled By: mthrok fbshipit-source-id: 64353577d05f9dbda00dd9d10b9ebcedddfdce5b
-
moto authored
Summary: Preparation for the adoptation of `autosummary`. Replace `:footcite:` with `:cite:` and introduce dedicated reference page, as `:footcite:` does not work well with `autosummary`. Example: https://output.circle-artifacts.com/output/job/4da47ba6-d9c7-418e-b5b0-e9f8a146a6c3/artifacts/0/docs/datasets.html#cmuarctic https://output.circle-artifacts.com/output/job/4da47ba6-d9c7-418e-b5b0-e9f8a146a6c3/artifacts/0/docs/references.html Pull Request resolved: https://github.com/pytorch/audio/pull/2676 Reviewed By: carolineechen Differential Revision: D39509431 Pulled By: mthrok fbshipit-source-id: e6003dd01ec3eff3d598054690f61de8ee31ac9a
-
- 14 Sep, 2022 4 commits
-
-
Zhaoheng Ni authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2675 Reviewed By: carolineechen Differential Revision: D39515996 Pulled By: nateanl fbshipit-source-id: 5824375f6a758af21b6ad6c635dd06081663644f
-
moto authored
Summary: Currently, the way feature badges are generated assumes that both documentations and the supported features page are on the same level from the root. This does not work when we introduce `:autosummary:` which generates individual documentation pages one level below. This commit changes it so that links to the supported features page are properly relative from the documentation level. There is no appearance change from this commit. Pull Request resolved: https://github.com/pytorch/audio/pull/2677 Reviewed By: carolineechen Differential Revision: D39507451 Pulled By: mthrok fbshipit-source-id: f18da4201f0eb747586be21c8bd9a958217aebc2
-
Caroline Chen authored
Summary: modifications to ctc decoder LM docstrings on top of https://github.com/pytorch/audio/issues/2657 Pull Request resolved: https://github.com/pytorch/audio/pull/2658 Reviewed By: mthrok Differential Revision: D39468921 Pulled By: carolineechen fbshipit-source-id: c5497cc2fa22fb98a304d037e27c91bf68a9ad6a
-
Caroline Chen authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2673 Reviewed By: mthrok Differential Revision: D39507612 Pulled By: carolineechen fbshipit-source-id: 3a9ee53f72cabd6e3085c76867017be4a6ed7f53
-
- 13 Sep, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2669 Reviewed By: carolineechen, mthrok Differential Revision: D39433560 Pulled By: nateanl fbshipit-source-id: 5b652b31c00badb37b27a32ac25b422a5bcc74cb
-
- 12 Sep, 2022 1 commit
-
-
Caroline Chen authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2668 Reviewed By: nateanl, mthrok Differential Revision: D39433671 Pulled By: carolineechen fbshipit-source-id: 3545a5b4019832861c34fd8c05e5f8600fd80d5c
-
- 07 Sep, 2022 1 commit
-
-
moto authored
Summary: 1. Override class `__module__` attribute in `conf.py` so that no manual override is necessary 2. Fix SourceSeparationBundle member attribute Pull Request resolved: https://github.com/pytorch/audio/pull/2656 Reviewed By: carolineechen Differential Revision: D39293053 Pulled By: mthrok fbshipit-source-id: 2b8d6be1aee517d0e692043c26ac2438a787adc6
-
- 24 Aug, 2022 1 commit
-
-
moto authored
Summary: This commit adds FFmpeg-based encoder StreamWriter class. StreamWriter is pretty much the opposite of StreamReader class, and it supports; * Encoding audio / still image / video * Exporting to local file / streaming protocol / devices etc... * File-like object support (in later commit) * HW video encoding (in later commit) See also: https://fburl.com/gslide/z85kn5a9 (Meta internal) Pull Request resolved: https://github.com/pytorch/audio/pull/2628 Reviewed By: nateanl Differential Revision: D38816650 Pulled By: mthrok fbshipit-source-id: a9343b0d55755e186971dc96fb86eb52daa003c8
-
- 22 Aug, 2022 1 commit
-
-
moto authored
Summary: The minor release fixes some gallery issue, which allows to remove some of the customization we had in https://github.com/pytorch/audio/issues/2629 https://output.circle-artifacts.com/output/job/553a9b98-8260-4cb4-a681-20ef97d2c33e/artifacts/0/docs/pipelines.html#torchaudio.pipelines.Wav2Vec2ASRBundle Pull Request resolved: https://github.com/pytorch/audio/pull/2638 Reviewed By: carolineechen, nateanl Differential Revision: D38909097 Pulled By: mthrok fbshipit-source-id: 78346d93b54fca2a19b28991c224324ef53221c9
-
- 18 Aug, 2022 2 commits
-
-
moto authored
Summary: This commit fixes the issue with the recent Sphinx-Gallery update. Also it pins the versions of Sphinx-related packages. Before: <img width="256" alt="Screen Shot 2022-08-17 at 10 02 23 PM" src="https://user-images.githubusercontent.com/855818/185140952-28f2d98a-b586-424c-a003-b69089f48eb9.png"> After: https://user-images.githubusercontent.com/855818/185271889-bd4f86a0-986b-43bb-8121-bd77750d74f0.mov Pull Request resolved: https://github.com/pytorch/audio/pull/2629 Reviewed By: carolineechen Differential Revision: D38816417 Pulled By: mthrok fbshipit-source-id: 11ee3f9121d9a302772ee1f461dacae52eb28852
-
moto authored
Summary: Resolves the following warning ``` /torchaudio/docs/source/transforms.rst:94: WARNING: Title underline too short. :hidden:`Loudness` ----------------- ``` Pull Request resolved: https://github.com/pytorch/audio/pull/2627 Reviewed By: carolineechen Differential Revision: D38814802 Pulled By: mthrok fbshipit-source-id: 5dfaf2d7bae22dba0f4a14f04ca63f28d6b2a749
-
- 15 Aug, 2022 2 commits
-
-
moto authored
Summary: The link to version selector has been absolute link, which had been a trap when reviewing gh-pages deployment from folk. This commit changes that to relative link. Pull Request resolved: https://github.com/pytorch/audio/pull/2605 Test Plan: - https://mthrok.github.io/audio/main/index.html -> click version selector -> https://mthrok.github.io/audio/versions.html - https://mthrok.github.io/audio/0.12.1/index.html -> click version selector -> https://pytorch.org/audio/versions.html Reviewed By: carolineechen, nateanl Differential Revision: D38695645 Pulled By: mthrok fbshipit-source-id: 91132ac19b8c61f39d304a162435b9c6599ef2b2
-
Zhaoheng Ni authored
Summary: `ctc_decoder` has become beta, remove it from prototype documents. Pull Request resolved: https://github.com/pytorch/audio/pull/2617 Reviewed By: hwangjeff Differential Revision: D38706869 Pulled By: nateanl fbshipit-source-id: 41679f4e65a584b6b882af4551a50123f1dcef02
-
- 11 Aug, 2022 1 commit
-
-
hwangjeff authored
Summary: Adds function `add_noise`, which computes and returns the sum of a waveform and scaled noise. Pull Request resolved: https://github.com/pytorch/audio/pull/2608 Reviewed By: nateanl Differential Revision: D38557141 Pulled By: hwangjeff fbshipit-source-id: 1457fa213f43ca5b4333d3c7580971655d4260a0
-
- 05 Aug, 2022 1 commit
-
-
hwangjeff authored
Summary: Adds functions `convolve` and `fftconvolve`, which compute the convolution of two tensors along their trailing dimension. The former performs the convolution directly, whereas the latter performs it using FFT. Pull Request resolved: https://github.com/pytorch/audio/pull/2602 Reviewed By: nateanl, mthrok Differential Revision: D38450771 Pulled By: hwangjeff fbshipit-source-id: b2d1e063ba21eafeddf317d60749e7120b14292b
-
- 03 Aug, 2022 2 commits
-
-
Sean Kim authored
Summary: Add new model pretrained weights and tests Pull Request resolved: https://github.com/pytorch/audio/pull/2601 Reviewed By: carolineechen, nateanl Differential Revision: D38396673 Pulled By: skim0514 fbshipit-source-id: e06f97d28508543bc18e671344386a947bc870c1
-
bshall authored
Summary: I took a stab at implementing the ITU-R BS.1770-4 loudness recommendation (closes https://github.com/pytorch/audio/issues/1205). To give some more details: - I've implemented K-weighting following csteinmetz1 instead of BrechtDeMan since it fit well with torchaudio's already implemented filters (`treble_biquad` and `highpass_biquad`). - I've added four audio files to test compliance with the recommendation. These are linked in [this pdf](https://www.itu.int/dms_pub/itu-r/opb/rep/R-REP-BS.2217-2-2016-PDF-E.pdf). There are many more test files there but I didn't want to bog down the assets directory with too many files. Let me know if I should add or remove anything. - I've kept many of the constant internal to the function (e.g. the block duration, overlap, and the absolute threshold gamma). I'm not sure if these should be exposed in the signature. - I've implemented support for up to 5 channels (following both csteinmetz1 and BrechtDeMan). The recommendation includes weights for up to 24 channels. Is there any convention for how many channels to support? I hope this is helpful! looking forward to hearing from you. Pull Request resolved: https://github.com/pytorch/audio/pull/2472 Reviewed By: hwangjeff Differential Revision: D38389155 Pulled By: carolineechen fbshipit-source-id: fcc86d864c04ab2bedaa9acd941ebc4478ca6904
-
- 29 Jul, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: - The "speech + noise" mixture still has a high SNR, which can't show the effectiveness of MVDR beamforming. To make the task more challenging, amplify the noise waveform to reduce the SNR of mixture speech. - Show the Si-SNR score of mixture speech when visualizing the mixture spectrogram. - FIx the figure in `rtf_power` subsection. - The description of enhanced spectrogram by `rtf_power` is wrong. Correct it to `rtf_power`. - Print PESQ, STOI, and SDR metric scores. Pull Request resolved: https://github.com/pytorch/audio/pull/2527 Reviewed By: mthrok Differential Revision: D38190218 Pulled By: nateanl fbshipit-source-id: 39562850a67f58a16e0a2866ed95f78c3f4dc7de
-
- 28 Jul, 2022 1 commit
-
-
Sean Kim authored
Summary: Add tutorial python file, draft PR, will continue to modify accordingly to feedback. Future plan: modify spectrogram and bottom audio design and work on finding best audio track and segments Pull Request resolved: https://github.com/pytorch/audio/pull/2572 Reviewed By: carolineechen, nateanl, mthrok Differential Revision: D38234001 Pulled By: skim0514 fbshipit-source-id: fe9207864f354dec5cf5ff52bf7d9ddcf4a001d5
-
- 26 Jul, 2022 1 commit
-
-
Sean Kim authored
Summary: Created new branch and brought in commits due to rebasing issues, resolved conflicts on new branch, close old branch. Pull Request resolved: https://github.com/pytorch/audio/pull/2565 Reviewed By: nateanl, mthrok Differential Revision: D38131189 Pulled By: skim0514 fbshipit-source-id: 96531480cf50562944abb28d70879f21b4609f15
-
- 25 Jul, 2022 1 commit
-
-
moto authored
Summary: This commit fix build_docs job timeout by pinning `resampy=0.2.2`. For some mysterious reason, `resampy=0.3.1` causes slowdown of unrelated code. https://github.com/bmcfee/resampy/issues/106 Pull Request resolved: https://github.com/pytorch/audio/pull/2543 Reviewed By: carolineechen Differential Revision: D38115003 Pulled By: mthrok fbshipit-source-id: 67cd1c73dd4adb3091e0b88aaf5c31de0dd4b87e
-
- 22 Jul, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: - Add documentation page for `SourceSeparationBundle` and `CONVTASNET_BASE_LIBRI2MIX`. - Add citation of Libri2Mix dataset in the bundle documentation. - url in integration test should use slash instead of `os.path.join` as it will fail on Windows. Change it to f-string. Pull Request resolved: https://github.com/pytorch/audio/pull/2559 Reviewed By: carolineechen Differential Revision: D38036116 Pulled By: nateanl fbshipit-source-id: 736732805191113955badfec3955e2e24e8f4836
-
- 19 Jul, 2022 1 commit
-
-
Sean Kim authored
Summary: Factory functions have been added to HDemucs class and test the implementation within the testing files. Pull Request resolved: https://github.com/pytorch/audio/pull/2547 Reviewed By: carolineechen Differential Revision: D37948600 Pulled By: skim0514 fbshipit-source-id: 7ac4e4a71519450cfbbc24ff7d7e70521f676040
-
- 12 Jul, 2022 2 commits
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2313 Reviewed By: carolineechen, nateanl Differential Revision: D37799552 Pulled By: mthrok fbshipit-source-id: 12e27fccb7098f3142e9ca0b748c71325cd324ee
-
Sean Kim authored
Summary: Draft PR with initial model implementation with minor changes from previous implementation Pull Request resolved: https://github.com/pytorch/audio/pull/2506 Reviewed By: nateanl Differential Revision: D37762671 Pulled By: skim0514 fbshipit-source-id: b7dc0a6ef725d6ae6d76c23c882623f7d339977c
-
- 07 Jul, 2022 1 commit
-
-
moto authored
Summary: Following the formatter changes heppened in fbcode, this commit update the linter config. Pull Request resolved: https://github.com/pytorch/audio/pull/2389 Reviewed By: hwangjeff Differential Revision: D37659649 Pulled By: mthrok fbshipit-source-id: 1c52ff93f0b10cb2e7303d2ad13b2d65ffccfcb0
-
- 27 Jun, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: This PR adds two dataset classes of VoxCeleb1 corpus. - `VoxCeleb1Identification` Each data sample contains the waveform, sample rate, speaker id, and the file id. - `VoxCeleb1Verification` Each data sample contains a pair of waveforms, sample rate, the label indicating if they are from the same speaker, and the file ids. Pull Request resolved: https://github.com/pytorch/audio/pull/2349 Reviewed By: carolineechen Differential Revision: D35927921 Pulled By: nateanl fbshipit-source-id: 3e07ddd329178777698841565053eb59befe6449
-
- 21 Jun, 2022 1 commit
-
-
Sean Kim authored
Summary: Create dataset handler and tests for new dataset. Manually tested and unit tested to test validity. Pre-commit ran for style checks. Pull Request resolved: https://github.com/pytorch/audio/pull/2484 Reviewed By: carolineechen, nateanl Differential Revision: D37250556 Pulled By: skim0514 fbshipit-source-id: d2c8d73d22fd9d7282026265676f3eab1e178d51
-
- 20 Jun, 2022 1 commit
-
-
Caroline Chen authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2480 Reviewed By: nateanl Differential Revision: D37249571 Pulled By: carolineechen fbshipit-source-id: caefeec4253c91f2579655a0c1735edaeed51be9
-
- 08 Jun, 2022 2 commits
-
-
moto authored
Summary: https://output.circle-artifacts.com/output/job/75187a52-b0d8-4cac-89f3-24e10889a36a/artifacts/0/docs/hw_acceleration_tutorial.html 1. Update HW decoding tutorial to include file-like object 1. Add note about unseekable object int streaming API tutorial Pull Request resolved: https://github.com/pytorch/audio/pull/2408 Reviewed By: hwangjeff Differential Revision: D36632702 Pulled By: mthrok fbshipit-source-id: 17be2fb8528cb1d2d1ee11901b6a95c512466feb
-
moto authored
Summary: The Streaming API tutorial has gotten long, so this commit split it into two. Pull Request resolved: https://github.com/pytorch/audio/pull/2446 Reviewed By: hwangjeff Differential Revision: D36987513 Pulled By: mthrok fbshipit-source-id: 13e3aad74c0d0e654c39c0eeceffca1a00b0dac4
-
- 04 Jun, 2022 1 commit
-
-
moto authored
Summary: Undesired logs are one of the loudest UX complains we get. Yet, loading media files involves uncertainty which is difficult to debug without debug log. This commit introduces utility functions to configure logging level so that we can ask users to enable it when they encounter an issue, while defaulting to non-verbose option. Pull Request resolved: https://github.com/pytorch/audio/pull/2439 Reviewed By: hwangjeff, xiaohui-zhang Differential Revision: D36903763 Pulled By: mthrok fbshipit-source-id: f4ddd9915b13197c2a2eb97e965005b8b5b8d987
-
- 01 Jun, 2022 2 commits
-
-
Zhaoheng Ni authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2411 Reviewed By: carolineechen Differential Revision: D36663904 Pulled By: nateanl fbshipit-source-id: c6a7dd530c9cfbb58b7121ebe02db6ae293cc2d0
-
Caroline Chen authored
Summary: Move CTC beam search decoder out of prototype to new `torchaudio.models.decoder` module. hwangjeff mthrok any thoughts on the new module + naming, and if we should move rnnt beam search here as well?? Pull Request resolved: https://github.com/pytorch/audio/pull/2410 Reviewed By: mthrok Differential Revision: D36784521 Pulled By: carolineechen fbshipit-source-id: a2ec52f86bba66e03327a9af0c5df8bbefcd67ed
-
- 24 May, 2022 1 commit
-
-
moto authored
Summary: Follow-up of https://github.com/pytorch/audio/issues/2407, the <script> was not properly closed on pages other than tutorials Pull Request resolved: https://github.com/pytorch/audio/pull/2409 Reviewed By: carolineechen Differential Revision: D36632668 Pulled By: mthrok fbshipit-source-id: 9c0409a8011d77f8689e2dcdc1bd9844d3d31f79
-