- 18 Jul, 2023 1 commit
-
-
moto authored
Summary: Now that GPU video decoders are available in doc CI, we run the tutorials with GPU decoders. Pull Request resolved: https://github.com/pytorch/audio/pull/3478 Differential Revision: D47519672 Pulled By: mthrok fbshipit-source-id: 2f95243100e9c75e17c2b4d306da164f0e31f8f2
-
- 15 Jul, 2023 1 commit
-
-
moto authored
Summary: The nightly builds support FFmpeg version 4, 5 and 6. Pull Request resolved: https://github.com/pytorch/audio/pull/3480 Differential Revision: D47482841 Pulled By: mthrok fbshipit-source-id: 88267f5e83ddc7b1e866b35e57a87b985e2c78c9
-
- 05 Jul, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3433 Current design of forced_align accept 2D Tensor for `log_probs` and 1D Tensor for `targets`. To make the API simple, the PR make changes to only support batch Tensors (3D Tensor for `log_probs` and 2D Tensor for `targets`). Reviewed By: mthrok Differential Revision: D46657526 fbshipit-source-id: af17ec3f92f1a2c46dba91c6db2488a11de36f89
-
- 28 Jun, 2023 1 commit
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3449 Differential Revision: D47094402 Pulled By: mthrok fbshipit-source-id: 43e6994604f0e6c06a5f19c5e8599e2ce12ae622
-
- 26 Jun, 2023 1 commit
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3442 Differential Revision: D46797481 Pulled By: mthrok fbshipit-source-id: 3513037cbb8f2edb70fdab0fec5c7c554a697abe
-
- 21 Jun, 2023 1 commit
-
-
Xiaohui Zhang authored
Summary: Splitting the multilingual example part into another tutorial. Pull Request resolved: https://github.com/pytorch/audio/pull/3443 Reviewed By: mthrok Differential Revision: D46802844 Pulled By: xiaohui-zhang fbshipit-source-id: a7093053cac8b79d650d4f665db7fde2d8254998
-
- 15 Jun, 2023 1 commit
-
-
moto authored
Summary: * Fix backtrack visualization (the cooridnate was off-by-one.) * Add note about the simplification and the new align API * Explicitly handle SOS and EOS Pull Request resolved: https://github.com/pytorch/audio/pull/3440 Reviewed By: xiaohui-zhang Differential Revision: D46761282 Pulled By: mthrok fbshipit-source-id: b0b6c9754674e8e23543e9f002e29b55102c92f8
-
- 07 Jun, 2023 1 commit
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3415 Differential Revision: D46526437 Pulled By: mthrok fbshipit-source-id: f78d19c19d7e68f67712412de35d9ed50f47263b
-
- 02 Jun, 2023 2 commits
-
-
moto authored
Summary: This commit removes compute_kaldi_pitch function and the underlying Kaldi integration from torchaudio. Kaldi pitch function was added in a short period of time by integrating the original Kaldi implementation, instead of reimplementing it in PyTorch. The Kaldi integration employed a hack which replaces the base vector/matrix implementation of Kaldi with PyTorch Tensor so that there is only one blas library within torchaudio. Recently, we are making torchaudio more lean, and we don't see a wide adoption of kaldi_pitch feature, so we decided to remove them. See some of the discussion https://github.com/pytorch/audio/issues/1269 Pull Request resolved: https://github.com/pytorch/audio/pull/3368 Differential Revision: D46406176 Pulled By: mthrok fbshipit-source-id: ee5e24d825188f379979ddccd680c7323b119b1e
-
moto authored
Summary: Replace sox_effects with `torchaudio.io.AudioEffector` 1. To show case the new and better feature 2. To prepare for the upcoming removal of file-like support object Pull Request resolved: https://github.com/pytorch/audio/pull/3375 Reviewed By: nateanl Differential Revision: D46379016 Pulled By: mthrok fbshipit-source-id: 70f24b62494204949f327f6ac6c49f315c9ee315
-
- 31 May, 2023 1 commit
-
-
Jeff Hwang authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3379 Fixes `RNNTBeamSearch.infer`'s docstring and removes unused import from tutorial. Reviewed By: mthrok Differential Revision: D46227174 fbshipit-source-id: 7c1c3f05a6476cb0437622dea6f3ae6cb3ea9468
-
- 26 May, 2023 2 commits
-
-
atalman authored
Summary: This reverts commit d38a7854. This is temporary revert to unblock unit test migration from circleci to github Pull Request resolved: https://github.com/pytorch/audio/pull/3377 Reviewed By: mthrok Differential Revision: D46230498 Pulled By: atalman fbshipit-source-id: 000d8a9ca00750fc1ca61f4c2cdd6e930a5ce46d
-
Lakshmi Krishnan authored
Summary: This commit fixes the following issues affecting streaming decoding quality 1. The `init_b` hypothesis is only regenerated from blank token if no initial hypotheses are provided. 2. Allows the decoder to receive top-K hypothesis to continue decoding from, instead of using just the top hypothesis at each decoding step. This dramatically affects decoding quality especially for speech with long pauses and disfluencies. 3. Some minor errors regarding shape checking for length. This also means that the resulting output is the entire transcript up until that time step, instead of just the incremental change in transcript. Pull Request resolved: https://github.com/pytorch/audio/pull/3295 Reviewed By: nateanl Differential Revision: D46216113 Pulled By: hwangjeff fbshipit-source-id: 8f7efae28dcca4a052f434ca55a2795c9e5ec0b0
-
- 23 May, 2023 1 commit
-
-
Xiaohui Zhang authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3356 move the forced aligner tutorial to torchaudio, with some formatting changes Reviewed By: mthrok Differential Revision: D46060238 fbshipit-source-id: d90e7db5669a58d1e9ef5c2ec3c6d175b4e394ec
-
- 21 May, 2023 2 commits
-
-
Moto Hira authored
Differential Revision: D45960556 Original commit changeset: 93f2271f7130 Original Phabricator Diff: D45960556 fbshipit-source-id: d22883fbcf9c5f2bb5d49274bcc194bdffaca72a
-
Xiaohui Zhang authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3351 move the forced aligner tutorial to torchaudio, with some formatting changes Reviewed By: vineelpratap, nateanl Differential Revision: D45960556 fbshipit-source-id: 93f2271f71307404e6a7732385cf7d646dc8ceaa
-
- 16 May, 2023 1 commit
-
-
moto authored
Summary: This commit upgrade the version of FFmpeg compiled against TorchAudio binary distribution to 5.0.4. FFmpeg 5.0 was released in Jan 2022, and many package managers provide a version of FFmpeg v5. Conda-forge lists 5.1 for all the platforms TorchAudio supports.https://anaconda.org/conda-forge/ffmpeg Pull Request resolved: https://github.com/pytorch/audio/pull/3298 Reviewed By: hwangjeff Differential Revision: D45865599 Pulled By: mthrok fbshipit-source-id: d95638eb80daaf477a710a992f4ead9b9009bb9b
-
- 10 May, 2023 2 commits
-
-
moto authored
Summary: https://output.circle-artifacts.com/output/job/fbfa6d9a-5014-42ac-8e77-c1e9565747e8/artifacts/0/docs/tutorials/effector_tutorial.html Pull Request resolved: https://github.com/pytorch/audio/pull/3226 Reviewed By: nateanl Differential Revision: D45402724 Pulled By: mthrok fbshipit-source-id: bc9d1bc071f6f5062b9cc35d743b4a3016306262
-
moto authored
Summary: This commit is preparation for landing dispatcher switch in https://github.com/pytorch/audio/issues/3241 Making FFmpeg backend default causes some issues on tutorials, so this commit disable it. The IO tutorial will be updated after https://github.com/pytorch/audio/issues/3241 is landed to accommodate the change. Since it is necessary to mention the changes related to migration in the IO tutorial, I also update the IO documentation to include migration work so that it's easy to redirect. Pull Request resolved: https://github.com/pytorch/audio/pull/3285 Reviewed By: nateanl Differential Revision: D45671237 Pulled By: mthrok fbshipit-source-id: cb541f6bd93cd9920019b8ec83210ea69d34f133
-
- 05 May, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: Add scatter plots for STOI, PESQ, Si-SDR, and MOS scores to demonstrate the performance of `SquimObjective` and `SquimSubjective` models and how close they are to the ground truths. Pull Request resolved: https://github.com/pytorch/audio/pull/3313 Reviewed By: hwangjeff Differential Revision: D45620311 Pulled By: nateanl fbshipit-source-id: cb58ffd3744df4749b9385876da8de0cffd93557
-
- 29 Apr, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: The PR adds a tutorial that demonstrates how to use pre-trained `TorchAudio-SQUIM` pipelines to estimate objective and subjective metric scores (PESQ, STOI, Si-SDR, MOS). Pull Request resolved: https://github.com/pytorch/audio/pull/3279 Reviewed By: hwangjeff Differential Revision: D45415404 Pulled By: nateanl fbshipit-source-id: abcaeadcca0eabc2dca53b607eac6257a701c903
-
- 31 Mar, 2023 1 commit
-
-
Nouran Ali authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3222 Reviewed By: nateanl Differential Revision: D44539424 Pulled By: mthrok fbshipit-source-id: 8fbcb5f9918c9930c939bcd448493fa5cf604545
-
- 29 Mar, 2023 1 commit
-
-
moto authored
Summary: There is a part of StreamWriter tutorial that warns about corrupted AAC audio output, but this is no longer relevant thus this commit deletes it. Pull Request resolved: https://github.com/pytorch/audio/pull/3214 Reviewed By: nateanl Differential Revision: D44504030 Pulled By: mthrok fbshipit-source-id: 4d26d582e9fb87d4e6fa674c05fe3192bc223eef
-
- 28 Mar, 2023 1 commit
-
-
nateanl authored
Summary: Fix https://github.com/pytorch/audio/issues/3211 Pull Request resolved: https://github.com/pytorch/audio/pull/3212 Reviewed By: mthrok Differential Revision: D44472523 Pulled By: nateanl fbshipit-source-id: eb519b0045e7518ad13863a53271745a80d89a21
-
- 16 Mar, 2023 1 commit
-
-
jiyuntu-eero authored
Summary: Fix https://github.com/pytorch/audio/issues/3166. In `get_trellis` method, the index of blank symbol is regarded as 0 by default. It should be changed to `blank_id`. Pull Request resolved: https://github.com/pytorch/audio/pull/3172 Reviewed By: mthrok Differential Revision: D44090889 Pulled By: nateanl fbshipit-source-id: d119f4ded895d31aeefd59f8d975224870100264
-
- 02 Mar, 2023 1 commit
-
-
moto authored
Summary: Fix build_doc job https://app.circleci.com/pipelines/github/pytorch/audio/15217/workflows/ce50b317-a59e-4741-b8d2-59129420deb8 - build.ffmpeg.html might not exist when IPython notebook is processed. Changing to main doc URL. - Fix bash cell syntax in HW tutorial - Fix C++ doc - Fix duplicated target name in streamwriter tutorial Pull Request resolved: https://github.com/pytorch/audio/pull/3125 Reviewed By: xiaohui-zhang Differential Revision: D43724078 Pulled By: mthrok fbshipit-source-id: ea7d46ec5e377cf2fbd7c3798df57da73750ac5c
-
- 15 Feb, 2023 1 commit
-
-
hwangjeff authored
Summary: Updates tutorial "Audio Data Augmentation" to use two of the newly introduced data augmentation operators in beta: `torchaudio.functional.fftconvolve` and `torchaudio.functional.add_noise`. Pull Request resolved: https://github.com/pytorch/audio/pull/3062 Reviewed By: mthrok Differential Revision: D43298120 Pulled By: hwangjeff fbshipit-source-id: 09ca736a5c67242568515d600b7d31eab32c2df1
-
- 30 Jan, 2023 1 commit
-
-
Yan Li authored
Summary: Currently there will be a few errors when this tutorial is run with a CUDA device. The reasons being: - The source audio waveform is not properly moved to the GPU. The `to()` method is not in-place for Tensors, so we need to assign the return value of the method call to the variable (otherwise the Tensor would still be on the CPU). - When performing further analysis and displaying of the output audio, we need to move them back from the GPU to the CPU. This is because some of the functions we call require the Tensor to be on the CPU (e.g. `stft()` and `bss_eval_sources()`). Pull Request resolved: https://github.com/pytorch/audio/pull/3017 Reviewed By: mthrok Differential Revision: D42828526 Pulled By: nateanl fbshipit-source-id: c28bc855e79e3363a011f4a35a69aae1764e7762
-
- 17 Jan, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: The mel spectrograms in the TTS tutorial are upside down. The PR fixes it by using `origin="lower"` in imshow. Pull Request resolved: https://github.com/pytorch/audio/pull/2989 Reviewed By: mthrok Differential Revision: D42538349 Pulled By: nateanl fbshipit-source-id: 4388103a49bdfabf1705c1f979d44ecedd5c910a
-
- 13 Jan, 2023 1 commit
-
-
moto authored
Summary: Per the suggestion by nateanl, adding the visualization of feature fed to ASR. <img width="688" alt="Screen Shot 2023-01-12 at 8 19 59 PM" src="https://user-images.githubusercontent.com/855818/212215190-23be7553-4c04-40d9-944e-3ee2ff69c49b.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2974 Reviewed By: nateanl Differential Revision: D42484088 Pulled By: mthrok fbshipit-source-id: 2c839492869416554eac04aa06cd12078db21bd7
-
- 30 Dec, 2022 1 commit
-
-
moto authored
Summary: Artifact: [subtractive_synthesis_tutorial](https://output.circle-artifacts.com/output/job/4c1ce33f-834d-48e0-ba89-2e91acdcb572/artifacts/0/docs/tutorials/subtractive_synthesis_tutorial.html) Pull Request resolved: https://github.com/pytorch/audio/pull/2934 Reviewed By: carolineechen Differential Revision: D42284945 Pulled By: mthrok fbshipit-source-id: d255b8e8e2a601a19bc879f9e1c38edbeebaf9b3
-
- 17 Dec, 2022 1 commit
-
-
moto authored
Summary: Adds filter design tutorial, which demonstrates `sinc_impulse_response` and `frequency_impulse_response`. Example: - [filter_design_tutorial](https://output.circle-artifacts.com/output/job/bd22c615-9215-4b17-a52c-b171a47f646c/artifacts/0/docs/tutorials/filter_design_tutorial.html) Pull Request resolved: https://github.com/pytorch/audio/pull/2894 Reviewed By: xiaohui-zhang Differential Revision: D42117658 Pulled By: mthrok fbshipit-source-id: f7dd04980e8557bb6f0e0ec26ac2c7f53314ea16
-
- 16 Dec, 2022 1 commit
-
-
Caroline Chen authored
Summary: resolves https://github.com/pytorch/audio/issues/2891 Rename `resampling_method` options to more accurately describe what is happening. Previously the methods were set to `sinc_interpolation` and `kaiser_window`, which can be confusing as both options actually use sinc interpolation methodology, but differ in the window function used. As a result, rename `sinc_interpolation` to `sinc_interp_hann` and `kaiser_window` to `sinc_interp_kaiser`. Using an old option will throw a warning, and those options will be deprecated in 2 released. The numerical behavior is unchanged. Pull Request resolved: https://github.com/pytorch/audio/pull/2922 Reviewed By: mthrok Differential Revision: D42083619 Pulled By: carolineechen fbshipit-source-id: 9a9a7ea2d2daeadc02d53dddfd26afe249459e70
-
- 29 Nov, 2022 1 commit
-
-
moto authored
Summary: This commit adds the tutorial for additive synthesis, using torchaudio's prototype DSP ops. [Review here](https://output.circle-artifacts.com/output/job/3dc83322-832a-4272-9c13-df752c97b660/artifacts/0/docs/tutorials/additive_synthesis_tutorial.html) Pull Request resolved: https://github.com/pytorch/audio/pull/2877 Reviewed By: carolineechen Differential Revision: D41585425 Pulled By: mthrok fbshipit-source-id: b81283b90e4779c8054fd030a1d8c3d39d676bbd
-
- 28 Nov, 2022 1 commit
-
-
moto authored
Summary: This commits add tutorial for oscillator_bank and adsr_envelope, which will be a basis for DDSP. - [Review here](https://output.circle-artifacts.com/output/job/cf1d3001-88e5-418b-8cf8-ae22b4445dba/artifacts/0/docs/tutorials/oscillator_tutorial.html) Pull Request resolved: https://github.com/pytorch/audio/pull/2862 Reviewed By: carolineechen Differential Revision: D41559503 Pulled By: mthrok fbshipit-source-id: 3f1689186db7d246de14f228fc2f91bf37db98cd
-
- 17 Oct, 2022 1 commit
-
-
moto authored
Summary: * Refactor benchmark script * Rename `time` variable to avoid (potential) conflicting with time module * Fix `beta` parameter in benchmark (it was not used previously) * Use `timeit` module for benchmark * Add plot * Move the comment on result at the end * Add link to an explanation of aliasing https://output.circle-artifacts.com/output/job/20b57d2f-3614-4161-a18e-e0c1a537739c/artifacts/0/docs/tutorials/audio_resampling_tutorial.html Pull Request resolved: https://github.com/pytorch/audio/pull/2773 Reviewed By: carolineechen Differential Revision: D40421337 Pulled By: mthrok fbshipit-source-id: b402f84d4517695daeca75fb84ad876ef9354b3a
-
- 14 Oct, 2022 2 commits
-
-
moto authored
Summary: In StreamWriter basic usage tutorial, matplotlib is used to generate raster images of waveforms, and the figure used is left unshown in the resulting tutorial with the use of ``sphinx_gallery_defer_figures`` command. It turned out that this figure is shown in the next code block executed by Sphinx Gallery, and the figure is placed in totally unrelated place. https://pytorch.org/audio/main/tutorials/audio_feature_extractions_tutorial.html <img width="951" alt="Screen Shot 2022-10-14 at 10 06 58 PM" src="https://user-images.githubusercontent.com/855818/195855124-ecd9be49-5085-4acd-9a93-608d9d1ee9ce.png"> This commit fixes it by closing the figure. Pull Request resolved: https://github.com/pytorch/audio/pull/2771 Reviewed By: nateanl Differential Revision: D40382076 Pulled By: mthrok fbshipit-source-id: 015f2bab8492d3b4fbe70e1174c7776a5aa2679a
-
nateanl authored
Summary: The separation applies on chunks of audios to avoid OOM. The combination of consecutive chunks is described in the graph:  In the last audio chunk, there is no future chunk to be combined, hence the overlap on the right side doesn't need to be faded. Pull Request resolved: https://github.com/pytorch/audio/pull/2769 Reviewed By: carolineechen Differential Revision: D40358382 Pulled By: nateanl fbshipit-source-id: ec8be895d7a67acb257e2693b64922397163ed5e
-
- 13 Oct, 2022 2 commits
-
-
Caroline Chen authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2762 Reviewed By: mthrok Differential Revision: D40332603 Pulled By: carolineechen fbshipit-source-id: 2de51265adc81b4728f4d6798d287bd2eccf5251
-
moto authored
Summary: Adding and updating author information. Pull Request resolved: https://github.com/pytorch/audio/pull/2764 Reviewed By: carolineechen Differential Revision: D40332427 Pulled By: mthrok fbshipit-source-id: 4f04c7351386c122e3b0a45c2ed1757a04b7dc9a
-