- 06 Sep, 2022 3 commits
-
-
Ravi Makhija authored
Summary: This PR is meant to address the bug raised in issue https://github.com/pytorch/audio/issues/2634. In particular, previously the Box Muller transform was used to generate Gaussian variates for dithering based on `torch.rand` uniform variates, but it was incorrectly implemented (e.g. the same uniform variate was used as input to the transform, rather than two different uniform variates), which led to a different (non-Gaussian) distribution. This PR instead uses `torch.randn` to generate the Gaussian variates. Pull Request resolved: https://github.com/pytorch/audio/pull/2639 Reviewed By: mthrok Differential Revision: D39101144 Pulled By: carolineechen fbshipit-source-id: 691e49679f6598ef0a1675f6f4ee721ef32215fd
-
Caroline Chen authored
Summary: Adding support for metadata mode, requested in https://github.com/pytorch/audio/issues/2539, by adding a public `get_metadata()` function in the dataset. This function can be used directly by users to fetch metadata for individual dataset indices, or users can subclass the dataset and override `__getitem__` with `get_metadata` to create a dataset class that directly handles metadata mode. Pull Request resolved: https://github.com/pytorch/audio/pull/2653 Reviewed By: nateanl, mthrok Differential Revision: D39105114 Pulled By: carolineechen fbshipit-source-id: 6f26f1402a053dffcfcc5d859f87271ed5923348
-
Peter Albert authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2655 Removed obsolete example and the corresponding test Reviewed By: mthrok Differential Revision: D39260253 fbshipit-source-id: 0bde71ffd75dd0c94a5cc4a9940f4648a5d61bd7
-
- 02 Sep, 2022 1 commit
-
-
moto authored
Summary: This commits add CUDA hardware encoding to StreamWriter. For certain video formats, it can encode video directly from CUDA Tensor, without needing to move the data to host CPU. Pull Request resolved: https://github.com/pytorch/audio/pull/2505 Reviewed By: hwangjeff Differential Revision: D37446830 Pulled By: mthrok fbshipit-source-id: eee6424f01a99a3b611dcad45ed58f86cba4672a
-
- 01 Sep, 2022 1 commit
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2648 Reviewed By: nateanl Differential Revision: D38976874 Pulled By: mthrok fbshipit-source-id: 0541dea2a633d97000b4b8609ff6b83f6b82c864
-
- 26 Aug, 2022 3 commits
-
-
pbialecki authored
Summary: CC atalman Pull Request resolved: https://github.com/pytorch/audio/pull/2623 Reviewed By: hwangjeff, nateanl Differential Revision: D39036432 Pulled By: atalman fbshipit-source-id: cd74a1bf8f74e31bd2c32c80d32c617f4b1766e8
-
Omkar Salpekar authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2650 Reviewed By: mehtanirav Differential Revision: D39040559 Pulled By: osalpekar fbshipit-source-id: df39e23d7c246728793aab969b8dc1070af88d75
-
Caroline Chen authored
Summary: `bg_iterator` was deprecated in 0.11 because it was known to have issues (deadlock) without speed up. Remove instances of `bg_iterator` used in torchaudio examples. Resolves https://github.com/pytorch/audio/issues/2642 Pull Request resolved: https://github.com/pytorch/audio/pull/2645 Reviewed By: nateanl Differential Revision: D38954292 Pulled By: carolineechen fbshipit-source-id: 2333ab5228c2b8511ff532057543aaf9d02b2789
-
- 25 Aug, 2022 1 commit
-
-
Omkar Salpekar authored
Summary: Calling the reusable workflow introduced in https://github.com/pytorch/test-infra/pull/546 to build conda binaries on linux. Pull Request resolved: https://github.com/pytorch/audio/pull/2626 Reviewed By: mehtanirav Differential Revision: D39028057 Pulled By: osalpekar fbshipit-source-id: d74ea3771967d0ee2b0ad28a8f811a95145b2183
-
- 24 Aug, 2022 1 commit
-
-
moto authored
Summary: This commit adds FFmpeg-based encoder StreamWriter class. StreamWriter is pretty much the opposite of StreamReader class, and it supports; * Encoding audio / still image / video * Exporting to local file / streaming protocol / devices etc... * File-like object support (in later commit) * HW video encoding (in later commit) See also: https://fburl.com/gslide/z85kn5a9 (Meta internal) Pull Request resolved: https://github.com/pytorch/audio/pull/2628 Reviewed By: nateanl Differential Revision: D38816650 Pulled By: mthrok fbshipit-source-id: a9343b0d55755e186971dc96fb86eb52daa003c8
-
- 23 Aug, 2022 2 commits
-
-
Ravi Makhija authored
Summary: Added example for LFCC transform as mentioned in issue https://github.com/pytorch/audio/issues/1564. Pull Request resolved: https://github.com/pytorch/audio/pull/2640 Reviewed By: carolineechen Differential Revision: D38908975 Pulled By: nateanl fbshipit-source-id: ffdd994390db7f27556b011a8050a65eef9cd09d
-
Omkar Salpekar authored
Summary: As part of Project Nova, we are consolidating CI/CD workflows and infra, making them reusable across PyTorch ecosystem libraries. https://github.com/pytorch/test-infra/pull/460 introduces a general-purpose reusable workflow to build linux wheels for python libraries. This PR introduces a caller workflow that triggers the reusable workflow. Details around modular env setup, passing input args across workflows, etc. are still being worked out. Using reusable workflow defined in https://github.com/pytorch/test-infra/pull/506 Pull Request resolved: https://github.com/pytorch/audio/pull/2548 Reviewed By: osalpekar Differential Revision: D38947733 Pulled By: mehtanirav fbshipit-source-id: 03ab88cef973a092f5c5d1ff8c74ec7ae7e46d01
-
- 22 Aug, 2022 2 commits
-
-
moto authored
Summary: The minor release fixes some gallery issue, which allows to remove some of the customization we had in https://github.com/pytorch/audio/issues/2629 https://output.circle-artifacts.com/output/job/553a9b98-8260-4cb4-a681-20ef97d2c33e/artifacts/0/docs/pipelines.html#torchaudio.pipelines.Wav2Vec2ASRBundle Pull Request resolved: https://github.com/pytorch/audio/pull/2638 Reviewed By: carolineechen, nateanl Differential Revision: D38909097 Pulled By: mthrok fbshipit-source-id: 78346d93b54fca2a19b28991c224324ef53221c9
-
Ravi Makhija authored
Summary: Added example for Loudness transform (implemented in PR https://github.com/pytorch/audio/issues/2472) as mentioned in issue https://github.com/pytorch/audio/issues/1564. Pull Request resolved: https://github.com/pytorch/audio/pull/2641 Reviewed By: nateanl Differential Revision: D38907782 Pulled By: carolineechen fbshipit-source-id: fd2bcc4bac3095a626ea9cf36cb70cb2bf003d63
-
- 20 Aug, 2022 1 commit
-
-
Ravi Makhija authored
Summary: Added example for MFCC transform as mentioned in issue https://github.com/pytorch/audio/issues/1564. Note: Python formatter package `black` uses double quotes for the string dict keys (e.g. in `melkwargs` for this example). Please let me know if there is a different linter/format/convention that is preferred! Pull Request resolved: https://github.com/pytorch/audio/pull/2637 Reviewed By: carolineechen Differential Revision: D38873729 Pulled By: nateanl fbshipit-source-id: 2e8fe2930671e7c5d02c0c37cf1ca5cc8c5079e3
-
- 19 Aug, 2022 2 commits
-
-
Moto Hira authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2636 At the early stage of torchaudio extension module, `torchaudio/csrc/pybind` directory was created so that all the code defining Python interface would be placed there and there will be only one extension module called `torchaudio._torchaudio`. However, the codebase has been evolved in a way separate extensions are defined for each feature (third party dependency) for the sake of more moduler file organization. What is left in `csrc/pybind` is libsox Python bindings. This commit moves it under `csrc/sox`. Follow-up rename `torchaudio._torchaudio` to `torchaudio._torchaudio_sox`. Reviewed By: carolineechen Differential Revision: D38829253 fbshipit-source-id: 3554af45a2beb0f902810c5548751264e093f28d
-
moto authored
Summary: Update compatibility matrix Pull Request resolved: https://github.com/pytorch/audio/pull/2633 Reviewed By: nateanl Differential Revision: D38827670 Pulled By: mthrok fbshipit-source-id: 5c66bf60a06e37919ee725a5f4adf571e6c89100
-
- 18 Aug, 2022 6 commits
-
-
moto authored
Summary: * Use download_asset * Remove notes around nightly * Print versions first * Remove duplicated import Pull Request resolved: https://github.com/pytorch/audio/pull/2631 Reviewed By: carolineechen Differential Revision: D38830395 Pulled By: mthrok fbshipit-source-id: c9259df33562defe249734d1ed074dac0fddc2f6
-
Ravi Makhija authored
Summary: Added example for InverseMelScale transform as mentioned in issue https://github.com/pytorch/audio/issues/1564. Pull Request resolved: https://github.com/pytorch/audio/pull/2635 Reviewed By: carolineechen Differential Revision: D38830318 Pulled By: nateanl fbshipit-source-id: fd26a700d495f6755db0767625aa8577cb89bd83
-
moto authored
Summary: Google Colab now has torchaudio 0.12 pre-installed. This commit removes the note about nightly build. Pull Request resolved: https://github.com/pytorch/audio/pull/2632 Reviewed By: carolineechen Differential Revision: D38827632 Pulled By: mthrok fbshipit-source-id: ac769780868b741c3012357d589ec0019d9af6eb
-
moto authored
Summary: Resolves the following warnings ``` /torchaudio/docs/source/tutorials/asr_inference_with_ctc_decoder_tutorial.rst:195: WARNING: Unexpected indentation. /torchaudio/docs/source/tutorials/asr_inference_with_ctc_decoder_tutorial.rst:446: WARNING: Unexpected indentation. /torchaudio/docs/source/tutorials/audio_io_tutorial.rst:559: WARNING: Content block expected for the "note" directive; none found. /torchaudio/docs/source/tutorials/mvdr_tutorial.rst:338: WARNING: Bullet list ends without a blank line; unexpected unindent. ``` Pull Request resolved: https://github.com/pytorch/audio/pull/2630 Reviewed By: nateanl Differential Revision: D38816632 Pulled By: mthrok fbshipit-source-id: 135ded4e064d136be67ce24439e96f5e9c9ce635
-
moto authored
Summary: This commit fixes the issue with the recent Sphinx-Gallery update. Also it pins the versions of Sphinx-related packages. Before: <img width="256" alt="Screen Shot 2022-08-17 at 10 02 23 PM" src="https://user-images.githubusercontent.com/855818/185140952-28f2d98a-b586-424c-a003-b69089f48eb9.png"> After: https://user-images.githubusercontent.com/855818/185271889-bd4f86a0-986b-43bb-8121-bd77750d74f0.mov Pull Request resolved: https://github.com/pytorch/audio/pull/2629 Reviewed By: carolineechen Differential Revision: D38816417 Pulled By: mthrok fbshipit-source-id: 11ee3f9121d9a302772ee1f461dacae52eb28852
-
moto authored
Summary: Resolves the following warning ``` /torchaudio/docs/source/transforms.rst:94: WARNING: Title underline too short. :hidden:`Loudness` ----------------- ``` Pull Request resolved: https://github.com/pytorch/audio/pull/2627 Reviewed By: carolineechen Differential Revision: D38814802 Pulled By: mthrok fbshipit-source-id: 5dfaf2d7bae22dba0f4a14f04ca63f28d6b2a749
-
- 16 Aug, 2022 4 commits
-
-
Zhaoheng Ni authored
Summary: To make the code consistent, we should use double quotation marks for all strings. This PR make such changes in functional and transforms. Pull Request resolved: https://github.com/pytorch/audio/pull/2618 Reviewed By: carolineechen Differential Revision: D38744137 Pulled By: nateanl fbshipit-source-id: 74213a24d9f66c306cc92019d77dcb2a877f94bd
-
Ravi Makhija authored
Summary: Added example for AmplitudeToDB transform as mentioned in issue https://github.com/pytorch/audio/issues/1564. Pull Request resolved: https://github.com/pytorch/audio/pull/2615 Reviewed By: carolineechen Differential Revision: D38743117 Pulled By: nateanl fbshipit-source-id: bf0f760299f4777a4bca65da86359faa00b16207
-
Ravi Makhija authored
Summary: Added example for MelScale transform as mentioned in issue https://github.com/pytorch/audio/issues/1564. Pull Request resolved: https://github.com/pytorch/audio/pull/2616 Reviewed By: carolineechen Differential Revision: D38743145 Pulled By: nateanl fbshipit-source-id: e24ca92f5317a0ea5a141418bf084b12cfb22486
-
Andrey Talman authored
Summary: Similar to https://github.com/pytorch/vision/pull/6218 Fixing MacOS builds Pull Request resolved: https://github.com/pytorch/audio/pull/2622 Reviewed By: weiwangmeta Differential Revision: D38722983 Pulled By: atalman fbshipit-source-id: 4cef85c97dc270fc812bc289592c4f3815f73c85
-
- 15 Aug, 2022 3 commits
-
-
Andrey Talman authored
Summary: Same as: https://github.com/pytorch/vision/pull/6422 Testing: ``` export ANACONDA_PATH=$(conda info --base)/bin echo $ANACONDA_PATH /opt/homebrew/Caskroom/miniconda/base/bin $ANACONDA_PATH/anaconda -V anaconda Command line client (version 1.10.0) ``` Failure: https://github.com/pytorch/audio/runs/7837085749?check_suite_focus=true Pull Request resolved: https://github.com/pytorch/audio/pull/2621 Reviewed By: weiwangmeta, seemethere Differential Revision: D38714324 Pulled By: atalman fbshipit-source-id: 55342cf69006e9250403c955202846bab4516f3e
-
moto authored
Summary: The link to version selector has been absolute link, which had been a trap when reviewing gh-pages deployment from folk. This commit changes that to relative link. Pull Request resolved: https://github.com/pytorch/audio/pull/2605 Test Plan: - https://mthrok.github.io/audio/main/index.html -> click version selector -> https://mthrok.github.io/audio/versions.html - https://mthrok.github.io/audio/0.12.1/index.html -> click version selector -> https://pytorch.org/audio/versions.html Reviewed By: carolineechen, nateanl Differential Revision: D38695645 Pulled By: mthrok fbshipit-source-id: 91132ac19b8c61f39d304a162435b9c6599ef2b2
-
Zhaoheng Ni authored
Summary: `ctc_decoder` has become beta, remove it from prototype documents. Pull Request resolved: https://github.com/pytorch/audio/pull/2617 Reviewed By: hwangjeff Differential Revision: D38706869 Pulled By: nateanl fbshipit-source-id: 41679f4e65a584b6b882af4551a50123f1dcef02
-
- 12 Aug, 2022 1 commit
-
-
Andrey Talman authored
Summary: Introducing pytorch-cuda metapackage Same as: https://github.com/pytorch/vision/pull/6371 Following PR: https://github.com/pytorch/builder/pull/1094 Adds cuda metapackage called pytorch-cuda . This way we can make sure to install correct version of cuda dependencies and don't depend on conda-forge. Pull Request resolved: https://github.com/pytorch/audio/pull/2612 Reviewed By: hwangjeff, seemethere, nateanl Differential Revision: D38633332 Pulled By: atalman fbshipit-source-id: 78a6115bb252ebdb6d66a57d7d2c4a4978ddb501
-
- 11 Aug, 2022 1 commit
-
-
hwangjeff authored
Summary: Adds function `add_noise`, which computes and returns the sum of a waveform and scaled noise. Pull Request resolved: https://github.com/pytorch/audio/pull/2608 Reviewed By: nateanl Differential Revision: D38557141 Pulled By: hwangjeff fbshipit-source-id: 1457fa213f43ca5b4333d3c7580971655d4260a0
-
- 10 Aug, 2022 3 commits
-
-
hwangjeff authored
Summary: https://github.com/pytorch/audio/issues/2535 modified the Conformer RNN-T Lightning module to accept a SentencePiece model instance rather than a file path. This PR makes changes to account for this in the train script. Pull Request resolved: https://github.com/pytorch/audio/pull/2611 Reviewed By: carolineechen Differential Revision: D38578892 Pulled By: hwangjeff fbshipit-source-id: ec3b9823ad30ffb730baa13d10d8b79020866aac
-
Kunal Upadya authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2609 Converted argument validations in torchaudio/functional/filtering from assert based validation to the preferred if-then raise validation. Added specific error messages in all cases. Reviewed By: mthrok Differential Revision: D38515029 fbshipit-source-id: 6c644a042f86c6feb2bbe8bd02fdb484fe27fae9
-
Sean Kim authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2607 Reviewed By: carolineechen, nateanl Differential Revision: D38522606 Pulled By: skim0514 fbshipit-source-id: 2c38b8dcb343bcf624bfda1bfa2afd91abf2e668
-
- 09 Aug, 2022 1 commit
-
-
Caroline Chen authored
Summary: Expose flashlight's LM and LMState classes to support decoding with custom language models, including NN LMs. The `ctc_decoder` API is as follows - To decode with KenLM, pass in KenLM language model path to `lm` variable - To decode with custom LM, create Python class with `CTCDecoderLM` subclass, and pass in the class to `lm` variable. Additionally create a file of LM words listed in order of the LM index, with a word per line, and pass in the file to `lm_path`. - To decode without a language model, set `lm` to `None` (default) Validated against fairseq w2l decoder on sample LibriSpeech dataset and LM. Code for validation can be found [here](https://github.com/facebookresearch/fairseq/compare/main...carolineechen:fairseq:ctc-decoder). Also added unit tests to validate custom implementations of ZeroLM and KenLM, and also using a biased LM. Follow ups: - Train simple LM on LibriSpeech and demonstrate usage in tutorial or examples directory cc jacobkahn Pull Request resolved: https://github.com/pytorch/audio/pull/2528 Reviewed By: mthrok Differential Revision: D38243802 Pulled By: carolineechen fbshipit-source-id: 445e78f6c20bda655aabf819fc0f771fe68c73d7
-
- 08 Aug, 2022 1 commit
-
-
Caroline Chen authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2606 Reviewed By: nateanl Differential Revision: D38502666 Pulled By: carolineechen fbshipit-source-id: 1e279996fff3621835a07882c63328856fe38f3a
-
- 05 Aug, 2022 3 commits
-
-
hwangjeff authored
Summary: Adds functions `convolve` and `fftconvolve`, which compute the convolution of two tensors along their trailing dimension. The former performs the convolution directly, whereas the latter performs it using FFT. Pull Request resolved: https://github.com/pytorch/audio/pull/2602 Reviewed By: nateanl, mthrok Differential Revision: D38450771 Pulled By: hwangjeff fbshipit-source-id: b2d1e063ba21eafeddf317d60749e7120b14292b
-
Caroline Chen authored
Summary: ``words`` field of CTCHypothesis is empty if no lexicon is provided, which produces confusing output (see issue https://github.com/pytorch/audio/issues/2584) when following our tutorial example with lexicon free usage. This PR adds a note in both docs and tutorial. Followup: determine if we want to modify the behavior of ``words`` in the lexicon free case. One option is to merge and then split the generated tokens by the input silent token to populate the words field, but this is tricky since the meaning of a "word" in the lexicon free case can be vague and not all languages have whitespaces between words, etc Pull Request resolved: https://github.com/pytorch/audio/pull/2603 Reviewed By: mthrok Differential Revision: D38459709 Pulled By: carolineechen fbshipit-source-id: d64ff186df4633f00e94c64afeaa6a50cebf2934
-
Ravi Makhija authored
Summary: Added example for `SlidingWindowCmn` transform as mentioned in issue https://github.com/pytorch/audio/issues/1564 Pull Request resolved: https://github.com/pytorch/audio/pull/2600 Reviewed By: mthrok Differential Revision: D38395579 Pulled By: carolineechen fbshipit-source-id: 44c5b7181789eedcaaa1d80149d5a1ab8de4c0ba
-