- 26 Aug, 2022 2 commits
-
-
Omkar Salpekar authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2650 Reviewed By: mehtanirav Differential Revision: D39040559 Pulled By: osalpekar fbshipit-source-id: df39e23d7c246728793aab969b8dc1070af88d75
-
Caroline Chen authored
Summary: `bg_iterator` was deprecated in 0.11 because it was known to have issues (deadlock) without speed up. Remove instances of `bg_iterator` used in torchaudio examples. Resolves https://github.com/pytorch/audio/issues/2642 Pull Request resolved: https://github.com/pytorch/audio/pull/2645 Reviewed By: nateanl Differential Revision: D38954292 Pulled By: carolineechen fbshipit-source-id: 2333ab5228c2b8511ff532057543aaf9d02b2789
-
- 25 Aug, 2022 1 commit
-
-
Omkar Salpekar authored
Summary: Calling the reusable workflow introduced in https://github.com/pytorch/test-infra/pull/546 to build conda binaries on linux. Pull Request resolved: https://github.com/pytorch/audio/pull/2626 Reviewed By: mehtanirav Differential Revision: D39028057 Pulled By: osalpekar fbshipit-source-id: d74ea3771967d0ee2b0ad28a8f811a95145b2183
-
- 24 Aug, 2022 1 commit
-
-
moto authored
Summary: This commit adds FFmpeg-based encoder StreamWriter class. StreamWriter is pretty much the opposite of StreamReader class, and it supports; * Encoding audio / still image / video * Exporting to local file / streaming protocol / devices etc... * File-like object support (in later commit) * HW video encoding (in later commit) See also: https://fburl.com/gslide/z85kn5a9 (Meta internal) Pull Request resolved: https://github.com/pytorch/audio/pull/2628 Reviewed By: nateanl Differential Revision: D38816650 Pulled By: mthrok fbshipit-source-id: a9343b0d55755e186971dc96fb86eb52daa003c8
-
- 23 Aug, 2022 2 commits
-
-
Ravi Makhija authored
Summary: Added example for LFCC transform as mentioned in issue https://github.com/pytorch/audio/issues/1564. Pull Request resolved: https://github.com/pytorch/audio/pull/2640 Reviewed By: carolineechen Differential Revision: D38908975 Pulled By: nateanl fbshipit-source-id: ffdd994390db7f27556b011a8050a65eef9cd09d
-
Omkar Salpekar authored
Summary: As part of Project Nova, we are consolidating CI/CD workflows and infra, making them reusable across PyTorch ecosystem libraries. https://github.com/pytorch/test-infra/pull/460 introduces a general-purpose reusable workflow to build linux wheels for python libraries. This PR introduces a caller workflow that triggers the reusable workflow. Details around modular env setup, passing input args across workflows, etc. are still being worked out. Using reusable workflow defined in https://github.com/pytorch/test-infra/pull/506 Pull Request resolved: https://github.com/pytorch/audio/pull/2548 Reviewed By: osalpekar Differential Revision: D38947733 Pulled By: mehtanirav fbshipit-source-id: 03ab88cef973a092f5c5d1ff8c74ec7ae7e46d01
-
- 22 Aug, 2022 2 commits
-
-
moto authored
Summary: The minor release fixes some gallery issue, which allows to remove some of the customization we had in https://github.com/pytorch/audio/issues/2629 https://output.circle-artifacts.com/output/job/553a9b98-8260-4cb4-a681-20ef97d2c33e/artifacts/0/docs/pipelines.html#torchaudio.pipelines.Wav2Vec2ASRBundle Pull Request resolved: https://github.com/pytorch/audio/pull/2638 Reviewed By: carolineechen, nateanl Differential Revision: D38909097 Pulled By: mthrok fbshipit-source-id: 78346d93b54fca2a19b28991c224324ef53221c9
-
Ravi Makhija authored
Summary: Added example for Loudness transform (implemented in PR https://github.com/pytorch/audio/issues/2472) as mentioned in issue https://github.com/pytorch/audio/issues/1564. Pull Request resolved: https://github.com/pytorch/audio/pull/2641 Reviewed By: nateanl Differential Revision: D38907782 Pulled By: carolineechen fbshipit-source-id: fd2bcc4bac3095a626ea9cf36cb70cb2bf003d63
-
- 20 Aug, 2022 1 commit
-
-
Ravi Makhija authored
Summary: Added example for MFCC transform as mentioned in issue https://github.com/pytorch/audio/issues/1564. Note: Python formatter package `black` uses double quotes for the string dict keys (e.g. in `melkwargs` for this example). Please let me know if there is a different linter/format/convention that is preferred! Pull Request resolved: https://github.com/pytorch/audio/pull/2637 Reviewed By: carolineechen Differential Revision: D38873729 Pulled By: nateanl fbshipit-source-id: 2e8fe2930671e7c5d02c0c37cf1ca5cc8c5079e3
-
- 19 Aug, 2022 2 commits
-
-
Moto Hira authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2636 At the early stage of torchaudio extension module, `torchaudio/csrc/pybind` directory was created so that all the code defining Python interface would be placed there and there will be only one extension module called `torchaudio._torchaudio`. However, the codebase has been evolved in a way separate extensions are defined for each feature (third party dependency) for the sake of more moduler file organization. What is left in `csrc/pybind` is libsox Python bindings. This commit moves it under `csrc/sox`. Follow-up rename `torchaudio._torchaudio` to `torchaudio._torchaudio_sox`. Reviewed By: carolineechen Differential Revision: D38829253 fbshipit-source-id: 3554af45a2beb0f902810c5548751264e093f28d
-
moto authored
Summary: Update compatibility matrix Pull Request resolved: https://github.com/pytorch/audio/pull/2633 Reviewed By: nateanl Differential Revision: D38827670 Pulled By: mthrok fbshipit-source-id: 5c66bf60a06e37919ee725a5f4adf571e6c89100
-
- 18 Aug, 2022 6 commits
-
-
moto authored
Summary: * Use download_asset * Remove notes around nightly * Print versions first * Remove duplicated import Pull Request resolved: https://github.com/pytorch/audio/pull/2631 Reviewed By: carolineechen Differential Revision: D38830395 Pulled By: mthrok fbshipit-source-id: c9259df33562defe249734d1ed074dac0fddc2f6
-
Ravi Makhija authored
Summary: Added example for InverseMelScale transform as mentioned in issue https://github.com/pytorch/audio/issues/1564. Pull Request resolved: https://github.com/pytorch/audio/pull/2635 Reviewed By: carolineechen Differential Revision: D38830318 Pulled By: nateanl fbshipit-source-id: fd26a700d495f6755db0767625aa8577cb89bd83
-
moto authored
Summary: Google Colab now has torchaudio 0.12 pre-installed. This commit removes the note about nightly build. Pull Request resolved: https://github.com/pytorch/audio/pull/2632 Reviewed By: carolineechen Differential Revision: D38827632 Pulled By: mthrok fbshipit-source-id: ac769780868b741c3012357d589ec0019d9af6eb
-
moto authored
Summary: Resolves the following warnings ``` /torchaudio/docs/source/tutorials/asr_inference_with_ctc_decoder_tutorial.rst:195: WARNING: Unexpected indentation. /torchaudio/docs/source/tutorials/asr_inference_with_ctc_decoder_tutorial.rst:446: WARNING: Unexpected indentation. /torchaudio/docs/source/tutorials/audio_io_tutorial.rst:559: WARNING: Content block expected for the "note" directive; none found. /torchaudio/docs/source/tutorials/mvdr_tutorial.rst:338: WARNING: Bullet list ends without a blank line; unexpected unindent. ``` Pull Request resolved: https://github.com/pytorch/audio/pull/2630 Reviewed By: nateanl Differential Revision: D38816632 Pulled By: mthrok fbshipit-source-id: 135ded4e064d136be67ce24439e96f5e9c9ce635
-
moto authored
Summary: This commit fixes the issue with the recent Sphinx-Gallery update. Also it pins the versions of Sphinx-related packages. Before: <img width="256" alt="Screen Shot 2022-08-17 at 10 02 23 PM" src="https://user-images.githubusercontent.com/855818/185140952-28f2d98a-b586-424c-a003-b69089f48eb9.png"> After: https://user-images.githubusercontent.com/855818/185271889-bd4f86a0-986b-43bb-8121-bd77750d74f0.mov Pull Request resolved: https://github.com/pytorch/audio/pull/2629 Reviewed By: carolineechen Differential Revision: D38816417 Pulled By: mthrok fbshipit-source-id: 11ee3f9121d9a302772ee1f461dacae52eb28852
-
moto authored
Summary: Resolves the following warning ``` /torchaudio/docs/source/transforms.rst:94: WARNING: Title underline too short. :hidden:`Loudness` ----------------- ``` Pull Request resolved: https://github.com/pytorch/audio/pull/2627 Reviewed By: carolineechen Differential Revision: D38814802 Pulled By: mthrok fbshipit-source-id: 5dfaf2d7bae22dba0f4a14f04ca63f28d6b2a749
-
- 16 Aug, 2022 4 commits
-
-
Zhaoheng Ni authored
Summary: To make the code consistent, we should use double quotation marks for all strings. This PR make such changes in functional and transforms. Pull Request resolved: https://github.com/pytorch/audio/pull/2618 Reviewed By: carolineechen Differential Revision: D38744137 Pulled By: nateanl fbshipit-source-id: 74213a24d9f66c306cc92019d77dcb2a877f94bd
-
Ravi Makhija authored
Summary: Added example for AmplitudeToDB transform as mentioned in issue https://github.com/pytorch/audio/issues/1564. Pull Request resolved: https://github.com/pytorch/audio/pull/2615 Reviewed By: carolineechen Differential Revision: D38743117 Pulled By: nateanl fbshipit-source-id: bf0f760299f4777a4bca65da86359faa00b16207
-
Ravi Makhija authored
Summary: Added example for MelScale transform as mentioned in issue https://github.com/pytorch/audio/issues/1564. Pull Request resolved: https://github.com/pytorch/audio/pull/2616 Reviewed By: carolineechen Differential Revision: D38743145 Pulled By: nateanl fbshipit-source-id: e24ca92f5317a0ea5a141418bf084b12cfb22486
-
Andrey Talman authored
Summary: Similar to https://github.com/pytorch/vision/pull/6218 Fixing MacOS builds Pull Request resolved: https://github.com/pytorch/audio/pull/2622 Reviewed By: weiwangmeta Differential Revision: D38722983 Pulled By: atalman fbshipit-source-id: 4cef85c97dc270fc812bc289592c4f3815f73c85
-
- 15 Aug, 2022 3 commits
-
-
Andrey Talman authored
Summary: Same as: https://github.com/pytorch/vision/pull/6422 Testing: ``` export ANACONDA_PATH=$(conda info --base)/bin echo $ANACONDA_PATH /opt/homebrew/Caskroom/miniconda/base/bin $ANACONDA_PATH/anaconda -V anaconda Command line client (version 1.10.0) ``` Failure: https://github.com/pytorch/audio/runs/7837085749?check_suite_focus=true Pull Request resolved: https://github.com/pytorch/audio/pull/2621 Reviewed By: weiwangmeta, seemethere Differential Revision: D38714324 Pulled By: atalman fbshipit-source-id: 55342cf69006e9250403c955202846bab4516f3e
-
moto authored
Summary: The link to version selector has been absolute link, which had been a trap when reviewing gh-pages deployment from folk. This commit changes that to relative link. Pull Request resolved: https://github.com/pytorch/audio/pull/2605 Test Plan: - https://mthrok.github.io/audio/main/index.html -> click version selector -> https://mthrok.github.io/audio/versions.html - https://mthrok.github.io/audio/0.12.1/index.html -> click version selector -> https://pytorch.org/audio/versions.html Reviewed By: carolineechen, nateanl Differential Revision: D38695645 Pulled By: mthrok fbshipit-source-id: 91132ac19b8c61f39d304a162435b9c6599ef2b2
-
Zhaoheng Ni authored
Summary: `ctc_decoder` has become beta, remove it from prototype documents. Pull Request resolved: https://github.com/pytorch/audio/pull/2617 Reviewed By: hwangjeff Differential Revision: D38706869 Pulled By: nateanl fbshipit-source-id: 41679f4e65a584b6b882af4551a50123f1dcef02
-
- 12 Aug, 2022 1 commit
-
-
Andrey Talman authored
Summary: Introducing pytorch-cuda metapackage Same as: https://github.com/pytorch/vision/pull/6371 Following PR: https://github.com/pytorch/builder/pull/1094 Adds cuda metapackage called pytorch-cuda . This way we can make sure to install correct version of cuda dependencies and don't depend on conda-forge. Pull Request resolved: https://github.com/pytorch/audio/pull/2612 Reviewed By: hwangjeff, seemethere, nateanl Differential Revision: D38633332 Pulled By: atalman fbshipit-source-id: 78a6115bb252ebdb6d66a57d7d2c4a4978ddb501
-
- 11 Aug, 2022 1 commit
-
-
hwangjeff authored
Summary: Adds function `add_noise`, which computes and returns the sum of a waveform and scaled noise. Pull Request resolved: https://github.com/pytorch/audio/pull/2608 Reviewed By: nateanl Differential Revision: D38557141 Pulled By: hwangjeff fbshipit-source-id: 1457fa213f43ca5b4333d3c7580971655d4260a0
-
- 10 Aug, 2022 3 commits
-
-
hwangjeff authored
Summary: https://github.com/pytorch/audio/issues/2535 modified the Conformer RNN-T Lightning module to accept a SentencePiece model instance rather than a file path. This PR makes changes to account for this in the train script. Pull Request resolved: https://github.com/pytorch/audio/pull/2611 Reviewed By: carolineechen Differential Revision: D38578892 Pulled By: hwangjeff fbshipit-source-id: ec3b9823ad30ffb730baa13d10d8b79020866aac
-
Kunal Upadya authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2609 Converted argument validations in torchaudio/functional/filtering from assert based validation to the preferred if-then raise validation. Added specific error messages in all cases. Reviewed By: mthrok Differential Revision: D38515029 fbshipit-source-id: 6c644a042f86c6feb2bbe8bd02fdb484fe27fae9
-
Sean Kim authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2607 Reviewed By: carolineechen, nateanl Differential Revision: D38522606 Pulled By: skim0514 fbshipit-source-id: 2c38b8dcb343bcf624bfda1bfa2afd91abf2e668
-
- 09 Aug, 2022 1 commit
-
-
Caroline Chen authored
Summary: Expose flashlight's LM and LMState classes to support decoding with custom language models, including NN LMs. The `ctc_decoder` API is as follows - To decode with KenLM, pass in KenLM language model path to `lm` variable - To decode with custom LM, create Python class with `CTCDecoderLM` subclass, and pass in the class to `lm` variable. Additionally create a file of LM words listed in order of the LM index, with a word per line, and pass in the file to `lm_path`. - To decode without a language model, set `lm` to `None` (default) Validated against fairseq w2l decoder on sample LibriSpeech dataset and LM. Code for validation can be found [here](https://github.com/facebookresearch/fairseq/compare/main...carolineechen:fairseq:ctc-decoder). Also added unit tests to validate custom implementations of ZeroLM and KenLM, and also using a biased LM. Follow ups: - Train simple LM on LibriSpeech and demonstrate usage in tutorial or examples directory cc jacobkahn Pull Request resolved: https://github.com/pytorch/audio/pull/2528 Reviewed By: mthrok Differential Revision: D38243802 Pulled By: carolineechen fbshipit-source-id: 445e78f6c20bda655aabf819fc0f771fe68c73d7
-
- 08 Aug, 2022 1 commit
-
-
Caroline Chen authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2606 Reviewed By: nateanl Differential Revision: D38502666 Pulled By: carolineechen fbshipit-source-id: 1e279996fff3621835a07882c63328856fe38f3a
-
- 05 Aug, 2022 4 commits
-
-
hwangjeff authored
Summary: Adds functions `convolve` and `fftconvolve`, which compute the convolution of two tensors along their trailing dimension. The former performs the convolution directly, whereas the latter performs it using FFT. Pull Request resolved: https://github.com/pytorch/audio/pull/2602 Reviewed By: nateanl, mthrok Differential Revision: D38450771 Pulled By: hwangjeff fbshipit-source-id: b2d1e063ba21eafeddf317d60749e7120b14292b
-
Caroline Chen authored
Summary: ``words`` field of CTCHypothesis is empty if no lexicon is provided, which produces confusing output (see issue https://github.com/pytorch/audio/issues/2584) when following our tutorial example with lexicon free usage. This PR adds a note in both docs and tutorial. Followup: determine if we want to modify the behavior of ``words`` in the lexicon free case. One option is to merge and then split the generated tokens by the input silent token to populate the words field, but this is tricky since the meaning of a "word" in the lexicon free case can be vague and not all languages have whitespaces between words, etc Pull Request resolved: https://github.com/pytorch/audio/pull/2603 Reviewed By: mthrok Differential Revision: D38459709 Pulled By: carolineechen fbshipit-source-id: d64ff186df4633f00e94c64afeaa6a50cebf2934
-
Ravi Makhija authored
Summary: Added example for `SlidingWindowCmn` transform as mentioned in issue https://github.com/pytorch/audio/issues/1564 Pull Request resolved: https://github.com/pytorch/audio/pull/2600 Reviewed By: mthrok Differential Revision: D38395579 Pulled By: carolineechen fbshipit-source-id: 44c5b7181789eedcaaa1d80149d5a1ab8de4c0ba
-
Ravi Makhija authored
Summary: Added example for Vad transform as mentioned in issue https://github.com/pytorch/audio/issues/1564 Pull Request resolved: https://github.com/pytorch/audio/pull/2598 Reviewed By: mthrok Differential Revision: D38432103 Pulled By: carolineechen fbshipit-source-id: 8f7e26c48d4ffb6bfe55bba6f9c7ee915e6edaef
-
- 04 Aug, 2022 1 commit
-
-
Omkar Vichare authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2599 Bootcamp task T127107566. Replacing assert statements with if ... then raise so can be run in optimized mode Reviewed By: mthrok Differential Revision: D38370108 fbshipit-source-id: 74eaf5b72c511b62ddbb8e0e3b0ed638ad49e4f2
-
- 03 Aug, 2022 2 commits
-
-
Sean Kim authored
Summary: Add new model pretrained weights and tests Pull Request resolved: https://github.com/pytorch/audio/pull/2601 Reviewed By: carolineechen, nateanl Differential Revision: D38396673 Pulled By: skim0514 fbshipit-source-id: e06f97d28508543bc18e671344386a947bc870c1
-
bshall authored
Summary: I took a stab at implementing the ITU-R BS.1770-4 loudness recommendation (closes https://github.com/pytorch/audio/issues/1205). To give some more details: - I've implemented K-weighting following csteinmetz1 instead of BrechtDeMan since it fit well with torchaudio's already implemented filters (`treble_biquad` and `highpass_biquad`). - I've added four audio files to test compliance with the recommendation. These are linked in [this pdf](https://www.itu.int/dms_pub/itu-r/opb/rep/R-REP-BS.2217-2-2016-PDF-E.pdf). There are many more test files there but I didn't want to bog down the assets directory with too many files. Let me know if I should add or remove anything. - I've kept many of the constant internal to the function (e.g. the block duration, overlap, and the absolute threshold gamma). I'm not sure if these should be exposed in the signature. - I've implemented support for up to 5 channels (following both csteinmetz1 and BrechtDeMan). The recommendation includes weights for up to 24 channels. Is there any convention for how many channels to support? I hope this is helpful! looking forward to hearing from you. Pull Request resolved: https://github.com/pytorch/audio/pull/2472 Reviewed By: hwangjeff Differential Revision: D38389155 Pulled By: carolineechen fbshipit-source-id: fcc86d864c04ab2bedaa9acd941ebc4478ca6904
-
- 02 Aug, 2022 1 commit
-
-
Eli Uriegas authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2581 Also removes spurious lines of code that were erroring out silently Signed-off-by:
Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: carolineechen Differential Revision: D38336705 Pulled By: seemethere fbshipit-source-id: 700a969a4bace7d9ca94a9db908b29f383b7d94e
-
- 01 Aug, 2022 1 commit
-
-
Ravi Makhija authored
Summary: Added example for [Vol transform](https://pytorch.org/audio/stable/transforms.html#torchaudio.transforms.Vol) as mentioned in this issue https://github.com/pytorch/audio/issues/1564. Also made a minor edit to the docstring for `class Vol` to fix a grammar typo and use more common verbiage. Pull Request resolved: https://github.com/pytorch/audio/pull/2597 Reviewed By: nateanl, mthrok Differential Revision: D38316433 Pulled By: carolineechen fbshipit-source-id: 0be8fc505800a59acdab843813767acfdeac8243
-