- 24 May, 2023 1 commit
-
-
moto authored
Summary: Follow-up https://github.com/pytorch/audio/issues/3045 - Revert the removal of HW acceleration doc - comment out FFmpeg CLI test run Pull Request resolved: https://github.com/pytorch/audio/pull/3349 Reviewed By: nateanl Differential Revision: D46121899 Pulled By: mthrok fbshipit-source-id: dfc030a69f05addec73637cfb6a720c184e37323
-
- 23 May, 2023 1 commit
-
-
Xiaohui Zhang authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3356 move the forced aligner tutorial to torchaudio, with some formatting changes Reviewed By: mthrok Differential Revision: D46060238 fbshipit-source-id: d90e7db5669a58d1e9ef5c2ec3c6d175b4e394ec
-
- 22 May, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3355 Reviewed By: xiaohui-zhang Differential Revision: D46060254 Pulled By: nateanl fbshipit-source-id: c2e44f994739755daf049fe350dd24a987a9cc29
-
- 19 May, 2023 1 commit
-
-
moto authored
Summary: This commit add the step to build FFmpeg with GPU decoder in build_doc job so that we can use GPU decoder/encoder in documentations. Pull Request resolved: https://github.com/pytorch/audio/pull/3045 Reviewed By: nateanl Differential Revision: D45965739 Pulled By: mthrok fbshipit-source-id: c167eb3ef347860a51efa906068fa2daa556f017
-
- 17 May, 2023 1 commit
-
-
Carl Parker authored
Summary: Previously, `breadcrumbs.html` identified a nightly build version by the prefix "Nightly" which would normally be prepended to the version in `conf.py`. However, the version string is coming through without the "Nightly" prefix, so this change causes `breadcrumbs.html` to key on the substring "dev" instead. The reason we aren't getting "Nightly" is apparently because the environment variable BUILD_VERSION is available, so `conf.py` is using the value of that env var instead of the version string imported from the `torchaudio` module itself, which actually appears to be incorrect; see below. If I install torchaudio using conda install torchaudio -c pytorch-nightly then `torchaudio.__version__` returns the incorrect version string: 2.0.0.dev20230309 Pull Request resolved: https://github.com/pytorch/audio/pull/3333 Reviewed By: mthrok Differential Revision: D45926466 Pulled By: carljparker fbshipit-source-id: d5516f2d9f1716c2400d3e9b285bd5d32b4b3a77
-
- 16 May, 2023 2 commits
-
-
moto authored
Summary: This commit upgrade the version of FFmpeg compiled against TorchAudio binary distribution to 5.0.4. FFmpeg 5.0 was released in Jan 2022, and many package managers provide a version of FFmpeg v5. Conda-forge lists 5.1 for all the platforms TorchAudio supports.https://anaconda.org/conda-forge/ffmpeg Pull Request resolved: https://github.com/pytorch/audio/pull/3298 Reviewed By: hwangjeff Differential Revision: D45865599 Pulled By: mthrok fbshipit-source-id: d95638eb80daaf477a710a992f4ead9b9009bb9b
-
moto authored
Summary: TorchAudio has migrated CTC decoder to flashlight-text, and code related CTC decoder was removed in https://github.com/pytorch/audio/issues/3236. This commit cleans up the residual, removes the third party libraries used for CTC decoder, and mention to environment variable for CTC decoder. Pull Request resolved: https://github.com/pytorch/audio/pull/3339 Reviewed By: nateanl Differential Revision: D45920878 Pulled By: mthrok fbshipit-source-id: 8d93e64138697781570e5b0b1c9f86e1a7923a89
-
- 11 May, 2023 1 commit
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3325 Reviewed By: hwangjeff Differential Revision: D45759434 Pulled By: mthrok fbshipit-source-id: f3b1127fcf3b23beeab61fb7ff18f1b89b11ddc6
-
- 10 May, 2023 2 commits
-
-
moto authored
Summary: https://output.circle-artifacts.com/output/job/fbfa6d9a-5014-42ac-8e77-c1e9565747e8/artifacts/0/docs/tutorials/effector_tutorial.html Pull Request resolved: https://github.com/pytorch/audio/pull/3226 Reviewed By: nateanl Differential Revision: D45402724 Pulled By: mthrok fbshipit-source-id: bc9d1bc071f6f5062b9cc35d743b4a3016306262
-
moto authored
Summary: This commit is preparation for landing dispatcher switch in https://github.com/pytorch/audio/issues/3241 Making FFmpeg backend default causes some issues on tutorials, so this commit disable it. The IO tutorial will be updated after https://github.com/pytorch/audio/issues/3241 is landed to accommodate the change. Since it is necessary to mention the changes related to migration in the IO tutorial, I also update the IO documentation to include migration work so that it's easy to redirect. Pull Request resolved: https://github.com/pytorch/audio/pull/3285 Reviewed By: nateanl Differential Revision: D45671237 Pulled By: mthrok fbshipit-source-id: cb541f6bd93cd9920019b8ec83210ea69d34f133
-
- 29 Apr, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: The PR adds a tutorial that demonstrates how to use pre-trained `TorchAudio-SQUIM` pipelines to estimate objective and subjective metric scores (PESQ, STOI, Si-SDR, MOS). Pull Request resolved: https://github.com/pytorch/audio/pull/3279 Reviewed By: hwangjeff Differential Revision: D45415404 Pulled By: nateanl fbshipit-source-id: abcaeadcca0eabc2dca53b607eac6257a701c903
-
- 28 Apr, 2023 1 commit
-
-
Yuekai Zhang authored
Summary: This PR implements a CUDA based ctc prefix beam search decoder. Attach serveral benchmark results using V100 below: |decoder type| model |datasets | decoding time (secs)| beam size | batch size | model unit | subsampling times | vocab size | |--------------|---------|------|-----------------|------------|-------------|------------|-----------------------|------------| | cuctc | conformer nemo |dev clean |7.68s | 8 | 32 | bpe | 4 | 1000| | cuctc | conformer nemo |dev clean (sort by length) |1.6s | 8 | 32 | bpe | 4 | 1000| | cuctc | wav2vec2.0 torchaudio |dev clean |22s | 10 | 1 | char | 2 | 29| | cuctc | conformer espnet |aishell1 test | 5s | 10 | 24 | char | 4 | 4233| Note: 1. The design is to parallel computation through batch and vocab axis, for loop the frames axis. So it's more friendly with smaller sequence lengths, larger vocab size comparing with CPU implementations. 2. WER is the same as CPU implementations. However, it can't decode with LM now. Resolves: https://github.com/pytorch/audio/issues/2957. Pull Request resolved: https://github.com/pytorch/audio/pull/3096 Reviewed By: nateanl Differential Revision: D44709397 Pulled By: mthrok fbshipit-source-id: 3078c54a2b44dc00eb4a81b4c657487eeff8c155
-
- 11 Apr, 2023 1 commit
-
-
moto authored
Summary: GCC should not be used when building FFmpeg for torchaudio, as torchaudio uses MSVC (cl.exe) Pull Request resolved: https://github.com/pytorch/audio/pull/3257 Reviewed By: nateanl Differential Revision: D44835169 Pulled By: mthrok fbshipit-source-id: 038c70caae58cec47dd2d6d08b8244c193104eda
-
- 10 Apr, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: - Add citations of [`TorchAudio-Squim`](https://arxiv.org/abs/2304.01448) publication. - Update descriptions in the `SQUIM_OBJECTIVE` and `SQUIM_SUBJECTIVE` pipelines. Pull Request resolved: https://github.com/pytorch/audio/pull/3254 Reviewed By: hwangjeff Differential Revision: D44802015 Pulled By: nateanl fbshipit-source-id: ca08298ec1eafefdd671ff2e010ef18f7372f9f8
-
- 01 Apr, 2023 1 commit
-
-
moto authored
Summary: This commit adds a new feature AudioEffector, which can be used to apply various effects and codecs to waveforms in Tensor. Under the hood it uses StreamWriter and StreamReader to apply filters and encode/decode. This is going to replace the deprecated `apply_codec` and `apply_sox_effect_tensor` functions. It can also perform online, chunk-by-chunk filtering. Tutorial to follow. closes https://github.com/pytorch/audio/issues/3161 Pull Request resolved: https://github.com/pytorch/audio/pull/3163 Reviewed By: hwangjeff Differential Revision: D44576660 Pulled By: mthrok fbshipit-source-id: 2c5cc87082ab431315d29d56d6ac9efaf4cf7aeb
-
- 27 Mar, 2023 1 commit
-
-
hwangjeff authored
Summary: For `StreamWriter`, * Renames arg `config` to codec_config`. * Renames struct `EncodingConfig` and dataclass `EncodeConfig` to `CodecConfig`. * Adds docstrings for arg codec_config`. * Updates `chunk` to `frames` in `write_*_chunk` methods. Pull Request resolved: https://github.com/pytorch/audio/pull/3203 Reviewed By: mthrok Differential Revision: D44350153 Pulled By: hwangjeff fbshipit-source-id: 1b940b1366a43ec0565c362bfcbf62744088b343
-
- 23 Mar, 2023 2 commits
-
-
Zhaoheng Ni authored
Summary: The PR adds the pre-trained pipeline for `SquimSubjective` model which predicts MOS score for speech enhancement task. Pull Request resolved: https://github.com/pytorch/audio/pull/3197 Reviewed By: mthrok Differential Revision: D44313244 Pulled By: nateanl fbshipit-source-id: 905095ff77006e9f441faa826fc25d9d8681e8aa
-
Zhaoheng Ni authored
Summary: In the nightly documentation, "Prototype Factory Functions of Beta Models" is listed as an individual section, which is not correct. <img width="310" alt="image" src="https://user-images.githubusercontent.com/8653221/227262349-604b99e8-1b20-4b19-9711-81e7b6cfa62e.png"> After the PR, the section outlook is fixed <img width="285" alt="image" src="https://user-images.githubusercontent.com/8653221/227262893-b938d81e-6c4b-432a-833c-95981bca5e65.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/3202 Reviewed By: mthrok Differential Revision: D44338663 Pulled By: nateanl fbshipit-source-id: 09f591b9e4af66ebf34fb423bd5c30d4630f0b88
-
- 21 Mar, 2023 3 commits
-
-
Zhaoheng Ni authored
Summary: Add model architecture and factory functions for `SquimSubjective` which predicts subjective evaluation metric scores (e.g. MOS) for speech enhancement task. Pull Request resolved: https://github.com/pytorch/audio/pull/3189 Reviewed By: mthrok Differential Revision: D44267255 Pulled By: nateanl fbshipit-source-id: f8060398b14c625b38ea1bb2417f61aeaec3f1db
-
moto authored
Summary: To suppress local warning of flake8 <120 Pull Request resolved: https://github.com/pytorch/audio/pull/3191 Reviewed By: nateanl Differential Revision: D44263027 Pulled By: mthrok fbshipit-source-id: b3e48dba21fc5c9813f07e624a93f38a68956c6e
-
Zhaoheng Ni authored
Summary: In librosa 0.10 release, positional arguments are deprecated (see https://github.com/librosa/librosa/pull/1521 for details). The PR fixes the HiFiGAN unit test by using keyword arguments for `librosa.filters.mel` function. Pull Request resolved: https://github.com/pytorch/audio/pull/3185 Reviewed By: mthrok Differential Revision: D44218852 Pulled By: nateanl fbshipit-source-id: 6171f7bec6a2144917697c1d640e701d95ec60d7
-
- 17 Mar, 2023 2 commits
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3182 Reviewed By: nateanl Differential Revision: D44167810 Pulled By: mthrok fbshipit-source-id: 6ecbae54224ef7ba32835e4006aa5f2dc16b9acb
-
moto authored
Summary: Adds config object `EncodingConfig` and modifies `StreamWriter` to allow for passing in additional encoder configuration parameters, e.g. bit rate and compression level. Pull Request resolved: https://github.com/pytorch/audio/pull/3179 Pull Request resolved: https://github.com/pytorch/audio/pull/3164 Reviewed By: mthrok Differential Revision: D43861413 Pulled By: hwangjeff fbshipit-source-id: c1682cb2f6e682ab6f1a506511d2be7c7b254161
-
- 15 Mar, 2023 1 commit
-
-
Carl Parker authored
Summary: - Boldface the version-selection UX and increase size by three percent. - Add text to breadcrumbs to indicate version and stability. - New `breadcrumbs.html` in `_templates` overrides Sphinx version. I create a new variable in `conf.py`, **version_stable**, which has the version number for the most-recent stable release. I define this variable in the **html_context** dictionary so that it is visible to the templates. I use this approach because I was not able to find any other way of discerning the current stable release during the build. Note that the `versions.html` file--which identifies the current stable release--appears to be available only in the **gh-pages** branch and so it is not available at build time. However, this means that someone will need to update `conf.py` whenever the current stable release changes. Pull Request resolved: https://github.com/pytorch/audio/pull/3167 Reviewed By: mthrok Differential Revision: D44112224 Pulled By: carljparker fbshipit-source-id: e76f5cb6734a784d161342964459577aa9b64cac
-
- 14 Mar, 2023 2 commits
-
-
hwangjeff authored
Summary: Adds documentation that introduces forthcoming I/O backend revision and provides enablement directions for the current release. Doc pages: https://output.circle-artifacts.com/output/job/9c0e5a49-eaf4-404c-b910-ca1b18bb289b/artifacts/0/docs/torchaudio.html Pull Request resolved: https://github.com/pytorch/audio/pull/3147 Reviewed By: mthrok Differential Revision: D43824019 Pulled By: hwangjeff fbshipit-source-id: ad21d60c7e8f69f64859c56a8ca75735ddc22e40
-
Zhaoheng Ni authored
Summary: Add `2.0.0` release to the compatibility matrix Pull Request resolved: https://github.com/pytorch/audio/pull/3168 Reviewed By: mthrok Differential Revision: D44059197 Pulled By: nateanl fbshipit-source-id: a2830d059be90eddeab72b30e85cdfc393369bf8
-
- 08 Mar, 2023 1 commit
-
-
moto authored
Summary: This commit adds fields to OutputStream, which shows the result of fitlers, such as width and height after filtering. Before ``` OutputStream( source_index=0, filter_description='fps=3,scale=width=320:height=320,format=pix_fmts=gray') ``` After ``` OutputVideoStream( source_index=0, filter_description='fps=3,scale=width=320:height=320,format=pix_fmts=gray', media_type='video', format='gray', width=320, height=320, frame_rate=3.0) ``` Pull Request resolved: https://github.com/pytorch/audio/pull/3155 Reviewed By: nateanl Differential Revision: D43882399 Pulled By: mthrok fbshipit-source-id: 620676b1a06f293fdd56de8203a11120f228fa2d
-
- 02 Mar, 2023 1 commit
-
-
moto authored
Summary: Fix build_doc job https://app.circleci.com/pipelines/github/pytorch/audio/15217/workflows/ce50b317-a59e-4741-b8d2-59129420deb8 - build.ffmpeg.html might not exist when IPython notebook is processed. Changing to main doc URL. - Fix bash cell syntax in HW tutorial - Fix C++ doc - Fix duplicated target name in streamwriter tutorial Pull Request resolved: https://github.com/pytorch/audio/pull/3125 Reviewed By: xiaohui-zhang Differential Revision: D43724078 Pulled By: mthrok fbshipit-source-id: ea7d46ec5e377cf2fbd7c3798df57da73750ac5c
-
- 27 Feb, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: Add pre-trained pipeline support for `SquimObjective` model. The pre-trained model is trained on DNS 2020 challenge dataset. Pull Request resolved: https://github.com/pytorch/audio/pull/3103 Reviewed By: xiaohui-zhang, mthrok Differential Revision: D43611794 Pulled By: nateanl fbshipit-source-id: 0ac76a27e7027a43ffccb158385ddb2409b8526d
-
- 24 Feb, 2023 2 commits
-
-
moto authored
Summary: This commit is kind of clean up and preparation for future development. We plan to pass around more complicated objects among StreamReader and StreamWriter, and TorchBind is not expressive enough for defining intermediate object, so we use PyBind11 for binding StreamWriter. Pull Request resolved: https://github.com/pytorch/audio/pull/3091 Reviewed By: xiaohui-zhang Differential Revision: D43515714 Pulled By: mthrok fbshipit-source-id: 9097bb104bbf8c1536a5fab6f87447c08b10a7f2
-
Zhaoheng Ni authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3084 Reviewed By: mthrok Differential Revision: D43550150 Pulled By: nateanl fbshipit-source-id: 5c5e3d9461e375be202493e3399ff38ce5cd7690
-
- 22 Feb, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3042 Reviewed By: mthrok Differential Revision: D43405932 Pulled By: nateanl fbshipit-source-id: 88f6dabae35565b699230e9909b8f68f4a57f5c7
-
- 15 Feb, 2023 1 commit
-
-
moto authored
Summary: * Mention context manager in StreamWriter * Add FFmpeg as optional dependency Pull Request resolved: https://github.com/pytorch/audio/pull/3064 Reviewed By: hwangjeff Differential Revision: D43307818 Pulled By: mthrok fbshipit-source-id: 86339d973aba85e090f520e08af65b5d736e3d18
-
- 14 Feb, 2023 2 commits
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3053 Reviewed By: nateanl Differential Revision: D43238766 Pulled By: mthrok fbshipit-source-id: 4f82878b1c97b0e6a35af75855849b86200e6061
-
Zhaoheng Ni authored
Summary: replicate of https://github.com/pytorch/audio/issues/2644 Pull Request resolved: https://github.com/pytorch/audio/pull/2880 Reviewed By: mthrok Differential Revision: D41633911 Pulled By: nateanl fbshipit-source-id: 73cf145d75c389e996aafe96571ab86dc21f86e5
-
- 11 Feb, 2023 1 commit
-
-
moto authored
Summary: Par https://github.com/pytorch/audio/issues/3040 and https://github.com/pytorch/audio/issues/3041, it turned out Google Colab now has FFmpeg with GPU decoder/encoder preinstalled, and installing FFmpeg manually corrups the environment. This commit updates the tutorial by extracting and moving the how-to-install part to installation/build section. closes https://github.com/pytorch/audio/issues/3041 closes https://github.com/pytorch/audio/issues/3040 Pull Request resolved: https://github.com/pytorch/audio/pull/3050 Reviewed By: nateanl Differential Revision: D43166054 Pulled By: mthrok fbshipit-source-id: 32667f292a796344d5fcde86e8231e15ad904e58
-
- 09 Feb, 2023 1 commit
-
-
moto authored
Summary: - Add documentation - Tweak docsrting - Fix import Pull Request resolved: https://github.com/pytorch/audio/pull/3051 Reviewed By: weiwangmeta, atalman, nateanl Differential Revision: D43166081 Pulled By: mthrok fbshipit-source-id: 7d77aa34a6318a64824626cff8372f8b9aebf6f9
-
- 07 Feb, 2023 1 commit
-
-
moto authored
Summary: Add a section about installation/build https://output.circle-artifacts.com/output/job/f121cd38-68f3-47a3-ac29-c7b0cfe94c77/artifacts/0/docs/installation.html <img width="1102" alt="Screenshot 2023-02-06 at 6 13 50 PM" src="https://user-images.githubusercontent.com/855818/217108551-622b117b-209e-4776-b5d6-d6934c8126a4.png"> https://output.circle-artifacts.com/output/job/f121cd38-68f3-47a3-ac29-c7b0cfe94c77/artifacts/0/docs/build.html <img width="1072" alt="Screenshot 2023-02-06 at 6 13 57 PM" src="https://user-images.githubusercontent.com/855818/217108568-c125cdc2-9d6a-4c1d-a155-2cee40c9dac6.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/3038 Reviewed By: hwangjeff, nateanl Differential Revision: D43083469 Pulled By: mthrok fbshipit-source-id: e0b5b76dbf706552dd60ae26ea40ebc98627e3b0
-
- 01 Feb, 2023 1 commit
-
-
moto authored
Summary: Adding C++ documentation. (C++ APIs are categorized as prototype, though it's used by Python beta APIs.) https://output.circle-artifacts.com/output/job/69654229-a99e-4b15-9ce0-7bc6bcf01101/artifacts/0/docs/libtorchaudio.html <img width="1202" alt="Screenshot 2023-01-31 at 11 48 47 AM" src="https://user-images.githubusercontent.com/855818/215828167-d23032f8-9e40-4413-b5b1-5cbd12d705e9.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2994 Reviewed By: hwangjeff Differential Revision: D42876621 Pulled By: mthrok fbshipit-source-id: d8b8d610b87ec766501baa88b7506368a9905a6a
-
- 27 Jan, 2023 1 commit
-
-
hwangjeff authored
Summary: Moves `AddNoise`, `Convolve`, `FFTConvolve`, `Speed`, `SpeedPerturbation`, `Deemphasis`, and `Preemphasis` out of `torchaudio.prototype.transforms` and into `torchaudio.transforms`. Pull Request resolved: https://github.com/pytorch/audio/pull/3009 Reviewed By: xiaohui-zhang, mthrok Differential Revision: D42730322 Pulled By: hwangjeff fbshipit-source-id: 43739ac31437150d3127e51eddc0f0bba5facb15
-