- 14 Aug, 2023 1 commit
-
-
moto authored
Summary: * Merge backend doc into torchaudio toplevel doc * Update backend, dispatcher, installation doc Pull Request resolved: https://github.com/pytorch/audio/pull/3555 Reviewed By: huangruizhe Differential Revision: D48326812 Pulled By: mthrok fbshipit-source-id: cc0d7326eacfebd341323b5d613ca1777255748b
-
- 11 Aug, 2023 1 commit
-
-
moto authored
Summary: `torchaudio.info` returns `AudioMetaData`. It should be exposed as public API, without referring `backend` submodule. Pull Request resolved: https://github.com/pytorch/audio/pull/3556 Reviewed By: huangruizhe Differential Revision: D48267349 Pulled By: mthrok fbshipit-source-id: 6ccc0c32bf62fbdcb71495fc7d8d4cc29891538a
-
- 10 Aug, 2023 1 commit
-
-
Jeff Hwang authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3545 Adds function for computing the Fréchet distance between two multivariate normal distributions. Reviewed By: mthrok Differential Revision: D48126102 fbshipit-source-id: e4e122b831e1e752037c03f5baa9451e81ef1697
-
- 07 Aug, 2023 2 commits
-
-
moto authored
Summary: Port the MMS FA model from tutorial to the library with post-processing module. Pull Request resolved: https://github.com/pytorch/audio/pull/3521 Reviewed By: huangruizhe Differential Revision: D48038285 Pulled By: mthrok fbshipit-source-id: 571cf0fceaaab4790983be2719f1a85805b814f5
-
moto authored
Summary: This commit adds `merge_tokens` function which removes repeated tokens from CTC token sequences returned from `forced_align`. Resolving repeated tokens is a necessary step and almost universal, thus it makes sense to have such helper function in torchaudio. Pull Request resolved: https://github.com/pytorch/audio/pull/3535 Reviewed By: huangruizhe Differential Revision: D48111202 Pulled By: mthrok fbshipit-source-id: 25354bfa210aa5c03f8c1d3e201f253ca3761b24
-
- 03 Aug, 2023 1 commit
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3527 Reviewed By: huangruizhe Differential Revision: D48008822 Pulled By: mthrok fbshipit-source-id: 4beae2956dfd1f00534832b70a1bf0897cba7812
-
- 01 Aug, 2023 2 commits
-
-
Yuekai Zhang authored
Summary: Add a separate tutorial for cuctc. Reslove https://github.com/pytorch/audio/issues/3096 Pull Request resolved: https://github.com/pytorch/audio/pull/3297 Reviewed By: huangruizhe Differential Revision: D47928400 Pulled By: mthrok fbshipit-source-id: 8c16492fb4d007b6ea7969ba77c866a51749c0ec
-
hwangjeff authored
Summary: Adds pre-trained VGGish inference pipeline ported from https://github.com/harritaylor/torchvggish and https://github.com/tensorflow/models/tree/master/research/audioset. Pull Request resolved: https://github.com/pytorch/audio/pull/3491 Reviewed By: mthrok Differential Revision: D47738130 Pulled By: hwangjeff fbshipit-source-id: 859c1ff1ec1b09dae4e26586169544571657cc67
-
- 31 Jul, 2023 1 commit
-
-
moto authored
Summary: - Set global matplotlib rc params - Fix style check - Fix and updates FA tutorial plots - Add av-asr index cars Pull Request resolved: https://github.com/pytorch/audio/pull/3515 Reviewed By: huangruizhe Differential Revision: D47894156 Pulled By: mthrok fbshipit-source-id: b40d8d31f12ffc2b337e35e632afc216e9d59a6e
-
- 28 Jul, 2023 3 commits
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3517 Reviewed By: huangruizhe Differential Revision: D47858452 Pulled By: mthrok fbshipit-source-id: 62ee6c8bb2669dd70f8ca25703a04dc8a9d19aec
-
Zhaoheng Ni authored
Summary: The PR move `SquimObjective` and `SquimSubjective` models and corresponding factory functions and pre-trained pipelines out of prototype and to the core directory. They will be included in the next official release. Pull Request resolved: https://github.com/pytorch/audio/pull/3512 Reviewed By: mthrok Differential Revision: D47837434 Pulled By: nateanl fbshipit-source-id: d0639f29079f7e1afc30f236849e530c8cadffd8
-
Pingchuan Ma authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3511 Reviewed By: mthrok Differential Revision: D47852108 Pulled By: mpc001 fbshipit-source-id: c0ecb4b5bcc8670013dcbe1164e3929f5793c8aa
-
- 27 Jul, 2023 1 commit
-
-
moto authored
Summary: This commit updates the way libsox is integrated to torchaudio 1. We stop statically linking libsox, so torchaudio will not ship libsox. 2. We link libsox dynamically. Users are expected to install libsox by themselves. 3. We use stab library to build torchaudio. Pull Request resolved: https://github.com/pytorch/audio/pull/3497 Differential Revision: D47803706 Pulled By: mthrok fbshipit-source-id: 31b05495d81069186fa52d67beea360cc7e817a8
-
- 25 Jul, 2023 2 commits
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3483 Differential Revision: D47725664 Pulled By: mthrok fbshipit-source-id: e4249e1488fa7af8670be4a5077957912ff3420b
-
Pingchuan Ma authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3492 Reviewed By: mthrok Differential Revision: D47755638 Pulled By: mpc001 fbshipit-source-id: 729efdb2a69b5656dbc0b70dd623c1509123d3aa
-
- 18 Jul, 2023 1 commit
-
-
moto authored
Summary: Now that GPU video decoders are available in doc CI, we run the tutorials with GPU decoders. Pull Request resolved: https://github.com/pytorch/audio/pull/3478 Differential Revision: D47519672 Pulled By: mthrok fbshipit-source-id: 2f95243100e9c75e17c2b4d306da164f0e31f8f2
-
- 15 Jul, 2023 1 commit
-
-
moto authored
Summary: The nightly builds support FFmpeg version 4, 5 and 6. Pull Request resolved: https://github.com/pytorch/audio/pull/3480 Differential Revision: D47482841 Pulled By: mthrok fbshipit-source-id: 88267f5e83ddc7b1e866b35e57a87b985e2c78c9
-
- 12 Jul, 2023 1 commit
-
-
moto authored
Summary: This commit introduces support for multiple FFmpeg versions for OSS binary distributions. Currently torchaudio only works with FFmpeg 4. This is inconvenient from installing to runtime linking. This commit allows to pick FFmpeg 4, 5 or 6 at runtime, instead of just looking for v4. The way it works is that we compile the FFmpeg extension three times with different FFmpeg and ship them. At runtime, we look for libavutil of specific version and when one is found, load the corresponding FFmpeg extension. The order of preference is 6, 5, then 4. To make the build process simple and reproducible, we use pre-built binaries of FFmpeg during the build. They are LGPL and downloaded from S3 at build time, instead of building every time. The use of pre-built binaries as scaffolding limits the system that can build torchaudio, so it also introduces single FFmpeg version support mode. setting FFMPEG_ROOT during the build will change the way binaries are built so that it will only support one specific version of FFmpeg. Pull Request resolved: https://github.com/pytorch/audio/pull/3464 Differential Revision: D47300223 Pulled By: mthrok fbshipit-source-id: 560c7968315e4c8922afa11a4693f648c0356d04
-
- 11 Jul, 2023 1 commit
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3469 Differential Revision: D47368140 Pulled By: mthrok fbshipit-source-id: d82ddb91ae1f6612298486fb8401f95c48db5620
-
- 28 Jun, 2023 1 commit
-
-
Pingchuan Ma authored
Summary: Include Conformer/Emformer RNN-T ASR/VSR/AV-ASR link to index.rst Pull Request resolved: https://github.com/pytorch/audio/pull/3441 Differential Revision: D47094158 Pulled By: mthrok fbshipit-source-id: 9ab42ac2bf52a5ce488003897ffba2f10a6ca941
-
- 21 Jun, 2023 2 commits
-
-
Jeff Hwang authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3427 Adds transform `ChromaSpectrogram` for generating chromagrams from waveforms as well as transform `ChromaScale` for generating chromagrams from linear-frequency spectrograms. Reviewed By: mthrok Differential Revision: D46547418 fbshipit-source-id: 250f298b8e11d8cf82f05536c29d51cf8d77a960
-
Xiaohui Zhang authored
Summary: Splitting the multilingual example part into another tutorial. Pull Request resolved: https://github.com/pytorch/audio/pull/3443 Reviewed By: mthrok Differential Revision: D46802844 Pulled By: xiaohui-zhang fbshipit-source-id: a7093053cac8b79d650d4f665db7fde2d8254998
-
- 08 Jun, 2023 1 commit
-
-
Jeff Hwang authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3395 Adds chroma filter bank function `chroma_filterbank` to `torchaudio.prototype.functional`. Reviewed By: mthrok Differential Revision: D46307672 fbshipit-source-id: c5d8104a8bb03da70d0629b5cc224e0d897148d5
-
- 07 Jun, 2023 1 commit
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3415 Differential Revision: D46526437 Pulled By: mthrok fbshipit-source-id: f78d19c19d7e68f67712412de35d9ed50f47263b
-
- 05 Jun, 2023 1 commit
-
-
moto authored
Summary: Follow up of: https://github.com/pytorch/audio/pull/3368 Remove files and lines no longer used. Pull Request resolved: https://github.com/pytorch/audio/pull/3403 Differential Revision: D46441462 Pulled By: mthrok fbshipit-source-id: 11b881ec4b24fa0d625c6aee9f4bd91f637f9923
-
- 26 May, 2023 1 commit
-
-
atalman authored
Summary: This reverts commit d38a7854. This is temporary revert to unblock unit test migration from circleci to github Pull Request resolved: https://github.com/pytorch/audio/pull/3377 Reviewed By: mthrok Differential Revision: D46230498 Pulled By: atalman fbshipit-source-id: 000d8a9ca00750fc1ca61f4c2cdd6e930a5ce46d
-
- 24 May, 2023 2 commits
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3367 Reviewed By: nateanl Differential Revision: D46148139 Pulled By: mthrok fbshipit-source-id: 50f297ac69bb95562976eb452e4e382b8c064c3c
-
moto authored
Summary: Follow-up https://github.com/pytorch/audio/issues/3045 - Revert the removal of HW acceleration doc - comment out FFmpeg CLI test run Pull Request resolved: https://github.com/pytorch/audio/pull/3349 Reviewed By: nateanl Differential Revision: D46121899 Pulled By: mthrok fbshipit-source-id: dfc030a69f05addec73637cfb6a720c184e37323
-
- 23 May, 2023 1 commit
-
-
Xiaohui Zhang authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3356 move the forced aligner tutorial to torchaudio, with some formatting changes Reviewed By: mthrok Differential Revision: D46060238 fbshipit-source-id: d90e7db5669a58d1e9ef5c2ec3c6d175b4e394ec
-
- 22 May, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3355 Reviewed By: xiaohui-zhang Differential Revision: D46060254 Pulled By: nateanl fbshipit-source-id: c2e44f994739755daf049fe350dd24a987a9cc29
-
- 19 May, 2023 1 commit
-
-
moto authored
Summary: This commit add the step to build FFmpeg with GPU decoder in build_doc job so that we can use GPU decoder/encoder in documentations. Pull Request resolved: https://github.com/pytorch/audio/pull/3045 Reviewed By: nateanl Differential Revision: D45965739 Pulled By: mthrok fbshipit-source-id: c167eb3ef347860a51efa906068fa2daa556f017
-
- 17 May, 2023 1 commit
-
-
Carl Parker authored
Summary: Previously, `breadcrumbs.html` identified a nightly build version by the prefix "Nightly" which would normally be prepended to the version in `conf.py`. However, the version string is coming through without the "Nightly" prefix, so this change causes `breadcrumbs.html` to key on the substring "dev" instead. The reason we aren't getting "Nightly" is apparently because the environment variable BUILD_VERSION is available, so `conf.py` is using the value of that env var instead of the version string imported from the `torchaudio` module itself, which actually appears to be incorrect; see below. If I install torchaudio using conda install torchaudio -c pytorch-nightly then `torchaudio.__version__` returns the incorrect version string: 2.0.0.dev20230309 Pull Request resolved: https://github.com/pytorch/audio/pull/3333 Reviewed By: mthrok Differential Revision: D45926466 Pulled By: carljparker fbshipit-source-id: d5516f2d9f1716c2400d3e9b285bd5d32b4b3a77
-
- 16 May, 2023 2 commits
-
-
moto authored
Summary: This commit upgrade the version of FFmpeg compiled against TorchAudio binary distribution to 5.0.4. FFmpeg 5.0 was released in Jan 2022, and many package managers provide a version of FFmpeg v5. Conda-forge lists 5.1 for all the platforms TorchAudio supports.https://anaconda.org/conda-forge/ffmpeg Pull Request resolved: https://github.com/pytorch/audio/pull/3298 Reviewed By: hwangjeff Differential Revision: D45865599 Pulled By: mthrok fbshipit-source-id: d95638eb80daaf477a710a992f4ead9b9009bb9b
-
moto authored
Summary: TorchAudio has migrated CTC decoder to flashlight-text, and code related CTC decoder was removed in https://github.com/pytorch/audio/issues/3236. This commit cleans up the residual, removes the third party libraries used for CTC decoder, and mention to environment variable for CTC decoder. Pull Request resolved: https://github.com/pytorch/audio/pull/3339 Reviewed By: nateanl Differential Revision: D45920878 Pulled By: mthrok fbshipit-source-id: 8d93e64138697781570e5b0b1c9f86e1a7923a89
-
- 11 May, 2023 1 commit
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3325 Reviewed By: hwangjeff Differential Revision: D45759434 Pulled By: mthrok fbshipit-source-id: f3b1127fcf3b23beeab61fb7ff18f1b89b11ddc6
-
- 10 May, 2023 2 commits
-
-
moto authored
Summary: https://output.circle-artifacts.com/output/job/fbfa6d9a-5014-42ac-8e77-c1e9565747e8/artifacts/0/docs/tutorials/effector_tutorial.html Pull Request resolved: https://github.com/pytorch/audio/pull/3226 Reviewed By: nateanl Differential Revision: D45402724 Pulled By: mthrok fbshipit-source-id: bc9d1bc071f6f5062b9cc35d743b4a3016306262
-
moto authored
Summary: This commit is preparation for landing dispatcher switch in https://github.com/pytorch/audio/issues/3241 Making FFmpeg backend default causes some issues on tutorials, so this commit disable it. The IO tutorial will be updated after https://github.com/pytorch/audio/issues/3241 is landed to accommodate the change. Since it is necessary to mention the changes related to migration in the IO tutorial, I also update the IO documentation to include migration work so that it's easy to redirect. Pull Request resolved: https://github.com/pytorch/audio/pull/3285 Reviewed By: nateanl Differential Revision: D45671237 Pulled By: mthrok fbshipit-source-id: cb541f6bd93cd9920019b8ec83210ea69d34f133
-
- 29 Apr, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: The PR adds a tutorial that demonstrates how to use pre-trained `TorchAudio-SQUIM` pipelines to estimate objective and subjective metric scores (PESQ, STOI, Si-SDR, MOS). Pull Request resolved: https://github.com/pytorch/audio/pull/3279 Reviewed By: hwangjeff Differential Revision: D45415404 Pulled By: nateanl fbshipit-source-id: abcaeadcca0eabc2dca53b607eac6257a701c903
-
- 28 Apr, 2023 1 commit
-
-
Yuekai Zhang authored
Summary: This PR implements a CUDA based ctc prefix beam search decoder. Attach serveral benchmark results using V100 below: |decoder type| model |datasets | decoding time (secs)| beam size | batch size | model unit | subsampling times | vocab size | |--------------|---------|------|-----------------|------------|-------------|------------|-----------------------|------------| | cuctc | conformer nemo |dev clean |7.68s | 8 | 32 | bpe | 4 | 1000| | cuctc | conformer nemo |dev clean (sort by length) |1.6s | 8 | 32 | bpe | 4 | 1000| | cuctc | wav2vec2.0 torchaudio |dev clean |22s | 10 | 1 | char | 2 | 29| | cuctc | conformer espnet |aishell1 test | 5s | 10 | 24 | char | 4 | 4233| Note: 1. The design is to parallel computation through batch and vocab axis, for loop the frames axis. So it's more friendly with smaller sequence lengths, larger vocab size comparing with CPU implementations. 2. WER is the same as CPU implementations. However, it can't decode with LM now. Resolves: https://github.com/pytorch/audio/issues/2957. Pull Request resolved: https://github.com/pytorch/audio/pull/3096 Reviewed By: nateanl Differential Revision: D44709397 Pulled By: mthrok fbshipit-source-id: 3078c54a2b44dc00eb4a81b4c657487eeff8c155
-
- 11 Apr, 2023 1 commit
-
-
moto authored
Summary: GCC should not be used when building FFmpeg for torchaudio, as torchaudio uses MSVC (cl.exe) Pull Request resolved: https://github.com/pytorch/audio/pull/3257 Reviewed By: nateanl Differential Revision: D44835169 Pulled By: mthrok fbshipit-source-id: 038c70caae58cec47dd2d6d08b8244c193104eda
-