- 07 Aug, 2023 2 commits
-
-
moto authored
Summary: This commit adds `merge_tokens` function which removes repeated tokens from CTC token sequences returned from `forced_align`. Resolving repeated tokens is a necessary step and almost universal, thus it makes sense to have such helper function in torchaudio. Pull Request resolved: https://github.com/pytorch/audio/pull/3535 Reviewed By: huangruizhe Differential Revision: D48111202 Pulled By: mthrok fbshipit-source-id: 25354bfa210aa5c03f8c1d3e201f253ca3761b24
-
moto authored
Summary: Currently `torchaudio.functional.forced_align` function requires full information on input/target lengths. When performing non-batched alignment, these can be inferred from the size of Tensor. Pull Request resolved: https://github.com/pytorch/audio/pull/3533 Reviewed By: nateanl Differential Revision: D48111041 Pulled By: mthrok fbshipit-source-id: fbf07124d3959c5cc5533dcd86296851587082fb
-
- 04 Aug, 2023 2 commits
-
-
Jeff Hwang authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3531 Revises VGGish pipeline to accept arbitrary state dict function to accommodate loading weights from any source. Reviewed By: mthrok Differential Revision: D48056390 fbshipit-source-id: 2767699b58442ad132b518b4a6435f2772a637c3
-
moto authored
Summary: - Simplify the step to generate token-level alignment Pull Request resolved: https://github.com/pytorch/audio/pull/3529 Reviewed By: huangruizhe Differential Revision: D48066787 Pulled By: mthrok fbshipit-source-id: 452c243d278e508926a59894928e280fea76dcc6
-
- 03 Aug, 2023 2 commits
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3527 Reviewed By: huangruizhe Differential Revision: D48008822 Pulled By: mthrok fbshipit-source-id: 4beae2956dfd1f00534832b70a1bf0897cba7812
-
hwangjeff authored
Summary: Increases numerical tolerance on Conformer RNN-T TorchScript consistency tests to resolve CI test failures. Pull Request resolved: https://github.com/pytorch/audio/pull/3525 Reviewed By: mthrok Differential Revision: D48000613 Pulled By: hwangjeff fbshipit-source-id: 1d35ba58055a8346dc40e2b67f37ccfd2e015894
-
- 02 Aug, 2023 1 commit
-
-
moto authored
Summary: When passing int16 type tensor to `save(backend="sox")`, the resulting file should be 16-bit signed PCM, but instead is 32-bit signed PCM. Resolves https://github.com/pytorch/audio/issues/3304 Pull Request resolved: https://github.com/pytorch/audio/pull/3524 Reviewed By: huangruizhe Differential Revision: D47941090 Pulled By: mthrok fbshipit-source-id: 2622b31eb1cbf03969f67ab2b2adec6e2ba677c4
-
- 01 Aug, 2023 3 commits
-
-
Yuekai Zhang authored
Summary: Add a separate tutorial for cuctc. Reslove https://github.com/pytorch/audio/issues/3096 Pull Request resolved: https://github.com/pytorch/audio/pull/3297 Reviewed By: huangruizhe Differential Revision: D47928400 Pulled By: mthrok fbshipit-source-id: 8c16492fb4d007b6ea7969ba77c866a51749c0ec
-
moto authored
Summary: torch.nn.utils.weight_norm is deprecated. Replacing this with new API Pull Request resolved: https://github.com/pytorch/audio/pull/3523 Reviewed By: huangruizhe Differential Revision: D47932384 Pulled By: mthrok fbshipit-source-id: 344abfa12bd11da779f7fd13b74a1e009a582b52
-
hwangjeff authored
Summary: Adds pre-trained VGGish inference pipeline ported from https://github.com/harritaylor/torchvggish and https://github.com/tensorflow/models/tree/master/research/audioset. Pull Request resolved: https://github.com/pytorch/audio/pull/3491 Reviewed By: mthrok Differential Revision: D47738130 Pulled By: hwangjeff fbshipit-source-id: 859c1ff1ec1b09dae4e26586169544571657cc67
-
- 31 Jul, 2023 2 commits
-
-
moto authored
Summary: torch.norm is now deprecated. The usages in torchaudio seems to be vector norm, so replacing them with torch.linalg.vector_norm Resolves https://github.com/pytorch/audio/issues/3484 Pull Request resolved: https://github.com/pytorch/audio/pull/3522 Reviewed By: huangruizhe Differential Revision: D47926659 Pulled By: mthrok fbshipit-source-id: f7428cf0168109a3d340b8784adc99bb5f781084
-
moto authored
Summary: - Set global matplotlib rc params - Fix style check - Fix and updates FA tutorial plots - Add av-asr index cars Pull Request resolved: https://github.com/pytorch/audio/pull/3515 Reviewed By: huangruizhe Differential Revision: D47894156 Pulled By: mthrok fbshipit-source-id: b40d8d31f12ffc2b337e35e632afc216e9d59a6e
-
- 29 Jul, 2023 1 commit
-
-
moto authored
Summary: The I/O functions in _compat module was introduced there so that everything related to FFmpeg is in torchaudio.io and FFmpeg library initialization can be carried out in `torchaudio.io.__init__`. Now that this constraint is removed, (all the initialization happens at `torchaudio._extension.__init__`) and `_compat` is only used by FFmpeg dispatcher backend, we move the module to `torchaudio._backend` for better locality. Pull Request resolved: https://github.com/pytorch/audio/pull/3518 Reviewed By: huangruizhe Differential Revision: D47877412 Pulled By: mthrok fbshipit-source-id: aa18c8cb6e5d5360950df5158c33c653e37c565f
-
- 28 Jul, 2023 5 commits
-
-
moto authored
Summary: Context: https://github.com/pytorch/audio/issues/3448 The documentation of amplitude_to_DB is ambigious on how cut-off values are computed when the input tensor is 3D. This commit clarifies that. Closes: https://github.com/pytorch/audio/issues/3448 Pull Request resolved: https://github.com/pytorch/audio/pull/3519 Reviewed By: huangruizhe Differential Revision: D47875505 Pulled By: mthrok fbshipit-source-id: e06bb997e7a27e2abe35c8e2ac91ddfbded4e641
-
moto authored
Summary: In https://github.com/pytorch/audio/issues/2419, we added ffmpeg as fallback for sox_io backend. The was a warkaround for solving the issue with libmad removal. Now that we introduced `backend` argument to I/O functions, and libsox integration is moved to dynamic binding where users can use libsox with libmad integration, we do not need the workaround. This commit is based on reverting https://github.com/pytorch/audio/issues/2416 (fd7ace17). Pull Request resolved: https://github.com/pytorch/audio/pull/3516 Reviewed By: huangruizhe Differential Revision: D47855272 Pulled By: mthrok fbshipit-source-id: 5af73af7865f6e545ccb052d478e86588ff2a014
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3517 Reviewed By: huangruizhe Differential Revision: D47858452 Pulled By: mthrok fbshipit-source-id: 62ee6c8bb2669dd70f8ca25703a04dc8a9d19aec
-
Zhaoheng Ni authored
Summary: The PR move `SquimObjective` and `SquimSubjective` models and corresponding factory functions and pre-trained pipelines out of prototype and to the core directory. They will be included in the next official release. Pull Request resolved: https://github.com/pytorch/audio/pull/3512 Reviewed By: mthrok Differential Revision: D47837434 Pulled By: nateanl fbshipit-source-id: d0639f29079f7e1afc30f236849e530c8cadffd8
-
Pingchuan Ma authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3511 Reviewed By: mthrok Differential Revision: D47852108 Pulled By: mpc001 fbshipit-source-id: c0ecb4b5bcc8670013dcbe1164e3929f5793c8aa
-
- 27 Jul, 2023 3 commits
-
-
moto authored
Summary: Removes residual from https://github.com/pytorch/audio/issues/3497 Pull Request resolved: https://github.com/pytorch/audio/pull/3514 Differential Revision: D47838049 Pulled By: mthrok fbshipit-source-id: c4b00aba9f4cc887ec595f04d7a2dd673c63b975
-
moto authored
Summary: This commit updates the way libsox is integrated to torchaudio 1. We stop statically linking libsox, so torchaudio will not ship libsox. 2. We link libsox dynamically. Users are expected to install libsox by themselves. 3. We use stab library to build torchaudio. Pull Request resolved: https://github.com/pytorch/audio/pull/3497 Differential Revision: D47803706 Pulled By: mthrok fbshipit-source-id: 31b05495d81069186fa52d67beea360cc7e817a8
-
moto authored
Summary: Since libsox and ffmpeg extensions now depend on external libraries, their initialization processes might cause unrecoverable issue, such as segfault. This commit adds environment variable to disable them so that importing torchaudio won't attempt to load these libraries. Pull Request resolved: https://github.com/pytorch/audio/pull/3500 Differential Revision: D47808178 Pulled By: mthrok fbshipit-source-id: 80c1c6b5f4bc608d4e209473702680db093c95ee
-
- 26 Jul, 2023 3 commits
-
-
Pingchuan Ma authored
Summary: This PR moves video loading outside detector during pre-processing. Pull Request resolved: https://github.com/pytorch/audio/pull/3498 Reviewed By: mthrok Differential Revision: D47811044 Pulled By: mpc001 fbshipit-source-id: f17839b695b13d3cf2d9db343d7e9a0202eea7d5
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3499 Differential Revision: D47803654 Pulled By: mthrok fbshipit-source-id: 2b916fa66d84c91c01b4dfe6dd5ee3501159f451
-
moto authored
Summary: Add scheduled doc update job so that docs are updated at least once a day. Pull Request resolved: https://github.com/pytorch/audio/pull/3496 Differential Revision: D47795577 Pulled By: mthrok fbshipit-source-id: aba5376ec51f07560014d250a16fef8b8a11b43e
-
- 25 Jul, 2023 7 commits
-
-
moto authored
Summary: In preparation for https://github.com/pytorch/audio/pull/3082 Disable those FFmpeg tests that depend on sox CLI. These tests need to be updated or removed so as not to use sox CLI. Auto-skip some sox tests if decoder/encoder are not available Pull Request resolved: https://github.com/pytorch/audio/pull/3494 Differential Revision: D47761948 Pulled By: mthrok fbshipit-source-id: 3a48d7f280f8376a48d223947dd41a7cdc8cbc30
-
moto authored
Summary: - Fix condition to add new commit to gh-pages - Allow to deploy docs from workflow dispatch Pull Request resolved: https://github.com/pytorch/audio/pull/3495 Differential Revision: D47767443 Pulled By: mthrok fbshipit-source-id: 9ca858868f3e822e532c21cde9d7499af9891a51
-
Pingchuan Ma authored
Summary: This PR is to include few changes in the AV-ASR recipe. The changes include better results, a faster face detector (Mediapipe), renamed variable names, a streamlined dataloader, and a few illustrated examples. These changes were made to improve the usability of the recipe. Pull Request resolved: https://github.com/pytorch/audio/pull/3493 Reviewed By: mthrok Differential Revision: D47758072 Pulled By: mpc001 fbshipit-source-id: 4533587776f3a7a74f3f11b0ece773a0934bacdc
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3483 Differential Revision: D47725664 Pulled By: mthrok fbshipit-source-id: e4249e1488fa7af8670be4a5077957912ff3420b
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3490 Differential Revision: D47757316 Pulled By: mthrok fbshipit-source-id: cfb376be29980f9e452f291c4fa25780e9f85a97
-
moto authored
Summary: Resolves https://github.com/pytorch/audio/issues/3486 Pull Request resolved: https://github.com/pytorch/audio/pull/3487 Differential Revision: D47724733 Pulled By: mthrok fbshipit-source-id: 26f5641a8271a7e50c4a33861d09b0c8274b29e4
-
Pingchuan Ma authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3492 Reviewed By: mthrok Differential Revision: D47755638 Pulled By: mpc001 fbshipit-source-id: 729efdb2a69b5656dbc0b70dd623c1509123d3aa
-
- 24 Jul, 2023 1 commit
-
-
Pingchuan Ma authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3489 Reviewed By: mthrok Differential Revision: D47726448 Pulled By: mpc001 fbshipit-source-id: 3d5aa7646c6bb816dcbbf70c61e98404bb148841
-
- 18 Jul, 2023 1 commit
-
-
moto authored
Summary: Now that GPU video decoders are available in doc CI, we run the tutorials with GPU decoders. Pull Request resolved: https://github.com/pytorch/audio/pull/3478 Differential Revision: D47519672 Pulled By: mthrok fbshipit-source-id: 2f95243100e9c75e17c2b4d306da164f0e31f8f2
-
- 17 Jul, 2023 1 commit
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3467 Differential Revision: D47482388 Pulled By: mthrok fbshipit-source-id: abff36491dc28b83270673860d6457a084b1327d
-
- 15 Jul, 2023 2 commits
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3476 Differential Revision: D47494211 Pulled By: mthrok fbshipit-source-id: 230bbf0a271b070d1dea34146d0d466e666cccdc
-
moto authored
Summary: The nightly builds support FFmpeg version 4, 5 and 6. Pull Request resolved: https://github.com/pytorch/audio/pull/3480 Differential Revision: D47482841 Pulled By: mthrok fbshipit-source-id: 88267f5e83ddc7b1e866b35e57a87b985e2c78c9
-
- 14 Jul, 2023 1 commit
-
-
moto authored
Summary: When using GPU decoder in some environments, attempting to read the output formats from filter graph caused an issue in which the software pixel format cannot be determined. We do not know the exact cause but when it happens, the input link of buffer sink does not have HW frames context. Since currently no filter can convert the pixel format of CUDA frame, we resort to the HW frames context of the output link of buffer source. Environments this was observed. Env1 - OS: Fedora 36 (x86_64) - GCC 12.2.1 - Python 3.10.12 - GPU: GeForce RTX 3070 Ti Laptop GPU - FFmpeg: 5.1.3 - nv-codec-header: n11.1.5.2 - CUDA: 12.1 Env2 - Ubuntu 20.04.4 LTS (x86_64) - GCC 9.4.0 - Python 3.11.3 - GPU: Quadro GV100 - FFmpeg: 5.1.3 - nv-codec-header: n11.1.5.2 - CUDA: 11.4 Pull Request resolved: https://github.com/pytorch/audio/pull/3479 Differential Revision: D47482407 Pulled By: mthrok fbshipit-source-id: 1c53096b27824453b260138ab64e1948afeeefc7
-
- 13 Jul, 2023 2 commits
-
-
Omkar Salpekar authored
Summary: Reintroduce a conda environment within which we will do all deps installation, audio builds, and tests runs. This conda environment will use the python version set by the GHA job - previously this just defaulted to using the system 3.10 python which was default inside the container. Pull Request resolved: https://github.com/pytorch/audio/pull/3477 Reviewed By: mthrok Differential Revision: D47414572 Pulled By: osalpekar fbshipit-source-id: 80760f82c7726205b29812d576e498db2a7a80a0
-
Moto Hira authored
Differential Revision: D47402174 Original commit changeset: 00c0719ab184 Original Phabricator Diff: D47402174 fbshipit-source-id: b1f6ea4cc3ecef3f72a87bf2f67bf9644c847546
-
- 12 Jul, 2023 1 commit
-
-
moto authored
Summary: - FFmpeg 6 deprecated attributes - Guard CUDA specific functions not used in CPU builds Pull Request resolved: https://github.com/pytorch/audio/pull/3471 Differential Revision: D47402174 Pulled By: mthrok fbshipit-source-id: 00c0719ab1849b50c0b56b03d8fb38bc7aa74538
-