- 12 Oct, 2022 2 commits
-
-
Zhaoheng Ni authored
Summary: This PR improves the Wav2Vec2/HuBERT model regarding model pre-training. - The model initialization of positional embedding and transformer module is essential to model pre-training. The accuracy of unmasked frames should be higher than masked frames, as it is an easier task. but without the initialization, the accuracy of masked frames is higher than unmasked frames. Compared the performance after two epochs with 16 GPUs. - With model initialization, the accuracies of masked/unmasked frames are 0.08/0.11. - Without model initialization, the accuracies of masked/unmasked frames are 0.06/0.04. - After adding the model initialization, the gradient is easy to overflow (aka `nan` gradient). In paper [Self-Supervised Learning for speech recognition with Intermediate layer supervision](https://arxiv.org/abs/2112.08778) the authors propose a simple but effective method to mitigate the overflow issue, by scaling down the multiplication of query and key and subtracting the maximum value from it (subtracting a constant value won't change the output of softmax). Then it guarantees the value won't be overflowed. - In the original fairseq, the mask indices are generated by `numpy.random.choice`. Here replace `torch.multinomial` with `torch.randperm`. (cc carolineechen). Other improvements within training scripts will be included in a separate PR. Pull Request resolved: https://github.com/pytorch/audio/pull/2716 Reviewed By: xiaohui-zhang Differential Revision: D39832189 Pulled By: nateanl fbshipit-source-id: f4d2a473a79ad63add2dd16624bd155d5ce4de27
-
Caroline Chen authored
Summary: a couple of circleci unittests are failing during hubert xlarge torchscript test, which has been known to fail on Windows in the past (#65776). this PR disables this test on circleci cc atalman Pull Request resolved: https://github.com/pytorch/audio/pull/2758 Reviewed By: mthrok Differential Revision: D40290535 Pulled By: carolineechen fbshipit-source-id: 5c5fb43434a517b6c439a8cb8e853015d1550a57
-
- 11 Oct, 2022 4 commits
-
-
atalman authored
Summary: Increase inactivity timeout for binary build jobs Pull Request resolved: https://github.com/pytorch/audio/pull/2754 Reviewed By: carolineechen Differential Revision: D40275368 Pulled By: atalman fbshipit-source-id: 5e682bb78bda640d615f874fbdf0e650b5a38ee0
-
Caroline Chen authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2751 Reviewed By: nateanl Differential Revision: D40267874 Pulled By: carolineechen fbshipit-source-id: 4e45a02c650ed65c05cde82289a400a3be877927
-
atalman authored
Summary: Fix windows python 3.8 loading path Pull Request resolved: https://github.com/pytorch/audio/pull/2747 Reviewed By: nateanl Differential Revision: D40264326 Pulled By: nateanl fbshipit-source-id: f4a24757de7b48c63a7481034eb11fc3ff174327
-
Zhaoheng Ni authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2738 Reviewed By: carolineechen Differential Revision: D40238099 Pulled By: nateanl fbshipit-source-id: c5cc94c2a348a6ef34c04b8dd26114ecb874d73e
-
- 10 Oct, 2022 2 commits
-
-
Zhaoheng Ni authored
Summary: Besides the unit test, the PR also addresses these issues: - The original `LibriMix` dataset only supports "min" mode, which means the audio length is the minimum of all clean sources. It is default for source separation task. Users may also want to use "max" mode which allows for end-to-end separation and recognition. The PR adds ``mode`` argument to let users decide which dataset they want to use. - If the task is ``"enh_both"``, the target is the audios in ``mix_clean`` instead of separate clean sources. The PR fixes it to use ``mix_clean`` as target. Pull Request resolved: https://github.com/pytorch/audio/pull/2659 Reviewed By: carolineechen Differential Revision: D40229227 Pulled By: nateanl fbshipit-source-id: fc07e0d88a245e1367656d3767cf98168a799235
-
Zhaoheng Ni authored
Summary: The docstring of `wav2vec2` argument is wrong. Fix it in this PR. Pull Request resolved: https://github.com/pytorch/audio/pull/2746 Reviewed By: carolineechen Differential Revision: D40225995 Pulled By: nateanl fbshipit-source-id: 770e9c928ebebd7b6307e181601eb64625d668da
-
- 09 Oct, 2022 1 commit
-
-
Caroline Chen authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2732 Reviewed By: nateanl Differential Revision: D40186996 Pulled By: nateanl fbshipit-source-id: a0ad325b7153c9e580dad2c515730dadbe8840c4
-
- 08 Oct, 2022 1 commit
-
-
moto authored
Summary: * Add HW encoding to HW tutorial https://colab.research.google.com/drive/1DDah_IaGULEO66CfQWltRqaVheBkiXdN#scrollTo=eXzKSVrHk1vS Pull Request resolved: https://github.com/pytorch/audio/pull/2739 Reviewed By: hwangjeff Differential Revision: D40197086 Pulled By: hwangjeff fbshipit-source-id: 1780a5419f6705f7c24ba96bd46c3310438af7db
-
- 07 Oct, 2022 3 commits
-
-
hwangjeff authored
Summary: Updates sox info docstring to account for mp3 frame count handling fix introduced in https://github.com/pytorch/audio/issues/2740. Pull Request resolved: https://github.com/pytorch/audio/pull/2742 Reviewed By: nateanl Differential Revision: D40189846 Pulled By: nateanl fbshipit-source-id: d6371418d7d4867dd0b97ee72ebf846d5c93dc30
-
hwangjeff authored
Summary: Modifies `info_audio` to compute and return number of frames if not found in stream info. This resolves the `num_frames == 0` issue for mp3 that's cited in https://github.com/pytorch/audio/issues/2524. Pull Request resolved: https://github.com/pytorch/audio/pull/2740 Reviewed By: nateanl Differential Revision: D40168639 Pulled By: nateanl fbshipit-source-id: bb45baa0f9cd56844315b04e40ab9835d825fc24
-
moto authored
Summary: Specifying multiple object in `:minigallery:` directive shows duplicated tutorials. This commit fixes it by listing tutorials based on module used. https://output.circle-artifacts.com/output/job/c3da2a22-40d5-4e2d-b73a-28b39e712817/artifacts/0/docs/io.html Before: <img width="694" alt="Screen Shot 2022-10-07 at 7 04 35 AM" src="https://user-images.githubusercontent.com/855818/194427092-ca1202e7-0731-4c18-b48b-24923d692a4a.png"> After: <img width="648" alt="Screen Shot 2022-10-07 at 7 03 14 AM" src="https://user-images.githubusercontent.com/855818/194426950-5b780458-2bf0-43ef-b020-fcbbfdf8d41b.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2736 Reviewed By: carolineechen Differential Revision: D40160247 Pulled By: carolineechen fbshipit-source-id: 547496f9b569ff7a4d70db97e90f3ea503344477
-
- 06 Oct, 2022 3 commits
-
-
moto authored
Summary: Add a tutorial for basic usage of torchaudio.io.StreamWriter. https://output.circle-artifacts.com/output/job/55d9a495-af7a-483c-84cb-de9a08cfd2f3/artifacts/0/docs/tutorials/streamwriter_basic_tutorial.html Pull Request resolved: https://github.com/pytorch/audio/pull/2698 Reviewed By: carolineechen Differential Revision: D40133007 Pulled By: carolineechen fbshipit-source-id: 141f692c32343981bfb228357f21562ffe36f623
-
atalman authored
Summary: Torchaudio load libary path fix for windows and python = 3.8 Fixes: https://github.com/pytorch/audio/issues/2726 Fixes following issue: ``` >>> import torchaudio Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Users\atalman\miniconda3\envs\mywin38\lib\site-packages\torchaudio\__init__.py", line 1, in <module> from torchaudio import ( # noqa: F401 File "C:\Users\atalman\miniconda3\envs\mywin38\lib\site-packages\torchaudio\_extension.py", line 128, in <module> _init_extension() File "C:\Users\atalman\miniconda3\envs\mywin38\lib\site-packages\torchaudio\_extension.py", line 98, in _init_extension _load_lib("libtorchaudio") File "C:\Users\atalman\miniconda3\envs\mywin38\lib\site-packages\torchaudio\_extension.py", line 52, in _load_lib torch.ops.load_library(path) File "C:\Users\atalman\miniconda3\envs\mywin38\lib\site-packages\torch\_ops.py", line 573, in load_library ctypes.CDLL(path) File "C:\Users\atalman\miniconda3\envs\mywin38\lib\ctypes\__init__.py", line 373, in __init__ self._handle = _dlopen(self._name, mode) FileNotFoundError: Could not find module 'C:\Users\atalman\miniconda3\envs\mywin38\Lib\site-packages\torchaudio\lib\libtorchaudio.pyd' (or one of its dependencies). Try using the full path with constructor syntax. >>> ``` Caused by dlls not being found in the conda environment ``` C:\Users\atalman\miniconda3\envs\mywin38\bin\ ``` While this environment is set correctly in PATH its ignored with Python = 3.8 Please refer to: https://stackoverflow.com/questions/59330863/cant-import-dll-module-in-python Pull Request resolved: https://github.com/pytorch/audio/pull/2735 Reviewed By: carolineechen Differential Revision: D40112293 Pulled By: carolineechen fbshipit-source-id: c7fc9bb49fc3ec4a2855c6ea473f36808103ed1e
-
Ivan Zaitsev authored
Summary: The goal is to to reduce the number of job failures due to timeouts, see https://app.circleci.com/pipelines/github/pytorch/audio/12882/workflows/f99da1a5-32e6-4bac-8ceb-fbf36d693e2d/jobs/936363?invite=true#step-105-105 for example. Pull Request resolved: https://github.com/pytorch/audio/pull/2734 Reviewed By: weiwangmeta, atalman Differential Revision: D40077578 fbshipit-source-id: 573f43a4d088a7086fa6925ac5ba1fdd1e8f39ec
-
- 05 Oct, 2022 1 commit
-
-
moto authored
Summary: * Port downstream change https://github.com/pytorch/tutorials/pull/2060 * Fix inter-tutorial links and references Pull Request resolved: https://github.com/pytorch/audio/pull/2733 Reviewed By: hwangjeff Differential Revision: D40086902 Pulled By: hwangjeff fbshipit-source-id: 00b04c6a1b68fb9fadd52b610b26ecaab15d52d8
-
- 03 Oct, 2022 3 commits
-
-
moto authored
Summary: https://output.circle-artifacts.com/output/job/213c71c8-c9b5-4516-af92-a2f8dab2c9fd/artifacts/0/docs/tutorials/streamwriter_advanced.html Pull Request resolved: https://github.com/pytorch/audio/pull/2708 Reviewed By: carolineechen Differential Revision: D40013310 Pulled By: mthrok fbshipit-source-id: 7226b021ce2fe951b3bf0bd41e93a6bbcf696124
-
moto authored
Summary: Adopt `:autosummary:` to various modules * torchaudio.compliance.kaldi * torchaudio.sox_effects * torchaudio.utils Pull Request resolved: https://github.com/pytorch/audio/pull/2664 Reviewed By: nateanl Differential Revision: D39841873 Pulled By: mthrok fbshipit-source-id: ff4fa6976324fca5f35b737b715f976e2a722bac -
Zhaoheng Ni authored
Summary: The MuST-C reference is added in https://github.com/pytorch/audio/pull/2689. This PR adds the citation to the RNNT pipeline documentation. Pull Request resolved: https://github.com/pytorch/audio/pull/2728 Reviewed By: carolineechen Differential Revision: D39990882 Pulled By: nateanl fbshipit-source-id: 011057952dd8aa30a4cb7c7af0ac75123e329d7e
-
- 01 Oct, 2022 1 commit
-
-
Sergii Dymchenko authored
Summary: The file looks hopelessly outdated. Pull Request resolved: https://github.com/pytorch/audio/pull/2730 Reviewed By: mthrok Differential Revision: D39993805 Pulled By: kit1980 fbshipit-source-id: f5ad97c83873061175455cc7b129ec71a9ec3d7d
-
- 29 Sep, 2022 1 commit
-
-
atalman authored
Summary: Cuda 10.2 deprecation, migration of unit tests from cuda 10.2 to cuda 11.6 Pull Request resolved: https://github.com/pytorch/audio/pull/2724 Reviewed By: weiwangmeta Differential Revision: D39912484 Pulled By: atalman fbshipit-source-id: e760b630375eae94384cda68d24f83ef46ada6d9
-
- 28 Sep, 2022 3 commits
-
-
atalman authored
Summary: Revert this fot now untill docker is updated Pull Request resolved: https://github.com/pytorch/audio/pull/2723 Reviewed By: nateanl Differential Revision: D39900382 Pulled By: atalman fbshipit-source-id: f8701e359bc11e8f9f3a29144f7e7da336a470da
-
Andrey Talman authored
Summary: Removing cuda102 Pull Request resolved: https://github.com/pytorch/audio/pull/2715 Reviewed By: hwangjeff Differential Revision: D39823444 Pulled By: atalman fbshipit-source-id: c11d798ab86cf9a6d5ed3804958b4a0c2f8a87ff
-
Ivan Zaitsev authored
Summary: Example job that was failing previously: https://app.circleci.com/pipelines/github/pytorch/audio/12796/workflows/ae96794a-6df4-4a2a-84df-ada7a7250045/jobs/927709 The failure: ``` "Detected that PyTorch and TorchAudio were compiled with different CUDA versions. " RuntimeError: Detected that PyTorch and TorchAudio were compiled with different CUDA versions. PyTorch has CUDA version 11.7 whereas TorchAudio has CUDA version 11.6. Please install the TorchAudio version that matches your PyTorch version. ``` Has install command: ``` pip install $(ls ~/workspace/torchaudio*.whl) -f "https://download.pytorch.org/whl/${UPLOAD_CHANNEL}/torch_${UPLOAD_CHANNEL}.html" # expands to: pip install /c/Users/circleci/workspace/torchaudio-0.13.0.dev20220927+cu116-cp37-cp37m-win_amd64.whl -f https://download.pytorch.org/whl/nightly/torch_nightly.html ``` Linux job (succeeds) for uses different "-f" (find links) url, that includes specific cuda version: https://app.circleci.com/pipelines/github/pytorch/audio/12809/workflows/aadca2ab-5a00-4a0a-ab6a-4a1b7a503713/jobs/927861 Command: ``` pip install $(ls ~/workspace/torchaudio*.whl) -f "https://download.pytorch.org/whl/${UPLOAD_CHANNEL}/${CU_VERSION}/torch_${UPLOAD_CHANNEL}.html" # expands to: pip install /root/workspace/torchaudio-0.13.0.dev20220927+cu116-cp37-cp37m-linux_x86_64.whl -f https://download.pytorch.org/whl/nightly/cu116/torch_nightly.html ``` This PR makes Windows installation match the linux one. Testing: * verified command manually on Circle CI: ``` >>> import torch >>> import torchaudio C:\tools\miniconda3\lib\site-packages\torchaudio\compliance\kaldi.py:22: UserWarning: Failed to initialize NumPy: numpy.core.multiarray failed to import (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\utils\tensor_numpy.cpp:77.) EPSILON = torch.tensor(torch.finfo(torch.float).eps) C:\tools\miniconda3\lib\site-packages\torchaudio\backend\utils.py:62: UserWarning: No audio backend is available. warnings.warn("No audio backend is available.") ``` Co-authered: weiwangmeta Pull Request resolved: https://github.com/pytorch/audio/pull/2721 Reviewed By: hwangjeff Differential Revision: D39870805 Pulled By: izaitsevfb fbshipit-source-id: 2957cba4f53d00783a5c07099f24050ce15e7d1c
-
- 27 Sep, 2022 2 commits
-
-
Omkar Salpekar authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2718 Original commit changeset: 7e222d80ca07 Original Phabricator Diff: D39756852 (https://github.com/pytorch/audio/commit/7ba7cf4d24a2967b8fa4aaff437116524281f8fd) Reviewed By: weiwangmeta Differential Revision: D39839899 fbshipit-source-id: f5605eb9882f7c7f0008e88338ab711131b29404
-
Omkar Salpekar authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2706 Reviewed By: kit1980 Differential Revision: D39786253 Pulled By: osalpekar fbshipit-source-id: 2a0c427f57e5c70ff1cf419b7e0c2316e5f0e16c
-
- 26 Sep, 2022 2 commits
-
-
Andrey Talman authored
Summary: Conda version on circleCI prints following message: ``` ==> WARNING: A newer version of conda exists. <== current version: 4.6.14 latest version: 4.14.0 ``` and as a result this error: ``` + /c/tools/miniconda3/Scripts/conda.exe install -v -y -c pytorch-nightly -c nvidia pytorch numpy ffmpeg pytorch-cuda=11.6 Collecting package metadata: ...working... done Solving environment: ...working... Too long with no output (exceeded 30m0s): context deadline exceeded ``` This should update the conda version running on the system and allow us to install pytorch and run some tests. Pull Request resolved: https://github.com/pytorch/audio/pull/2704 Reviewed By: weiwangmeta Differential Revision: D39820037 Pulled By: atalman fbshipit-source-id: 4a82a7a6cbe3dc1a5807ac669e2fa79f454037fa
-
Andrey Talman authored
Summary: Remove linux wheel from circleci Pull Request resolved: https://github.com/pytorch/audio/pull/2714 Reviewed By: weiwangmeta Differential Revision: D39816121 Pulled By: atalman fbshipit-source-id: a3c99b530896888d7b4271d8b3f27f3c986b3480
-
- 24 Sep, 2022 2 commits
-
-
hwangjeff authored
Summary: `torch.version.cuda` can return a string of form X.X or X.X.X. This PR modifies the CUDA version check to account for this. Pull Request resolved: https://github.com/pytorch/audio/pull/2710 Reviewed By: carolineechen, nateanl Differential Revision: D39796810 Pulled By: hwangjeff fbshipit-source-id: b483bd8200195844d65d0caddebaf1b10f939b64
-
hwangjeff authored
Summary: Adds check to ensure that TorchAudio and PyTorch versions use the same CUDA version. Pull Request resolved: https://github.com/pytorch/audio/pull/2707 Reviewed By: mthrok Differential Revision: D39791154 Pulled By: hwangjeff fbshipit-source-id: de00889c7bac897c6b8762502f9d37797016b71d
-
- 23 Sep, 2022 3 commits
-
-
Alex Beloi authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2700 ATT for pytorch/audio Reviewed By: mthrok Differential Revision: D39707243 fbshipit-source-id: 1dc2a5a9fe913a9071e6df679e39d632b75212fb
-
Omkar Salpekar authored
Summary: This does 2 things: Comments out Linux Wheels-related jobs in CircleCI so that they are not run on nightlies/releases. Adds a GHA workflow that calls the build workflow in pytorch/test-infra. Testing: Verified that the builds are triggered by this workflow, and all builds are green: https://github.com/pytorch/audio/actions/runs/3109635749/jobs/5040029155 Pull Request resolved: https://github.com/pytorch/audio/pull/2702 Reviewed By: seemethere Differential Revision: D39756852 Pulled By: osalpekar fbshipit-source-id: 7e222d80ca0720e3be43b929f1e55f5c0166b947
-
moto authored
Summary: Since that new tutorials for StreamWriter are being added, there are more tutorials for media IO than the rest. So this commit introduces sub-index for IO tutorials. Pull Request resolved: https://github.com/pytorch/audio/pull/2703 Reviewed By: carolineechen Differential Revision: D39769049 Pulled By: mthrok fbshipit-source-id: 19a3981bc624fdce1d5d703c67e28a751a15e812
-
- 22 Sep, 2022 2 commits
-
-
moto authored
Summary: * Introduce the mini-index at `torchaudio.datasets` page. * Standardize the format of return type docstring. https://output.circle-artifacts.com/output/job/989328b2-0270-4958-b577-19cf749af3fd/artifacts/0/docs/datasets.html <img width="936" alt="Screen Shot 2022-09-21 at 6 56 52 PM" src="https://user-images.githubusercontent.com/855818/191475141-a97f2bea-705f-49bc-8c34-6ec869e76793.png"> https://output.circle-artifacts.com/output/job/989328b2-0270-4958-b577-19cf749af3fd/artifacts/0/docs/generated/torchaudio.datasets.CMUDict.html#torchaudio.datasets.CMUDict <img width="1069" alt="Screen Shot 2022-09-21 at 6 57 32 PM" src="https://user-images.githubusercontent.com/855818/191475293-e3302528-27ea-4212-9c12-fd6d900fdf3e.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2692 Reviewed By: carolineechen Differential Revision: D39687463 Pulled By: mthrok fbshipit-source-id: 4175fc15388817d2fe76206188618dd1576281df
-
moto authored
Summary: * Fix Sphinx warning * Update asset management Pull Request resolved: https://github.com/pytorch/audio/pull/2701 Reviewed By: carolineechen Differential Revision: D39714126 Pulled By: mthrok fbshipit-source-id: a5b04cfbf8bedce67c621b6bfe1dcd975b343313
-
- 21 Sep, 2022 4 commits
-
-
Caroline Chen authored
Summary: Add metadata mode for the following SUPERB benchmark datasets - QUESST14 - Fluent Speech Commands - VoxCeleb1 follow ups: - Add metadata mode for LibriMix -- waiting for unit tests to merge - Add IEMOCAP + SNIPS datasets Pull Request resolved: https://github.com/pytorch/audio/pull/2697 Reviewed By: mthrok Differential Revision: D39666809 Pulled By: carolineechen fbshipit-source-id: 3a8f07627acceed70f960f47e694efad75b108c2
-
moto authored
Summary: * Introduce the mini-index at `torchaudio.pipelines` page. * Add introductions * Update pipeline tutorials https://output.circle-artifacts.com/output/job/ccc57d95-1930-45c9-b967-c8d477d35f29/artifacts/0/docs/pipelines.html <img width="1163" alt="Screen Shot 2022-09-20 at 1 23 29 PM" src="https://user-images.githubusercontent.com/855818/191167049-98324e93-2e16-41db-8538-3b5b54eb8224.png"> <img width="1115" alt="Screen Shot 2022-09-20 at 1 23 49 PM" src="https://user-images.githubusercontent.com/855818/191167071-4770f594-2540-43a4-a01c-e983bf59220f.png"> https://output.circle-artifacts.com/output/job/ccc57d95-1930-45c9-b967-c8d477d35f29/artifacts/0/docs/generated/torchaudio.pipelines.RNNTBundle.html#torchaudio.pipelines.RNNTBundle <img width="1108" alt="Screen Shot 2022-09-20 at 1 24 18 PM" src="https://user-images.githubusercontent.com/855818/191167123-51b33a5f-c30c-46bc-b002-b05d2d0d27b7.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2689 Reviewed By: carolineechen Differential Revision: D39691253 Pulled By: mthrok fbshipit-source-id: ddf5fdadb0b64cf2867b6271ba53e8e8c0fa7e49
-
moto authored
Summary: In https://github.com/pytorch/audio/issues/2694 CMakeLists.txt was not properly updated, so the tests are failing. This commit fix it. Pull Request resolved: https://github.com/pytorch/audio/pull/2699 Reviewed By: carolineechen Differential Revision: D39687409 Pulled By: mthrok fbshipit-source-id: 2e14f3c478f1f8a112a03839f2dbcca51215fed7
-
Moto Hira authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2694 This commit adds Tensor type as input to `StreamReader`. The Tensor is interpreted as byte string buffer. Reviewed By: hwangjeff Differential Revision: D39467630 fbshipit-source-id: 6369eed5e16fbb657568bf6bb80d703483d72f8e
-