- 23 Dec, 2021 3 commits
-
-
moto authored
Summary: Follow-up of https://github.com/pytorch/audio/issues/2086 The CI job to download the third party code and cache daily has not been properly updated. Pull Request resolved: https://github.com/pytorch/audio/pull/2095 Reviewed By: hwangjeff Differential Revision: D33291738 Pulled By: mthrok fbshipit-source-id: 6fc61f76b35c6f032085eda9d6053eefd2a1e0a9
-
Moto Hira authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2094 Reviewed By: nateanl Differential Revision: D33288439 fbshipit-source-id: 385e0e4257755dbaf143287f612e19bede189757
-
hwangjeff authored
Summary: Adds implementation of Conformer module. Adapted from sravyapopuri388's implementation for fairseq at https://github.com/fairinternal/fairseq-py/pull/2770. Pull Request resolved: https://github.com/pytorch/audio/pull/2068 Reviewed By: mthrok Differential Revision: D33236957 Pulled By: hwangjeff fbshipit-source-id: 382d99394996ff5249522b5899e1a4b4a95de9e6
-
- 22 Dec, 2021 2 commits
-
-
Joao Gomes authored
Summary: - Deprecates data utils (with warning that will be removed in v0.12) - replaces all usages of `torchaudio.datasets.utils.download_url` with `torch.hub.download_url_to_file` - replaces all MD5 hashes with SHA256 hash #Addresses https://github.com/pytorch/audio/issues/1883 Pull Request resolved: https://github.com/pytorch/audio/pull/2073 Reviewed By: mthrok Differential Revision: D33241756 Pulled By: jdsgomes fbshipit-source-id: 49388ec5965bfc91d9a1d8d0786eeafb2969f6cf
-
Joao Gomes authored
Summary: After discussing with Moto Hira, we decided to revert linting exemptions introduced previously in order to keep the entire audio project as formatted as possible, to reduce the time we spend on formatting discussion. Pull Request resolved: https://github.com/pytorch/audio/pull/2087 Reviewed By: mthrok Differential Revision: D33236949 Pulled By: jdsgomes fbshipit-source-id: e13079f532c4534d8a168059b0ded6fa375fdecf
-
- 21 Dec, 2021 3 commits
-
-
Moto Hira authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2092 Reviewed By: carolineechen Differential Revision: D33169110 fbshipit-source-id: e422ad93efe50b91f1ac5d572dc82768c1000c05
-
moto authored
Summary: 1. Reorder Audio display so that audios are playable from browser in doc 2. Add link to function documentations https://470342-90321822-gh.circle-artifacts.com/0/docs/tutorials/audio_data_augmentation_tutorial.html Pull Request resolved: https://github.com/pytorch/audio/pull/2082 Reviewed By: carolineechen Differential Revision: D33227725 Pulled By: mthrok fbshipit-source-id: c7ee360b6f9b84c8e0a9b72193b98487d03b57ab
-
moto authored
Summary: ## bug description When a 24 bits-par-sample audio is loaded via file-like object, the loaded Tensor is wrong. It was fine if the audio is loaded from local file. ## The cause of the bug The core of the sox's decoding mechanism is `sox_read` function, one of which parameter is the maximum number of samples to decode from the given buffer. https://fossies.org/dox/sox-14.4.2/formats_8c.html#a2a4f0194a0f919d4f38c57b81aa2c06f)] The `sox_read` function is called in what is called `drain` effect, callback and this callback receives output buffer and its size in byte. The previous implementation passed this size value as the argument of `sox_read` for the maximum number of samples to read. Since buffer size is larger than the number of samples fit in the buffer, `sox_read` function always consumed the entire buffer. (This behavior is not wrong except when the input is 24 bits-per-sample and file-like object.) When the input is read from file-like object, inside of drain callback, new data are fetched via Python's `read` method and loaded on fixed-size memory region. The size of this memory region can be adjusted via `torchaudio.utils.sox_utils.set_buffer_size`, but the default value is 8096. If the input format is 24 bits-per-sample, the end of memory region does not necessarily correspond to the end of a valid sample. When `sox_read` consumes all the data in the buffer region, the data at the end introduces some unexpected values. This causes the aforementioned bug ## Fix Pass proper (better estimated) maximum number of samples decodable to `sox_read`. Pull Request resolved: https://github.com/pytorch/audio/pull/2084 Reviewed By: carolineechen Differential Revision: D33236947 Pulled By: mthrok fbshipit-source-id: 171d9b7945f81db54f98362a68b20f2f95bb11a4
-
- 20 Dec, 2021 3 commits
-
-
moto authored
Summary: Previously sox-related third-party source code was archived at `third_party/sox/archives`. Recently KenLM-related third-party source code was added and they are archived at `third_party/archives`. This PR changes the sox archive location to `third_party/archives`, so that all the archvies are cached at the same location. Pull Request resolved: https://github.com/pytorch/audio/pull/2086 Reviewed By: carolineechen Differential Revision: D33236927 Pulled By: mthrok fbshipit-source-id: 2f2aa5f4b386fefb46d7c98f7179c04995219f3c
-
Joao Gomes authored
Summary: The urls for this dataset seem to have changed so I am updating to the new location Pull Request resolved: https://github.com/pytorch/audio/pull/2074 Reviewed By: mthrok Differential Revision: D33234996 Pulled By: jdsgomes fbshipit-source-id: e09c35a122e8227fcce7fa97aeeeea312cb89173
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2085 Reviewed By: carolineechen Differential Revision: D33235225 Pulled By: mthrok fbshipit-source-id: 47fe9ec4c93a26322b3a362202ddd3c4654c3f8c
-
- 18 Dec, 2021 1 commit
-
-
moto authored
Summary: After all the C++ code from https://github.com/pytorch/audio/issues/2072 are added, this commit will enable decoder/KenLM integration in the build process. Pull Request resolved: https://github.com/pytorch/audio/pull/2078 Reviewed By: carolineechen Differential Revision: D33198183 Pulled By: mthrok fbshipit-source-id: 9d7fa76151d06fbbac3785183c7c2ff9862d3128
-
- 17 Dec, 2021 4 commits
-
-
Caroline Chen authored
Summary: part of https://github.com/pytorch/audio/issues/2072 -- splitting up the PR for easier review Add C++ files for binding CTC decoder functionality for Python Note: the code here will not be compiled until the build process is changed Pull Request resolved: https://github.com/pytorch/audio/pull/2079 Reviewed By: mthrok Differential Revision: D33196286 Pulled By: carolineechen fbshipit-source-id: 9fe4a8635b60ebfb594918bab00f5c3dccf96bd2
-
Caroline Chen authored
Summary: part of https://github.com/pytorch/audio/issues/2072 -- splitting up the PR for easier review Add C++ files from [flashlight](https://github.com/flashlight/flashlight) that are needed for building CTC decoder w/ Lexicon and KenLM support Note: the code here will not be compiled until the build process is changed (future PR) Pull Request resolved: https://github.com/pytorch/audio/pull/2075 Reviewed By: mthrok Differential Revision: D33186825 Pulled By: carolineechen fbshipit-source-id: 5b69eea7634f3fae686471d988422942bb784cd9
-
moto authored
Summary: Add KenLM and its dependencies required for static build (`zlib`, `bzip2`, `lzma` and `boost-thread`). The KenLM and its dependencies are build but since no corresponding code on torchaudio side is changed, the resulting torchaudio extension module is not changed. (therefore, as long as build process passes on CI this PR should be good to go.) Pull Request resolved: https://github.com/pytorch/audio/pull/2076 Reviewed By: carolineechen Differential Revision: D33189980 Pulled By: mthrok fbshipit-source-id: 6096113128b939f3cf70990c99aacc4aaa954584
-
moto authored
Summary: Similar to https://github.com/pytorch/audio/issues/2040 this commit refactor the part of the CMakeLists.txt which defines extension module so that second extension can be added easily. Pull Request resolved: https://github.com/pytorch/audio/pull/2077 Reviewed By: carolineechen Differential Revision: D33189998 Pulled By: mthrok fbshipit-source-id: dc562ce5360332479a7493c21a2930c6fcc6be84
-
- 15 Dec, 2021 1 commit
-
-
Joao Gomes authored
Summary: In order to align with the internal configuration and also torchvision we decided to sphinx-gallery examples from the lint checks . cc NicolasHug mthrok Pull Request resolved: https://github.com/pytorch/audio/pull/2071 Reviewed By: NicolasHug Differential Revision: D33091124 Pulled By: jdsgomes fbshipit-source-id: ffda2dde9115f0590cbde7785007cf811caca7ef
-
- 11 Dec, 2021 1 commit
-
-
Andrey Talman authored
Summary: cc peterjc123 maxluk nbcsm guyang3532 gunandrose4u smartcat2010 mszhanyi Pull Request resolved: https://github.com/pytorch/audio/pull/2067 Reviewed By: seemethere Differential Revision: D33032607 Pulled By: atalman fbshipit-source-id: a5767e9af27690d3a7ab762ddf30178b3069cd35
-
- 10 Dec, 2021 3 commits
-
-
Zhaoheng Ni authored
Summary: The unit test failures seems to be caused by [conda 4.11](https://github.com/conda/conda/issues/11096) Remove conda update line fixes the issue. Pull Request resolved: https://github.com/pytorch/audio/pull/2069 Reviewed By: carolineechen Differential Revision: D33023851 Pulled By: nateanl fbshipit-source-id: 73246189d4ccc541e366a5367f532a5b456af8f8
-
nateanl authored
Summary: The PR adds PyTorch Lightning based training script for HuBERT Base model. There are two iterations of pre-training and 1 iteration of ASR fine-tuning on LibriSpeech dataset. Pull Request resolved: https://github.com/pytorch/audio/pull/2000 Reviewed By: carolineechen Differential Revision: D33021467 Pulled By: nateanl fbshipit-source-id: 77fe5a751943b56b63d5f1fb4e6ef35946e081db
-
Joao Gomes authored
Summary: Following up on [this comment ](https://github.com/pytorch/audio/pull/2056#issuecomment-988356439) I am separating the config changes from the formatting. cc NicolasHug mthrok Pull Request resolved: https://github.com/pytorch/audio/pull/2066 Reviewed By: mthrok Differential Revision: D32990377 Pulled By: jdsgomes fbshipit-source-id: 67a6251a51901702ad10ae43c35609a09cbf5c5c
-
- 08 Dec, 2021 1 commit
-
-
moto authored
Summary: Part of https://github.com/pytorch/audio/issues/1986. Splitting the PR for easier review. Add `Decoder` class that manages `AVCodecContext` resource and process input `AVPacket`. For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md. Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later. Needs to be imported after https://github.com/pytorch/audio/issues/2041. Pull Request resolved: https://github.com/pytorch/audio/pull/2042 Reviewed By: carolineechen Differential Revision: D32933294 Pulled By: mthrok fbshipit-source-id: e443debadb44d491462fb641cd5b7b20c413b5b9
-
- 07 Dec, 2021 1 commit
-
-
moto authored
Summary: Part of https://github.com/pytorch/audio/issues/1986. Splitting the PR for easier review. Add wrapper classes that auto release memories allocated by ffmpeg libraries. For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md. Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later. - [x] Needs to be imported after updating TARGETS file. Pull Request resolved: https://github.com/pytorch/audio/pull/2041 Reviewed By: carolineechen Differential Revision: D32688964 Pulled By: mthrok fbshipit-source-id: 165bef5b292dbedae4e9599d53fb2a3f06978db8
-
- 04 Dec, 2021 1 commit
-
-
moto authored
Summary: (See https://github.com/pytorch/audio/issues/2038 description for the overall goal.) This commit turns the part that defines `libtorchaudio` into a function so that it becomes easy to define libraries in the same way as `libtorchaudio`. Built on top of https://github.com/pytorch/audio/issues/2039 Pull Request resolved: https://github.com/pytorch/audio/pull/2040 Reviewed By: hwangjeff Differential Revision: D32851990 Pulled By: mthrok fbshipit-source-id: a8206c62b076bc0849ada1a66c7502ae5ea35e28
-
- 03 Dec, 2021 5 commits
-
-
moto authored
Summary: While updating the documentation in release/0.10, a HIP error was raised. https://app.circleci.com/pipelines/github/pytorch/audio/8577/workflows/02c6ff44-a042-4f9a-8fb8-573a231f60db/jobs/452639 This happens because `pip install torchaudio -f https://...` defaults to ROCm version while `build_doc` is supposed to pick the CPU version. Adding suffix `+cpu` should resolve the isssue. It is validated on https://github.com/pytorch/audio/pull/2060 https://app.circleci.com/pipelines/github/pytorch/audio/8584/workflows/25ae26e5-273f-46f8-805d-ffc7b6b8eb58/jobs/453337 Pull Request resolved: https://github.com/pytorch/audio/pull/2060 Reviewed By: carolineechen Differential Revision: D32846765 Pulled By: mthrok fbshipit-source-id: e6b3b32646388b8c4ba864639f8b62d8b9d39844
-
moto authored
Summary: (See https://github.com/pytorch/audio/issues/2038 description for the overall goal.) This PR cleans up CMake customization logic for `libtorchaudio`. It introduces base variables LIBTORCHAUDIO_INCLUDE_DIRS, LIBTORCHAUDIO_LINK_LIBRARIES and LIBTORCHAUDIO_COMPILE_DEFINITIONS, which are respectively used when calling `target_include_directories`, `target_link_libraries` and `target_compile_definitions`. The customization logic only modifies these variables. The original implementation called these functions multiple times (once par customization logic) and it is getting difficult to understand the customization logic. Pull Request resolved: https://github.com/pytorch/audio/pull/2039 Reviewed By: carolineechen, nateanl Differential Revision: D32683004 Pulled By: mthrok fbshipit-source-id: 4d41f21692ac139b1185a6ab69eb45d881ee7e73
-
Joao Gomes authored
Summary: Addresses https://github.com/pytorch/audio/issues/1493 cc mthrok hwangjeff Pull Request resolved: https://github.com/pytorch/audio/pull/2034 Reviewed By: hwangjeff Differential Revision: D32807006 Pulled By: mthrok fbshipit-source-id: badf148646c5f768328c5a4e51bd6016b0be46f3
-
hwangjeff authored
Summary: Add training recipe for RNN-T Emformer ASR model to examples directory. Pull Request resolved: https://github.com/pytorch/audio/pull/2052 Reviewed By: nateanl Differential Revision: D32814096 Pulled By: hwangjeff fbshipit-source-id: a5153044efc16cb39f0e6413369a6791637af76a
-
Yi Zhang authored
Summary: 1. stop&disable the windows upgrade that's the major reason of the failure of cuda installation https://app.circleci.com/pipelines/github/pytorch/audio/8458/workflows/feb65e3b-1093-4724-b849-1a2ac166f354/jobs/441331 For more details please check out https://github.com/pytorch/pytorch/issues/64536 2. print the log when the cuda installation fails Pull Request resolved: https://github.com/pytorch/audio/pull/2032 Reviewed By: mthrok Differential Revision: D32816145 Pulled By: malfet fbshipit-source-id: 44a2ef0dd4c43469472a6e518ed64841e2dcd5bb
-
- 02 Dec, 2021 1 commit
-
-
moto authored
Summary: (This is a part of refactor series, followed up by https://github.com/pytorch/audio/issues/2039 and https://github.com/pytorch/audio/issues/2040. The goal is to make it easy to add a new library artifact alongside with `libtorchudio`, as in https://github.com/pytorch/audio/pull/2048/commits/4ced990849e60f6d19e87ae22819b04d1726648e https://github.com/pytorch/audio/issues/2048 .) We plan to add prototype/beta third party library integrations, which could be unstable. (segfault, missing dynamic library dependencies etc...) If we add such integrations into the existing libtorchaudio, in the worst case, it will prevent users from just `import torchaudio`. Instead, we would like to separate the prototype/beta integrations into separate libraries, so that such issues would not impact all users but users who attempt to use these prototytpe/beta features. Say, a prototype feature `foo` is added in `torchaudio.prototype.foo`. The following initialization procedure will achieve the above mechanism. 1. Place the library file `libtorchaudio_foo` in `torchaudio/lib`. 2. In `torchaudio.prototype.foo.__init__.py`, load the `libtorchaudio_foo`. Note: The approach will be slightly different for fbcode, because of how buck deploys C++ libraries and standardized environment, but the code change here is still applicable. Pull Request resolved: https://github.com/pytorch/audio/pull/2038 Reviewed By: carolineechen, nateanl Differential Revision: D32682900 Pulled By: mthrok fbshipit-source-id: 0f402a92a366fba8c2894a0fe01f47f8cdd51376
-
- 30 Nov, 2021 2 commits
-
-
hwangjeff authored
Summary: Our Griffin-Lim autograd tests take a long time to run. This PR adjusts some parameters to shorten the run time. For one of the four tests: Before: ``` test/torchaudio_unittest/transforms/autograd_cpu_test.py . [100%] ======================== 1 passed in 517.35s (0:08:37) ========================= ``` After: ``` test/torchaudio_unittest/transforms/autograd_cpu_test.py . [100%] ======================== 1 passed in 104.59s (0:01:44) ========================= ``` Pull Request resolved: https://github.com/pytorch/audio/pull/2037 Reviewed By: mthrok Differential Revision: D32726213 Pulled By: hwangjeff fbshipit-source-id: c785323ab380aea4b63fb1683b557c8ae842f54e
-
moto authored
Summary: Resolves https://github.com/pytorch/audio/issues/2049, https://github.com/pytorch/audio/issues/1940 Pull Request resolved: https://github.com/pytorch/audio/pull/2050 Reviewed By: nateanl Differential Revision: D32712513 Pulled By: mthrok fbshipit-source-id: e1db81786bcca67605ff765d27e0527e20967d1c
-
- 24 Nov, 2021 3 commits
-
-
Yi Zhang authored
Summary: Similar to https://github.com/pytorch/vision/pull/4788 Make sure the workflow could download right PyTorch with cpu or cuda in case nightly build wasn't ready at that day. https://app.circleci.com/pipelines/github/pytorch/audio/8427/workflows/11a80738-bcdd-45e3-b37f-328be36c60ee/jobs/438285?invite=true#step-107-542  Pull Request resolved: https://github.com/pytorch/audio/pull/2026 Reviewed By: hwangjeff, nateanl Differential Revision: D32634926 Pulled By: mthrok fbshipit-source-id: 30d6349a0a2ce174b789a5888b1c8e0544a23a37
-
Caroline Chen authored
Summary: The previous way of detecting the merger and labels given a commit hash no longer works with ShipIt, as PRs are closed and not merged and are not associated with a commit hash. To work around this, update the script to get the merger (pulled by: ) and PR number from the commit hash message, and then collect labels from the corresponding PR. Pull Request resolved: https://github.com/pytorch/audio/pull/2030 Reviewed By: mthrok Differential Revision: D32634870 Pulled By: carolineechen fbshipit-source-id: a8fcfc5912871d3cca056de43ab25b5d0acb2226
-
hwangjeff authored
Summary: Adds beam search decoder for RNN-T implementation ``torchaudio.prototype.RNNT`` that is TorchScript-able and supports both streaming and non-streaming inference. Pull Request resolved: https://github.com/pytorch/audio/pull/2028 Reviewed By: mthrok Differential Revision: D32627919 Pulled By: hwangjeff fbshipit-source-id: aab99e346d6514a3207a9fb69d4b42978b4cdbbd
-
- 23 Nov, 2021 2 commits
-
-
moto authored
Summary: - Remove unnecessary content list - Remove legacy description Pull Request resolved: https://github.com/pytorch/audio/pull/2029 Reviewed By: carolineechen Differential Revision: D32629917 Pulled By: mthrok fbshipit-source-id: bc9a9366c681bcf8b74907c2a6459c73fb6a7424
-
moto authored
Summary: The sox_effects test in `concurrent.future.ThreadPoolExecutor` started failing since couple of days. While investigate this, skipping the test. Pull Request resolved: https://github.com/pytorch/audio/pull/2025 Reviewed By: nateanl Differential Revision: D32615933 Pulled By: mthrok fbshipit-source-id: 4f7301c0d3c0d11f687011e42e06d9c87ce4197f
-
- 22 Nov, 2021 3 commits
-
-
Zhaoheng Ni authored
Summary: Allow users to use `torch.cfloat` dtype input for MVDR module. It internally convert the spectrogram into `torch.cdouble` and output the tensor with the original dtype of the spectrogram. Pull Request resolved: https://github.com/pytorch/audio/pull/2024 Reviewed By: carolineechen Differential Revision: D32594051 Pulled By: nateanl fbshipit-source-id: e32609ccdc881b36300d579c90daba41c9234b46
-
Albert Villanova del Moral authored
Summary: Fix minor typo in docs. Pull Request resolved: https://github.com/pytorch/audio/pull/2012 Reviewed By: nateanl Differential Revision: D32562618 Pulled By: mthrok fbshipit-source-id: 79262a14d9b10381249602a63f400232031abaa2
-
Zhaoheng Ni authored
Summary: Division first, multiplication second. This helps avoid the value overflow issue. It also helps the ``stv_evd`` solution pass the gradient check. Pull Request resolved: https://github.com/pytorch/audio/pull/2004 Reviewed By: mthrok Differential Revision: D32539827 Pulled By: nateanl fbshipit-source-id: 70a386608324bb6e1b1c7238c78d403698590f22
-