- 20 May, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3348 The pull request adds a CTC-based forced alignment function that supports both CPU and CUDA deviced. The function takes the CTC emissions and target labels as inputs and generates the corresponding labels for each frame. Reviewed By: vineelpratap, xiaohui-zhang Differential Revision: D45867265 fbshipit-source-id: 3e25b06bf9bc8bb1bdcdc08de7f4434d912154cb
-
- 28 Apr, 2023 1 commit
-
-
Yuekai Zhang authored
Summary: This PR implements a CUDA based ctc prefix beam search decoder. Attach serveral benchmark results using V100 below: |decoder type| model |datasets | decoding time (secs)| beam size | batch size | model unit | subsampling times | vocab size | |--------------|---------|------|-----------------|------------|-------------|------------|-----------------------|------------| | cuctc | conformer nemo |dev clean |7.68s | 8 | 32 | bpe | 4 | 1000| | cuctc | conformer nemo |dev clean (sort by length) |1.6s | 8 | 32 | bpe | 4 | 1000| | cuctc | wav2vec2.0 torchaudio |dev clean |22s | 10 | 1 | char | 2 | 29| | cuctc | conformer espnet |aishell1 test | 5s | 10 | 24 | char | 4 | 4233| Note: 1. The design is to parallel computation through batch and vocab axis, for loop the frames axis. So it's more friendly with smaller sequence lengths, larger vocab size comparing with CPU implementations. 2. WER is the same as CPU implementations. However, it can't decode with LM now. Resolves: https://github.com/pytorch/audio/issues/2957. Pull Request resolved: https://github.com/pytorch/audio/pull/3096 Reviewed By: nateanl Differential Revision: D44709397 Pulled By: mthrok fbshipit-source-id: 3078c54a2b44dc00eb4a81b4c657487eeff8c155
-
- 05 Apr, 2023 1 commit
-
-
moto authored
Summary: Following https://github.com/pytorch/audio/pull/3232, static build of flashlight-text has been disabled and removed from nightly build. This commit removes the related source/build from torchaudio code base. Pull Request resolved: https://github.com/pytorch/audio/pull/3236 Reviewed By: jacobkahn Differential Revision: D44712539 Pulled By: mthrok fbshipit-source-id: a201c89b5046f224526309cd4e17a5105e58a949
-
- 14 Feb, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: replicate of https://github.com/pytorch/audio/issues/2644 Pull Request resolved: https://github.com/pytorch/audio/pull/2880 Reviewed By: mthrok Differential Revision: D41633911 Pulled By: nateanl fbshipit-source-id: 73cf145d75c389e996aafe96571ab86dc21f86e5
-
- 09 Feb, 2023 1 commit
-
-
moto authored
Summary: Commit b4c66d1f broke all the CIs. The new policy changes the timestamp of configuration files of third party libraries, which triggers re-configuration which requires extra tools. This commit fixes it by reverting the old behavior. Also this adds guard for older cmake versions. Pull Request resolved: https://github.com/pytorch/audio/pull/3046 Reviewed By: atalman Differential Revision: D43133536 Pulled By: mthrok fbshipit-source-id: 357055c8c1b53e593b8b7880f2045e13512c7a8f
-
- 08 Feb, 2023 1 commit
-
-
moto authored
Summary: Currently, for each third party library checked out with ExternalProject_Add, the following warning is shown. This commit set the policy so that the warning is not shown. ``` CMake Warning (dev) at ci_env/lib/python3.10/site-packages/cmake/data/share/cmake-3.25/Modules/ExternalProject.cmake:3075 (message): The DOWNLOAD_EXTRACT_TIMESTAMP option was not given and policy CMP0135 is not set. The policy's OLD behavior will be used. When using a URL download, the timestamps of extracted files should preferably be that of the time of extraction, otherwise code that depends on the extracted contents might not be rebuilt if the URL changes. The OLD behavior preserves the timestamps from the archive instead, but this is usually not what you want. Update your project to the NEW behavior or specify the DOWNLOAD_EXTRACT_TIMESTAMP option with a value of true to avoid this robustness issue. ``` Pull Request resolved: https://github.com/pytorch/audio/pull/3044 Reviewed By: xiaohui-zhang Differential Revision: D43110818 Pulled By: mthrok fbshipit-source-id: d2e20c9fdbbeeedb5ad546fe32dbda28c5bdd431
-
- 12 Jan, 2023 1 commit
-
-
moto authored
Summary: Following the change in PyTorch core. https://github.com/pytorch/pytorch/commit/87e4a087784c805312a2b48bb063d2400df26c5e Pull Request resolved: https://github.com/pytorch/audio/pull/2973 Reviewed By: xiaohui-zhang Differential Revision: D42462709 Pulled By: mthrok fbshipit-source-id: 60c2aa3d63fe25d8e0b7aa476404e7a55d6eb87f
-
- 04 Jan, 2023 1 commit
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2866 Reviewed By: carolineechen Differential Revision: D42349474 Pulled By: mthrok fbshipit-source-id: 31455184031fff52719ef829e40bb1e09e11b0e7
-
- 29 Dec, 2022 1 commit
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2930 Reviewed By: carolineechen, nateanl Differential Revision: D42280966 Pulled By: mthrok fbshipit-source-id: f9d5f1dc7c1a62d932fb2020aafb63734f2bf405
-
- 29 Jul, 2022 1 commit
-
-
moto authored
Summary: This commit enables CTC decoder on Windows. The functionality seems to work fine. The tests are passing, the decoding tutorial runs fine. The only difference to the Linux/macOS version is that loading model in XZ compression format is not supported.  Pull Request resolved: https://github.com/pytorch/audio/pull/2587 Reviewed By: carolineechen, nateanl Differential Revision: D38276490 Pulled By: mthrok fbshipit-source-id: f2203b2235c5bbb0220fe560aaaf0e1d5530347a
-
- 28 Jul, 2022 1 commit
-
-
moto authored
Summary: Extract the helper functions for defining library and extension so that they can be reused for building flashlight library and binding in https://github.com/pytorch/audio/issues/2580. Pull Request resolved: https://github.com/pytorch/audio/pull/2585 Reviewed By: carolineechen Differential Revision: D38233407 Pulled By: mthrok fbshipit-source-id: 96f7c62a8b70bb3ff5caede9730165d54a55272f
-
- 02 Jun, 2022 1 commit
-
-
moto authored
Summary: Remove the code related to libmad, which had been disabled in https://github.com/pytorch/audio/issues/2354 In https://github.com/pytorch/audio/issues/2419, we mp3 decoding to ffmpeg. But CI tests were still using libmad. This commit completely removes libmad from torchaudio. This is BC-breaking change as `apply_sox_effects_file` function cannot handle MP3, and it cannot fallback to ffmpeg. The workaround for this is to use `torchaudio.load` then `apply_sox_effects_tensor`. Pull Request resolved: https://github.com/pytorch/audio/pull/2428 Reviewed By: carolineechen Differential Revision: D36851805 Pulled By: mthrok fbshipit-source-id: f98795c59a1ac61cef511f2bbeac37f7c3c69d55
-
- 13 May, 2022 1 commit
-
-
moto authored
Summary: This commit moves the Streaming API out of prototype module. * The related classes are renamed as following - `Streamer` -> `StreamReader`. - `SourceStream` -> `StreamReaderSourceStream` - `SourceAudioStream` -> `StreamReaderSourceAudioStream` - `SourceVideoStream` -> `StreamReaderSourceVideoStream` - `OutputStream` -> `StreamReaderOutputStream` This change is preemptive measurement for the possibility to add `StreamWriter` API. * Replace BUILD_FFMPEG build arg with USE_FFMPEG We are not building FFmpeg, so USE_FFMPEG is more appropriate --- After https://github.com/pytorch/audio/issues/2377 Remaining TODOs: (different PRs) - [ ] Introduce `is_ffmpeg_binding_available` function. - [ ] Refactor C++ code: - Rename `Streamer` to `StreamReader`. - Rename `streamer.[h|cpp]` to `stream_reader.[h|cpp]`. - Rename `prototype.cpp` to `stream_reader_binding.cpp`. - Introduce `stream_reader` directory. - [x] Enable FFmpeg in smoke test (https://github.com/pytorch/audio/issues/2381) Pull Request resolved: https://github.com/pytorch/audio/pull/2378 Reviewed By: carolineechen Differential Revision: D36359299 Pulled By: mthrok fbshipit-source-id: 6a57b702996af871e577fb7addbf3522081c1328
-
- 28 Apr, 2022 1 commit
-
-
moto authored
Summary: libmad integration should be enabled only from source-build Pull Request resolved: https://github.com/pytorch/audio/pull/2354 Reviewed By: nateanl Differential Revision: D36012035 Pulled By: mthrok fbshipit-source-id: adeda8cbfd418f96245909cae6862b648a6915a7
-
- 05 Jan, 2022 1 commit
-
-
moto authored
Summary: Update ffmpeg discovery logic Previously the build process used pkg-config to locate an installation of ffmpeg, which does not work well Windows/CentOS. This commit update the discovery process to use the custom FindFFMPEG.cmake adopted from Kitware/VTK repository with addition of conda environment. The custom discovery logic can support Windows and CentOS. Pull Request resolved: https://github.com/pytorch/audio/pull/2124 Reviewed By: carolineechen Differential Revision: D33429564 Pulled By: mthrok fbshipit-source-id: 6cb50c1d8c58f51e0f3f3af5c5b541aa3a699bba
-
- 30 Dec, 2021 1 commit
-
-
moto authored
Summary: This PR adds `BUILD_FFMPEG` switch to torchaudio build process so that features related to ffmpeg are built. The flag is false by default, so no CI jobs or development flow are affected. This is because handling the dependencies around ffmpeg is a bit tricky. Currently, the CMake file uses `pkg-config` to find an ffmpeg installation in the system. This works fine for both conda-based installation and system-managed installation (like `apt`). In subsequent PRs, I will find a solution that works for local development and binary distributions. Pull Request resolved: https://github.com/pytorch/audio/pull/2048 Reviewed By: hwangjeff, nateanl Differential Revision: D33367260 Pulled By: mthrok fbshipit-source-id: 94517acecb62bd6d4e96d4b7cbc3ab3c2a25706c
-
- 18 Dec, 2021 1 commit
-
-
moto authored
Summary: After all the C++ code from https://github.com/pytorch/audio/issues/2072 are added, this commit will enable decoder/KenLM integration in the build process. Pull Request resolved: https://github.com/pytorch/audio/pull/2078 Reviewed By: carolineechen Differential Revision: D33198183 Pulled By: mthrok fbshipit-source-id: 9d7fa76151d06fbbac3785183c7c2ff9862d3128
-
- 17 Dec, 2021 1 commit
-
-
moto authored
Summary: Add KenLM and its dependencies required for static build (`zlib`, `bzip2`, `lzma` and `boost-thread`). The KenLM and its dependencies are build but since no corresponding code on torchaudio side is changed, the resulting torchaudio extension module is not changed. (therefore, as long as build process passes on CI this PR should be good to go.) Pull Request resolved: https://github.com/pytorch/audio/pull/2076 Reviewed By: carolineechen Differential Revision: D33189980 Pulled By: mthrok fbshipit-source-id: 6096113128b939f3cf70990c99aacc4aaa954584
-
- 06 Oct, 2021 1 commit
-
-
moto authored
-
- 29 Sep, 2021 1 commit
-
-
Caroline Chen authored
-
- 24 Sep, 2021 1 commit
-
-
Yi Zhang authored
This commit fixes the local build on Windows with CUDA.
-
- 16 Sep, 2021 1 commit
-
-
moto authored
* Split `libtorchaudio` and `_torchaudio` This change extract the core implementation from `_torchaudio` to `libtorchaudio`, so that `libtorchaudio` is reusable in TorchScript-based app. `_torchaudio` is a wrapper around `libtorchaudio` and only provides PyBind11-based features. (currently file-like object support in I/O) * Removed `BUILD_LIBTORCHAUDIO` option When invoking `cmake`, `libtorchaudio` is always built, so this option is removed. The new assumptions around the library discoverability - In regular OSS workflow (`pip`/`conda`-based binary installation), both `libtorchaudio` and `_torchaudio` are present. In this case,`libtorchaudio` has to be loaded manually with `torch.ops.load_library` and/or `torch.classes.load_library` otherwise importing `_torchaudio` would not be able to resolve the symbols defined in `libtorchaudio`. - When `torchaudio` is deployed with PEX format (single zip file) - We expect that`libtorchaudio.so` exists as a file in some search path configured by client code. - `_torchaudio` is still importable and because we do not know where `libtorchaudio` will exist, we will let the dynamic loader resolve the dependency from `_torchaudio` to `libtorchaudio`, which should work as long as `libtorchaudio` is in a library search path (search path is not modifiable from already-running Python process).
-
- 13 Sep, 2021 1 commit
-
-
Michael Melesse authored
* fix build error on ROCM * Update CMakeLists.txt Co-authored-by:
Nikita Shulga <nikita.shulga@gmail.com> * address comments and fix cuda detction on rocm Co-authored-by:
Nikita Shulga <nikita.shulga@gmail.com>
-
- 30 Aug, 2021 1 commit
-
-
Nikita Shulga authored
Needed to support CUDA builds on CPU machine Parse `TORCH_CUDA_ARCH_LIST` as new-CUDA-language Cmake-3.18+ style [CMAKE_CUDA_ARCHITECTURES](https://cmake.org/cmake/help/latest/prop_tgt/CUDA_ARCHITECTURES.html#prop_tgt:CUDA_ARCHITECTURES)
-
- 26 Aug, 2021 1 commit
-
-
moto authored
* Default to BUILD_SOX=1 in non-Windows systems Since the adaptation of CMake and restricting to the static linking of libsox, the build process has become much robust with libsox integration enabled. This commit makes it default behavior to build libsox integration in non-Windows systems. The build process still checks BUILD_SOX env var so, setting `BUILD_SOX=0` disables it.
-
- 19 Aug, 2021 1 commit
-
-
Caroline Chen authored
-
- 28 Jun, 2021 1 commit
-
-
Caroline Chen authored
-
- 06 May, 2021 1 commit
-
-
Caroline Chen authored
-
- 02 Apr, 2021 1 commit
-
-
Michael Melesse authored
-
- 09 Feb, 2021 1 commit
-
-
moto authored
-
- 08 Feb, 2021 2 commits
- 04 Feb, 2021 1 commit
-
-
moto authored
* Switch to cmake for build * Hide symbols
-