- 07 Jun, 2023 1 commit
-
-
moto authored
Summary: To investigate https://github.com/pytorch/audio/issues/3411 Pull Request resolved: https://github.com/pytorch/audio/pull/3418 Differential Revision: D46535891 Pulled By: mthrok fbshipit-source-id: b90bba399eb54f9f0ae073bd590cd8a46054ed7e
-
- 05 Jun, 2023 1 commit
-
-
moto authored
Summary: Follow up of: https://github.com/pytorch/audio/pull/3368 Remove files and lines no longer used. Pull Request resolved: https://github.com/pytorch/audio/pull/3403 Differential Revision: D46441462 Pulled By: mthrok fbshipit-source-id: 11b881ec4b24fa0d625c6aee9f4bd91f637f9923
-
- 03 Jun, 2023 1 commit
-
-
Moto Hira authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3402 This is a second attempt of https://github.com/pytorch/audio/pull/3353. The basic logic to enable dlopen for FFmpeg libraries are same. It uses `at::DynamicLibrary`, which allows to compile torchaudio without linking FFmpeg libraries. This time, the option to enable this feature DLOPEN_FFMPEG has been added, so that users have a way to disable this feature and keep using build-time linking. Please refer to stub.h for more technical detail. Differential Revision: D46403783 fbshipit-source-id: ca3db57ff6bdc50c8c225d22f12f3e76c6dc3f16
-
- 20 May, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3348 The pull request adds a CTC-based forced alignment function that supports both CPU and CUDA deviced. The function takes the CTC emissions and target labels as inputs and generates the corresponding labels for each frame. Reviewed By: vineelpratap, xiaohui-zhang Differential Revision: D45867265 fbshipit-source-id: 3e25b06bf9bc8bb1bdcdc08de7f4434d912154cb
-
- 28 Apr, 2023 1 commit
-
-
Yuekai Zhang authored
Summary: This PR implements a CUDA based ctc prefix beam search decoder. Attach serveral benchmark results using V100 below: |decoder type| model |datasets | decoding time (secs)| beam size | batch size | model unit | subsampling times | vocab size | |--------------|---------|------|-----------------|------------|-------------|------------|-----------------------|------------| | cuctc | conformer nemo |dev clean |7.68s | 8 | 32 | bpe | 4 | 1000| | cuctc | conformer nemo |dev clean (sort by length) |1.6s | 8 | 32 | bpe | 4 | 1000| | cuctc | wav2vec2.0 torchaudio |dev clean |22s | 10 | 1 | char | 2 | 29| | cuctc | conformer espnet |aishell1 test | 5s | 10 | 24 | char | 4 | 4233| Note: 1. The design is to parallel computation through batch and vocab axis, for loop the frames axis. So it's more friendly with smaller sequence lengths, larger vocab size comparing with CPU implementations. 2. WER is the same as CPU implementations. However, it can't decode with LM now. Resolves: https://github.com/pytorch/audio/issues/2957. Pull Request resolved: https://github.com/pytorch/audio/pull/3096 Reviewed By: nateanl Differential Revision: D44709397 Pulled By: mthrok fbshipit-source-id: 3078c54a2b44dc00eb4a81b4c657487eeff8c155
-
- 05 Apr, 2023 1 commit
-
-
moto authored
Summary: Following https://github.com/pytorch/audio/pull/3232, static build of flashlight-text has been disabled and removed from nightly build. This commit removes the related source/build from torchaudio code base. Pull Request resolved: https://github.com/pytorch/audio/pull/3236 Reviewed By: jacobkahn Differential Revision: D44712539 Pulled By: mthrok fbshipit-source-id: a201c89b5046f224526309cd4e17a5105e58a949
-
- 04 Apr, 2023 1 commit
-
-
moto authored
Summary: As we migrate to use upstream flashlight-text and KenLM, this PR disable building CTC decoder by default. This will stop shipping flashlight-text and KenLM bundle in torchaudio binary. Ref: https://github.com/pytorch/audio/issues/3088 cc jacobkahn Pull Request resolved: https://github.com/pytorch/audio/pull/3232 Reviewed By: hwangjeff Differential Revision: D44650872 Pulled By: mthrok fbshipit-source-id: 2415623abaf3cafa181135db5112d3c711137cd7
-
- 14 Feb, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: replicate of https://github.com/pytorch/audio/issues/2644 Pull Request resolved: https://github.com/pytorch/audio/pull/2880 Reviewed By: mthrok Differential Revision: D41633911 Pulled By: nateanl fbshipit-source-id: 73cf145d75c389e996aafe96571ab86dc21f86e5
-
- 09 Feb, 2023 1 commit
-
-
DanilBaibak authored
Summary: We don't need the presence of physical HW to compile with CUDA. This is a follow up PR regarding `USE_ROCM` for issue https://github.com/pytorch/audio/issues/2979. Pull Request resolved: https://github.com/pytorch/audio/pull/3008 Reviewed By: malfet Differential Revision: D42708862 Pulled By: DanilBaibak fbshipit-source-id: 90cedc80a2d180ca1e0912ad5b644398182417b8
-
- 23 Jan, 2023 1 commit
-
-
Nikita Shulga authored
Summary: We don't need the presence of physical HW to compile with CUDA. Likely one of the causes of https://github.com/pytorch/audio/issues/2979 (i.e. in CircleCI builds USE_CUDA were defined by CI environment, so nobody ever checked the default, but this is not the case in Nova builds) Pull Request resolved: https://github.com/pytorch/audio/pull/3005 Test Plan: Check that `compute.cu` is mentioned in builds, for example see https://github.com/pytorch/audio/actions/runs/3990295262/jobs/6843771056#step:9:829 ``` [193/202] /usr/local/cuda-11.6/bin/nvcc -forward-unknown-to-host-compiler -DINCLUDE_KALDI -DUSE_C10D_GLOO -DUSE_C10D_NCCL -DUSE_CUDA -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dlibtorchaudio_EXPORTS -I/__w/audio/audio/pytorch/audio -I/__w/audio/audio/pytorch/audio/third_party/kaldi/src -I/__w/audio/audio/pytorch/audio/third_party/kaldi/submodule/src -isystem=/__w/_temp/conda_environment_3990295262/lib/python3.7/site-packages/torch/include -isystem=/__w/_temp/conda_environment_3990295262/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem=/usr/local/cuda-11.6/include -DONNX_NAMESPACE=onnx_c2 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_50,code=compute_50 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=integer_sign_change,--diag_suppress=useless_using_declaration,--diag_suppress=set_but_not_used,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=implicit_return_from_non_void_function,--diag_suppress=unsigned_compare_with_zero,--diag_suppress=declared_but_not_referenced,--diag_suppress=bad_friend_decl --expt-relaxed-constexpr --expt-extended-lambda -O3 -DNDEBUG -Xcompiler=-fPIC -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++17 -MD -MT torchaudio/csrc/CMakeFiles/libtorchaudio.dir/rnnt/gpu/compute.cu.o -MF torchaudio/csrc/CMakeFiles/libtorchaudio.dir/rnnt/gpu/compute.cu.o.d -x cu -c /__w/audio/audio/pytorch/audio/torchaudio/csrc/rnnt/gpu/compute.cu -o torchaudio/csrc/CMakeFiles/libtorchaudio.dir/rnnt/gpu/compute.cu.o ``` Reviewed By: mthrok Differential Revision: D42687455 Pulled By: malfet fbshipit-source-id: c37ad58cc62439d1268865e9bf0bcb97079a529f
-
- 21 Dec, 2022 1 commit
-
-
moto authored
Summary: This commit makes the following changes to the C++ library organization - Move sox-related feature implementations from `libtorchaudio` to `libtorchaudio_sox`. - Remove C++ implementation of `is_sox_available` and `is_ffmpeg_available` as it is now sufficient to check the existence of `libtorchaudio_sox` and `libtorchaudio_ffmpeg` to check the availability. This makes `libtorchaudio_sox` and `libtorchaudio_ffmpeg` independent from `libtorchaudio`. - Move PyBind11-based bindings (`_torchaudio_sox`, `_torchaudio_ffmpeg`) into `torchaudio.lib` so that the built library structure is less cluttered. Background: Originally, when the `libsox` was the only C++ extension and `libtorchaudio` was supposed to contain all the C++ code. The things are different now. We have a bunch of C++ extensions and we need to make the code/build structure more modular. The new `libtorchaudio_sox` contains the implementations and `_torchaudio_sox` contains the PyBin11-based bindings. Pull Request resolved: https://github.com/pytorch/audio/pull/2929 Reviewed By: hwangjeff Differential Revision: D42159594 Pulled By: mthrok fbshipit-source-id: 1a0fbca9e4143137f6363fc001b2378ce6029aa7
-
- 12 Aug, 2022 1 commit
-
-
Andrey Talman authored
Summary: Introducing pytorch-cuda metapackage Same as: https://github.com/pytorch/vision/pull/6371 Following PR: https://github.com/pytorch/builder/pull/1094 Adds cuda metapackage called pytorch-cuda . This way we can make sure to install correct version of cuda dependencies and don't depend on conda-forge. Pull Request resolved: https://github.com/pytorch/audio/pull/2612 Reviewed By: hwangjeff, seemethere, nateanl Differential Revision: D38633332 Pulled By: atalman fbshipit-source-id: 78a6115bb252ebdb6d66a57d7d2c4a4978ddb501
-
- 29 Jul, 2022 1 commit
-
-
moto authored
Summary: This commit enables CTC decoder on Windows. The functionality seems to work fine. The tests are passing, the decoding tutorial runs fine. The only difference to the Linux/macOS version is that loading model in XZ compression format is not supported.  Pull Request resolved: https://github.com/pytorch/audio/pull/2587 Reviewed By: carolineechen, nateanl Differential Revision: D38276490 Pulled By: mthrok fbshipit-source-id: f2203b2235c5bbb0220fe560aaaf0e1d5530347a
-
- 28 Jul, 2022 1 commit
-
-
moto authored
Summary: This commit gets rid of our copy of CTC decoder code and replace it with upstream Flashlight-Text repo. Pull Request resolved: https://github.com/pytorch/audio/pull/2580 Reviewed By: carolineechen Differential Revision: D38244906 Pulled By: mthrok fbshipit-source-id: d274240fc67675552d19ff35e9a363b9b9048721
-
- 02 Jun, 2022 1 commit
-
-
moto authored
Summary: Remove the code related to libmad, which had been disabled in https://github.com/pytorch/audio/issues/2354 In https://github.com/pytorch/audio/issues/2419, we mp3 decoding to ffmpeg. But CI tests were still using libmad. This commit completely removes libmad from torchaudio. This is BC-breaking change as `apply_sox_effects_file` function cannot handle MP3, and it cannot fallback to ffmpeg. The workaround for this is to use `torchaudio.load` then `apply_sox_effects_tensor`. Pull Request resolved: https://github.com/pytorch/audio/pull/2428 Reviewed By: carolineechen Differential Revision: D36851805 Pulled By: mthrok fbshipit-source-id: f98795c59a1ac61cef511f2bbeac37f7c3c69d55
-
- 21 May, 2022 1 commit
-
-
moto authored
Summary: This commit adds file-like object support to Streaming API. ## Features - File-like objects are expected to implement `read(self, n)`. - Additionally `seek(self, offset, whence)` is used if available. - Without `seek` method, some formats cannot be decoded properly. - To work around this, one can use the existing `decoder` option to tell what decoder it should use. - The set of `decoder` and `decoder_option` arguments were added to `add_basic_[audio|video]_stream` method, similar to `add_[audio|video]_stream`. - So as to have the arguments common to both audio and video in front of the rest of the arguments, the order of the arguments are changed. - Also `dtype` and `format` arguments were changed to make them consistent across audio/video methods. ## Code structure The approach is very similar to how file-like object is supported in sox-based I/O. In Streaming API if the input src is string, it is passed to the implementation bound with TorchBind, if the src has `read` attribute, it is passed to the same implementation bound via PyBind 11.  ## Refactoring involved - Extracted to https://github.com/pytorch/audio/issues/2402 - Some implementation in the original TorchBind surface layer is converted to Wrapper class so that they can be re-used from PyBind11 bindings. The wrapper class serves to simplify the binding. - `add_basic_[audio|video]_stream` methods were removed from C++ layer as it was just constructing string and passing it to `add_[audio|video]_stream` method, which is simpler to do in Python. - The original core Streamer implementation kept the use of types in `c10` namespace minimum. All the `c10::optional` and `c10::Dict` were converted to the equivalents of `std` at binding layer. But since they work fine with PyBind11, Streamer core methods deal them directly. ## TODO: - [x] Check if it is possible to stream MP4 (yuv420p) from S3 and directly decode (with/without HW decoding). Pull Request resolved: https://github.com/pytorch/audio/pull/2400 Reviewed By: carolineechen Differential Revision: D36520073 Pulled By: mthrok fbshipit-source-id: a11d981bbe99b1ff0cc356e46264ac8e76614bc6
-
- 13 May, 2022 1 commit
-
-
moto authored
Summary: This commit moves the Streaming API out of prototype module. * The related classes are renamed as following - `Streamer` -> `StreamReader`. - `SourceStream` -> `StreamReaderSourceStream` - `SourceAudioStream` -> `StreamReaderSourceAudioStream` - `SourceVideoStream` -> `StreamReaderSourceVideoStream` - `OutputStream` -> `StreamReaderOutputStream` This change is preemptive measurement for the possibility to add `StreamWriter` API. * Replace BUILD_FFMPEG build arg with USE_FFMPEG We are not building FFmpeg, so USE_FFMPEG is more appropriate --- After https://github.com/pytorch/audio/issues/2377 Remaining TODOs: (different PRs) - [ ] Introduce `is_ffmpeg_binding_available` function. - [ ] Refactor C++ code: - Rename `Streamer` to `StreamReader`. - Rename `streamer.[h|cpp]` to `stream_reader.[h|cpp]`. - Rename `prototype.cpp` to `stream_reader_binding.cpp`. - Introduce `stream_reader` directory. - [x] Enable FFmpeg in smoke test (https://github.com/pytorch/audio/issues/2381) Pull Request resolved: https://github.com/pytorch/audio/pull/2378 Reviewed By: carolineechen Differential Revision: D36359299 Pulled By: mthrok fbshipit-source-id: 6a57b702996af871e577fb7addbf3522081c1328
-
- 28 Apr, 2022 1 commit
-
-
moto authored
Summary: libmad integration should be enabled only from source-build Pull Request resolved: https://github.com/pytorch/audio/pull/2354 Reviewed By: nateanl Differential Revision: D36012035 Pulled By: mthrok fbshipit-source-id: adeda8cbfd418f96245909cae6862b648a6915a7
-
- 30 Dec, 2021 1 commit
-
-
moto authored
Summary: This PR adds `BUILD_FFMPEG` switch to torchaudio build process so that features related to ffmpeg are built. The flag is false by default, so no CI jobs or development flow are affected. This is because handling the dependencies around ffmpeg is a bit tricky. Currently, the CMake file uses `pkg-config` to find an ffmpeg installation in the system. This works fine for both conda-based installation and system-managed installation (like `apt`). In subsequent PRs, I will find a solution that works for local development and binary distributions. Pull Request resolved: https://github.com/pytorch/audio/pull/2048 Reviewed By: hwangjeff, nateanl Differential Revision: D33367260 Pulled By: mthrok fbshipit-source-id: 94517acecb62bd6d4e96d4b7cbc3ab3c2a25706c
-
- 23 Dec, 2021 1 commit
-
-
Joao Gomes authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2096 run: `arc lint --apply-patches --paths-cmd 'hg files -I "./**/*.py"'` Reviewed By: mthrok Differential Revision: D33297351 fbshipit-source-id: 7bf5956edf0717c5ca90219f72414ff4eeaf5aa8
-
- 18 Dec, 2021 1 commit
-
-
moto authored
Summary: After all the C++ code from https://github.com/pytorch/audio/issues/2072 are added, this commit will enable decoder/KenLM integration in the build process. Pull Request resolved: https://github.com/pytorch/audio/pull/2078 Reviewed By: carolineechen Differential Revision: D33198183 Pulled By: mthrok fbshipit-source-id: 9d7fa76151d06fbbac3785183c7c2ff9862d3128
-
- 17 Dec, 2021 1 commit
-
-
moto authored
Summary: Add KenLM and its dependencies required for static build (`zlib`, `bzip2`, `lzma` and `boost-thread`). The KenLM and its dependencies are build but since no corresponding code on torchaudio side is changed, the resulting torchaudio extension module is not changed. (therefore, as long as build process passes on CI this PR should be good to go.) Pull Request resolved: https://github.com/pytorch/audio/pull/2076 Reviewed By: carolineechen Differential Revision: D33189980 Pulled By: mthrok fbshipit-source-id: 6096113128b939f3cf70990c99aacc4aaa954584
-
- 30 Nov, 2021 1 commit
-
-
moto authored
Summary: Resolves https://github.com/pytorch/audio/issues/2049, https://github.com/pytorch/audio/issues/1940 Pull Request resolved: https://github.com/pytorch/audio/pull/2050 Reviewed By: nateanl Differential Revision: D32712513 Pulled By: mthrok fbshipit-source-id: e1db81786bcca67605ff765d27e0527e20967d1c
-
- 06 Oct, 2021 2 commits
- 20 Sep, 2021 1 commit
-
-
moto authored
Make the structure of library files somewhat similar to PyTorch core, which has the following pattern ``` torch/_C.so torch/lib/libc10.so torch/lib/libtorch.so ... ``` ``` torchaudio/_torchaudio.so torchaudio/lib/libtorchaudio.so ```
-
- 16 Sep, 2021 1 commit
-
-
moto authored
* Split `libtorchaudio` and `_torchaudio` This change extract the core implementation from `_torchaudio` to `libtorchaudio`, so that `libtorchaudio` is reusable in TorchScript-based app. `_torchaudio` is a wrapper around `libtorchaudio` and only provides PyBind11-based features. (currently file-like object support in I/O) * Removed `BUILD_LIBTORCHAUDIO` option When invoking `cmake`, `libtorchaudio` is always built, so this option is removed. The new assumptions around the library discoverability - In regular OSS workflow (`pip`/`conda`-based binary installation), both `libtorchaudio` and `_torchaudio` are present. In this case,`libtorchaudio` has to be loaded manually with `torch.ops.load_library` and/or `torch.classes.load_library` otherwise importing `_torchaudio` would not be able to resolve the symbols defined in `libtorchaudio`. - When `torchaudio` is deployed with PEX format (single zip file) - We expect that`libtorchaudio.so` exists as a file in some search path configured by client code. - `_torchaudio` is still importable and because we do not know where `libtorchaudio` will exist, we will let the dynamic loader resolve the dependency from `_torchaudio` to `libtorchaudio`, which should work as long as `libtorchaudio` is in a library search path (search path is not modifiable from already-running Python process).
-
- 13 Sep, 2021 1 commit
-
-
Michael Melesse authored
* fix build error on ROCM * Update CMakeLists.txt Co-authored-by:
Nikita Shulga <nikita.shulga@gmail.com> * address comments and fix cuda detction on rocm Co-authored-by:
Nikita Shulga <nikita.shulga@gmail.com>
-
- 30 Aug, 2021 1 commit
-
-
Nikita Shulga authored
Needed to support CUDA builds on CPU machine Parse `TORCH_CUDA_ARCH_LIST` as new-CUDA-language Cmake-3.18+ style [CMAKE_CUDA_ARCHITECTURES](https://cmake.org/cmake/help/latest/prop_tgt/CUDA_ARCHITECTURES.html#prop_tgt:CUDA_ARCHITECTURES)
-
- 26 Aug, 2021 1 commit
-
-
moto authored
* Default to BUILD_SOX=1 in non-Windows systems Since the adaptation of CMake and restricting to the static linking of libsox, the build process has become much robust with libsox integration enabled. This commit makes it default behavior to build libsox integration in non-Windows systems. The build process still checks BUILD_SOX env var so, setting `BUILD_SOX=0` disables it.
-
- 19 Aug, 2021 1 commit
-
-
Caroline Chen authored
-
- 28 Jun, 2021 2 commits
-
-
Caroline Chen authored
-
Caroline Chen authored
-
- 06 May, 2021 1 commit
-
-
Caroline Chen authored
-
- 02 Apr, 2021 1 commit
-
-
Michael Melesse authored
-
- 05 Mar, 2021 1 commit
-
-
Caroline Chen authored
-
- 03 Mar, 2021 1 commit
-
-
Caroline Chen authored
-
- 09 Feb, 2021 1 commit
-
-
moto authored
-
- 04 Feb, 2021 1 commit
-
-
moto authored
* Switch to cmake for build * Hide symbols
-
- 12 Jan, 2021 1 commit
-
-
moto authored
With this change, `BUILD_TRANSDUCER=1 python setup.py build_ext` now sees `-D_GLIBCXX_USE_CXX11_ABI=` in the compilation command. (Note: sox is C-only so it is not relevant to sox build process) See also: - https://github.com/pytorch/text/pull/931 - https://stackoverflow.com/a/55406930
-