Commits · 36f5010b9b61092cf03d2e5f39708b1ec2f0caeb · OpenDAS / Torchaudio

26 Oct, 2023 1 commit

Swap decoder/encoder implementation · 36f5010b

moto-meta authored Oct 26, 2023

Differential Revision: D50677606

Pull Request resolved: https://github.com/pytorch/audio/pull/3681

36f5010b

25 Oct, 2023 1 commit
- Prep for restructure (#3676) · 7c988b43
  moto authored Oct 25, 2023
```
Add torio top-level directory. It's not part of the package yet.
```
  7c988b43
12 Oct, 2023 1 commit

Simplify the logic to initialize FFmpeg · f62367a6

moto-meta authored Oct 12, 2023

Differential Revision: D50193749

Pull Request resolved: https://github.com/pytorch/audio/pull/3650

f62367a6

09 Oct, 2023 1 commit

Migrate to src-layout · ec13a815

moto-meta authored Oct 09, 2023

Differential Revision: D49965263

Pull Request resolved: https://github.com/pytorch/audio/pull/3639

ec13a815

12 Jul, 2023 1 commit

Support multiple FFmpeg versions (#3464) · 786066b4

moto authored Jul 11, 2023

Summary:
This commit introduces support for multiple FFmpeg versions for OSS binary distributions.

Currently torchaudio only works with FFmpeg 4. This is inconvenient from installing to runtime linking.
This commit allows to pick FFmpeg 4, 5 or 6 at runtime, instead of just looking for v4.

The way it works is that we compile the FFmpeg extension three times with different FFmpeg and ship them.
At runtime, we look for libavutil of specific version and when one is found, load the corresponding FFmpeg extension.
The order of preference is 6, 5, then 4.

To make the build process simple and reproducible, we use pre-built binaries of FFmpeg during the build.
They are LGPL and downloaded from S3 at build time, instead of building every time.

The use of pre-built binaries as scaffolding limits the system that can build torchaudio, so it also introduces
single FFmpeg version support mode. setting FFMPEG_ROOT during the build will change the way binaries are built
so that it will only support one specific version of FFmpeg.

Pull Request resolved: https://github.com/pytorch/audio/pull/3464

Differential Revision: D47300223

Pulled By: mthrok

fbshipit-source-id: 560c7968315e4c8922afa11a4693f648c0356d04

786066b4

28 Apr, 2023 1 commit

Add cuctc decoder (#3096) · 0a1801ed

Yuekai Zhang authored Apr 28, 2023

Summary:
This PR implements a CUDA based ctc prefix beam search decoder.

Attach serveral benchmark results using V100 below:
|decoder type| model |datasets       | decoding time (secs)| beam size | batch size | model unit | subsampling times | vocab size |
|--------------|---------|------|-----------------|------------|-------------|------------|-----------------------|------------|
| cuctc |  conformer nemo    |dev clean        |7.68s | 8           |  32       | bpe         |    4  | 1000|
| cuctc |  conformer nemo   |dev clean  (sort by length)      |1.6s | 8           |  32       | bpe         |    4  | 1000|
| cuctc |  wav2vec2.0 torchaudio |dev clean                                |22s | 10           |  1       | char         |    2  | 29|
| cuctc |   conformer espnet   |aishell1 test                             | 5s | 10           |  24       | char         |    4  | 4233|

Note:
1.  The design is to parallel computation through batch and vocab axis, for loop the frames axis. So it's more friendly with smaller sequence lengths, larger vocab size comparing with CPU implementations.
2. WER is the same as CPU implementations. However, it can't decode with LM now.

Resolves: https://github.com/pytorch/audio/issues/2957.

Pull Request resolved: https://github.com/pytorch/audio/pull/3096

Reviewed By: nateanl

Differential Revision: D44709397

Pulled By: mthrok

fbshipit-source-id: 3078c54a2b44dc00eb4a81b4c657487eeff8c155

0a1801ed

03 Apr, 2023 1 commit

Migrate the binding of FFmpeg utils to PyBind11 (#3228) · 61c31bc0

moto authored Apr 03, 2023

Summary:
Utilities functions are only available to Python, so no need to use TorchBind for them.
This should allow us to remove link-whole flag when linking `libtorchaudio_ffmpeg` part.

Pull Request resolved: https://github.com/pytorch/audio/pull/3228

Reviewed By: nateanl

Differential Revision: D44639560

Pulled By: mthrok

fbshipit-source-id: 5116073ee8c5ab572c63ad123942c4826bfe1100

61c31bc0

17 Mar, 2023 1 commit

Cache HW device context (#3178) · 0c8c138c

moto authored Mar 17, 2023

Summary:
TODO: add cache release

Pull Request resolved: https://github.com/pytorch/audio/pull/3178

Reviewed By: hwangjeff

Differential Revision: D44136275

Pulled By: mthrok

fbshipit-source-id: 4eaf646fe17a469e8bbbdf43441d5532f9f8461d

0c8c138c

08 Feb, 2023 1 commit

Update the guard mechanism for FFmpeg-related features (#3028) · 98b3ac17

moto authored Feb 08, 2023

Summary:
Instead of raising an error when lazy import happens, this method allows to import features, and raises an error when the feature is being used.

This makes it easy to adopt the same error mechanism across different modules. It is how it's done for sox-related features.

Pull Request resolved: https://github.com/pytorch/audio/pull/3028

Reviewed By: xiaohui-zhang

Differential Revision: D42966976

Pulled By: mthrok

fbshipit-source-id: 423dfe0b8a3970cd07f20e841c794c7f2809f993

98b3ac17

30 Jan, 2023 1 commit

Add get_build_config ffmpeg utility function (#3014) · 635d8cff

moto authored Jan 29, 2023

Summary:
We often need to look at which FFmpeg was found and linked when debugging an issue.

Version number is often not enough but there is no easy way to find where the library was found either.

This commit adds utility function that prints the build time configuration.

It helps to distinguish if the linked FFmpeg is the one from binary distribution built in CI or locally built.

Pull Request resolved: https://github.com/pytorch/audio/pull/3014

Reviewed By: hwangjeff

Differential Revision: D42794952

Pulled By: mthrok

fbshipit-source-id: 91ed358fde8cfe9d6d950f34742b1722e729cf4e

635d8cff

06 Jan, 2023 1 commit

Add utility functions to fetch available formats/devices/codecs/protocols. (#2958) · b6d147ad

moto authored Jan 06, 2023

Summary:
This commit adds utility functions that fetch the available/supported formats/devices/codecs.

These functions are mostly same with commands like `ffmpeg -decoders`. But the use of `ffmpeg` CLI can report different resutls if there are multiple installation of FFmpegs. Or, the CLI might not be available.

Pull Request resolved: https://github.com/pytorch/audio/pull/2958

Reviewed By: hwangjeff

Differential Revision: D42371640

Pulled By: mthrok

fbshipit-source-id: 96a96183815a126cb1adc97ab7754aef216fff6f

b6d147ad

03 Oct, 2022 1 commit

Adopt :autosummary: to multiple modules (#2664) · ef1ba56f

moto authored Oct 03, 2022

Summary:
Adopt `:autosummary:` to various modules

    * torchaudio.compliance.kaldi
    * torchaudio.sox_effects
    * torchaudio.utils

Pull Request resolved: https://github.com/pytorch/audio/pull/2664

Reviewed By: nateanl

Differential Revision: D39841873

Pulled By: mthrok

fbshipit-source-id: ff4fa6976324fca5f35b737b715f976e2a722bac

ef1ba56f

27 Jun, 2022 1 commit

Add utility function to fetch FFmpeg library versions (#2467) · 4ba7dc38

moto authored Jun 27, 2022

Summary:
Follow-up of https://github.com/pytorch/audio/issues/2464. Add utility function to fetch the versions of FFmpeg.

Pull Request resolved: https://github.com/pytorch/audio/pull/2467

Reviewed By: carolineechen

Differential Revision: D37028006

Pulled By: mthrok

fbshipit-source-id: 72adce1e6b43985760ce55b715b0e59af5244fdb

4ba7dc38

04 Jun, 2022 1 commit

Make FFmpeg log level configurable (#2439) · 877a88c5

moto authored Jun 03, 2022

Summary:
Undesired logs are one of the loudest UX complains we get.
Yet, loading media files involves uncertainty which is
difficult to debug without debug log.

This commit introduces utility functions to configure logging level
so that we can ask users to enable it when they encounter an issue,
while defaulting to non-verbose option.

Pull Request resolved: https://github.com/pytorch/audio/pull/2439

Reviewed By: hwangjeff, xiaohui-zhang

Differential Revision: D36903763

Pulled By: mthrok

fbshipit-source-id: f4ddd9915b13197c2a2eb97e965005b8b5b8d987

877a88c5