Commits · 575d221ec4b394fcc8e5bb8147d0e7ba012afe56 · OpenDAS / Torchaudio

22 Dec, 2021 1 commit

Revert linting exemptions introduced in #2071 (#2087) · 575d221e

Joao Gomes authored Dec 22, 2021

Summary:
After discussing with Moto Hira, we decided to revert linting exemptions
introduced previously in order to keep the entire audio project as formatted
as possible, to reduce the time we spend on formatting discussion.

Pull Request resolved: https://github.com/pytorch/audio/pull/2087

Reviewed By: mthrok

Differential Revision: D33236949

Pulled By: jdsgomes

fbshipit-source-id: e13079f532c4534d8a168059b0ded6fa375fdecf

575d221e

21 Dec, 2021 3 commits

Clean up CTC decoder bynding code (#2092) · 4c2edd21

Moto Hira authored Dec 21, 2021

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2092

Reviewed By: carolineechen

Differential Revision: D33169110

fbshipit-source-id: e422ad93efe50b91f1ac5d572dc82768c1000c05

4c2edd21

Update audio augmentation tutorial (#2082) · 3a03d8c0

moto authored Dec 20, 2021

Summary:
1. Reorder Audio display so that audios are playable from browser in doc
2. Add link to function documentations

https://470342-90321822-gh.circle-artifacts.com/0/docs/tutorials/audio_data_augmentation_tutorial.html

Pull Request resolved: https://github.com/pytorch/audio/pull/2082

Reviewed By: carolineechen

Differential Revision: D33227725

Pulled By: mthrok

fbshipit-source-id: c7ee360b6f9b84c8e0a9b72193b98487d03b57ab

3a03d8c0

Fix load behavior for 24-bit input (#2084) · 4554d242

moto authored Dec 20, 2021

Summary:
## bug description

When a 24 bits-par-sample audio is loaded via file-like object,
the loaded Tensor is wrong. It was fine if the audio is loaded
from local file.

## The cause of the bug

The core of the sox's decoding mechanism is `sox_read` function,
one of which parameter is the maximum number of samples to decode
from the given buffer.

https://fossies.org/dox/sox-14.4.2/formats_8c.html#a2a4f0194a0f919d4f38c57b81aa2c06f)]

The `sox_read` function is called in what is called `drain` effect,
callback and this callback receives output buffer and its size in
byte. The previous implementation passed this size value as
the argument of `sox_read` for the maximum number of samples to
read. Since buffer size is larger than the number of samples fit in
the buffer, `sox_read` function always consumed the entire
buffer. (This behavior is not wrong except when the input is
24 bits-per-sample and file-like object.)

When the input is read from file-like object, inside of drain
callback, new data are fetched via Python's `read` method and
loaded on fixed-size memory region. The size of this memory region
can be adjusted via `torchaudio.utils.sox_utils.set_buffer_size`,
but the default value is 8096.

If the input format is 24 bits-per-sample, the end of memory region
does not necessarily correspond to the end of a valid sample.
When `sox_read` consumes all the data in the buffer region, the data
at the end introduces some unexpected values.
This causes the aforementioned bug

## Fix

Pass proper (better estimated) maximum number of samples decodable to
`sox_read`.

Pull Request resolved: https://github.com/pytorch/audio/pull/2084

Reviewed By: carolineechen

Differential Revision: D33236947

Pulled By: mthrok

fbshipit-source-id: 171d9b7945f81db54f98362a68b20f2f95bb11a4

4554d242

20 Dec, 2021 3 commits

Standardize the location of third-party source code (#2086) · 2476dd2d

moto authored Dec 20, 2021

Summary:
Previously sox-related third-party source code was archived at
`third_party/sox/archives`.
Recently KenLM-related third-party source code was added and
they are archived at `third_party/archives`.

This PR changes the sox archive location to `third_party/archives`,
so that all the archvies are cached at the same location.

Pull Request resolved: https://github.com/pytorch/audio/pull/2086

Reviewed By: carolineechen

Differential Revision: D33236927

Pulled By: mthrok

fbshipit-source-id: 2f2aa5f4b386fefb46d7c98f7179c04995219f3c

2476dd2d

Update URLs for libritts (#2074) · f3f23e42

Joao Gomes authored Dec 20, 2021

Summary:
The urls for this dataset seem to have changed so I am updating to the new location

Pull Request resolved: https://github.com/pytorch/audio/pull/2074

Reviewed By: mthrok

Differential Revision: D33234996

Pulled By: jdsgomes

fbshipit-source-id: e09c35a122e8227fcce7fa97aeeeea312cb89173

f3f23e42

Remove unnecessary sources from KenLM build (#2085) · db5ac7de

moto authored Dec 20, 2021

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2085

Reviewed By: carolineechen

Differential Revision: D33235225

Pulled By: mthrok

fbshipit-source-id: 47fe9ec4c93a26322b3a362202ddd3c4654c3f8c

db5ac7de

18 Dec, 2021 1 commit

Add FL Decoder / KenLM integration to build process (#2078) · 246dd52a

moto authored Dec 18, 2021

Summary:
After all the C++ code from https://github.com/pytorch/audio/issues/2072 are added, this commit will enable decoder/KenLM integration in the build process.

Pull Request resolved: https://github.com/pytorch/audio/pull/2078

Reviewed By: carolineechen

Differential Revision: D33198183

Pulled By: mthrok

fbshipit-source-id: 9d7fa76151d06fbbac3785183c7c2ff9862d3128

246dd52a

17 Dec, 2021 4 commits

Add C++ files for CTC decoder bindings (#2079) · 396d16cb

Caroline Chen authored Dec 17, 2021

Summary:
part of https://github.com/pytorch/audio/issues/2072 -- splitting up the PR for easier review

Add C++ files for binding CTC decoder functionality for Python

Note: the code here will not be compiled until the build process is changed

Pull Request resolved: https://github.com/pytorch/audio/pull/2079

Reviewed By: mthrok

Differential Revision: D33196286

Pulled By: carolineechen

fbshipit-source-id: 9fe4a8635b60ebfb594918bab00f5c3dccf96bd2

396d16cb

Add C++ files for CTC decoder (#2075) · 24baa243

Caroline Chen authored Dec 17, 2021

Summary:
part of https://github.com/pytorch/audio/issues/2072 -- splitting up the PR for easier review

Add C++ files from [flashlight](https://github.com/flashlight/flashlight) that are needed for building CTC decoder w/ Lexicon and KenLM support

Note: the code here will not be compiled until the build process is changed (future PR)

Pull Request resolved: https://github.com/pytorch/audio/pull/2075

Reviewed By: mthrok

Differential Revision: D33186825

Pulled By: carolineechen

fbshipit-source-id: 5b69eea7634f3fae686471d988422942bb784cd9

24baa243

Add static build of KenLM (#2076) · adc559a8

moto authored Dec 17, 2021

Summary:
Add KenLM and its dependencies required for static build (`zlib`, `bzip2`, `lzma` and `boost-thread`).

The KenLM and its dependencies are build but since no corresponding code on torchaudio side is changed, the resulting torchaudio extension module is not changed. (therefore, as long as build process passes on CI this PR should be good to go.)

Pull Request resolved: https://github.com/pytorch/audio/pull/2076

Reviewed By: carolineechen

Differential Revision: D33189980

Pulled By: mthrok

fbshipit-source-id: 6096113128b939f3cf70990c99aacc4aaa954584

adc559a8

Introduce helper function to define extension (#2077) · c02faf04

moto authored Dec 17, 2021

Summary:
Similar to https://github.com/pytorch/audio/issues/2040 this commit refactor the part of the CMakeLists.txt
which defines extension module so that second extension can be added
easily.

Pull Request resolved: https://github.com/pytorch/audio/pull/2077

Reviewed By: carolineechen

Differential Revision: D33189998

Pulled By: mthrok

fbshipit-source-id: dc562ce5360332479a7493c21a2930c6fcc6be84

c02faf04

15 Dec, 2021 1 commit

exlucluding sphinx-gallery examples (#2071) · dba00177

Joao Gomes authored Dec 15, 2021

Summary:
In order to align with the internal configuration and also torchvision we decided to  sphinx-gallery examples from the lint checks .

cc NicolasHug mthrok

Pull Request resolved: https://github.com/pytorch/audio/pull/2071

Reviewed By: NicolasHug

Differential Revision: D33091124

Pulled By: jdsgomes

fbshipit-source-id: ffda2dde9115f0590cbde7785007cf811caca7ef

dba00177

11 Dec, 2021 1 commit

Add CUDA-11.5 builds to torchaudio (#2067) · 0a701058

Andrey Talman authored Dec 10, 2021

Summary:
cc peterjc123 maxluk nbcsm guyang3532 gunandrose4u smartcat2010 mszhanyi

Pull Request resolved: https://github.com/pytorch/audio/pull/2067

Reviewed By: seemethere

Differential Revision: D33032607

Pulled By: atalman

fbshipit-source-id: a5767e9af27690d3a7ab762ddf30178b3069cd35

0a701058

10 Dec, 2021 3 commits

Fix CircleCI test failures (#2069) · 71c2ae77

Zhaoheng Ni authored Dec 10, 2021

Summary:
The unit test failures seems to be caused by [conda 4.11](https://github.com/conda/conda/issues/11096)
Remove conda update line fixes the issue.

Pull Request resolved: https://github.com/pytorch/audio/pull/2069

Reviewed By: carolineechen

Differential Revision: D33023851

Pulled By: nateanl

fbshipit-source-id: 73246189d4ccc541e366a5367f532a5b456af8f8

71c2ae77

Add bucketize sampler and dataset for HuBERT Base model training pipeline (#2000) · ddb9fb5b

nateanl authored Dec 10, 2021

Summary:
The PR adds PyTorch Lightning based training script for HuBERT Base model. There are two iterations of pre-training and 1 iteration of ASR fine-tuning on LibriSpeech dataset.

Pull Request resolved: https://github.com/pytorch/audio/pull/2000

Reviewed By: carolineechen

Differential Revision: D33021467

Pulled By: nateanl

fbshipit-source-id: 77fe5a751943b56b63d5f1fb4e6ef35946e081db

ddb9fb5b

OSS config for lint checks (#2066) · 7d092896

Joao Gomes authored Dec 10, 2021

Summary:
Following up on [this comment ](https://github.com/pytorch/audio/pull/2056#issuecomment-988356439) I am separating the config changes from the formatting.

cc NicolasHug  mthrok

Pull Request resolved: https://github.com/pytorch/audio/pull/2066

Reviewed By: mthrok

Differential Revision: D32990377

Pulled By: jdsgomes

fbshipit-source-id: 67a6251a51901702ad10ae43c35609a09cbf5c5c

7d092896

08 Dec, 2021 1 commit

Add decoder class (#2042) · 34e1d24f

moto authored Dec 07, 2021

Summary:
Part of https://github.com/pytorch/audio/issues/1986. Splitting the PR for easier review.

Add `Decoder` class that manages `AVCodecContext` resource and process input `AVPacket`.
For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md.

Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later.
Needs to be imported after https://github.com/pytorch/audio/issues/2041.

Pull Request resolved: https://github.com/pytorch/audio/pull/2042

Reviewed By: carolineechen

Differential Revision: D32933294

Pulled By: mthrok

fbshipit-source-id: e443debadb44d491462fb641cd5b7b20c413b5b9

34e1d24f

07 Dec, 2021 1 commit

Add wrapper classes that manage memories allocated by ffmpeg (#2041) · e0280cf5

moto authored Dec 07, 2021

Summary:
Part of https://github.com/pytorch/audio/issues/1986. Splitting the PR for easier review.

Add wrapper classes that auto release memories allocated by ffmpeg libraries.
For the overall architecture, see https://github.com/mthrok/audio/blob/ffmpeg/torchaudio/csrc/ffmpeg/README.md.

Note: Without a change to build process, the code added here won't be compiled. The build process will be updated later.
- [x] Needs to be imported after updating TARGETS file.

Pull Request resolved: https://github.com/pytorch/audio/pull/2041

Reviewed By: carolineechen

Differential Revision: D32688964

Pulled By: mthrok

fbshipit-source-id: 165bef5b292dbedae4e9599d53fb2a3f06978db8

e0280cf5

04 Dec, 2021 1 commit

Refactor and functionize the library definition (#2040) · 5a2d114d

moto authored Dec 03, 2021

Summary:
(See https://github.com/pytorch/audio/issues/2038 description for the overall goal.)

This commit turns the part that defines `libtorchaudio` into a function
so that it becomes easy to define libraries in the same way as `libtorchaudio`.

Built on top of https://github.com/pytorch/audio/issues/2039

Pull Request resolved: https://github.com/pytorch/audio/pull/2040

Reviewed By: hwangjeff

Differential Revision: D32851990

Pulled By: mthrok

fbshipit-source-id: a8206c62b076bc0849ada1a66c7502ae5ea35e28

5a2d114d

03 Dec, 2021 5 commits

Add "+cpu" suffix to doc build job (#2060) · 70c7e7e1

moto authored Dec 03, 2021

Summary:
While updating the documentation in release/0.10, a HIP error was raised.
https://app.circleci.com/pipelines/github/pytorch/audio/8577/workflows/02c6ff44-a042-4f9a-8fb8-573a231f60db/jobs/452639

This happens because `pip install torchaudio -f https://...` defaults to
ROCm version while `build_doc` is supposed to pick the CPU version.

Adding suffix `+cpu` should resolve the isssue.

It is validated on https://github.com/pytorch/audio/pull/2060 https://app.circleci.com/pipelines/github/pytorch/audio/8584/workflows/25ae26e5-273f-46f8-805d-ffc7b6b8eb58/jobs/453337

Pull Request resolved: https://github.com/pytorch/audio/pull/2060

Reviewed By: carolineechen

Differential Revision: D32846765

Pulled By: mthrok

fbshipit-source-id: e6b3b32646388b8c4ba864639f8b62d8b9d39844

70c7e7e1

Clean up libtorchaudio customization logic (#2039) · a401dcb8

moto authored Dec 03, 2021

Summary:
(See https://github.com/pytorch/audio/issues/2038 description for the overall goal.)
This PR cleans up CMake customization logic for `libtorchaudio`.

It introduces base variables LIBTORCHAUDIO_INCLUDE_DIRS,
LIBTORCHAUDIO_LINK_LIBRARIES and LIBTORCHAUDIO_COMPILE_DEFINITIONS,
which are respectively used when calling `target_include_directories`,
`target_link_libraries` and `target_compile_definitions`.

The customization logic only modifies these variables.

The original implementation called these functions multiple times
(once par customization logic) and it is getting difficult to understand
the customization logic.

Pull Request resolved: https://github.com/pytorch/audio/pull/2039

Reviewed By: carolineechen, nateanl

Differential Revision: D32683004

Pulled By: mthrok

fbshipit-source-id: 4d41f21692ac139b1185a6ab69eb45d881ee7e73

a401dcb8

Adding warnings in mu_law* for the wrong input type (#2034) · 338d38a2

Joao Gomes authored Dec 03, 2021

Summary:
Addresses  https://github.com/pytorch/audio/issues/1493

cc mthrok hwangjeff

Pull Request resolved: https://github.com/pytorch/audio/pull/2034

Reviewed By: hwangjeff

Differential Revision: D32807006

Pulled By: mthrok

fbshipit-source-id: badf148646c5f768328c5a4e51bd6016b0be46f3

338d38a2

Add training recipe for RNN-T Emformer ASR model (#2052) · 7ac525e7

hwangjeff authored Dec 03, 2021

Summary:
Add training recipe for RNN-T Emformer ASR model to examples directory.

Pull Request resolved: https://github.com/pytorch/audio/pull/2052

Reviewed By: nateanl

Differential Revision: D32814096

Pulled By: hwangjeff

fbshipit-source-id: a5153044efc16cb39f0e6413369a6791637af76a

7ac525e7

improve cuda installation on windows (#2032) · 4b11eee8

Yi Zhang authored Dec 02, 2021

Summary:
1. stop&disable the windows upgrade that's the major reason of the failure of cuda installation
    https://app.circleci.com/pipelines/github/pytorch/audio/8458/workflows/feb65e3b-1093-4724-b849-1a2ac166f354/jobs/441331
     For more details please check out https://github.com/pytorch/pytorch/issues/64536
2. print the log when the cuda installation fails

Pull Request resolved: https://github.com/pytorch/audio/pull/2032

Reviewed By: mthrok

Differential Revision: D32816145

Pulled By: malfet

fbshipit-source-id: 44a2ef0dd4c43469472a6e518ed64841e2dcd5bb

4b11eee8

02 Dec, 2021 1 commit

Refactor the library loading mechanism (#2038) · 9114e636

moto authored Dec 02, 2021

Summary:
(This is a part of refactor series, followed up by https://github.com/pytorch/audio/issues/2039 and https://github.com/pytorch/audio/issues/2040.
The goal is to make it easy to add a new library artifact alongside with `libtorchudio`, as in https://github.com/pytorch/audio/pull/2048/commits/4ced990849e60f6d19e87ae22819b04d1726648e https://github.com/pytorch/audio/issues/2048 .)

We plan to add prototype/beta third party library integrations,
which could be unstable. (segfault, missing dynamic library dependencies etc...)

If we add such integrations into the existing libtorchaudio,
in the worst case, it will prevent users from just `import torchaudio`.

Instead, we would like to separate the prototype/beta integrations
into separate libraries, so that such issues would not impact all users but
users who attempt to use these prototytpe/beta features.

Say, a prototype feature `foo` is added in `torchaudio.prototype.foo`.
The following initialization procedure will achieve the above mechanism.

1. Place the library file `libtorchaudio_foo` in `torchaudio/lib`.
2. In `torchaudio.prototype.foo.__init__.py`, load the `libtorchaudio_foo`.

Note:
The approach will be slightly different for fbcode, because of how buck deploys
C++ libraries and standardized environment, but the code change here is still
applicable.

Pull Request resolved: https://github.com/pytorch/audio/pull/2038

Reviewed By: carolineechen, nateanl

Differential Revision: D32682900

Pulled By: mthrok

fbshipit-source-id: 0f402a92a366fba8c2894a0fe01f47f8cdd51376

9114e636

30 Nov, 2021 2 commits

Revise Griffin-Lim transform test to reduce execution time (#2037) · 96b1fa72

hwangjeff authored Nov 30, 2021

Summary:
Our Griffin-Lim autograd tests take a long time to run. This PR adjusts some parameters to shorten the run time.

For one of the four tests:
Before:
```
test/torchaudio_unittest/transforms/autograd_cpu_test.py . [100%]

======================== 1 passed in 517.35s (0:08:37) =========================
```

After:
```
test/torchaudio_unittest/transforms/autograd_cpu_test.py . [100%]

======================== 1 passed in 104.59s (0:01:44) =========================
```

Pull Request resolved: https://github.com/pytorch/audio/pull/2037

Reviewed By: mthrok

Differential Revision: D32726213

Pulled By: hwangjeff

fbshipit-source-id: c785323ab380aea4b63fb1683b557c8ae842f54e

96b1fa72

Allow whitespace as TORCH_CUDA_ARCH_LIST delimiter (#2050) · e83d4177

moto authored Nov 30, 2021

Summary:
Resolves https://github.com/pytorch/audio/issues/2049, https://github.com/pytorch/audio/issues/1940

Pull Request resolved: https://github.com/pytorch/audio/pull/2050

Reviewed By: nateanl

Differential Revision: D32712513

Pulled By: mthrok

fbshipit-source-id: e1db81786bcca67605ff765d27e0527e20967d1c

e83d4177

24 Nov, 2021 3 commits

improve installing nightly pytorch (#2026) · fce431cd

Yi Zhang authored Nov 24, 2021

Summary:
Similar to https://github.com/pytorch/vision/pull/4788
Make sure the workflow could download right PyTorch with cpu or cuda in case nightly build wasn't ready at that day.

https://app.circleci.com/pipelines/github/pytorch/audio/8427/workflows/11a80738-bcdd-45e3-b37f-328be36c60ee/jobs/438285?invite=true#step-107-542

![image](https://user-images.githubusercontent.com/16190118/142969390-142df7ec-6040-40c1-9a02-17d43f5de05e.png)

Pull Request resolved: https://github.com/pytorch/audio/pull/2026

Reviewed By: hwangjeff, nateanl

Differential Revision: D32634926

Pulled By: mthrok

fbshipit-source-id: 30d6349a0a2ce174b789a5888b1c8e0544a23a37

fce431cd

Update script for getting PR merger and labels (#2030) · 9392c9e0

Caroline Chen authored Nov 23, 2021

Summary:
The previous way of detecting the merger and labels given a commit hash no longer works with ShipIt, as PRs are closed and not merged and are not associated with a commit hash. To work around this, update the script to get the merger (pulled by: ) and PR number from the commit hash message, and then collect labels from the corresponding PR.

Pull Request resolved: https://github.com/pytorch/audio/pull/2030

Reviewed By: mthrok

Differential Revision: D32634870

Pulled By: carolineechen

fbshipit-source-id: a8fcfc5912871d3cca056de43ab25b5d0acb2226

9392c9e0

Add RNN-T beam search decoder (#2028) · 60a85b50

hwangjeff authored Nov 23, 2021

Summary:
Adds beam search decoder for RNN-T implementation ``torchaudio.prototype.RNNT`` that is TorchScript-able and supports both streaming and non-streaming inference.

Pull Request resolved: https://github.com/pytorch/audio/pull/2028

Reviewed By: mthrok

Differential Revision: D32627919

Pulled By: hwangjeff

fbshipit-source-id: aab99e346d6514a3207a9fb69d4b42978b4cdbbd

60a85b50

23 Nov, 2021 2 commits

Update datasets document (#2029) · 9c9aef88

moto authored Nov 23, 2021

Summary:
- Remove unnecessary content list
- Remove legacy description

Pull Request resolved: https://github.com/pytorch/audio/pull/2029

Reviewed By: carolineechen

Differential Revision: D32629917

Pulled By: mthrok

fbshipit-source-id: bc9a9366c681bcf8b74907c2a6459c73fb6a7424

9c9aef88

Temporarily skip threadpool test (#2025) · 05ae795a

moto authored Nov 23, 2021

Summary:
The sox_effects test in `concurrent.future.ThreadPoolExecutor` started failing since couple of days. While investigate this, skipping the test.

Pull Request resolved: https://github.com/pytorch/audio/pull/2025

Reviewed By: nateanl

Differential Revision: D32615933

Pulled By: mthrok

fbshipit-source-id: 4f7301c0d3c0d11f687011e42e06d9c87ce4197f

05ae795a

22 Nov, 2021 3 commits

Relax dtype for MVDR (#2024) · 392a03c8

Zhaoheng Ni authored Nov 22, 2021

Summary:
Allow users to use `torch.cfloat` dtype input for MVDR module. It internally convert the spectrogram into `torch.cdouble` and output the tensor with the original dtype of the spectrogram.

Pull Request resolved: https://github.com/pytorch/audio/pull/2024

Reviewed By: carolineechen

Differential Revision: D32594051

Pulled By: nateanl

fbshipit-source-id: e32609ccdc881b36300d579c90daba41c9234b46

392a03c8

Fix minor typo (#2012) · 358354aa

Albert Villanova del Moral authored Nov 22, 2021

Summary:
Fix minor typo in docs.

Pull Request resolved: https://github.com/pytorch/audio/pull/2012

Reviewed By: nateanl

Differential Revision: D32562618

Pulled By: mthrok

fbshipit-source-id: 79262a14d9b10381249602a63f400232031abaa2

358354aa

Improve MVDR stability (#2004) · fb2f9538

Zhaoheng Ni authored Nov 22, 2021

Summary:
Division first, multiplication second. This helps avoid the value overflow issue. It also helps the ``stv_evd`` solution pass the gradient check.

Pull Request resolved: https://github.com/pytorch/audio/pull/2004

Reviewed By: mthrok

Differential Revision: D32539827

Pulled By: nateanl

fbshipit-source-id: 70a386608324bb6e1b1c7238c78d403698590f22

fb2f9538

19 Nov, 2021 3 commits

Disable SPHINXOPT=-W for local env (#2013) · 3ff46bfa

moto authored Nov 19, 2021

Summary:
With the introduction of tutorials, the turn around time for doc build
has become longer. By default, the tutorial is not built but SPHINXOPT=-W
treats it as error.

This commit disable the option for the local build while keeping it
for the CI.

Pull Request resolved: https://github.com/pytorch/audio/pull/2013

Reviewed By: carolineechen

Differential Revision: D32538952

Pulled By: mthrok

fbshipit-source-id: eae4ffd87100dff466f91abfe26a82aa702d605a

3ff46bfa

Inplace initialisation of RNN weights (#2010) · e076f37c

krishnakalyan3 authored Nov 19, 2021

Summary:
Ref: https://github.com/pytorch/audio/issues/1993

Pull Request resolved: https://github.com/pytorch/audio/pull/2010

Reviewed By: mthrok

Differential Revision: D32539511

Pulled By: nateanl

fbshipit-source-id: e99be963123cbc039d79bdb514450b7e8f5a84fc

e076f37c

Update to xavier_uniform and avoid legacy data.uniform_ initialization (#2018) · b3830bac

krishnakalyan3 authored Nov 19, 2021

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2018

Reviewed By: hwangjeff, mthrok

Differential Revision: D32540498

Pulled By: nateanl

fbshipit-source-id: f104fbf84c0f489906b6555f21f931b60fedb59e

b3830bac

18 Nov, 2021 1 commit

Add Emformer RNN-T model (#2003) · 78ce7010

hwangjeff authored Nov 18, 2021

Summary:
Adds streaming-capable recurrent neural network transducer (RNN-T) model that uses Emformer for its transcription network. Includes two factory functions — one that allows for building a custom model, and one that builds a preconfigured base model.

Pull Request resolved: https://github.com/pytorch/audio/pull/2003

Reviewed By: nateanl

Differential Revision: D32440879

Pulled By: hwangjeff

fbshipit-source-id: 601cb1de368427f25e3b7d120e185960595d2360

78ce7010