Commits · d98c8847f9950c6268a94e2e2a6027ca6456cf6d · OpenDAS / Torchaudio

27 Sep, 2021 1 commit

Enable audio windows cuda tests (#1777) · d98c8847

Yi Zhang authored Sep 28, 2021

* enable windows cudatests

* add this dir

* minor change

* vs integration

* Update cuda_install.bat

* add logs

* minor change

* minor change

* cp vision conda activate

* mv vc_env_helper.bat

* minor change

* exit if cuda not avaiable

* install numpy

* improt CMakeLists

* check cuda

* minor change

* change windows GPU image from previous to stable

* set libtorch audio suffix as pyd on Windows

* reduce changes

* check env settings

d98c8847

26 Sep, 2021 1 commit
- Add equations to MVDR docstring (#1789) · b6a0434a
  nateanl authored Sep 26, 2021
  
  b6a0434a
25 Sep, 2021 1 commit
- [doc] Fix return type of wav2vec2 model (#1790) · 78d41d57
  moto authored Sep 25, 2021
  
  78d41d57
24 Sep, 2021 4 commits

[BC-Breaking] Split pretraining and finetuning factory functions (#1783) · b2e9f1e4

moto authored Sep 24, 2021

* [BC-Breaking] Split pretraining and finetuning factory functions

Previously, factory functions of wav2vec2 only generated the architecture
for the fine-tuning architecture used in wav2ve2 paper for ASR task.
That is, pre-training architecture + Linear module, and it did not
provide a straightforward way to generate architectures for pre-training.

The goal of the original implementation was to allow the inference of
wav2vec2 in non-Python environment via TorchScript. Now we would like to
expand it to pre-training/fine-tuning and HuBERT model as well.

Therefore, we need to have factory functions for both pre-training and
fine-tuning. This commit introduces new factory functions and separate
functions for pre-training and fine-tuning.

1. New functions for ASR fine-tuning.

We introdcue `wav2vec2_asr_XXX` functions which generates the architecture
used for the fine-tuning task in wav2vec2 paper. *1

2. Re-purpse the old functions

The existing functions, `wav2vec2_XXX`, now generates the architecture with
pre-trainig module only. (no Linear module)

Note
*1 This architecture is just one way to define architecture for fine-tuning
and it is not universal definition. The new `wav2vec2_asr_XXX` functions are
designed to provide these specific fine-tuning configuration and they are not
meant to support generic architecture for downstream task.

b2e9f1e4

Fix build on Windows with CUDA (#1787) · cf0adb28
Yi Zhang authored Sep 24, 2021
```
This commit fixes the local build on Windows with CUDA.
```
cf0adb28
Add MVDR beamforming tutorial to example directory (#1768) · 8d83a2f4
nateanl authored Sep 24, 2021

8d83a2f4
set libtorch audio suffix as pyd on Windows (#1788) · 56a010b0
Yi Zhang authored Sep 24, 2021

56a010b0

23 Sep, 2021 1 commit
- update win gpu image from previous to stable (#1786) · c69955c6
  Yi Zhang authored Sep 23, 2021
  
  c69955c6
22 Sep, 2021 3 commits

[BC-Breaking] Move fine-tune specific module out of wav2vec2 encoder (#1782) · 40f2a085

moto authored Sep 22, 2021

Previously, the Linear module (called `readout`, which is used only for an ASR fine-tuning
task) was placed in encoder module. Conceptually, the encoder has nothing to
do with a module specific to fine-tuning / downstream task.

The problems here are that;
1. encoder can be also used in pre-training phase, in which such a module should
not present
2. The choice of Linear module is arbitral, and it is inconvenient for users
to have hard-coded module structure in encoder.

Therefore, this commit moves the Linear module out the encoder, and places it
as `aux` attribute of `Wav2Vec2Model`. (as a result `Wav2Vec2Model` has
`feature_extractor`, `encoder` and `aux` attributes.)

An alternative approach is to define another module and place `Wav2Vec2Model`
and aux module along each other. But that will introduce a new class we need
to maintain.
The expected use of `aux` is only  for 1. loading the pre-trained parameters 
published by `fairseq` (and it's variations from HF) and 2. creating the same model 
architectures for comparison experiment.
The newly introduced class will not be general enough for downstream adaptations, 
where there will be a bunch of different more complicated models. (i.e. s3prl)

Therefore, based on the minimalistic approach, we put them inside of `Wav2Vec2Model`.

40f2a085

Fix HF model integration (#1781) · e9cab8f8

moto authored Sep 22, 2021

* Fix HF model integration

Previously, when testing wav2vec models from HF transformers, all the model were
instantiated as `Wav2Vec2ForCTC` class, while some of them were supposed to be
`Wav2Vec2Model`.

Fixing this revealed that model importer cannot correctly handle `Wav2Vec2Model` import.

This PR fixes these issues.

e9cab8f8

Update reference from master to main elsewhere (#1784) · 1b4b82e0

moto authored Sep 22, 2021



Summary: Update fairseq reference from master to main elsewhere

Reviewed By: alexeib

Differential Revision: D30938472

fbshipit-source-id: 243b98550207f241c9d3265bf3d4060350aaf0a8
Co-authored-by: Diana Liskovich <dianaml@fb.com>

1b4b82e0

21 Sep, 2021 1 commit

Tweak test name by appending factory function name (#1780) · 5aedcab3

moto authored Sep 21, 2021

Apply tweak around the test names so that it's easier to see which tests are failing.

Before: `test_import_finetuned_model_2`
After: `test_import_finetuned_model_2_wav2vec2_large_lv60k`

5aedcab3

20 Sep, 2021 3 commits

[BC-Breaking] Update `extract_features` of Wav2Vec2Model (#1776) · 78b08c26

moto authored Sep 20, 2021

* [BC-Breaking] Update `extract_features` of Wav2Vec2Model

Originally, `extract_features` method was returning the result from
the convolutional feature extractor module.

The features commonly used in downstream tasks are outputs from intermediate
layers of transformer block in encoder.

This commit update the behavior of `extract_features` to allow selectively
retrieve such features.

78b08c26

Put libtorchaudio in lib directory (#1773) · 599a82b7

moto authored Sep 20, 2021

Make the structure of library files somewhat similar to PyTorch core, which has the following pattern

```
torch/_C.so
torch/lib/libc10.so
torch/lib/libtorch.so
...
```

```
torchaudio/_torchaudio.so
torchaudio/lib/libtorchaudio.so
```

599a82b7

Move MVDR and PSD modules to transforms (#1771) · ac97ad82
nateanl authored Sep 20, 2021

ac97ad82

17 Sep, 2021 3 commits
- [DOC] Fix model subsections (#1775) · 88ca1e05
  moto authored Sep 17, 2021
  
  88ca1e05
- Fix typo in source separation README (#1774) · 19b21010
  moto authored Sep 17, 2021
  
  19b21010
- Refactor batch consistency test in transforms (#1772) · b40aee5a
  nateanl authored Sep 17, 2021
  
  b40aee5a
16 Sep, 2021 1 commit

Split extension into custom impl and Python wrapper libraries (#1752) · 0f822179

moto authored Sep 16, 2021

* Split `libtorchaudio` and `_torchaudio`

This change extract the core implementation from `_torchaudio` to `libtorchaudio`,
so that `libtorchaudio` is reusable in TorchScript-based app.

`_torchaudio` is a wrapper around `libtorchaudio` and only provides PyBind11-based
features. (currently file-like object support in I/O)

* Removed `BUILD_LIBTORCHAUDIO` option

When invoking `cmake`, `libtorchaudio` is always built, so this option is removed.

The new assumptions around the library discoverability

- In regular OSS workflow (`pip`/`conda`-based binary installation), both `libtorchaudio` and `_torchaudio` are present.
    In this case,`libtorchaudio` has to be loaded manually with `torch.ops.load_library` and/or `torch.classes.load_library` otherwise importing `_torchaudio` would not be able to resolve the symbols defined in `libtorchaudio`.
- When `torchaudio` is deployed with PEX format (single zip file)
  - We expect that`libtorchaudio.so` exists as a file in some search path configured by client code.
  - `_torchaudio` is still importable and because we do not know where `libtorchaudio` will exist, we will let the dynamic loader resolve the dependency from `_torchaudio` to `libtorchaudio`, which should work as long as `libtorchaudio` is in a library search path (search path is not modifiable from already-running Python process).

0f822179

15 Sep, 2021 4 commits

Migrate CircleCI docker image (#1767) · 0d007b7d

moto authored Sep 15, 2021

CCI says circleci/python:3.8 is deprecaetd. Migrating to cimg/python:3.8.

> CircleCI’s latest pre-built container images were designed from the ground up to help your team build more reliably.
> Our new images were built specifically for continuous integration projects and they are our most deterministic, performant, and efficient images yet.

> As of Dec 31, 2021, legacy images will no longer be supported on CircleCI.

Related links

- https://circleci.com/blog/announcing-our-next-generation-convenience-images-smaller-faster-more-deterministic
- https://discuss.circleci.com/t/legacy-convenience-image-deprecation/41034
- https://circleci.com/docs/2.0/next-gen-migration-guide

0d007b7d

Fix compile warnings (#1762) · 712c5c67

moto authored Sep 15, 2021

* Fix comparison between signed and unsigned integer expressions
* Remove unused variable

712c5c67

[fbsync] Style fixes (#1766) · 27e6779c
hwangjeff authored Sep 15, 2021
```
Applies style fixes identified in fbcode.
```
27e6779c
Add normalization to steering vector solutions in MVDR Module (#1765) · b00bacf7
nateanl authored Sep 15, 2021

b00bacf7

13 Sep, 2021 1 commit

[ROCM] fix build error (#1729) · ddb04e7d

Michael Melesse authored Sep 13, 2021



* fix build error on ROCM

* Update CMakeLists.txt
Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>

* address comments and fix cuda detction on rocm
Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>

ddb04e7d

07 Sep, 2021 2 commits

Update the way to access libsox global config (#1755) · e11d27ce

moto authored Sep 07, 2021

* Update the way to access libsox global config

Preparation for splitting `libtorchaudio` and `_torchaudio`.

When two libraries are compiled separately, and each code does
`#include <sox.h>` independently, two copies of libsox's static global
variables (`sox_globals_t`) are created.

Our code should be referring to the same instance. To achieve this,
`_torchaudio` should be accessing the global variable defined in
`libtorchaudio` via the custom utility functions, and it should not
directly use `sox_get_globals`.

e11d27ce

Extract PyBind11 feature implementations (#1739) · 2a67fcc1

moto authored Sep 07, 2021

This PR moves the code related to PyBind11 to the dedicated directory `torchaudio/csrc/pybind`.

Before, features related to PyBind11 (I/O for file-like object) was implemented in `torchaudio/csrc/sox` and the binding was defined in `torchaudio/csrc/pybind.cpp`. We used macro definition `TORCH_API_INCLUDE_EXTENSION_H` to turn on/off the feature, in addition to including/excluding `torchaudio/csrc/pybind.cpp` in the list of compiled source.

In the previous manner, in C++ example, one had to rebuild libtorchaudio separately, but by splitting them completely at compile time, it should conceptually possible to distribute libtorchaudio within torchaudio Python package and reuse it for C++ example.

2a67fcc1

02 Sep, 2021 6 commits
- Put output tensor on proper device in `get_whitenoise()` (#1744) · feede97e
  jayleverett authored Sep 02, 2021
```
* put output tensor on device in `get_whitenoise()`

* Update `get_spectrogram()` so that window uses same device as waveform

* put window on proper device in `test_griffinlim()`
```
  feede97e
- Standardize optional types in docstrings (#1746) · 768432c3
  Caroline Chen authored Sep 02, 2021
  
  768432c3
- Add link to TTS colab example to README (#1748) · d9bfb708
  yangarbiter authored Sep 02, 2021
  
  d9bfb708
- Upload wheels to respective folders (#1751) · 98435e59
  Nikita Shulga authored Sep 02, 2021
```
Cherry-pick of
https://github.com/pytorch/vision/commit/d2460a75de237cfef8e5c3415f7bb0ad8467c0e5
into this repo

Fixes https://github.com/pytorch/audio/issues/1750
```
  98435e59
- Build torchaudio for 11.3 as well (#1747) · a164477d
  Nikita Shulga authored Sep 02, 2021
  
  a164477d
- Update the version of fairseq used for testing (#1745) · 7e6e778b
  moto authored Sep 02, 2021
  
  7e6e778b
01 Sep, 2021 1 commit
- Add edit_distance to documentation with a new category Metric (#1743) · d579d4b2
  yangarbiter authored Sep 01, 2021
  
  d579d4b2
31 Aug, 2021 4 commits

Fix WaveRNN training example (#1740) · b4553de5
yangarbiter authored Aug 31, 2021

b4553de5
Increase no_output_timeout to 20m for WinConda (#1738) · 29c35626
Nikita Shulga authored Aug 31, 2021
```
Downloading CUDA-11.1 from conda-forge takes a while
```
29c35626

Enable Linux wheel/conda GPU package builds (#1730) · 0f844305

Nikita Shulga authored Aug 31, 2021

* Remove some obsolete conditions about CUDA-10.0 from `pkg_helpers.bash`.
* Use `USE_CUDA` instead of `FORCE_CUDA` in `pkg_helpers.bash`
* Do not define `NO_CUDA_PACKAGE` in build_wheel.sh and build_cuda.sh
* Add conda-forge for Win cuda-11.1 builds
* Pass USE_CUDA / TORCH_CUDA_ARCH_LIST to conda build
* Add selected CUDA to path
* Don't define USE_CUDA for ROCM

TODO: Fix Windows CUDA builds

0f844305

Fix CUDA build logic for _torchaudio.so (#1737) · e3c082b7

Nikita Shulga authored Aug 31, 2021

It's wrong to depend on `${TORCH_LIBRARIES}` as it pulls in explicit
`libcuda.so.1` dependency, which violates the assumption that GPU
accelerated libraries should be loadable with no NVIDIA drivers installed

Instead, make it depend on `torch` target, which includes all necessary
Torch C++ API dependences

e3c082b7

30 Aug, 2021 3 commits

Simplify the extension initialization process (#1734) · e8cc7f91

moto authored Aug 30, 2021

Calling `torch.[ops|classes].load_library(<PATH_TO_LIBRARY_FILE>)` is problematic in case `torchaudio`
is deployed with PEX format, because the library file does not exist as a file.

Our extension module, when it exists, is guaranteed to have PyBind11 binding even when no function is bound.
This allows to load the library using the regular `import` statement in Python, and it works even in PEX format.
When the library is loaded, the static initialization kicks in and the custom kernels bound via TorchScript also
become available. This removes the need to call `torch.[ops|classe].load_library`.

This works even when the implementation of custom kernel is stripped from `_torchaudio.so` so long as `_torchaudio.so` properly depend on the library that has the kernel implementations and static initialization.

e8cc7f91

Use `at::parallel_for` in lfilter core loop (#1557) · a525abbc
Chin-Yun Yu authored Aug 30, 2021

a525abbc

setup.py should parse TORCH_CUDA_ARCH_LIST (#1733) · 8cbd56c2

Nikita Shulga authored Aug 29, 2021

Needed to support CUDA builds on CPU machine

Parse `TORCH_CUDA_ARCH_LIST` as new-CUDA-language Cmake-3.18+ style [CMAKE_CUDA_ARCHITECTURES](https://cmake.org/cmake/help/latest/prop_tgt/CUDA_ARCHITECTURES.html#prop_tgt:CUDA_ARCHITECTURES)

8cbd56c2