Commits · eb8e8dc84fa6b3e3174bfdc82b19035a624f7c3d · OpenDAS / Torchaudio

28 Dec, 2021 1 commit

Add Sphinx gallery automatically (#2101) · eb8e8dc8

moto authored Dec 28, 2021

Summary:
This commit updates the documentation configuration so that if an API (function or class) is used in tutorials, then it automatically add the links to the tutorials.

It also adds `py:func:` so that it's easy to jump from tutorials to API reference.

Note: the use of `py:func:` is not required to be recognized by Shpinx-gallery.

* https://482162-90321822-gh.circle-artifacts.com/0/docs/transforms.html#feature-extractions

<img width="776" alt="Screen Shot 2021-12-24 at 12 41 43 PM" src="https://user-images.githubusercontent.com/855818/147367407-cd86f114-7177-426a-b5ee-a25af17ae476.png">

* https://482162-90321822-gh.circle-artifacts.com/0/docs/transforms.html#mvdr

<img width="769" alt="Screen Shot 2021-12-24 at 12 42 31 PM" src="https://user-images.githubusercontent.com/855818/147367422-01fd245f-2f25-4875-a206-910e17ae0161.png">

Pull Request resolved: https://github.com/pytorch/audio/pull/2101

Reviewed By: hwangjeff

Differential Revision: D33311283

Pulled By: mthrok

fbshipit-source-id: e0c124d2a761e0f8d81c3d14c4ffc836ffffe288

eb8e8dc8

23 Dec, 2021 3 commits

Add Python CTC decoder API (#2089) · a76b0066

Caroline Chen authored Dec 23, 2021

Summary:
Part of https://github.com/pytorch/audio/issues/2072 -- splitting up PR for easier review

This PR adds Python decoder API and basic README

Pull Request resolved: https://github.com/pytorch/audio/pull/2089

Reviewed By: mthrok

Differential Revision: D33299818

Pulled By: carolineechen

fbshipit-source-id: 778ec3692331e95258d3734f0d4ab60b6618ddbc

a76b0066

Apply arc lint to pytorch audio (#2096) · 5859923a

Joao Gomes authored Dec 23, 2021

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/2096

run: `arc lint --apply-patches --paths-cmd 'hg files -I "./**/*.py"'`

Reviewed By: mthrok

Differential Revision: D33297351

fbshipit-source-id: 7bf5956edf0717c5ca90219f72414ff4eeaf5aa8

5859923a

Introduce Conformer (#2068) · 1b17b011

hwangjeff authored Dec 22, 2021

Summary:
Adds implementation of Conformer module.

Adapted from sravyapopuri388's implementation for fairseq at https://github.com/fairinternal/fairseq-py/pull/2770.

Pull Request resolved: https://github.com/pytorch/audio/pull/2068

Reviewed By: mthrok

Differential Revision: D33236957

Pulled By: hwangjeff

fbshipit-source-id: 382d99394996ff5249522b5899e1a4b4a95de9e6

1b17b011

24 Nov, 2021 1 commit

Add RNN-T beam search decoder (#2028) · 60a85b50

hwangjeff authored Nov 23, 2021

Summary:
Adds beam search decoder for RNN-T implementation ``torchaudio.prototype.RNNT`` that is TorchScript-able and supports both streaming and non-streaming inference.

Pull Request resolved: https://github.com/pytorch/audio/pull/2028

Reviewed By: mthrok

Differential Revision: D32627919

Pulled By: hwangjeff

fbshipit-source-id: aab99e346d6514a3207a9fb69d4b42978b4cdbbd

60a85b50

23 Nov, 2021 1 commit

Update datasets document (#2029) · 9c9aef88

moto authored Nov 23, 2021

Summary:
- Remove unnecessary content list
- Remove legacy description

Pull Request resolved: https://github.com/pytorch/audio/pull/2029

Reviewed By: carolineechen

Differential Revision: D32629917

Pulled By: mthrok

fbshipit-source-id: bc9a9366c681bcf8b74907c2a6459c73fb6a7424

9c9aef88

18 Nov, 2021 1 commit

Add Emformer RNN-T model (#2003) · 78ce7010

hwangjeff authored Nov 18, 2021

Summary:
Adds streaming-capable recurrent neural network transducer (RNN-T) model that uses Emformer for its transcription network. Includes two factory functions — one that allows for building a custom model, and one that builds a preconfigured base model.

Pull Request resolved: https://github.com/pytorch/audio/pull/2003

Reviewed By: nateanl

Differential Revision: D32440879

Pulled By: hwangjeff

fbshipit-source-id: 601cb1de368427f25e3b7d120e185960595d2360

78ce7010

10 Nov, 2021 1 commit
- [BC-Breaking] Remove deprecated create_fb_matrix (#1998) · 22379d14
  Krishna Kalyan authored Nov 10, 2021
  
  22379d14
05 Nov, 2021 4 commits

Update documentation top page (#1988) · e7ea820e

moto authored Nov 05, 2021

- Add link to index page on left
- Package Reference -> API Reference
- Update description.

e7ea820e

Port MVDR tutorial (#1983) · b9247022
moto authored Nov 05, 2021

b9247022
Port audio manipulation tutorial (#1970) · 8f061987
moto authored Nov 05, 2021

8f061987

Refactor tutorial organization (#1987) · 6cf84866

moto authored Nov 05, 2021

* Refactor tutorial organization

* Merge tutorial subdirectoris under to examples/gallery/tutorials
* Do not use index.rst generated by Sphinx-gallery
* Instead use flat structure so that all the tutorials are listed in left menu
* Use `_assets` dir for artifacts of tutorials

6cf84866

04 Nov, 2021 5 commits

Port TTS tutorial (#1973) · b3c2cfce
moto authored Nov 04, 2021

b3c2cfce
Fix colab URL (#1981) · a6bcd291
moto authored Nov 04, 2021

a6bcd291

Add Colab/Download/Github link similar to tutorials (#1969) · 7c9402f1

moto authored Nov 04, 2021

This commit adds colab/download/source link to tutorials, like in `pytorch/tutorials` repo.

Since the upstream `pytorch-sphinx-theme` does not provide the interface for this,
a hack to overwrite the URL is added.

This hack might stop working if there is some update in `pytorch-sphinx-theme`.

7c9402f1

[DOC] Default to not build gallery (#1977) · 5898edba

moto authored Nov 04, 2021

With the introduction of TTS tutorial (#1973), it takes more than couple of minutes
to build documentation. This commit makes the doc build process defaults to
not build tutorials.

To build tutorials one can use environment variable `BUILD_GALLERY=1`,
and set `GALLERY_PATTERN=...` to filter the tutorials to build.

This `GALLERY_PATTERN` is same approach as in `tutorials` repo.

https://github.com/pytorch/tutorials/blob/cbf2238df0e78d84c15bd94288966d2f4b2e83ae/conf.py#L75-L83

Also this commit dynamically parse the subdirectory of `examples/gallery` so that when a new category of examples are added, it will automatically parsed.

5898edba

Add Sphinx-gallery to doc (#1967) · a3363539
moto authored Nov 04, 2021

a3363539

03 Nov, 2021 1 commit
- Add wav2vec2 ASR English pretrained model from voxpopuli (#1956) · f2eec77b
  moto authored Nov 03, 2021
  
  f2eec77b
02 Nov, 2021 3 commits
- Add citation information in the documentation (#1962) · 8a93717c
  yangarbiter authored Nov 02, 2021
  
  8a93717c
- Add wav2vec2 ASR Italian pretrained model from voxpopuli (#1954) · 5c8541b7
  moto authored Nov 02, 2021
  
  5c8541b7
- Add wav2vec2 ASR German pretrained model from voxpopuli (#1953) · e15431b7
  moto authored Nov 01, 2021
```
* Add wav2vec2 ASR German pretrained model from voxpopuli
```
  e15431b7
29 Oct, 2021 1 commit
- Improve backend and transforms docs (#1944) · 0f8014f5
  Caroline Chen authored Oct 29, 2021
  
  0f8014f5
28 Oct, 2021 1 commit
- Remove F.complex_norm and T.ComplexNorm (#1942) · ab50909d
  S Harish authored Oct 28, 2021
  
  ab50909d
27 Oct, 2021 2 commits
- Remove deprecated F.angle (#1935) · 1d3dcdbd
  S Harish authored Oct 27, 2021
  
  1d3dcdbd
- Add wav2vec2 ASR Spanish pretrained model from voxpopuli (#1924) · 3a599315
  moto authored Oct 26, 2021
  
  3a599315
26 Oct, 2021 1 commit
- Remove deprecated `F.magphase` (#1934) · d35ea80e
  S Harish authored Oct 26, 2021
  
  d35ea80e
25 Oct, 2021 1 commit
- Add pretrained French ASR from voxpopuli (#1919) · cbf267c3
  moto authored Oct 25, 2021
  
  cbf267c3
18 Oct, 2021 2 commits

Update models/pipelines doc (#1894) · 420e84ee

moto authored Oct 18, 2021

1. Override the return type so that Sphinx shows the exported symbols.
   (output model types and input torch.nn.Module)
2. Tweak docs for Tacotron2TTSBundle interfaces
3. Fix for HUBERT_ASR_XLARGE

420e84ee

Update intersphinx inventory (#1893) · 955cdbdc

moto authored Oct 18, 2021

Resolve the following warnings when `make clean html`.

```
parsing bibtex file /torchaudio/docs/source/refs.bib... parsed 26 entries
loading intersphinx inventory from https://docs.python.org/objects.inv...
loading intersphinx inventory from https://docs.scipy.org/doc/numpy/objects.inv...
loading intersphinx inventory from https://pytorch.org/docs/stable/objects.inv...
intersphinx inventory has moved: https://docs.python.org/objects.inv -> https://docs.python.org/3/objects.inv
intersphinx inventory has moved: https://docs.scipy.org/doc/numpy/objects.inv -> https://numpy.org/doc/stable/objects.inv
```

955cdbdc

16 Oct, 2021 1 commit
- Add SpecAugment figure/citation (#1887) · 9e3778d2
  moto authored Oct 16, 2021
  
  9e3778d2
15 Oct, 2021 5 commits
- Add TTS bundle/pipelines (#1872) · e885204e
  moto authored Oct 15, 2021
```
Future work items:
- length computation of GriffinLim
- better way to make InverseMelScale work in inference_mode
```
  e885204e
- Remove factory functions of tacotron2 and wavernn (#1874) · 6b8f378b
  moto authored Oct 15, 2021
  
  6b8f378b
- Add sample rate to Wav2Vec2 bundle (#1878) · 5600bd25
  moto authored Oct 15, 2021
  
  5600bd25
- Put pretrained weights to subsection (#1879) · 6c074666
  moto authored Oct 15, 2021
  
  6c074666
- Move wav2vec2 pretrained models to pipelines module (#1876) · fad855cd
  moto authored Oct 15, 2021
```
- Move wav2vec2 pretrained weights to `torchaudio.pipelines` namespace to align with #1872.
- Split `Wav2Vec2PretrainedModelBundle` into `Wav2Vec2Bundle` (for pre-training model) and  `Wav2Vec2ASRBundle` (for models fine-tuned for ASR).
- Update base URL
```
  fad855cd
08 Oct, 2021 2 commits
- Update Tacotron2 docs (#1840) · 78c382ee
  hwangjeff authored Oct 08, 2021
  
  78c382ee
- Add customization support to wav2vec2 labels (#1834) · fd7fcf93
  moto authored Oct 07, 2021
  
  fd7fcf93
07 Oct, 2021 3 commits

Merge factory functions of pre-training model and fine-tuned model (#1830) · 274ada80

moto authored Oct 07, 2021

This commit merges wav2vec2/hubert factory functions for pre-training and fine-tuning. In #1829, we added parameters to customize the models that are not part of architecture, and `aux_num_out` falls into this category, so it is no longer necessary to have separate functions. This concludes the wav2vec2/HuBERT API update in release 0.10.

The summary of BC-breaking changes on wav2vec2 APIs between 0.9 and 0.10 (when this commit is incorporated)
1. `Wav2Vec2Model.extract_features`
In 0.9, it was returning the output from `FeatureExtractor` module. In 0.10, it returns the list of outputs from the intermediate layers of `TransformerEncoder` block.
2. `wav2vec2_base(num_out: int)` -> `wav2vec2_base(<dropout_params:float>, aux_num_out: Optional[int]=None)`
    - `num_out` was renamed to `aux_num_out` and optional. If it is omitted, the resulting model does not have the linear layer for fine-tuning.
    - Added dropout parameters.

274ada80

[doc] List all the pre-trained models on right bar (#1828) · 60aeb78a
moto authored Oct 07, 2021

60aeb78a

Make the core wav2vec2 factory function public (#1829) · 31a69c36

moto authored Oct 06, 2021

This commit makes the following changes
1. Make the factory function with full customizability public.
    i.e. `_get_model(...) -> wav2vec2_model(...)`.
2. Change the other architecture-specific factory functions so that they accept parameters not related to the model architecture (such as dropout).
    i.e. `wav2vec2_base() -> wav2vec2_base(encoder_projection_dropout, encoder_attention_dropout, encoder_ff_interm_dropout, ...)`

### Why?

While adding the pre-trained weight support, I realized that separating API for model construction and pre-trained support achieves simple code organization because of the good separation of concern. As mentioned in #1821, in this framework,
  1. Model implementation is responsible for computation logic,
  2. factory functions are responsible for customizability and model construction,
  3. and pre-trained weight API is responsible for constructing a model and loading pre-trained weights along with the complementary information (such as pre-processing and class labels).

(note: for simple models, combining 1 and 2 is also okay.)

This means that factory functions has to support all the customizability required by pre-trained weight API. The current implementation uses the internal function like `from .model import Wav2Vec2Model, _get_model`, which is a bit strange.

This PR rectifies it by making the mother factory function public.
This also clarifies the purpose of having the other factory functions as public API, which is just a syntax sugar for constructing un-trained model with specific architecture. So this commit also adds supplemental parameters to them.

31a69c36