Commits · 67cdf8828f9bed16fe0a0c93fccd7cb63e9f10df · OpenDAS / Torchaudio

29 Dec, 2021 2 commits

Reorganize RNN-T components in prototype module (#2110) · 67cdf882

hwangjeff authored Dec 29, 2021

Summary:
Regroup RNN-T components under `torchaudio.prototype.models` and `torchaudio.prototype.pipelines`.

Updated docs: https://492321-90321822-gh.circle-artifacts.com/0/docs/prototype.html

Pull Request resolved: https://github.com/pytorch/audio/pull/2110

Reviewed By: carolineechen, mthrok

Differential Revision: D33354116

Pulled By: hwangjeff

fbshipit-source-id: 9cf4afed548cb173d56211c16d31bcfa25a8e4cb

67cdf882

Update prototype documentations (#2108) · 10cce198

moto authored Dec 28, 2021

Summary:
### Change list

* Split the documentation of prototypes
* Add a new API reference section dedicated for prototypes.
* Hide the signature of KenLMLexiconDecoder constructor. (cc carolineechen )
  * https://489516-90321822-gh.circle-artifacts.com/0/docs/prototype.ctc_decoder.html#torchaudio.prototype.ctc_decoder.KenLMLexiconDecoder
* Hide the signature of RNNT constructor. (cc hwangjeff )
  * https://489516-90321822-gh.circle-artifacts.com/0/docs/prototype.rnnt.html#torchaudio.prototype.RNNT
* Tweak CTC tutorial
  * Replace hyperlinks to API reference with backlinks
  * Add `progress=False` to download

### Follow-up

RNNT decoder and CTC decode returns their own `Hypothesis` classes. When I tried to add Hypothesis of CTC decode to the documentation, the build process complains that it's ambiguous.
I think the Hypothesis classes can be put inside of each decoder. (if TorchScript supports it) or make the name different, but in that case the interface of each Hypothesis has to be generic enough.

### Before

https://pytorch.org/audio/main/prototype.html

<img width="1390" alt="Screen Shot 2021-12-28 at 1 05 53 PM" src="https://user-images.githubusercontent.com/855818/147594425-6c7f8126-ab76-4edc-a616-a00901e7e9ef.png">

### After

https://489516-90321822-gh.circle-artifacts.com/0/docs/prototype.html

<img width="1202" alt="Screen Shot 2021-12-28 at 8 37 35 PM" src="https://user-images.githubusercontent.com/855818/147619281-8152b1ae-e127-40b2-a944-dc11b114b629.png">

https://489516-90321822-gh.circle-artifacts.com/0/docs/prototype.rnnt.html

<img width="1415" alt="Screen Shot 2021-12-28 at 8 38 27 PM" src="https://user-images.githubusercontent.com/855818/147619331-077b55b5-c5e9-47ab-bfe6-873e41c738c8.png">

https://489516-90321822-gh.circle-artifacts.com/0/docs/prototype.ctc_decoder.html

<img width="1417" alt="Screen Shot 2021-12-28 at 8 39 04 PM" src="https://user-images.githubusercontent.com/855818/147619364-63df3457-a4b2-4223-973f-f4301bd45280.png">

Pull Request resolved: https://github.com/pytorch/audio/pull/2108

Reviewed By: hwangjeff, carolineechen, nateanl

Differential Revision: D33340816

Pulled By: mthrok

fbshipit-source-id: 870edfadbe41d6f8abaf78fdb7017b3980dfe187

10cce198

28 Dec, 2021 1 commit

Add ASR CTC inference tutorial (#2106) · 133d0065

Caroline Chen authored Dec 28, 2021

Summary:
demonstrate usage of the CTC beam search decoder w/ lexicon constraint and KenLM support, on a LibriSpeech sample and using a pretrained wav2vec2 model

rendered: https://485200-90321822-gh.circle-artifacts.com/0/docs/tutorials/asr_inference_with_ctc_decoder_tutorial.html

follow-ups:
- incorporate `nbest`
- demonstrate customizability of different beam search parameters

Pull Request resolved: https://github.com/pytorch/audio/pull/2106

Reviewed By: mthrok

Differential Revision: D33340946

Pulled By: carolineechen

fbshipit-source-id: 0ab838375d96a035d54ed5b5bd9ab4dc8d19adb7

133d0065

05 Nov, 2021 4 commits

Update documentation top page (#1988) · e7ea820e

moto authored Nov 05, 2021

- Add link to index page on left
- Package Reference -> API Reference
- Update description.

e7ea820e

Port MVDR tutorial (#1983) · b9247022
moto authored Nov 05, 2021

b9247022
Port audio manipulation tutorial (#1970) · 8f061987
moto authored Nov 05, 2021

8f061987

Refactor tutorial organization (#1987) · 6cf84866

moto authored Nov 05, 2021

* Refactor tutorial organization

* Merge tutorial subdirectoris under to examples/gallery/tutorials
* Do not use index.rst generated by Sphinx-gallery
* Instead use flat structure so that all the tutorials are listed in left menu
* Use `_assets` dir for artifacts of tutorials

6cf84866

04 Nov, 2021 2 commits
- Port TTS tutorial (#1973) · b3c2cfce
  moto authored Nov 04, 2021
  
  b3c2cfce
- Add Sphinx-gallery to doc (#1967) · a3363539
  moto authored Nov 04, 2021
  
  a3363539
02 Nov, 2021 1 commit
- Add citation information in the documentation (#1962) · 8a93717c
  yangarbiter authored Nov 02, 2021
  
  8a93717c
15 Oct, 2021 1 commit

Move wav2vec2 pretrained models to pipelines module (#1876) · fad855cd

moto authored Oct 15, 2021

- Move wav2vec2 pretrained weights to `torchaudio.pipelines` namespace to align with #1872.
- Split `Wav2Vec2PretrainedModelBundle` into `Wav2Vec2Bundle` (for pre-training model) and  `Wav2Vec2ASRBundle` (for models fine-tuned for ASR).
- Update base URL

fad855cd

06 Oct, 2021 1 commit

Introduce Emformer (#1801) · 48cfbf2b

hwangjeff authored Oct 06, 2021

Adds an implementation of Emformer, a memory-efficient transformer architecture
introduced in https://ieeexplore.ieee.org/document/9414560 that targets low-latency
streaming speech recognition applications.

48cfbf2b

19 Aug, 2021 1 commit
- Move RNNT Loss out of prototype (#1711) · 2c115821
  Caroline Chen authored Aug 19, 2021
  
  2c115821
18 Aug, 2021 1 commit
- Move Tacotron2 out of prototype (#1714) · 352d63c5
  yangarbiter authored Aug 17, 2021
  
  352d63c5
12 Aug, 2021 1 commit
- Add prototype.tacotron2 page to docs (#1695) · 9c641849
  yangarbiter authored Aug 12, 2021
  
  9c641849
30 Apr, 2021 1 commit

Replace existing prototype RNNT Loss (#1479) · 0c263a93

Caroline Chen authored Apr 30, 2021

Replace the prototype RNNT implementation (using warp-transducer) with one without external library dependencies

0c263a93

11 Jan, 2021 1 commit
- add doc for rnnt loss (#1171) · b57f05c4
  Vincent QB authored Jan 11, 2021
  
  b57f05c4
19 Oct, 2020 1 commit

Update index.rst (#968) · ba1698ba

Brian Johnson authored Oct 19, 2020

Adds introductory context and links to the PyTorch Libraries to audio docs.

ba1698ba

20 Jul, 2020 1 commit

Update documentation and fix docstrings (#788) · 2381dd89

moto authored Jul 20, 2020

- Addresses #549 #638 #786 
- Add `torchaudio` top level module doc
- Separate `torchaudio` top level module doc from `index.html`
- Add `backend` module doc.
- Remove `-> None` from function signature as it adds noise to documentation
- Changed function argument name of `torchaudio.backend.sox_io_backend.save` from `tensor` to `src`, so that it matches with the reset of backends.
- Tweak bunch of docstrings

2381dd89

16 Jul, 2020 1 commit

Add Torchscript sox effects (#760) · 60a8e23d

moto authored Jul 15, 2020

* Add sox_utils module

* Make init/shutdown thread safe

* Add sox effects implementation

* Add test for sox effects

* Update docstrings and add examples

60a8e23d

01 Aug, 2019 1 commit
- Removal of torchaudio.legacy · d8a47f4a
  jamarshon authored Aug 01, 2019
  
  d8a47f4a
16 Jul, 2019 1 commit
- torch.functional Docs (#140) · 0902494e
  jamarshon authored Jul 16, 2019
  
  0902494e
11 Jul, 2019 1 commit
- Add Kaldi docs (#136) · 48707255
  jamarshon authored Jul 11, 2019
  
  48707255
22 May, 2019 1 commit
- Add Kaldi IO as a dependency + put a wrapper to convert to Tensor + add test... · a422f3fe
  jamarshon authored May 22, 2019
```
Add Kaldi IO as a dependency + put  a wrapper to convert to Tensor + add test to check correct type (#111)
```
  a422f3fe
25 Dec, 2018 1 commit
- sox effects and documentation · 301e2e98
  David Pollack authored Sep 11, 2018
  
  301e2e98
18 Dec, 2017 1 commit
- improve README and add sphinx docs generator · 088d5674
  Soumith Chintala authored Dec 17, 2017
  
  088d5674