- 29 Dec, 2021 2 commits
-
-
hwangjeff authored
Summary: Regroup RNN-T components under `torchaudio.prototype.models` and `torchaudio.prototype.pipelines`. Updated docs: https://492321-90321822-gh.circle-artifacts.com/0/docs/prototype.html Pull Request resolved: https://github.com/pytorch/audio/pull/2110 Reviewed By: carolineechen, mthrok Differential Revision: D33354116 Pulled By: hwangjeff fbshipit-source-id: 9cf4afed548cb173d56211c16d31bcfa25a8e4cb
-
moto authored
Summary: ### Change list * Split the documentation of prototypes * Add a new API reference section dedicated for prototypes. * Hide the signature of KenLMLexiconDecoder constructor. (cc carolineechen ) * https://489516-90321822-gh.circle-artifacts.com/0/docs/prototype.ctc_decoder.html#torchaudio.prototype.ctc_decoder.KenLMLexiconDecoder * Hide the signature of RNNT constructor. (cc hwangjeff ) * https://489516-90321822-gh.circle-artifacts.com/0/docs/prototype.rnnt.html#torchaudio.prototype.RNNT * Tweak CTC tutorial * Replace hyperlinks to API reference with backlinks * Add `progress=False` to download ### Follow-up RNNT decoder and CTC decode returns their own `Hypothesis` classes. When I tried to add Hypothesis of CTC decode to the documentation, the build process complains that it's ambiguous. I think the Hypothesis classes can be put inside of each decoder. (if TorchScript supports it) or make the name different, but in that case the interface of each Hypothesis has to be generic enough. ### Before https://pytorch.org/audio/main/prototype.html <img width="1390" alt="Screen Shot 2021-12-28 at 1 05 53 PM" src="https://user-images.githubusercontent.com/855818/147594425-6c7f8126-ab76-4edc-a616-a00901e7e9ef.png"> ### After https://489516-90321822-gh.circle-artifacts.com/0/docs/prototype.html <img width="1202" alt="Screen Shot 2021-12-28 at 8 37 35 PM" src="https://user-images.githubusercontent.com/855818/147619281-8152b1ae-e127-40b2-a944-dc11b114b629.png"> https://489516-90321822-gh.circle-artifacts.com/0/docs/prototype.rnnt.html <img width="1415" alt="Screen Shot 2021-12-28 at 8 38 27 PM" src="https://user-images.githubusercontent.com/855818/147619331-077b55b5-c5e9-47ab-bfe6-873e41c738c8.png"> https://489516-90321822-gh.circle-artifacts.com/0/docs/prototype.ctc_decoder.html <img width="1417" alt="Screen Shot 2021-12-28 at 8 39 04 PM" src="https://user-images.githubusercontent.com/855818/147619364-63df3457-a4b2-4223-973f-f4301bd45280.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2108 Reviewed By: hwangjeff, carolineechen, nateanl Differential Revision: D33340816 Pulled By: mthrok fbshipit-source-id: 870edfadbe41d6f8abaf78fdb7017b3980dfe187
-
- 28 Dec, 2021 1 commit
-
-
Caroline Chen authored
Summary: demonstrate usage of the CTC beam search decoder w/ lexicon constraint and KenLM support, on a LibriSpeech sample and using a pretrained wav2vec2 model rendered: https://485200-90321822-gh.circle-artifacts.com/0/docs/tutorials/asr_inference_with_ctc_decoder_tutorial.html follow-ups: - incorporate `nbest` - demonstrate customizability of different beam search parameters Pull Request resolved: https://github.com/pytorch/audio/pull/2106 Reviewed By: mthrok Differential Revision: D33340946 Pulled By: carolineechen fbshipit-source-id: 0ab838375d96a035d54ed5b5bd9ab4dc8d19adb7
-
- 05 Nov, 2021 4 commits
-
-
moto authored
- Add link to index page on left - Package Reference -> API Reference - Update description.
-
moto authored
-
moto authored
-
moto authored
* Refactor tutorial organization * Merge tutorial subdirectoris under to examples/gallery/tutorials * Do not use index.rst generated by Sphinx-gallery * Instead use flat structure so that all the tutorials are listed in left menu * Use `_assets` dir for artifacts of tutorials
-
- 04 Nov, 2021 2 commits
- 02 Nov, 2021 1 commit
-
-
yangarbiter authored
-
- 15 Oct, 2021 1 commit
-
-
moto authored
- Move wav2vec2 pretrained weights to `torchaudio.pipelines` namespace to align with #1872. - Split `Wav2Vec2PretrainedModelBundle` into `Wav2Vec2Bundle` (for pre-training model) and `Wav2Vec2ASRBundle` (for models fine-tuned for ASR). - Update base URL
-
- 06 Oct, 2021 1 commit
-
-
hwangjeff authored
Adds an implementation of Emformer, a memory-efficient transformer architecture introduced in https://ieeexplore.ieee.org/document/9414560 that targets low-latency streaming speech recognition applications.
-
- 19 Aug, 2021 1 commit
-
-
Caroline Chen authored
-
- 18 Aug, 2021 1 commit
-
-
yangarbiter authored
-
- 12 Aug, 2021 1 commit
-
-
yangarbiter authored
-
- 30 Apr, 2021 1 commit
-
-
Caroline Chen authored
Replace the prototype RNNT implementation (using warp-transducer) with one without external library dependencies
-
- 11 Jan, 2021 1 commit
-
-
Vincent QB authored
-
- 19 Oct, 2020 1 commit
-
-
Brian Johnson authored
Adds introductory context and links to the PyTorch Libraries to audio docs.
-
- 20 Jul, 2020 1 commit
-
-
moto authored
- Addresses #549 #638 #786 - Add `torchaudio` top level module doc - Separate `torchaudio` top level module doc from `index.html` - Add `backend` module doc. - Remove `-> None` from function signature as it adds noise to documentation - Changed function argument name of `torchaudio.backend.sox_io_backend.save` from `tensor` to `src`, so that it matches with the reset of backends. - Tweak bunch of docstrings
-
- 16 Jul, 2020 1 commit
-
-
moto authored
* Add sox_utils module * Make init/shutdown thread safe * Add sox effects implementation * Add test for sox effects * Update docstrings and add examples
-
- 01 Aug, 2019 1 commit
-
-
jamarshon authored
-
- 16 Jul, 2019 1 commit
-
-
jamarshon authored
-
- 11 Jul, 2019 1 commit
-
-
jamarshon authored
-
- 22 May, 2019 1 commit
-
-
jamarshon authored
Add Kaldi IO as a dependency + put a wrapper to convert to Tensor + add test to check correct type (#111)
-
- 25 Dec, 2018 1 commit
-
-
David Pollack authored
-
- 18 Dec, 2017 1 commit
-
-
Soumith Chintala authored
-