- 08 Nov, 2022 1 commit
-
-
hwangjeff authored
Summary: Adds `torch.nn.Module`-based implementations for convolution and FFT convolution. Pull Request resolved: https://github.com/pytorch/audio/pull/2811 Reviewed By: carolineechen Differential Revision: D40881937 Pulled By: hwangjeff fbshipit-source-id: bfe8969e6178ad4f58981efd4b2720ac006be8de
-
- 02 Nov, 2022 1 commit
-
-
moto authored
Summary: <img width="756" alt="Screen Shot 2022-11-01 at 3 32 58 PM" src="https://user-images.githubusercontent.com/855818/199173348-f463ae71-438c-4dad-a481-b65522a8e52f.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2812 Reviewed By: carolineechen Differential Revision: D40919942 Pulled By: mthrok fbshipit-source-id: 18e5a709c262fb0b15ada0d303f1d0dee033beb1
-
- 28 Oct, 2022 1 commit
-
-
moto authored
Summary: This commit re-organizes the tutorials. 1. Put all the tutorials in the left bar and make the section **folded by default**. 2. Add pytorch/tutorials-like cards in index 3. Move feature classifications to a dedicated page. https://output.circle-artifacts.com/output/job/1f1a04a5-137e-428d-9da4-c46f59eeffa4/artifacts/0/docs/index.html <img width="1073" alt="Screen Shot 2022-10-28 at 7 34 29 AM" src="https://user-images.githubusercontent.com/855818/198410686-3ef40ad2-c9c9-443c-800e-6e51e1b6a491.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2767 Reviewed By: carolineechen Differential Revision: D40627547 Pulled By: mthrok fbshipit-source-id: 098b825f242e91919126014abdab27852304ae64
-
- 23 Sep, 2022 1 commit
-
-
moto authored
Summary: Since that new tutorials for StreamWriter are being added, there are more tutorials for media IO than the rest. So this commit introduces sub-index for IO tutorials. Pull Request resolved: https://github.com/pytorch/audio/pull/2703 Reviewed By: carolineechen Differential Revision: D39769049 Pulled By: mthrok fbshipit-source-id: 19a3981bc624fdce1d5d703c67e28a751a15e812
-
- 15 Sep, 2022 1 commit
-
-
moto authored
Summary: Preparation for the adoptation of `autosummary`. Replace `:footcite:` with `:cite:` and introduce dedicated reference page, as `:footcite:` does not work well with `autosummary`. Example: https://output.circle-artifacts.com/output/job/4da47ba6-d9c7-418e-b5b0-e9f8a146a6c3/artifacts/0/docs/datasets.html#cmuarctic https://output.circle-artifacts.com/output/job/4da47ba6-d9c7-418e-b5b0-e9f8a146a6c3/artifacts/0/docs/references.html Pull Request resolved: https://github.com/pytorch/audio/pull/2676 Reviewed By: carolineechen Differential Revision: D39509431 Pulled By: mthrok fbshipit-source-id: e6003dd01ec3eff3d598054690f61de8ee31ac9a
-
- 15 Aug, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: `ctc_decoder` has become beta, remove it from prototype documents. Pull Request resolved: https://github.com/pytorch/audio/pull/2617 Reviewed By: hwangjeff Differential Revision: D38706869 Pulled By: nateanl fbshipit-source-id: 41679f4e65a584b6b882af4551a50123f1dcef02
-
- 05 Aug, 2022 1 commit
-
-
hwangjeff authored
Summary: Adds functions `convolve` and `fftconvolve`, which compute the convolution of two tensors along their trailing dimension. The former performs the convolution directly, whereas the latter performs it using FFT. Pull Request resolved: https://github.com/pytorch/audio/pull/2602 Reviewed By: nateanl, mthrok Differential Revision: D38450771 Pulled By: hwangjeff fbshipit-source-id: b2d1e063ba21eafeddf317d60749e7120b14292b
-
- 28 Jul, 2022 1 commit
-
-
Sean Kim authored
Summary: Add tutorial python file, draft PR, will continue to modify accordingly to feedback. Future plan: modify spectrogram and bottom audio design and work on finding best audio track and segments Pull Request resolved: https://github.com/pytorch/audio/pull/2572 Reviewed By: carolineechen, nateanl, mthrok Differential Revision: D38234001 Pulled By: skim0514 fbshipit-source-id: fe9207864f354dec5cf5ff52bf7d9ddcf4a001d5
-
- 08 Jun, 2022 1 commit
-
-
moto authored
Summary: The Streaming API tutorial has gotten long, so this commit split it into two. Pull Request resolved: https://github.com/pytorch/audio/pull/2446 Reviewed By: hwangjeff Differential Revision: D36987513 Pulled By: mthrok fbshipit-source-id: 13e3aad74c0d0e654c39c0eeceffca1a00b0dac4
-
- 01 Jun, 2022 1 commit
-
-
Caroline Chen authored
Summary: Move CTC beam search decoder out of prototype to new `torchaudio.models.decoder` module. hwangjeff mthrok any thoughts on the new module + naming, and if we should move rnnt beam search here as well?? Pull Request resolved: https://github.com/pytorch/audio/pull/2410 Reviewed By: mthrok Differential Revision: D36784521 Pulled By: carolineechen fbshipit-source-id: a2ec52f86bba66e03327a9af0c5df8bbefcd67ed
-
- 20 May, 2022 1 commit
-
-
moto authored
Summary: This commit adds tutorial to enable/use NVDEC with Stream API. https://output.circle-artifacts.com/output/job/19e66a4b-1819-4804-8834-d38e6c80c4fd/artifacts/0/docs/hw_acceleration_tutorial.html Because the use of NVDEC requires build / install FFmpeg from source, this tutorial was authored on Google Colab, tailored to its environment. The tutorial here is the result of the notebook execution, with the link to the publicly accessible Google Colab notebook. Pull Request resolved: https://github.com/pytorch/audio/pull/2393 Reviewed By: hwangjeff Differential Revision: D36404408 Pulled By: mthrok fbshipit-source-id: 9c820d3db4d06c5b343ecad0708489125ca06948
-
- 13 May, 2022 1 commit
-
-
moto authored
Summary: This commit moves the Streaming API out of prototype module. * The related classes are renamed as following - `Streamer` -> `StreamReader`. - `SourceStream` -> `StreamReaderSourceStream` - `SourceAudioStream` -> `StreamReaderSourceAudioStream` - `SourceVideoStream` -> `StreamReaderSourceVideoStream` - `OutputStream` -> `StreamReaderOutputStream` This change is preemptive measurement for the possibility to add `StreamWriter` API. * Replace BUILD_FFMPEG build arg with USE_FFMPEG We are not building FFmpeg, so USE_FFMPEG is more appropriate --- After https://github.com/pytorch/audio/issues/2377 Remaining TODOs: (different PRs) - [ ] Introduce `is_ffmpeg_binding_available` function. - [ ] Refactor C++ code: - Rename `Streamer` to `StreamReader`. - Rename `streamer.[h|cpp]` to `stream_reader.[h|cpp]`. - Rename `prototype.cpp` to `stream_reader_binding.cpp`. - Introduce `stream_reader` directory. - [x] Enable FFmpeg in smoke test (https://github.com/pytorch/audio/issues/2381) Pull Request resolved: https://github.com/pytorch/audio/pull/2378 Reviewed By: carolineechen Differential Revision: D36359299 Pulled By: mthrok fbshipit-source-id: 6a57b702996af871e577fb7addbf3522081c1328
-
- 12 Apr, 2022 1 commit
-
-
hwangjeff authored
Summary: Adds Conformer RNN-T model as prototype feature, by way of factory functions `conformer_rnnt_model` and `conformer_rnnt_base`, which instantiates a baseline version of the model. Also includes the following: - Modifies `Conformer` to accept arguments `use_group_norm` and `convolution_first` to pass to each of its `ConformerLayer` instances. - Makes `_Predictor` an abstract class and introduces `_EmformerEncoder` and `_ConformerEncoder`. - Introduces tests for `conformer_rnnt_model`. - Adds docs. Pull Request resolved: https://github.com/pytorch/audio/pull/2322 Reviewed By: xiaohui-zhang Differential Revision: D35565987 Pulled By: hwangjeff fbshipit-source-id: cb37bb0477ae3d5fcf0b7124f334f4cbb89b5789
-
- 08 Apr, 2022 1 commit
-
-
moto authored
Summary: Add badges of supported properties and devices to functionals and transforms. This commit adds `.. devices::` and `.. properties::` directives to sphinx. APIs with these directives will have badges (based off of shields.io) which link to the page with description of these features. Continuation of https://github.com/pytorch/audio/issues/2316 Excluded dtypes for further improvement, and actually added badges to most of functional/transforms. Pull Request resolved: https://github.com/pytorch/audio/pull/2321 Reviewed By: hwangjeff Differential Revision: D35489063 Pulled By: mthrok fbshipit-source-id: f68a70ebb22df29d5e9bd171273bd19007a81762
-
- 26 Feb, 2022 1 commit
-
-
moto authored
Summary: This commit adds tutorial for device ASR, and update API for device streaming. The changes for the interface are 1. Add `timeout` and `backoff` parameters to `process_packet` and `stream` methods. 2. Move `fill_buffer` method to private. When dealing with device stream, there are situations where the device buffer is not ready and the system returns `EAGAIN`. In such case, the previous implementation of `process_packet` method raised an exception in Python layer , but for device ASR, this is inefficient. A better approach is to retry within C++ layer in blocking manner. The new `timeout` parameter serves this purpose. Pull Request resolved: https://github.com/pytorch/audio/pull/2202 Reviewed By: nateanl Differential Revision: D34475829 Pulled By: mthrok fbshipit-source-id: bb6d0b125d800f87d189db40815af06fbd4cab59
-
- 04 Feb, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2177 Reviewed By: hwangjeff Differential Revision: D33893052 Pulled By: nateanl fbshipit-source-id: 00ff011eb96662b162c0327196a9564721e9c8f7
-
- 03 Feb, 2022 1 commit
-
-
moto authored
Summary: * tutorial for streaming API https://541810-90321822-gh.circle-artifacts.com/0/docs/tutorials/streaming_api_tutorial.html * tutorial for online speech recognition with Emformer RNN-T https://541810-90321822-gh.circle-artifacts.com/0/docs/tutorials/online_asr_tutorial.html Pull Request resolved: https://github.com/pytorch/audio/pull/2193 Reviewed By: hwangjeff Differential Revision: D33971312 Pulled By: mthrok fbshipit-source-id: f9b69114255f15eaf4463ca85b3efb0ba321a95f
-
- 02 Feb, 2022 1 commit
-
-
moto authored
Summary: This PR adds the prototype streaming API. The implementation is based on ffmpeg libraries. For the detailed usage, please refer to [the resulting tutorial](https://534376-90321822-gh.circle-artifacts.com/0/docs/tutorials/streaming_api_tutorial.html). Pull Request resolved: https://github.com/pytorch/audio/pull/2164 Reviewed By: hwangjeff Differential Revision: D33934457 Pulled By: mthrok fbshipit-source-id: 92ade4aff2d25baf02c0054682d4fbdc9ba8f3fe
-
- 01 Feb, 2022 1 commit
-
-
hwangjeff authored
Summary: Missed a couple of spots in https://github.com/pytorch/audio/issues/2187. Pull Request resolved: https://github.com/pytorch/audio/pull/2189 Reviewed By: carolineechen, nateanl, mthrok Differential Revision: D33926342 Pulled By: hwangjeff fbshipit-source-id: e1324c0fe8f9be90ad3143d19cd61c3d53f02b06
-
- 29 Dec, 2021 2 commits
-
-
hwangjeff authored
Summary: Regroup RNN-T components under `torchaudio.prototype.models` and `torchaudio.prototype.pipelines`. Updated docs: https://492321-90321822-gh.circle-artifacts.com/0/docs/prototype.html Pull Request resolved: https://github.com/pytorch/audio/pull/2110 Reviewed By: carolineechen, mthrok Differential Revision: D33354116 Pulled By: hwangjeff fbshipit-source-id: 9cf4afed548cb173d56211c16d31bcfa25a8e4cb
-
moto authored
Summary: ### Change list * Split the documentation of prototypes * Add a new API reference section dedicated for prototypes. * Hide the signature of KenLMLexiconDecoder constructor. (cc carolineechen ) * https://489516-90321822-gh.circle-artifacts.com/0/docs/prototype.ctc_decoder.html#torchaudio.prototype.ctc_decoder.KenLMLexiconDecoder * Hide the signature of RNNT constructor. (cc hwangjeff ) * https://489516-90321822-gh.circle-artifacts.com/0/docs/prototype.rnnt.html#torchaudio.prototype.RNNT * Tweak CTC tutorial * Replace hyperlinks to API reference with backlinks * Add `progress=False` to download ### Follow-up RNNT decoder and CTC decode returns their own `Hypothesis` classes. When I tried to add Hypothesis of CTC decode to the documentation, the build process complains that it's ambiguous. I think the Hypothesis classes can be put inside of each decoder. (if TorchScript supports it) or make the name different, but in that case the interface of each Hypothesis has to be generic enough. ### Before https://pytorch.org/audio/main/prototype.html <img width="1390" alt="Screen Shot 2021-12-28 at 1 05 53 PM" src="https://user-images.githubusercontent.com/855818/147594425-6c7f8126-ab76-4edc-a616-a00901e7e9ef.png"> ### After https://489516-90321822-gh.circle-artifacts.com/0/docs/prototype.html <img width="1202" alt="Screen Shot 2021-12-28 at 8 37 35 PM" src="https://user-images.githubusercontent.com/855818/147619281-8152b1ae-e127-40b2-a944-dc11b114b629.png"> https://489516-90321822-gh.circle-artifacts.com/0/docs/prototype.rnnt.html <img width="1415" alt="Screen Shot 2021-12-28 at 8 38 27 PM" src="https://user-images.githubusercontent.com/855818/147619331-077b55b5-c5e9-47ab-bfe6-873e41c738c8.png"> https://489516-90321822-gh.circle-artifacts.com/0/docs/prototype.ctc_decoder.html <img width="1417" alt="Screen Shot 2021-12-28 at 8 39 04 PM" src="https://user-images.githubusercontent.com/855818/147619364-63df3457-a4b2-4223-973f-f4301bd45280.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2108 Reviewed By: hwangjeff, carolineechen, nateanl Differential Revision: D33340816 Pulled By: mthrok fbshipit-source-id: 870edfadbe41d6f8abaf78fdb7017b3980dfe187
-
- 28 Dec, 2021 1 commit
-
-
Caroline Chen authored
Summary: demonstrate usage of the CTC beam search decoder w/ lexicon constraint and KenLM support, on a LibriSpeech sample and using a pretrained wav2vec2 model rendered: https://485200-90321822-gh.circle-artifacts.com/0/docs/tutorials/asr_inference_with_ctc_decoder_tutorial.html follow-ups: - incorporate `nbest` - demonstrate customizability of different beam search parameters Pull Request resolved: https://github.com/pytorch/audio/pull/2106 Reviewed By: mthrok Differential Revision: D33340946 Pulled By: carolineechen fbshipit-source-id: 0ab838375d96a035d54ed5b5bd9ab4dc8d19adb7
-
- 05 Nov, 2021 4 commits
-
-
moto authored
- Add link to index page on left - Package Reference -> API Reference - Update description.
-
moto authored
-
moto authored
-
moto authored
* Refactor tutorial organization * Merge tutorial subdirectoris under to examples/gallery/tutorials * Do not use index.rst generated by Sphinx-gallery * Instead use flat structure so that all the tutorials are listed in left menu * Use `_assets` dir for artifacts of tutorials
-
- 04 Nov, 2021 2 commits
- 02 Nov, 2021 1 commit
-
-
yangarbiter authored
-
- 15 Oct, 2021 1 commit
-
-
moto authored
- Move wav2vec2 pretrained weights to `torchaudio.pipelines` namespace to align with #1872. - Split `Wav2Vec2PretrainedModelBundle` into `Wav2Vec2Bundle` (for pre-training model) and `Wav2Vec2ASRBundle` (for models fine-tuned for ASR). - Update base URL
-
- 06 Oct, 2021 1 commit
-
-
hwangjeff authored
Adds an implementation of Emformer, a memory-efficient transformer architecture introduced in https://ieeexplore.ieee.org/document/9414560 that targets low-latency streaming speech recognition applications.
-
- 19 Aug, 2021 1 commit
-
-
Caroline Chen authored
-
- 18 Aug, 2021 1 commit
-
-
yangarbiter authored
-
- 12 Aug, 2021 1 commit
-
-
yangarbiter authored
-
- 30 Apr, 2021 1 commit
-
-
Caroline Chen authored
Replace the prototype RNNT implementation (using warp-transducer) with one without external library dependencies
-
- 11 Jan, 2021 1 commit
-
-
Vincent QB authored
-
- 19 Oct, 2020 1 commit
-
-
Brian Johnson authored
Adds introductory context and links to the PyTorch Libraries to audio docs.
-
- 20 Jul, 2020 1 commit
-
-
moto authored
- Addresses #549 #638 #786 - Add `torchaudio` top level module doc - Separate `torchaudio` top level module doc from `index.html` - Add `backend` module doc. - Remove `-> None` from function signature as it adds noise to documentation - Changed function argument name of `torchaudio.backend.sox_io_backend.save` from `tensor` to `src`, so that it matches with the reset of backends. - Tweak bunch of docstrings
-
- 16 Jul, 2020 1 commit
-
-
moto authored
* Add sox_utils module * Make init/shutdown thread safe * Add sox effects implementation * Add test for sox effects * Update docstrings and add examples
-
- 01 Aug, 2019 1 commit
-
-
jamarshon authored
-