- 22 Jan, 2023 1 commit
-
-
moto authored
Summary: This commit makes `StreamReader` report PTS (presentation time stamp) of the returned chunk as well. Example ```python from torchaudio.io import StreamReader s = StreamReader(...) s.add_video_stream(...) for (video_chunk, ) in s.stream(): # video_chunk is Torch tensor type but has extra attribute of PTS print(video_chunk.pts) # reports the PTS of the first frame of the video chunk. ``` For the backward compatibility, we introduce a `_ChunkTensor`, that is a composition of Tensor and metadata, but works like a normal tensor in PyTorch operations. The implementation of `_ChunkTensor` is based on [TrivialTensorViaComposition](https://github.com/albanD/subclass_zoo/blob/0eeb1d68fb59879029c610bc407f2997ae43ba0a/trivial_tensors.py#L83). It was also suggested to attach metadata directly to Tensor object, but the possibility to have the collision on torchaudio's metadata and new attributes introduced in PyTorch cannot be ignored, so we use Tensor subclass implementation. If any unexpected issue arise from metadata attribute name collision, client code can fetch the bare Tensor and continue. Pull Request resolved: https://github.com/pytorch/audio/pull/2975 Reviewed By: hwangjeff Differential Revision: D42526945 Pulled By: mthrok fbshipit-source-id: b4e9422e914ff328421b975120460f3001268f35
-
- 13 Jan, 2023 1 commit
-
-
Zhaoheng Ni authored
Summary: XLSR (cross-lingual speech representation) are a set of cross-lingual self-supervised learning models for generating cross-lingual speech representation. It was first proposed in https://arxiv.org/pdf/2006.13979.pdf which is trained on 53 languages (so-called XLSR-53). This PR supports more XLS-R models from https://arxiv.org/pdf/2111.09296.pdf that have more parameters (300M, 1B, 2B) and are trained on 128 languages. Pull Request resolved: https://github.com/pytorch/audio/pull/2959 Reviewed By: mthrok Differential Revision: D42397643 Pulled By: nateanl fbshipit-source-id: 23e8e51a7cde0a226db4f4028db7df8f02b986ce
-
- 10 Dec, 2022 1 commit
-
-
moto authored
Summary: Currently, the documentation page for `torchaudio.models` have separate sections for model definitions and factory functions. The relationships between models and factory functions are not immediately clear. This commit moves the list of factory functions to the list of models. After: - https://output.circle-artifacts.com/output/job/242a9521-7460-4043-895b-9995bf5093b5/artifacts/0/docs/generated/torchaudio.models.Wav2Vec2Model.html <img width="1171" alt="Screen Shot 2022-12-08 at 8 41 03 PM" src="https://user-images.githubusercontent.com/855818/206603743-74a6e368-c3cf-4b87-b854-518a95893f06.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2902 Reviewed By: carolineechen Differential Revision: D41897800 Pulled By: mthrok fbshipit-source-id: a3c01d28d80e755596a9bc37c951960eb84870b9
-
- 13 Oct, 2022 1 commit
-
-
moto authored
Summary: * Document `__call__` instead of `__init__` * List CTCHypothesis first as it is used in combination with CTCDecoder * Fix indentation of score method docstring Pull Request resolved: https://github.com/pytorch/audio/pull/2766 Reviewed By: carolineechen Differential Revision: D40349388 Pulled By: mthrok fbshipit-source-id: 5e512e6c2b29d3533eb62d09b289154ccd1abf4c
-
- 03 Oct, 2022 1 commit
-
-
moto authored
Summary: Adopt `:autosummary:` to various modules * torchaudio.compliance.kaldi * torchaudio.sox_effects * torchaudio.utils Pull Request resolved: https://github.com/pytorch/audio/pull/2664 Reviewed By: nateanl Differential Revision: D39841873 Pulled By: mthrok fbshipit-source-id: ff4fa6976324fca5f35b737b715f976e2a722bac
-
- 22 Sep, 2022 1 commit
-
-
moto authored
Summary: * Introduce the mini-index at `torchaudio.datasets` page. * Standardize the format of return type docstring. https://output.circle-artifacts.com/output/job/989328b2-0270-4958-b577-19cf749af3fd/artifacts/0/docs/datasets.html <img width="936" alt="Screen Shot 2022-09-21 at 6 56 52 PM" src="https://user-images.githubusercontent.com/855818/191475141-a97f2bea-705f-49bc-8c34-6ec869e76793.png"> https://output.circle-artifacts.com/output/job/989328b2-0270-4958-b577-19cf749af3fd/artifacts/0/docs/generated/torchaudio.datasets.CMUDict.html#torchaudio.datasets.CMUDict <img width="1069" alt="Screen Shot 2022-09-21 at 6 57 32 PM" src="https://user-images.githubusercontent.com/855818/191475293-e3302528-27ea-4212-9c12-fd6d900fdf3e.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2692 Reviewed By: carolineechen Differential Revision: D39687463 Pulled By: mthrok fbshipit-source-id: 4175fc15388817d2fe76206188618dd1576281df
-
- 21 Sep, 2022 2 commits
-
-
moto authored
Summary: * Introduce the mini-index at `torchaudio.pipelines` page. * Add introductions * Update pipeline tutorials https://output.circle-artifacts.com/output/job/ccc57d95-1930-45c9-b967-c8d477d35f29/artifacts/0/docs/pipelines.html <img width="1163" alt="Screen Shot 2022-09-20 at 1 23 29 PM" src="https://user-images.githubusercontent.com/855818/191167049-98324e93-2e16-41db-8538-3b5b54eb8224.png"> <img width="1115" alt="Screen Shot 2022-09-20 at 1 23 49 PM" src="https://user-images.githubusercontent.com/855818/191167071-4770f594-2540-43a4-a01c-e983bf59220f.png"> https://output.circle-artifacts.com/output/job/ccc57d95-1930-45c9-b967-c8d477d35f29/artifacts/0/docs/generated/torchaudio.pipelines.RNNTBundle.html#torchaudio.pipelines.RNNTBundle <img width="1108" alt="Screen Shot 2022-09-20 at 1 24 18 PM" src="https://user-images.githubusercontent.com/855818/191167123-51b33a5f-c30c-46bc-b002-b05d2d0d27b7.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2689 Reviewed By: carolineechen Differential Revision: D39691253 Pulled By: mthrok fbshipit-source-id: ddf5fdadb0b64cf2867b6271ba53e8e8c0fa7e49
-
moto authored
Summary: * Introduce the mini-index at `torchaudio.models` page. https://output.circle-artifacts.com/output/job/25e59810-3866-4ece-b1b7-8a10c7a2286d/artifacts/0/docs/models.html <img width="1042" alt="Screen Shot 2022-09-20 at 1 20 50 PM" src="https://user-images.githubusercontent.com/855818/191166816-83314ad1-8b67-475b-aa10-d4cc59126295.png"> <img width="1048" alt="Screen Shot 2022-09-20 at 1 20 58 PM" src="https://user-images.githubusercontent.com/855818/191166829-1ceb65e0-9506-4328-9a2f-8b75b4e54404.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2690 Reviewed By: carolineechen Differential Revision: D39654948 Pulled By: mthrok fbshipit-source-id: 703d1526617596f647c85a7148f41ca55fffdbc8
-
- 16 Sep, 2022 3 commits
-
-
moto authored
Summary: * Introduce the mini-index at `torchaudio.transforms` page. * Add "Augmentations" subsection. * Also updated the overall introduction. https://output.circle-artifacts.com/output/job/1b65246a-403c-4d2c-b97d-d1b582d8b4e5/artifacts/0/docs/transforms.html <img width="721" alt="Screen Shot 2022-09-16 at 5 20 08 PM" src="https://user-images.githubusercontent.com/855818/190591795-97c169db-a95b-480a-8d3c-d80072efa045.png"> <img width="755" alt="Screen Shot 2022-09-16 at 5 20 28 PM" src="https://user-images.githubusercontent.com/855818/190591828-03026918-febd-4194-91aa-7d8f704e17cc.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2683 Reviewed By: carolineechen Differential Revision: D39574255 Pulled By: mthrok fbshipit-source-id: a4beed7cacbb5184bad96efa903a3a1123dab627
-
moto authored
Summary: * Adopts `:autosummary:` in decoder module doc * Hide the constructor signature of `CTCDecoder` as `ctc_decoder` function is the one client code is supposed to be using. * Introduce `children` property to `CTCDecoderLMState` otherwise it does not show up in the doc. https://output.circle-artifacts.com/output/job/7aac5eb9-7d2d-4f63-bcdf-83a6f40b4e5a/artifacts/0/docs/models.decoder.html <img width="748" alt="Screen Shot 2022-09-16 at 5 23 22 PM" src="https://user-images.githubusercontent.com/855818/190592409-0c2ec8a4-d2cf-4d76-a965-8a570faaeb1a.png"> https://output.circle-artifacts.com/output/job/7aac5eb9-7d2d-4f63-bcdf-83a6f40b4e5a/artifacts/0/docs/generated/torchaudio.models.decoder.CTCDecoder.html#torchaudio.models.decoder.CTCDecoder <img width="723" alt="Screen Shot 2022-09-16 at 5 23 53 PM" src="https://user-images.githubusercontent.com/855818/190592501-3fad1e07-ae3e-44f5-93be-f33181025390.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2684 Reviewed By: carolineechen Differential Revision: D39574272 Pulled By: mthrok fbshipit-source-id: d977660bd46f5cf98c535adbf2735be896b28773
-
moto authored
Summary: This commit adopts :autosummary: directive to `torchaudio.io` module. It adds table of contents on `torchaudio.io` level. https://output.circle-artifacts.com/output/job/282089d1-c120-4d22-809f-0e0ac0947c37/artifacts/0/docs/io.html <img width="1094" alt="Screen Shot 2022-09-16 at 7 33 32 AM" src="https://user-images.githubusercontent.com/855818/190520248-27e469f8-7689-4dc2-b591-7b3f08bb4dff.png"> https://output.circle-artifacts.com/output/job/282089d1-c120-4d22-809f-0e0ac0947c37/artifacts/0/docs/generated/torchaudio.io.StreamReader.html#torchaudio.io.StreamReader <img width="1108" alt="Screen Shot 2022-09-16 at 7 33 59 AM" src="https://user-images.githubusercontent.com/855818/190520292-d090fed0-2f18-4961-b9f3-9e4808fd437e.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2681 Reviewed By: carolineechen Differential Revision: D39560459 Pulled By: mthrok fbshipit-source-id: 3de5f22b8d8d0834dfd8bac8619fbfaa44c5f4dd
-