index.rst 3.14 KB
Newer Older
1
torchaudio
2
==========
Brian Johnson's avatar
Brian Johnson committed
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
This library is part of the `PyTorch
<http://pytorch.org/>`_ project. PyTorch is an open source
machine learning framework.

Features described in this documentation are classified by release status:

  *Stable:*  These features will be maintained long-term and there should generally
  be no major performance limitations or gaps in documentation.
  We also expect to maintain backwards compatibility (although
  breaking changes can happen and notice will be given one release ahead
  of time).

  *Beta:*  Features are tagged as Beta because the API may change based on
  user feedback, because the performance needs to improve, or because
  coverage across operators is not yet complete. For Beta features, we are
  committing to seeing the feature through to the Stable classification.
  We are not, however, committing to backwards compatibility.

  *Prototype:*  These features are typically not available as part of
  binary distributions like PyPI or Conda, except sometimes behind run-time
  flags, and are at an early stage for feedback and testing.
24

25
26
27
28
29
30
31

The :mod:`torchaudio` package consists of I/O, popular datasets and common audio transformations.

.. toctree::
   :maxdepth: 2
   :caption: Package Reference

32
33
34
35
   torchaudio
   backend
   functional
   transforms
36
   datasets
37
   models
38
   pipelines
39
   sox_effects
jamarshon's avatar
jamarshon committed
40
   compliance.kaldi
41
   kaldi_io
moto's avatar
moto committed
42
   utils
hwangjeff's avatar
hwangjeff committed
43
   prototype
44

moto's avatar
moto committed
45
46
47
48
.. toctree::
   :maxdepth: 2
   :caption: Tutorials

moto's avatar
moto committed
49
50
51
   tutorials/speech_recognition_pipeline_tutorial
   tutorials/forced_alignment_tutorial
   tutorials/tacotron2_pipeline_tutorial
52

Brian Johnson's avatar
Brian Johnson committed
53
54
55
56
57
58
59
60
61
62
63
.. toctree::
   :maxdepth: 1
   :caption: PyTorch Libraries

   PyTorch <https://pytorch.org/docs>
   torchaudio <https://pytorch.org/audio>
   torchtext <https://pytorch.org/text>
   torchvision <https://pytorch.org/vision>
   TorchElastic <https://pytorch.org/elastic/>
   TorchServe <https://pytorch.org/serve>
   PyTorch on XLA Devices <http://pytorch.org/xla/>
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93


Citing torchaudio
~~~~~~~~~~~~~~~~~

If you find torchaudio useful, please cite the following paper:

- Yang, Y.-Y., Hira, M., Ni, Z., Chourdia, A., Astafurov, A., Chen, C., Yeh, C.-F., Puhrsch, C.,
  Pollack, D., Genzel, D., Greenberg, D., Yang, E. Z., Lian, J., Mahadeokar, J., Hwang, J.,
  Chen, J., Goldsborough, P., Roy, P., Narenthiran, S., Watanabe, S., Chintala, S.,
  Quenneville-Bélair, V, & Shi, Y. (2021).
  TorchAudio: Building Blocks for Audio and Speech Processing. arXiv preprint arXiv:2110.15018.


In BibTeX format:

.. code-block:: bibtex

    @article{yang2021torchaudio,
      title={TorchAudio: Building Blocks for Audio and Speech Processing},
      author={Yao-Yuan Yang and Moto Hira and Zhaoheng Ni and Anjali Chourdia and Artyom Astafurov
              and Caroline Chen and Ching-Feng Yeh and Christian Puhrsch and David Pollack and
              Dmitriy Genzel and Donny Greenberg and Edward Z. Yang and Jason Lian and Jay
              Mahadeokar and Jeff Hwang and Ji Chen and Peter Goldsborough and Prabhat Roy and
              Sean Narenthiran and Shinji Watanabe and Soumith Chintala and Vincent
              Quenneville-Bélair and Yangyang Shi},
      journal={arXiv preprint arXiv:2110.15018},
      year={2021}
    }