index.rst 4.18 KB
Newer Older
moto's avatar
moto committed
1
2
3
4
5
6
Torchaudio Documentation
========================

Torchaudio is a library for audio and signal processing with PyTorch.
It provides I/O, signal and data processing functions, datasets,
model implementations and application components.
Brian Johnson's avatar
Brian Johnson committed
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

Features described in this documentation are classified by release status:

  *Stable:*  These features will be maintained long-term and there should generally
  be no major performance limitations or gaps in documentation.
  We also expect to maintain backwards compatibility (although
  breaking changes can happen and notice will be given one release ahead
  of time).

  *Beta:*  Features are tagged as Beta because the API may change based on
  user feedback, because the performance needs to improve, or because
  coverage across operators is not yet complete. For Beta features, we are
  committing to seeing the feature through to the Stable classification.
  We are not, however, committing to backwards compatibility.

  *Prototype:*  These features are typically not available as part of
  binary distributions like PyPI or Conda, except sometimes behind run-time
  flags, and are at an early stage for feedback and testing.
25

moto's avatar
moto committed
26
27
28
29
.. toctree::
   :maxdepth: 1
   :hidden:
   :caption: Torchaudio Documentation
30

moto's avatar
moto committed
31
   Index <self>
moto's avatar
moto committed
32
   supported_features
33

moto's avatar
moto committed
34
35
API References
--------------
36

37
.. toctree::
38
   :maxdepth: 1
moto's avatar
moto committed
39
   :caption: API Reference
40

41
   torchaudio
42
   io
43
44
45
   backend
   functional
   transforms
46
   datasets
47
   models
48
   models.decoder
49
   pipelines
50
   sox_effects
jamarshon's avatar
jamarshon committed
51
   compliance.kaldi
52
   kaldi_io
moto's avatar
moto committed
53
   utils
moto's avatar
moto committed
54
55
56
57
58
59
60
61

Prototype API References
------------------------

.. toctree::
   :maxdepth: 1
   :caption: Prototype API Reference

hwangjeff's avatar
hwangjeff committed
62
   prototype
moto's avatar
moto committed
63
   prototype.ctc_decoder
hwangjeff's avatar
hwangjeff committed
64
   prototype.functional
65
   prototype.models
66
   prototype.pipelines
67

68
69
70
Getting Started
---------------
    
moto's avatar
moto committed
71
.. toctree::
72
73
   :maxdepth: 1
   :caption: Getting Started
moto's avatar
moto committed
74

75
   tutorials/audio_io_tutorial
76
77
   tutorials/streaming_api_tutorial
   tutorials/streaming_api2_tutorial
78
79
80
81
82
83
84
85
   tutorials/audio_resampling_tutorial
   tutorials/audio_data_augmentation_tutorial
   tutorials/audio_feature_extractions_tutorial
   tutorials/audio_feature_augmentation_tutorial
   tutorials/audio_datasets_tutorial

Advanced Usages
---------------
86

Brian Johnson's avatar
Brian Johnson committed
87
88
.. toctree::
   :maxdepth: 1
89
   :caption: Advanced Usages
90

91
   hw_acceleration_tutorial
92
   tutorials/speech_recognition_pipeline_tutorial
93
   tutorials/online_asr_tutorial
moto's avatar
moto committed
94
   tutorials/device_asr
95
96
   tutorials/forced_alignment_tutorial
   tutorials/tacotron2_pipeline_tutorial
moto's avatar
moto committed
97
   tutorials/mvdr_tutorial
98
   tutorials/asr_inference_with_ctc_decoder_tutorial
Sean Kim's avatar
Sean Kim committed
99
   tutorials/hybrid_demucs_tutorial
100
101

Citing torchaudio
102
-----------------
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118

If you find torchaudio useful, please cite the following paper:

- Yang, Y.-Y., Hira, M., Ni, Z., Chourdia, A., Astafurov, A., Chen, C., Yeh, C.-F., Puhrsch, C.,
  Pollack, D., Genzel, D., Greenberg, D., Yang, E. Z., Lian, J., Mahadeokar, J., Hwang, J.,
  Chen, J., Goldsborough, P., Roy, P., Narenthiran, S., Watanabe, S., Chintala, S.,
  Quenneville-Bélair, V, & Shi, Y. (2021).
  TorchAudio: Building Blocks for Audio and Speech Processing. arXiv preprint arXiv:2110.15018.


In BibTeX format:

.. code-block:: bibtex

    @article{yang2021torchaudio,
      title={TorchAudio: Building Blocks for Audio and Speech Processing},
119
120
121
122
123
124
125
126
      author={Yao-Yuan Yang and Moto Hira and Zhaoheng Ni and
              Anjali Chourdia and Artyom Astafurov and Caroline Chen and
              Ching-Feng Yeh and Christian Puhrsch and David Pollack and
              Dmitriy Genzel and Donny Greenberg and Edward Z. Yang and
              Jason Lian and Jay Mahadeokar and Jeff Hwang and Ji Chen and
              Peter Goldsborough and Prabhat Roy and Sean Narenthiran and
              Shinji Watanabe and Soumith Chintala and
              Vincent Quenneville-Bélair and Yangyang Shi},
127
128
129
130
      journal={arXiv preprint arXiv:2110.15018},
      year={2021}
    }

131
132
133
134
135
136
137
138
139
140
141
142
.. toctree::
   :maxdepth: 1
   :caption: PyTorch Libraries
   :hidden:

   PyTorch <https://pytorch.org/docs>
   torchaudio <https://pytorch.org/audio>
   torchtext <https://pytorch.org/text>
   torchvision <https://pytorch.org/vision>
   TorchElastic <https://pytorch.org/elastic/>
   TorchServe <https://pytorch.org/serve>
   PyTorch on XLA Devices <http://pytorch.org/xla/>