torchaudio.rst 4.63 KB
Newer Older
1
2
3
torchaudio
==========

4
5
I/O
---
6

7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
``torchaudio`` top-level module provides the following functions that make
it easy to handle audio data.

- :py:func:`torchaudio.info`
- :py:func:`torchaudio.load`
- :py:func:`torchaudio.save`

Under the hood, these functions are implemented using various decoding/encoding
libraries. There are currently three variants.

- ``FFmpeg``
- ``libsox``
- ``SoundFile``

``libsox`` backend is the first backend implemented in TorchAudio, and it
works on Linux and macOS.
``SoundFile`` backend was added to extend audio I/O support to Windows.
It also works on Linux and macOS.
``FFmpeg`` backend is the latest addition and it supports wide range of audio, video
formats and protocols.
It works on Linux, macOS and Windows.

.. _dispatcher_migration:

Introduction of Dispatcher
~~~~~~~~~~~~~~~~~~~~~~~~~~

Conventionally, torchaudio has had its IO backend set globally at runtime based on availability.
However, this approach does not allow applications to use different
backends, and it is not well-suited for large codebases.

For these reasons, we are introducing a dispatcher, a new mechanism to allow users to
choose a backend for each function call, and migrating the I/O functions.
This incurs multiple changes, some of which involve backward-compatibility-breaking changes, and require
users to change their function call.

The (planned) changes are as follows. For up-to-date information,
please refer to https://github.com/pytorch/audio/issues/2950

* In 2.0, audio I/O backend dispatcher was introduced.
  Users can opt-in to using dispatcher by setting the environment variable
  ``TORCHAUDIO_USE_BACKEND_DISPATCHER=1``
* In 2.1, the disptcher becomes the default mechanism for I/O.
  Those who need to keep using the previous mechanism (global backend) can do
  so by setting ``TORCHAUDIO_USE_BACKEND_DISPATCHER=0``.

Furthermore, we are removing file-like object support from libsox backend, as this
is better supported by FFmpeg backend and makes the build process simpler.
Therefore, beginning with 2.1, FFmpeg and Soundfile are the sole backends that support file-like objects.

The changes in 2.1 will mark the :ref:`backend utilities <backend_utils>` deprecated.
58
59
60
61

Current API
-----------

62
63
64
65
66
I/O functionalities
~~~~~~~~~~~~~~~~~~~

Audio I/O functions are implemented in :ref:`torchaudio.backend<backend>` module, but for the ease of use, the following functions are made available on :mod:`torchaudio` module. There are different backends available and you can switch backends with :func:`set_audio_backend`.

67
68

Please refer to :ref:`backend` for the detail, and the :doc:`Audio I/O tutorial <../tutorials/audio_io_tutorial>` for the usage.
69

70
71
72
73

torchaudio.info
~~~~~~~~~~~~~~~

74
75
76
77
.. function:: torchaudio.info(filepath: str, ...)

   Fetch meta data of an audio file. Refer to :ref:`backend` for the detail.

78
79
80
torchaudio.load
~~~~~~~~~~~~~~~

81
82
83
84
.. function:: torchaudio.load(filepath: str, ...)

   Load audio file into torch.Tensor object. Refer to :ref:`backend` for the detail.

85
86
87
torchaudio.save
~~~~~~~~~~~~~~~

88
89
90
91
92
93
.. function:: torchaudio.save(filepath: str, src: torch.Tensor, sample_rate: int, ...)

   Save torch.Tensor object into an audio format. Refer to :ref:`backend` for the detail.

.. currentmodule:: torchaudio

94
95
.. _backend_utils:

96
97
98
Backend Utilities
~~~~~~~~~~~~~~~~~

99
100
101
The following functions are effective only when backend dispatcher is disabled.
They are effectively deprecated.

102
103
104
105
106
.. autofunction:: list_audio_backends

.. autofunction:: get_audio_backend

.. autofunction:: set_audio_backend
107
108
109
110
111
112

.. _future_api:

Future API
----------

113
114
115
116
117
118
119
120
121
122
123
124
125
Dispatcher
~~~~~~~~~~

The dispatcher tries to use the I/O backend in the following order of precedence

1. FFmpeg
2. libsox
3. soundfile

One can pass ``backend`` argument to I/O functions to override this.

See :ref:`future_api` for details on the new API.

126
127
128
129
130
131
132
133
134
135
In the next release, each of ``torchaudio.info``, ``torchaudio.load``, and ``torchaudio.save`` will allow for selecting a backend to use via parameter ``backend``.
The functions will support using any of FFmpeg, SoX, and SoundFile, provided that the corresponding library is installed.
If a backend is not explicitly chosen, the functions will select a backend to use given order of precedence (FFmpeg, SoX, SoundFile) and library availability.

Note that only FFmpeg and SoundFile will support file-like objects.

These functions can be enabled in the current release by setting environment variable ``TORCHAUDIO_USE_BACKEND_DISPATCHER=1``.

.. currentmodule:: torchaudio._backend

136
137
138
torchaudio.info
~~~~~~~~~~~~~~~

139
140
141
.. autofunction:: info
   :noindex:

142
143
144
torchaudio.load
~~~~~~~~~~~~~~~

145
146
147
.. autofunction:: load
   :noindex:

148
149
150
torchaudio.save
~~~~~~~~~~~~~~~

151
152
.. autofunction:: save
   :noindex: