Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
Torchaudio
Commits
0f8014f5
Unverified
Commit
0f8014f5
authored
Oct 29, 2021
by
Caroline Chen
Committed by
GitHub
Oct 29, 2021
Browse files
Improve backend and transforms docs (#1944)
parent
f2dff4d4
Changes
5
Hide whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
13 additions
and
10 deletions
+13
-10
docs/source/backend.rst
docs/source/backend.rst
+1
-1
torchaudio/backend/common.py
torchaudio/backend/common.py
+1
-0
torchaudio/backend/soundfile_backend.py
torchaudio/backend/soundfile_backend.py
+3
-2
torchaudio/backend/sox_io_backend.py
torchaudio/backend/sox_io_backend.py
+3
-1
torchaudio/transforms.py
torchaudio/transforms.py
+5
-6
No files found.
docs/source/backend.rst
View file @
0f8014f5
...
@@ -38,7 +38,7 @@ AudioMetaData
...
@@ -38,7 +38,7 @@ AudioMetaData
Sox IO Backend
Sox IO Backend
~~~~~~~~~~~~~~
~~~~~~~~~~~~~~
The ``
"
sox_io
"
`` backend is available and default on Linux/macOS and not available on Windows.
The ``sox_io`` backend is available and default on Linux/macOS and not available on Windows.
I/O functions of this backend support `TorchScript <https://pytorch.org/docs/stable/jit.html>`_.
I/O functions of this backend support `TorchScript <https://pytorch.org/docs/stable/jit.html>`_.
...
...
torchaudio/backend/common.py
View file @
0f8014f5
...
@@ -23,6 +23,7 @@ class AudioMetaData:
...
@@ -23,6 +23,7 @@ class AudioMetaData:
* ``AMR_WB``: Adaptive Multi-Rate
* ``AMR_WB``: Adaptive Multi-Rate
* ``AMR_NB``: Adaptive Multi-Rate Wideband
* ``AMR_NB``: Adaptive Multi-Rate Wideband
* ``OPUS``: Opus
* ``OPUS``: Opus
* ``HTK``: Single channel 16-bit PCM
* ``UNKNOWN`` : None of above
* ``UNKNOWN`` : None of above
"""
"""
def
__init__
(
def
__init__
(
...
...
torchaudio/backend/soundfile_backend.py
View file @
0f8014f5
...
@@ -376,8 +376,9 @@ def save(
...
@@ -376,8 +376,9 @@ def save(
- 8-bit mu-law
- 8-bit mu-law
- 8-bit a-law
- 8-bit a-law
Note: Default encoding/bit depth is determined by the dtype of
Note:
the input Tensor.
Default encoding/bit depth is determined by the dtype of
the input Tensor.
``"flac"``
``"flac"``
- 8-bit
- 8-bit
...
...
torchaudio/backend/sox_io_backend.py
View file @
0f8014f5
...
@@ -215,8 +215,9 @@ def save(
...
@@ -215,8 +215,9 @@ def save(
``"wav"``, ``"amb"``
``"wav"``, ``"amb"``
- | If both ``encoding`` and ``bits_per_sample`` are not provided, the ``dtype`` of the
- | If both ``encoding`` and ``bits_per_sample`` are not provided, the ``dtype`` of the
| Tensor is used to determine the default value.
| Tensor is used to determine the default value.
- ``"PCM_U"`` if dtype is ``uint8``
- ``"PCM_U"`` if dtype is ``uint8``
- ``"PCM_S"`` if dtype is ``int16`` or ``int32`
- ``"PCM_S"`` if dtype is ``int16`` or ``int32`
`
- ``"PCM_F"`` if dtype is ``float32``
- ``"PCM_F"`` if dtype is ``float32``
- ``"PCM_U"`` if ``bits_per_sample=8``
- ``"PCM_U"`` if ``bits_per_sample=8``
...
@@ -235,6 +236,7 @@ def save(
...
@@ -235,6 +236,7 @@ def save(
``"wav"``, ``"amb"``;
``"wav"``, ``"amb"``;
- | If both ``encoding`` and ``bits_per_sample`` are not provided, the ``dtype`` of the
- | If both ``encoding`` and ``bits_per_sample`` are not provided, the ``dtype`` of the
| Tensor is used.
| Tensor is used.
- ``8`` if dtype is ``uint8``
- ``8`` if dtype is ``uint8``
- ``16`` if dtype is ``int16``
- ``16`` if dtype is ``int16``
- ``32`` if dtype is ``int32`` or ``float32``
- ``32`` if dtype is ``int32`` or ``float32``
...
...
torchaudio/transforms.py
View file @
0f8014f5
...
@@ -296,7 +296,7 @@ class AmplitudeToDB(torch.nn.Module):
...
@@ -296,7 +296,7 @@ class AmplitudeToDB(torch.nn.Module):
a full clip.
a full clip.
Args:
Args:
stype (str, optional): scale of input tensor ('power' or 'magnitude'). The
stype (str, optional): scale of input tensor (
``
'power'
``
or
``
'magnitude'
``
). The
power being the elementwise square of the magnitude. (Default: ``'power'``)
power being the elementwise square of the magnitude. (Default: ``'power'``)
top_db (float or None, optional): minimum negative cut-off in decibels. A reasonable
top_db (float or None, optional): minimum negative cut-off in decibels. A reasonable
number is 80. (Default: ``None``)
number is 80. (Default: ``None``)
...
@@ -332,15 +332,13 @@ class MelScale(torch.nn.Module):
...
@@ -332,15 +332,13 @@ class MelScale(torch.nn.Module):
r
"""Turn a normal STFT into a mel frequency STFT, using a conversion
r
"""Turn a normal STFT into a mel frequency STFT, using a conversion
matrix. This uses triangular filter banks.
matrix. This uses triangular filter banks.
User can control which device the filter bank (`fb`) is (e.g. fb.to(spec_f.device)).
Args:
Args:
n_mels (int, optional): Number of mel filterbanks. (Default: ``128``)
n_mels (int, optional): Number of mel filterbanks. (Default: ``128``)
sample_rate (int, optional): Sample rate of audio signal. (Default: ``16000``)
sample_rate (int, optional): Sample rate of audio signal. (Default: ``16000``)
f_min (float, optional): Minimum frequency. (Default: ``0.``)
f_min (float, optional): Minimum frequency. (Default: ``0.``)
f_max (float or None, optional): Maximum frequency. (Default: ``sample_rate // 2``)
f_max (float or None, optional): Maximum frequency. (Default: ``sample_rate // 2``)
n_stft (int, optional): Number of bins in STFT. See ``n_fft`` in :class:`Spectrogram`. (Default: ``201``)
n_stft (int, optional): Number of bins in STFT. See ``n_fft`` in :class:`Spectrogram`. (Default: ``201``)
norm (str or None, optional): If 'slaney', divide the triangular mel weights by the width of the mel band
norm (str or None, optional): If
``
'slaney'
``
, divide the triangular mel weights by the width of the mel band
(area normalization). (Default: ``None``)
(area normalization). (Default: ``None``)
mel_scale (str, optional): Scale to use: ``htk`` or ``slaney``. (Default: ``htk``)
mel_scale (str, optional): Scale to use: ``htk`` or ``slaney``. (Default: ``htk``)
...
@@ -795,7 +793,7 @@ class MuLawDecoding(torch.nn.Module):
...
@@ -795,7 +793,7 @@ class MuLawDecoding(torch.nn.Module):
r
"""Decode mu-law encoded signal. For more info see the
r
"""Decode mu-law encoded signal. For more info see the
`Wikipedia Entry <https://en.wikipedia.org/wiki/%CE%9C-law_algorithm>`_
`Wikipedia Entry <https://en.wikipedia.org/wiki/%CE%9C-law_algorithm>`_
This expects an input with values between 0 and quantization_channels - 1
This expects an input with values between 0 and
``
quantization_channels - 1
``
and returns a signal scaled between -1 and 1.
and returns a signal scaled between -1 and 1.
Args:
Args:
...
@@ -1003,7 +1001,8 @@ class Fade(torch.nn.Module):
...
@@ -1003,7 +1001,8 @@ class Fade(torch.nn.Module):
fade_in_len (int, optional): Length of fade-in (time frames). (Default: ``0``)
fade_in_len (int, optional): Length of fade-in (time frames). (Default: ``0``)
fade_out_len (int, optional): Length of fade-out (time frames). (Default: ``0``)
fade_out_len (int, optional): Length of fade-out (time frames). (Default: ``0``)
fade_shape (str, optional): Shape of fade. Must be one of: "quarter_sine",
fade_shape (str, optional): Shape of fade. Must be one of: "quarter_sine",
"half_sine", "linear", "logarithmic", "exponential". (Default: ``"linear"``)
``"half_sine"``, ``"linear"``, ``"logarithmic"``, ``"exponential"``.
(Default: ``"linear"``)
Example
Example
>>> waveform, sample_rate = torchaudio.load('test.wav', normalize=True)
>>> waveform, sample_rate = torchaudio.load('test.wav', normalize=True)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment