Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
Torchaudio
Commits
0f8014f5
"...text-generation-inference.git" did not exist on "7872b8c55b6cdbf97e30ba6e4cd700f2de7e9bc4"
Unverified
Commit
0f8014f5
authored
Oct 29, 2021
by
Caroline Chen
Committed by
GitHub
Oct 29, 2021
Browse files
Improve backend and transforms docs (#1944)
parent
f2dff4d4
Changes
5
Hide whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
13 additions
and
10 deletions
+13
-10
docs/source/backend.rst
docs/source/backend.rst
+1
-1
torchaudio/backend/common.py
torchaudio/backend/common.py
+1
-0
torchaudio/backend/soundfile_backend.py
torchaudio/backend/soundfile_backend.py
+3
-2
torchaudio/backend/sox_io_backend.py
torchaudio/backend/sox_io_backend.py
+3
-1
torchaudio/transforms.py
torchaudio/transforms.py
+5
-6
No files found.
docs/source/backend.rst
View file @
0f8014f5
...
@@ -38,7 +38,7 @@ AudioMetaData
...
@@ -38,7 +38,7 @@ AudioMetaData
Sox IO Backend
Sox IO Backend
~~~~~~~~~~~~~~
~~~~~~~~~~~~~~
The ``
"
sox_io
"
`` backend is available and default on Linux/macOS and not available on Windows.
The ``sox_io`` backend is available and default on Linux/macOS and not available on Windows.
I/O functions of this backend support `TorchScript <https://pytorch.org/docs/stable/jit.html>`_.
I/O functions of this backend support `TorchScript <https://pytorch.org/docs/stable/jit.html>`_.
...
...
torchaudio/backend/common.py
View file @
0f8014f5
...
@@ -23,6 +23,7 @@ class AudioMetaData:
...
@@ -23,6 +23,7 @@ class AudioMetaData:
* ``AMR_WB``: Adaptive Multi-Rate
* ``AMR_WB``: Adaptive Multi-Rate
* ``AMR_NB``: Adaptive Multi-Rate Wideband
* ``AMR_NB``: Adaptive Multi-Rate Wideband
* ``OPUS``: Opus
* ``OPUS``: Opus
* ``HTK``: Single channel 16-bit PCM
* ``UNKNOWN`` : None of above
* ``UNKNOWN`` : None of above
"""
"""
def
__init__
(
def
__init__
(
...
...
torchaudio/backend/soundfile_backend.py
View file @
0f8014f5
...
@@ -376,8 +376,9 @@ def save(
...
@@ -376,8 +376,9 @@ def save(
- 8-bit mu-law
- 8-bit mu-law
- 8-bit a-law
- 8-bit a-law
Note: Default encoding/bit depth is determined by the dtype of
Note:
the input Tensor.
Default encoding/bit depth is determined by the dtype of
the input Tensor.
``"flac"``
``"flac"``
- 8-bit
- 8-bit
...
...
torchaudio/backend/sox_io_backend.py
View file @
0f8014f5
...
@@ -215,8 +215,9 @@ def save(
...
@@ -215,8 +215,9 @@ def save(
``"wav"``, ``"amb"``
``"wav"``, ``"amb"``
- | If both ``encoding`` and ``bits_per_sample`` are not provided, the ``dtype`` of the
- | If both ``encoding`` and ``bits_per_sample`` are not provided, the ``dtype`` of the
| Tensor is used to determine the default value.
| Tensor is used to determine the default value.
- ``"PCM_U"`` if dtype is ``uint8``
- ``"PCM_U"`` if dtype is ``uint8``
- ``"PCM_S"`` if dtype is ``int16`` or ``int32`
- ``"PCM_S"`` if dtype is ``int16`` or ``int32`
`
- ``"PCM_F"`` if dtype is ``float32``
- ``"PCM_F"`` if dtype is ``float32``
- ``"PCM_U"`` if ``bits_per_sample=8``
- ``"PCM_U"`` if ``bits_per_sample=8``
...
@@ -235,6 +236,7 @@ def save(
...
@@ -235,6 +236,7 @@ def save(
``"wav"``, ``"amb"``;
``"wav"``, ``"amb"``;
- | If both ``encoding`` and ``bits_per_sample`` are not provided, the ``dtype`` of the
- | If both ``encoding`` and ``bits_per_sample`` are not provided, the ``dtype`` of the
| Tensor is used.
| Tensor is used.
- ``8`` if dtype is ``uint8``
- ``8`` if dtype is ``uint8``
- ``16`` if dtype is ``int16``
- ``16`` if dtype is ``int16``
- ``32`` if dtype is ``int32`` or ``float32``
- ``32`` if dtype is ``int32`` or ``float32``
...
...
torchaudio/transforms.py
View file @
0f8014f5
...
@@ -296,7 +296,7 @@ class AmplitudeToDB(torch.nn.Module):
...
@@ -296,7 +296,7 @@ class AmplitudeToDB(torch.nn.Module):
a full clip.
a full clip.
Args:
Args:
stype (str, optional): scale of input tensor ('power' or 'magnitude'). The
stype (str, optional): scale of input tensor (
``
'power'
``
or
``
'magnitude'
``
). The
power being the elementwise square of the magnitude. (Default: ``'power'``)
power being the elementwise square of the magnitude. (Default: ``'power'``)
top_db (float or None, optional): minimum negative cut-off in decibels. A reasonable
top_db (float or None, optional): minimum negative cut-off in decibels. A reasonable
number is 80. (Default: ``None``)
number is 80. (Default: ``None``)
...
@@ -332,15 +332,13 @@ class MelScale(torch.nn.Module):
...
@@ -332,15 +332,13 @@ class MelScale(torch.nn.Module):
r
"""Turn a normal STFT into a mel frequency STFT, using a conversion
r
"""Turn a normal STFT into a mel frequency STFT, using a conversion
matrix. This uses triangular filter banks.
matrix. This uses triangular filter banks.
User can control which device the filter bank (`fb`) is (e.g. fb.to(spec_f.device)).
Args:
Args:
n_mels (int, optional): Number of mel filterbanks. (Default: ``128``)
n_mels (int, optional): Number of mel filterbanks. (Default: ``128``)
sample_rate (int, optional): Sample rate of audio signal. (Default: ``16000``)
sample_rate (int, optional): Sample rate of audio signal. (Default: ``16000``)
f_min (float, optional): Minimum frequency. (Default: ``0.``)
f_min (float, optional): Minimum frequency. (Default: ``0.``)
f_max (float or None, optional): Maximum frequency. (Default: ``sample_rate // 2``)
f_max (float or None, optional): Maximum frequency. (Default: ``sample_rate // 2``)
n_stft (int, optional): Number of bins in STFT. See ``n_fft`` in :class:`Spectrogram`. (Default: ``201``)
n_stft (int, optional): Number of bins in STFT. See ``n_fft`` in :class:`Spectrogram`. (Default: ``201``)
norm (str or None, optional): If 'slaney', divide the triangular mel weights by the width of the mel band
norm (str or None, optional): If
``
'slaney'
``
, divide the triangular mel weights by the width of the mel band
(area normalization). (Default: ``None``)
(area normalization). (Default: ``None``)
mel_scale (str, optional): Scale to use: ``htk`` or ``slaney``. (Default: ``htk``)
mel_scale (str, optional): Scale to use: ``htk`` or ``slaney``. (Default: ``htk``)
...
@@ -795,7 +793,7 @@ class MuLawDecoding(torch.nn.Module):
...
@@ -795,7 +793,7 @@ class MuLawDecoding(torch.nn.Module):
r
"""Decode mu-law encoded signal. For more info see the
r
"""Decode mu-law encoded signal. For more info see the
`Wikipedia Entry <https://en.wikipedia.org/wiki/%CE%9C-law_algorithm>`_
`Wikipedia Entry <https://en.wikipedia.org/wiki/%CE%9C-law_algorithm>`_
This expects an input with values between 0 and quantization_channels - 1
This expects an input with values between 0 and
``
quantization_channels - 1
``
and returns a signal scaled between -1 and 1.
and returns a signal scaled between -1 and 1.
Args:
Args:
...
@@ -1003,7 +1001,8 @@ class Fade(torch.nn.Module):
...
@@ -1003,7 +1001,8 @@ class Fade(torch.nn.Module):
fade_in_len (int, optional): Length of fade-in (time frames). (Default: ``0``)
fade_in_len (int, optional): Length of fade-in (time frames). (Default: ``0``)
fade_out_len (int, optional): Length of fade-out (time frames). (Default: ``0``)
fade_out_len (int, optional): Length of fade-out (time frames). (Default: ``0``)
fade_shape (str, optional): Shape of fade. Must be one of: "quarter_sine",
fade_shape (str, optional): Shape of fade. Must be one of: "quarter_sine",
"half_sine", "linear", "logarithmic", "exponential". (Default: ``"linear"``)
``"half_sine"``, ``"linear"``, ``"logarithmic"``, ``"exponential"``.
(Default: ``"linear"``)
Example
Example
>>> waveform, sample_rate = torchaudio.load('test.wav', normalize=True)
>>> waveform, sample_rate = torchaudio.load('test.wav', normalize=True)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment