Unverified Commit 457148ea authored by Vincent QB's avatar Vincent QB Committed by GitHub
Browse files

Fixes #1314 (#1316)

parent 49860425
...@@ -208,3 +208,13 @@ vad ...@@ -208,3 +208,13 @@ vad
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autofunction:: compute_kaldi_pitch .. autofunction:: compute_kaldi_pitch
:hidden:`spectral_centroid`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autofunction:: spectral_centroid
:hidden:`apply_codec`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autofunction:: apply_codec
...@@ -136,6 +136,13 @@ Transforms are common audio transforms. They can be chained together using :clas ...@@ -136,6 +136,13 @@ Transforms are common audio transforms. They can be chained together using :clas
.. automethod:: forward .. automethod:: forward
:hidden:`SpectralCentroid`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: SpectralCentroid
.. automethod:: forward
:hidden:`Vad` :hidden:`Vad`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
...@@ -1011,24 +1011,25 @@ def apply_codec( ...@@ -1011,24 +1011,25 @@ def apply_codec(
bits_per_sample: Optional[int] = None, bits_per_sample: Optional[int] = None,
) -> Tensor: ) -> Tensor:
r""" r"""
Applies codecs as a form of augmentation Apply codecs as a form of augmentation.
Args: Args:
waveform (Tensor): Audio data. Must be 2 dimensional. See also ```channels_first``` waveform (Tensor): Audio data. Must be 2 dimensional. See also ```channels_first```.
sample_rate (int): Sample rate of the audio waveform sample_rate (int): Sample rate of the audio waveform.
format (str): file format format (str): File format.
channels_first (bool): channels_first (bool):
When True, both the input and output Tensor have dimension ``[channel, time]``. When True, both the input and output Tensor have dimension ``[channel, time]``.
Otherwise, they have dimension ``[time, channel]``. Otherwise, they have dimension ``[time, channel]``.
compression (float): Used for formats other than WAV. compression (float): Used for formats other than WAV.
For mor details see :py:func:`torchaudio.backend.sox_io_backend.save` For mor details see :py:func:`torchaudio.backend.sox_io_backend.save`.
encoding (str, optional): Changes the encoding for the supported formats. encoding (str, optional): Changes the encoding for the supported formats.
For more details see :py:func:`torchaudio.backend.sox_io_backend.save` For more details see :py:func:`torchaudio.backend.sox_io_backend.save`.
bits_per_sample (int, optional): Changes the bit depth for the supported formats. bits_per_sample (int, optional): Changes the bit depth for the supported formats.
For more details see :py:func:`torchaudio.backend.sox_io_backend.save` For more details see :py:func:`torchaudio.backend.sox_io_backend.save`.
Returns: Returns:
torch.Tensor: Resulting Tensor. torch.Tensor: Resulting Tensor.
If ``channels_first=True``, it has ``[channel, time]`` else ``[time, channel]`` If ``channels_first=True``, it has ``[channel, time]`` else ``[time, channel]``.
""" """
bytes = io.BytesIO() bytes = io.BytesIO()
torchaudio.backend.sox_io_backend.save(bytes, torchaudio.backend.sox_io_backend.save(bytes,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment