Commit baf354a7 authored by moto's avatar moto Committed by Facebook GitHub Bot
Browse files

Adopt `:autosummary:` in `torchaudio.transforms` module doc (#2683)

Summary:
* Introduce the mini-index at `torchaudio.transforms` page.
* Add "Augmentations" subsection.
* Also updated the overall introduction.

https://output.circle-artifacts.com/output/job/1b65246a-403c-4d2c-b97d-d1b582d8b4e5/artifacts/0/docs/transforms.html

<img width="721" alt="Screen Shot 2022-09-16 at 5 20 08 PM" src="https://user-images.githubusercontent.com/855818/190591795-97c169db-a95b-480a-8d3c-d80072efa045.png">

<img width="755" alt="Screen Shot 2022-09-16 at 5 20 28 PM" src="https://user-images.githubusercontent.com/855818/190591828-03026918-febd-4194-91aa-7d8f704e17cc.png">

Pull Request resolved: https://github.com/pytorch/audio/pull/2683

Reviewed By: carolineechen

Differential Revision: D39574255

Pulled By: mthrok

fbshipit-source-id: a4beed7cacbb5184bad96efa903a3a1123dab627
parent c89ab0c6
..
autogenerated from source/_templates/autosummary/class.rst
{{ name | underline }}
.. autoclass:: {{ fullname }}
:members:
.. role:: hidden
:class: hidden-section
torchaudio.transforms torchaudio.transforms
====================== =====================
.. py:module:: torchaudio.transforms
.. currentmodule:: torchaudio.transforms .. currentmodule:: torchaudio.transforms
Transforms are common audio transforms. They can be chained together using :class:`torch.nn.Sequential` :mod:`torchaudio.transforms` module contains common audio processings and feature extractions. The following diagram shows the relationship between some of the available transforms.
:hidden:`Utility`
~~~~~~~~~~~~~~~~~~
:hidden:`AmplitudeToDB`
-----------------------
.. autoclass:: AmplitudeToDB
.. automethod:: forward
:hidden:`MelScale`
------------------
.. autoclass:: MelScale
.. automethod:: forward
:hidden:`InverseMelScale`
-------------------------
.. autoclass:: InverseMelScale
.. automethod:: forward
:hidden:`MuLawEncoding`
-----------------------
.. autoclass:: MuLawEncoding
.. automethod:: forward
:hidden:`MuLawDecoding`
-----------------------
.. autoclass:: MuLawDecoding
.. automethod:: forward
:hidden:`Resample`
------------------
.. autoclass:: Resample
.. automethod:: forward
:hidden:`FrequencyMasking`
--------------------------
.. autoclass:: FrequencyMasking
.. automethod:: forward
:hidden:`TimeMasking`
---------------------
.. autoclass:: TimeMasking
.. automethod:: forward
:hidden:`TimeStretch`
---------------------
.. autoclass:: TimeStretch
.. automethod:: forward .. image:: https://download.pytorch.org/torchaudio/tutorial-assets/torchaudio_feature_extractions.png
:hidden:`Fade` Transforms are implemented using :class:`torch.nn.Module`. Common ways to build a processing pipeline are to define custom Module class or chain Modules together using :class:`torch.nn.Sequential`, then move it to a target device and data type.
--------------
.. autoclass:: Fade .. code::
.. automethod:: forward # Define custom feature extraction pipeline.
#
# 1. Resample audio
# 2. Convert to power spectrogram
# 3. Apply augmentations
# 4. Convert to mel-scale
#
class MyPipeline(torch.nn.Module):
def __init__(
self,
input_freq=16000,
resample_freq=8000,
n_fft=1024,
n_mel=256,
stretch_factor=0.8,
):
super().__init__()
self.resample = Resample(orig_freq=input_freq, new_freq=resample_freq)
:hidden:`Vol` self.spec = Spectrogram(n_fft=n_fft, power=2)
-------------
.. autoclass:: Vol
.. automethod:: forward
:hidden:`Loudness`
------------------
.. autoclass:: Loudness
.. automethod:: forward
:hidden:`Feature Extractions`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
:hidden:`Spectrogram`
---------------------
.. autoclass:: Spectrogram
.. automethod:: forward
:hidden:`InverseSpectrogram`
----------------------------
.. autoclass:: InverseSpectrogram
.. automethod:: forward
:hidden:`MelSpectrogram`
------------------------
.. autoclass:: MelSpectrogram self.spec_aug = torch.nn.Sequential(
TimeStretch(stretch_factor, fixed_rate=True),
FrequencyMasking(freq_mask_param=80),
TimeMasking(time_mask_param=80),
)
.. automethod:: forward self.mel_scale = MelScale(
n_mels=n_mel, sample_rate=resample_freq, n_stft=n_fft // 2 + 1)
:hidden:`GriffinLim` def forward(self, waveform: torch.Tensor) -> torch.Tensor:
-------------------- # Resample the input
resampled = self.resample(waveform)
.. autoclass:: GriffinLim # Convert to power spectrogram
spec = self.spec(resampled)
.. automethod:: forward # Apply SpecAugment
spec = self.spec_aug(spec)
:hidden:`MFCC` # Convert to mel-scale
-------------- mel = self.mel_scale(spec)
.. autoclass:: MFCC return mel
.. automethod:: forward
:hidden:`LFCC` .. code::
--------------
.. autoclass:: LFCC # Instantiate a pipeline
pipeline = MyPipeline()
.. automethod:: forward # Move the computation graph to CUDA
pipeline.to(device=torch.device("cuda"), dtype=torch.float32)
:hidden:`ComputeDeltas` # Perform the transform
----------------------- features = pipeline(waveform)
.. autoclass:: ComputeDeltas Please check out tutorials that cover in-depth usage of trasforms.
.. automethod:: forward .. minigallery:: torchaudio.transforms
:hidden:`PitchShift` Utility
-------------------- -------
.. autoclass:: PitchShift .. autosummary::
:toctree: generated
:nosignatures:
.. automethod:: forward AmplitudeToDB
MelScale
InverseMelScale
MuLawEncoding
MuLawDecoding
Resample
Fade
Vol
Loudness
:hidden:`SlidingWindowCmn` Feature Extractions
-------------------------- -------------------
.. autoclass:: SlidingWindowCmn .. autosummary::
:toctree: generated
:nosignatures:
.. automethod:: forward Spectrogram
InverseSpectrogram
:hidden:`SpectralCentroid` MelSpectrogram
-------------------------- GriffinLim
MFCC
.. autoclass:: SpectralCentroid LFCC
ComputeDeltas
.. automethod:: forward PitchShift
SlidingWindowCmn
:hidden:`Vad` SpectralCentroid
Vad
Augmentations
------------- -------------
.. autoclass:: Vad The following transforms implement popular augmentation techniques known as *SpecAugment* :cite:`specaugment`.
.. automethod:: forward .. autosummary::
:toctree: generated
:nosignatures:
:hidden:`Loss` FrequencyMasking
~~~~~~~~~~~~~~ TimeMasking
TimeStretch
:hidden:`RNNTLoss` Loss
------------------ ----
.. autoclass:: RNNTLoss .. autosummary::
:toctree: generated
:nosignatures:
.. automethod:: forward RNNTLoss
:hidden:`Multi-channel` Multi-channel
~~~~~~~~~~~~~~~~~~~~~~~
:hidden:`PSD`
------------- -------------
.. autoclass:: PSD .. autosummary::
:toctree: generated
.. automethod:: forward :nosignatures:
:hidden:`MVDR`
--------------
.. autoclass:: MVDR
.. automethod:: forward
:hidden:`RTFMVDR`
-----------------
.. autoclass:: RTFMVDR
.. automethod:: forward
:hidden:`SoudenMVDR`
--------------------
.. autoclass:: SoudenMVDR
.. automethod:: forward PSD
MVDR
RTFMVDR
SoudenMVDR
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment