Commit baf354a7 authored by moto's avatar moto Committed by Facebook GitHub Bot
Browse files

Adopt `:autosummary:` in `torchaudio.transforms` module doc (#2683)

Summary:
* Introduce the mini-index at `torchaudio.transforms` page.
* Add "Augmentations" subsection.
* Also updated the overall introduction.

https://output.circle-artifacts.com/output/job/1b65246a-403c-4d2c-b97d-d1b582d8b4e5/artifacts/0/docs/transforms.html

<img width="721" alt="Screen Shot 2022-09-16 at 5 20 08 PM" src="https://user-images.githubusercontent.com/855818/190591795-97c169db-a95b-480a-8d3c-d80072efa045.png">

<img width="755" alt="Screen Shot 2022-09-16 at 5 20 28 PM" src="https://user-images.githubusercontent.com/855818/190591828-03026918-febd-4194-91aa-7d8f704e17cc.png">

Pull Request resolved: https://github.com/pytorch/audio/pull/2683

Reviewed By: carolineechen

Differential Revision: D39574255

Pulled By: mthrok

fbshipit-source-id: a4beed7cacbb5184bad96efa903a3a1123dab627
parent c89ab0c6
..
autogenerated from source/_templates/autosummary/class.rst
{{ name | underline }}
.. autoclass:: {{ fullname }}
:members:
.. role:: hidden
:class: hidden-section
torchaudio.transforms
======================
.. py:module:: torchaudio.transforms
=====================
.. currentmodule:: torchaudio.transforms
Transforms are common audio transforms. They can be chained together using :class:`torch.nn.Sequential`
:hidden:`Utility`
~~~~~~~~~~~~~~~~~~
:hidden:`AmplitudeToDB`
-----------------------
.. autoclass:: AmplitudeToDB
.. automethod:: forward
:hidden:`MelScale`
------------------
.. autoclass:: MelScale
.. automethod:: forward
:hidden:`InverseMelScale`
-------------------------
.. autoclass:: InverseMelScale
.. automethod:: forward
:hidden:`MuLawEncoding`
-----------------------
.. autoclass:: MuLawEncoding
.. automethod:: forward
:hidden:`MuLawDecoding`
-----------------------
.. autoclass:: MuLawDecoding
.. automethod:: forward
:hidden:`Resample`
------------------
.. autoclass:: Resample
.. automethod:: forward
:hidden:`FrequencyMasking`
--------------------------
.. autoclass:: FrequencyMasking
.. automethod:: forward
:hidden:`TimeMasking`
---------------------
.. autoclass:: TimeMasking
.. automethod:: forward
:hidden:`TimeStretch`
---------------------
:mod:`torchaudio.transforms` module contains common audio processings and feature extractions. The following diagram shows the relationship between some of the available transforms.
.. autoclass:: TimeStretch
.. automethod:: forward
.. image:: https://download.pytorch.org/torchaudio/tutorial-assets/torchaudio_feature_extractions.png
:hidden:`Fade`
--------------
Transforms are implemented using :class:`torch.nn.Module`. Common ways to build a processing pipeline are to define custom Module class or chain Modules together using :class:`torch.nn.Sequential`, then move it to a target device and data type.
.. autoclass:: Fade
.. code::
.. automethod:: forward
# Define custom feature extraction pipeline.
#
# 1. Resample audio
# 2. Convert to power spectrogram
# 3. Apply augmentations
# 4. Convert to mel-scale
#
class MyPipeline(torch.nn.Module):
def __init__(
self,
input_freq=16000,
resample_freq=8000,
n_fft=1024,
n_mel=256,
stretch_factor=0.8,
):
super().__init__()
self.resample = Resample(orig_freq=input_freq, new_freq=resample_freq)
:hidden:`Vol`
-------------
.. autoclass:: Vol
.. automethod:: forward
:hidden:`Loudness`
------------------
.. autoclass:: Loudness
.. automethod:: forward
:hidden:`Feature Extractions`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
:hidden:`Spectrogram`
---------------------
.. autoclass:: Spectrogram
.. automethod:: forward
:hidden:`InverseSpectrogram`
----------------------------
.. autoclass:: InverseSpectrogram
.. automethod:: forward
:hidden:`MelSpectrogram`
------------------------
self.spec = Spectrogram(n_fft=n_fft, power=2)
.. autoclass:: MelSpectrogram
self.spec_aug = torch.nn.Sequential(
TimeStretch(stretch_factor, fixed_rate=True),
FrequencyMasking(freq_mask_param=80),
TimeMasking(time_mask_param=80),
)
.. automethod:: forward
self.mel_scale = MelScale(
n_mels=n_mel, sample_rate=resample_freq, n_stft=n_fft // 2 + 1)
:hidden:`GriffinLim`
--------------------
def forward(self, waveform: torch.Tensor) -> torch.Tensor:
# Resample the input
resampled = self.resample(waveform)
.. autoclass:: GriffinLim
# Convert to power spectrogram
spec = self.spec(resampled)
.. automethod:: forward
# Apply SpecAugment
spec = self.spec_aug(spec)
:hidden:`MFCC`
--------------
# Convert to mel-scale
mel = self.mel_scale(spec)
.. autoclass:: MFCC
return mel
.. automethod:: forward
:hidden:`LFCC`
--------------
.. code::
.. autoclass:: LFCC
# Instantiate a pipeline
pipeline = MyPipeline()
.. automethod:: forward
# Move the computation graph to CUDA
pipeline.to(device=torch.device("cuda"), dtype=torch.float32)
:hidden:`ComputeDeltas`
-----------------------
# Perform the transform
features = pipeline(waveform)
.. autoclass:: ComputeDeltas
Please check out tutorials that cover in-depth usage of trasforms.
.. automethod:: forward
.. minigallery:: torchaudio.transforms
:hidden:`PitchShift`
--------------------
Utility
-------
.. autoclass:: PitchShift
.. autosummary::
:toctree: generated
:nosignatures:
.. automethod:: forward
AmplitudeToDB
MelScale
InverseMelScale
MuLawEncoding
MuLawDecoding
Resample
Fade
Vol
Loudness
:hidden:`SlidingWindowCmn`
--------------------------
Feature Extractions
-------------------
.. autoclass:: SlidingWindowCmn
.. autosummary::
:toctree: generated
:nosignatures:
.. automethod:: forward
:hidden:`SpectralCentroid`
--------------------------
.. autoclass:: SpectralCentroid
.. automethod:: forward
:hidden:`Vad`
Spectrogram
InverseSpectrogram
MelSpectrogram
GriffinLim
MFCC
LFCC
ComputeDeltas
PitchShift
SlidingWindowCmn
SpectralCentroid
Vad
Augmentations
-------------
.. autoclass:: Vad
The following transforms implement popular augmentation techniques known as *SpecAugment* :cite:`specaugment`.
.. automethod:: forward
.. autosummary::
:toctree: generated
:nosignatures:
:hidden:`Loss`
~~~~~~~~~~~~~~
FrequencyMasking
TimeMasking
TimeStretch
:hidden:`RNNTLoss`
------------------
Loss
----
.. autoclass:: RNNTLoss
.. autosummary::
:toctree: generated
:nosignatures:
.. automethod:: forward
RNNTLoss
:hidden:`Multi-channel`
~~~~~~~~~~~~~~~~~~~~~~~
:hidden:`PSD`
Multi-channel
-------------
.. autoclass:: PSD
.. automethod:: forward
:hidden:`MVDR`
--------------
.. autoclass:: MVDR
.. automethod:: forward
:hidden:`RTFMVDR`
-----------------
.. autoclass:: RTFMVDR
.. automethod:: forward
:hidden:`SoudenMVDR`
--------------------
.. autoclass:: SoudenMVDR
.. autosummary::
:toctree: generated
:nosignatures:
.. automethod:: forward
PSD
MVDR
RTFMVDR
SoudenMVDR
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment