Adopt `:autosummary:` in `torchaudio.transforms` module doc (#2683)

Summary: * Introduce the mini-index at `torchaudio.transforms` page. * Add "Augmentations" subsection. * Also updated the overall introduction. https://output.circle-artifacts.com/output/job/1b65246a-403c-4d2c-b97d-d1b582d8b4e5/artifacts/0/docs/transforms.html <img width="721" alt="Screen Shot 2022-09-16 at 5 20 08 PM" src="https://user-images.githubusercontent.com/855818/190591795-97c169db-a95b-480a-8d3c-d80072efa045.png"> <img width="755" alt="Screen Shot 2022-09-16 at 5 20 28 PM" src="https://user-images.githubusercontent.com/855818/190591828-03026918-febd-4194-91aa-7d8f704e17cc.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2683 Reviewed By: carolineechen Differential Revision: D39574255 Pulled By: mthrok fbshipit-source-id: a4beed7cacbb5184bad96efa903a3a1123dab627

Adopt `:autosummary:` in `torchaudio.transforms` module doc (#2683)
Summary: * Introduce the mini-index at `torchaudio.transforms` page. * Add "Augmentations" subsection. * Also updated the overall introduction. https://output.circle-artifacts.com/output/job/1b65246a-403c-4d2c-b97d-d1b582d8b4e5/artifacts/0/docs/transforms.html <img width="721" alt="Screen Shot 2022-09-16 at 5 20 08 PM" src="https://user-images.githubusercontent.com/855818/190591795-97c169db-a95b-480a-8d3c-d80072efa045.png"> <img width="755" alt="Screen Shot 2022-09-16 at 5 20 28 PM" src="https://user-images.githubusercontent.com/855818/190591828-03026918-febd-4194-91aa-7d8f704e17cc.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2683 Reviewed By: carolineechen Differential Revision: D39574255 Pulled By: mthrok fbshipit-source-id: a4beed7cacbb5184bad96efa903a3a1123dab627
baf354a7 · moto · Facebook GitHub Bot · c89ab0c6 · baf354a7 · baf354a7
Commit baf354a7 authored Sep 16, 2022 by moto Committed by Facebook GitHub Bot Sep 16, 2022
Hide whitespace changes
Inline Side-by-side

Showing with 110 additions and 178 deletions

docs/source/_templates/autosummary/class.rst docs/source/_templates/autosummary/class.rst +7 -0

docs/source/transforms.rst docs/source/transforms.rst +103 -178

No files found.
--- a/docs/source/_templates/autosummary/class.rst
+++ b/docs/source/_templates/autosummary/class.rst
+..
+  autogenerated from source/_templates/autosummary/class.rst
+
+{{ name | underline }}
+
+.. autoclass:: {{ fullname }}
+   :members:
--- a/docs/source/transforms.rst
+++ b/docs/source/transforms.rst
-.. role:: hidden
-    :class: hidden-section
-
 torchaudio.transforms
-======================
-
-.. py:module:: torchaudio.transforms
+=====================

 .. currentmodule:: torchaudio.transforms

-Transforms are common audio transforms. They can be chained together using :class:`torch.nn.Sequential`
-
-:hidden:`Utility`
-~~~~~~~~~~~~~~~~~~
-
-:hidden:`AmplitudeToDB`
-----------------------
-
-.. autoclass:: AmplitudeToDB
-
-  .. automethod:: forward
-
-:hidden:`MelScale`
------------------
-
-.. autoclass:: MelScale
-
-  .. automethod:: forward
-
-:hidden:`InverseMelScale`
-------------------------
-
-.. autoclass:: InverseMelScale
-
-  .. automethod:: forward
-
-:hidden:`MuLawEncoding`
-----------------------
-
-.. autoclass:: MuLawEncoding
-
-  .. automethod:: forward
-
-:hidden:`MuLawDecoding`
-----------------------
-
-.. autoclass:: MuLawDecoding
-
-  .. automethod:: forward
-
-:hidden:`Resample`
------------------
-
-.. autoclass:: Resample
-
-  .. automethod:: forward
-
-:hidden:`FrequencyMasking`
--------------------------
-
-.. autoclass:: FrequencyMasking
-
-  .. automethod:: forward
-
-:hidden:`TimeMasking`
---------------------
-
-.. autoclass:: TimeMasking
-
-  .. automethod:: forward
-
-:hidden:`TimeStretch`
---------------------
+:mod:`torchaudio.transforms` module contains common audio processings and feature extractions. The following diagram shows the relationship between some of the available transforms.

-.. autoclass:: TimeStretch

-  .. automethod:: forward
+.. image:: https://download.pytorch.org/torchaudio/tutorial-assets/torchaudio_feature_extractions.png

-:hidden:`Fade`
--------------
+Transforms are implemented using :class:`torch.nn.Module`. Common ways to build a processing pipeline are to define custom Module class or chain Modules together using :class:`torch.nn.Sequential`, then move it to a target device and data type.

-.. autoclass:: Fade
+.. code::

-  .. automethod:: forward
+   # Define custom feature extraction pipeline.
+   #
+   # 1. Resample audio
+   # 2. Convert to power spectrogram
+   # 3. Apply augmentations
+   # 4. Convert to mel-scale
+   #
+   class MyPipeline(torch.nn.Module):
+       def __init__(
+           self,
+           input_freq=16000,
+           resample_freq=8000,
+           n_fft=1024,
+           n_mel=256,
+           stretch_factor=0.8,
+       ):
+           super().__init__()
+           self.resample = Resample(orig_freq=input_freq, new_freq=resample_freq)

-:hidden:`Vol`
-------------
-
-.. autoclass:: Vol
-
-  .. automethod:: forward
-
-:hidden:`Loudness`
------------------
-
-.. autoclass:: Loudness
-
-  .. automethod:: forward
-
-:hidden:`Feature Extractions`
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-:hidden:`Spectrogram`
---------------------
-
-.. autoclass:: Spectrogram
-
-  .. automethod:: forward
-
-:hidden:`InverseSpectrogram`
----------------------------
-
-.. autoclass:: InverseSpectrogram
-
-  .. automethod:: forward
-
-:hidden:`MelSpectrogram`
------------------------
+           self.spec = Spectrogram(n_fft=n_fft, power=2)

-.. autoclass:: MelSpectrogram
+           self.spec_aug = torch.nn.Sequential(
+               TimeStretch(stretch_factor, fixed_rate=True),
+               FrequencyMasking(freq_mask_param=80),
+               TimeMasking(time_mask_param=80),
+           )

-  .. automethod:: forward
+           self.mel_scale = MelScale(
+               n_mels=n_mel, sample_rate=resample_freq, n_stft=n_fft // 2 + 1)

-:hidden:`GriffinLim`
--------------------
+       def forward(self, waveform: torch.Tensor) -> torch.Tensor:
+           # Resample the input
+           resampled = self.resample(waveform)

-.. autoclass:: GriffinLim
+           # Convert to power spectrogram
+           spec = self.spec(resampled)

-  .. automethod:: forward
+           # Apply SpecAugment
+           spec = self.spec_aug(spec)

-:hidden:`MFCC`
--------------
+           # Convert to mel-scale
+           mel = self.mel_scale(spec)

-.. autoclass:: MFCC
+           return mel

-  .. automethod:: forward

-:hidden:`LFCC`
--------------
+.. code::

-.. autoclass:: LFCC
+   # Instantiate a pipeline
+   pipeline = MyPipeline()

-  .. automethod:: forward
+   # Move the computation graph to CUDA
+   pipeline.to(device=torch.device("cuda"), dtype=torch.float32)

-:hidden:`ComputeDeltas`
-----------------------
+   # Perform the transform
+   features = pipeline(waveform)

-.. autoclass:: ComputeDeltas
+Please check out tutorials that cover in-depth usage of trasforms.

-  .. automethod:: forward
+.. minigallery:: torchaudio.transforms

-:hidden:`PitchShift`
--------------------
+Utility
+-------

-.. autoclass:: PitchShift
+.. autosummary::
+    :toctree: generated
+    :nosignatures:

-  .. automethod:: forward
+    AmplitudeToDB
+    MelScale
+    InverseMelScale
+    MuLawEncoding
+    MuLawDecoding
+    Resample
+    Fade
+    Vol
+    Loudness

-:hidden:`SlidingWindowCmn`
--------------------------
+Feature Extractions
+-------------------

-.. autoclass:: SlidingWindowCmn
+.. autosummary::
+    :toctree: generated
+    :nosignatures:

-  .. automethod:: forward
-
-:hidden:`SpectralCentroid`
--------------------------
-
-.. autoclass:: SpectralCentroid
-
-  .. automethod:: forward
-
-:hidden:`Vad`
+    Spectrogram
+    InverseSpectrogram
+    MelSpectrogram
+    GriffinLim
+    MFCC
+    LFCC
+    ComputeDeltas
+    PitchShift
+    SlidingWindowCmn
+    SpectralCentroid
+    Vad
+
+Augmentations
 -------------

-.. autoclass:: Vad
+The following transforms implement popular augmentation techniques known as *SpecAugment* :cite:`specaugment`.

-  .. automethod:: forward
+.. autosummary::
+    :toctree: generated
+    :nosignatures:

-:hidden:`Loss`
-~~~~~~~~~~~~~~
+    FrequencyMasking
+    TimeMasking
+    TimeStretch

-:hidden:`RNNTLoss`
------------------
+Loss
+----

-.. autoclass:: RNNTLoss
+.. autosummary::
+    :toctree: generated
+    :nosignatures:

-  .. automethod:: forward
+    RNNTLoss

-:hidden:`Multi-channel`
-~~~~~~~~~~~~~~~~~~~~~~~
-
-:hidden:`PSD`
+Multi-channel
 -------------

-.. autoclass:: PSD
-
-  .. automethod:: forward
-
-:hidden:`MVDR`
--------------
-
-.. autoclass:: MVDR
-
-  .. automethod:: forward
-
-:hidden:`RTFMVDR`
-----------------
-
-.. autoclass:: RTFMVDR
-
-  .. automethod:: forward
-
-:hidden:`SoudenMVDR`
--------------------
-
-.. autoclass:: SoudenMVDR
+.. autosummary::
+    :toctree: generated
+    :nosignatures:

-  .. automethod:: forward
+    PSD
+    MVDR
+    RTFMVDR
+    SoudenMVDR