Adopt `:autosummary:` in `torchaudio.transforms` module doc (#2683)

Summary: * Introduce the mini-index at `torchaudio.transforms` page. * Add "Augmentations" subsection. * Also updated the overall introduction. https://output.circle-artifacts.com/output/job/1b65246a-403c-4d2c-b97d-d1b582d8b4e5/artifacts/0/docs/transforms.html <img width="721" alt="Screen Shot 2022-09-16 at 5 20 08 PM" src="https://user-images.githubusercontent.com/855818/190591795-97c169db-a95b-480a-8d3c-d80072efa045.png"> <img width="755" alt="Screen Shot 2022-09-16 at 5 20 28 PM" src="https://user-images.githubusercontent.com/855818/190591828-03026918-febd-4194-91aa-7d8f704e17cc.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2683 Reviewed By: carolineechen Differential Revision: D39574255 Pulled By: mthrok fbshipit-source-id: a4beed7cacbb5184bad96efa903a3a1123dab627

Adopt `:autosummary:` in `torchaudio.transforms` module doc (#2683)
Summary: * Introduce the mini-index at `torchaudio.transforms` page. * Add "Augmentations" subsection. * Also updated the overall introduction. https://output.circle-artifacts.com/output/job/1b65246a-403c-4d2c-b97d-d1b582d8b4e5/artifacts/0/docs/transforms.html <img width="721" alt="Screen Shot 2022-09-16 at 5 20 08 PM" src="https://user-images.githubusercontent.com/855818/190591795-97c169db-a95b-480a-8d3c-d80072efa045.png"> <img width="755" alt="Screen Shot 2022-09-16 at 5 20 28 PM" src="https://user-images.githubusercontent.com/855818/190591828-03026918-febd-4194-91aa-7d8f704e17cc.png"> Pull Request resolved: https://github.com/pytorch/audio/pull/2683 Reviewed By: carolineechen Differential Revision: D39574255 Pulled By: mthrok fbshipit-source-id: a4beed7cacbb5184bad96efa903a3a1123dab627
baf354a7 · moto · Facebook GitHub Bot · c89ab0c6 · baf354a7 · baf354a7
Commit baf354a7 authored Sep 16, 2022 by moto Committed by Facebook GitHub Bot Sep 16, 2022
Hide whitespace changes
Inline Side-by-side

Showing with 110 additions and 178 deletions

docs/source/_templates/autosummary/class.rst docs/source/_templates/autosummary/class.rst +7 -0

docs/source/transforms.rst docs/source/transforms.rst +103 -178

No files found.
--- a/docs/source/_templates/autosummary/class.rst
+++ b/docs/source/_templates/autosummary/class.rst
+..
+  autogenerated from source/_templates/autosummary/class.rst
+{{ name | underline }}
+.. autoclass:: {{ fullname }}
+   :members:
--- a/docs/source/transforms.rst
+++ b/docs/source/transforms.rst
-.. role:: hidden
-    :class: hidden-section
 torchaudio.transforms
-======================
+=====================
-.. py:module:: torchaudio.transforms
 .. currentmodule:: torchaudio.transforms
-Transforms are common audio transforms. They can be chained together using :class:`torch.nn.Sequential`
+:mod:`torchaudio.transforms` module contains common audio processings and feature extractions. The following diagram shows the relationship between some of the available transforms.
-:hidden:`Utility`
-~~~~~~~~~~~~~~~~~~
-:hidden:`AmplitudeToDB`
-----------------------
-.. autoclass:: AmplitudeToDB
-  .. automethod:: forward
-:hidden:`MelScale`
------------------
-.. autoclass:: MelScale
-  .. automethod:: forward
-:hidden:`InverseMelScale`
-------------------------
-.. autoclass:: InverseMelScale
-  .. automethod:: forward
-:hidden:`MuLawEncoding`
-----------------------
-.. autoclass:: MuLawEncoding
-  .. automethod:: forward
-:hidden:`MuLawDecoding`
-----------------------
-.. autoclass:: MuLawDecoding
-  .. automethod:: forward
-:hidden:`Resample`
------------------
-.. autoclass:: Resample
-  .. automethod:: forward
-:hidden:`FrequencyMasking`
--------------------------
-.. autoclass:: FrequencyMasking
-  .. automethod:: forward
-:hidden:`TimeMasking`
---------------------
-.. autoclass:: TimeMasking
-  .. automethod:: forward
-:hidden:`TimeStretch`
---------------------
-.. autoclass:: TimeStretch
-  .. automethod:: forward
+.. image:: https://download.pytorch.org/torchaudio/tutorial-assets/torchaudio_feature_extractions.png
-:hidden:`Fade`
+Transforms are implemented using :class:`torch.nn.Module`. Common ways to build a processing pipeline are to define custom Module class or chain Modules together using :class:`torch.nn.Sequential`, then move it to a target device and data type.
--------------
-.. autoclass:: Fade
+.. code::
-  .. automethod:: forward
+   # Define custom feature extraction pipeline.
+   #
+   # 1. Resample audio
+   # 2. Convert to power spectrogram
+   # 3. Apply augmentations
+   # 4. Convert to mel-scale
+   #
+   class MyPipeline(torch.nn.Module):
+       def __init__(
+           self,
+           input_freq=16000,
+           resample_freq=8000,
+           n_fft=1024,
+           n_mel=256,
+           stretch_factor=0.8,
+       ):
+           super().__init__()
+           self.resample = Resample(orig_freq=input_freq, new_freq=resample_freq)
-:hidden:`Vol`
+           self.spec = Spectrogram(n_fft=n_fft, power=2)
-------------
-.. autoclass:: Vol
-  .. automethod:: forward
-:hidden:`Loudness`
------------------
-.. autoclass:: Loudness
-  .. automethod:: forward
-:hidden:`Feature Extractions`
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-:hidden:`Spectrogram`
---------------------
-.. autoclass:: Spectrogram
-  .. automethod:: forward
-:hidden:`InverseSpectrogram`
----------------------------
-.. autoclass:: InverseSpectrogram
-  .. automethod:: forward
-:hidden:`MelSpectrogram`
------------------------
-.. autoclass:: MelSpectrogram
+           self.spec_aug = torch.nn.Sequential(
+               TimeStretch(stretch_factor, fixed_rate=True),
+               FrequencyMasking(freq_mask_param=80),
+               TimeMasking(time_mask_param=80),
+           )
-  .. automethod:: forward
+           self.mel_scale = MelScale(
+               n_mels=n_mel, sample_rate=resample_freq, n_stft=n_fft // 2 + 1)
-:hidden:`GriffinLim`
+       def forward(self, waveform: torch.Tensor) -> torch.Tensor:
--------------------
+           # Resample the input
+           resampled = self.resample(waveform)
-.. autoclass:: GriffinLim
+           # Convert to power spectrogram
+           spec = self.spec(resampled)
-  .. automethod:: forward
+           # Apply SpecAugment
+           spec = self.spec_aug(spec)
-:hidden:`MFCC`
+           # Convert to mel-scale
--------------
+           mel = self.mel_scale(spec)
-.. autoclass:: MFCC
+           return mel
-  .. automethod:: forward
-:hidden:`LFCC`
+.. code::
--------------
-.. autoclass:: LFCC
+   # Instantiate a pipeline
+   pipeline = MyPipeline()
-  .. automethod:: forward
+   # Move the computation graph to CUDA
+   pipeline.to(device=torch.device("cuda"), dtype=torch.float32)
-:hidden:`ComputeDeltas`
+   # Perform the transform
-----------------------
+   features = pipeline(waveform)
-.. autoclass:: ComputeDeltas
+Please check out tutorials that cover in-depth usage of trasforms.
-  .. automethod:: forward
+.. minigallery:: torchaudio.transforms
-:hidden:`PitchShift`
+Utility
--------------------
+-------
-.. autoclass:: PitchShift
+.. autosummary::
+    :toctree: generated
+    :nosignatures:
-  .. automethod:: forward
+    AmplitudeToDB
+    MelScale
+    InverseMelScale
+    MuLawEncoding
+    MuLawDecoding
+    Resample
+    Fade
+    Vol
+    Loudness
-:hidden:`SlidingWindowCmn`
+Feature Extractions
--------------------------
+-------------------
-.. autoclass:: SlidingWindowCmn
+.. autosummary::
+    :toctree: generated
+    :nosignatures:
-  .. automethod:: forward
+    Spectrogram
+    InverseSpectrogram
-:hidden:`SpectralCentroid`
+    MelSpectrogram
--------------------------
+    GriffinLim
+    MFCC
-.. autoclass:: SpectralCentroid
+    LFCC
+    ComputeDeltas
-  .. automethod:: forward
+    PitchShift
+    SlidingWindowCmn
-:hidden:`Vad`
+    SpectralCentroid
+    Vad
+Augmentations
 -------------
-.. autoclass:: Vad
+The following transforms implement popular augmentation techniques known as *SpecAugment* :cite:`specaugment`.
-  .. automethod:: forward
+.. autosummary::
+    :toctree: generated
+    :nosignatures:
-:hidden:`Loss`
+    FrequencyMasking
-~~~~~~~~~~~~~~
+    TimeMasking
+    TimeStretch
-:hidden:`RNNTLoss`
+Loss
------------------
+----
-.. autoclass:: RNNTLoss
+.. autosummary::
+    :toctree: generated
+    :nosignatures:
-  .. automethod:: forward
+    RNNTLoss
-:hidden:`Multi-channel`
+Multi-channel
-~~~~~~~~~~~~~~~~~~~~~~~
-:hidden:`PSD`
 -------------
-.. autoclass:: PSD
+.. autosummary::
+    :toctree: generated
-  .. automethod:: forward
+    :nosignatures:
-:hidden:`MVDR`
--------------
-.. autoclass:: MVDR
-  .. automethod:: forward
-:hidden:`RTFMVDR`
-----------------
-.. autoclass:: RTFMVDR
-  .. automethod:: forward
-:hidden:`SoudenMVDR`
--------------------
-.. autoclass:: SoudenMVDR
-  .. automethod:: forward
+    PSD
+    MVDR
+    RTFMVDR
+    SoudenMVDR