Commit 6a8ed4a2 authored by hwangjeff's avatar hwangjeff Committed by Facebook GitHub Bot
Browse files

Add documentation introducing I/O backend revision (#3147)

Summary:
Adds documentation that introduces forthcoming I/O backend revision and provides enablement directions for the current release.

Doc pages:
https://output.circle-artifacts.com/output/job/9c0e5a49-eaf4-404c-b910-ca1b18bb289b/artifacts/0/docs/torchaudio.html

Pull Request resolved: https://github.com/pytorch/audio/pull/3147

Reviewed By: mthrok

Differential Revision: D43824019

Pulled By: hwangjeff

fbshipit-source-id: ad21d60c7e8f69f64859c56a8ca75735ddc22e40
parent 10aec5bd
......@@ -26,6 +26,7 @@ docset: html
%: Makefile
doxygen source/Doxyfile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
@python post_process_dispatcher.py $(BUILDDIR)
clean:
rm -rf $(BUILDDIR)/*
......
"""Replaces every instance of 'torchaudio._backend' with 'torchaudio' in torchaudio.html.
Temporary hack while we maintain both the existing set of info/load/save functions and the
new ones backed by the backend dispatcher in torchaudio._backend.
"""
import sys
if __name__ == "__main__":
build_dir = sys.argv[1]
filepath = f"{build_dir}/html/torchaudio.html"
with open(filepath, "r") as f:
text = f.read()
text = text.replace("torchaudio._backend", "torchaudio")
with open(filepath, "w") as f:
f.write(text)
......@@ -10,6 +10,11 @@ Overview
:mod:`torchaudio.backend` module provides implementations for audio file I/O functionalities, which are ``torchaudio.info``, ``torchaudio.load``, and ``torchaudio.save``.
.. note::
Release 2.1 will revise ``torchaudio.info``, ``torchaudio.load``, and ``torchaudio.save`` to allow for backend selection via function parameter rather than ``torchaudio.set_audio_backend``, with FFmpeg being the default backend.
The new logic can be enabled in the current release by setting environment variable ``TORCHAUDIO_USE_BACKEND_DISPATCHER=1``.
See :ref:`future_api` for details on the new API.
There are currently two implementations available.
* :py:mod:`"sox_io" <torchaudio.backends.sox_io_backend>` (default on Linux/macOS)
......
torchaudio
==========
.. note::
Release 2.1 will revise ``torchaudio.info``, ``torchaudio.load``, and ``torchaudio.save`` to allow for backend selection via function parameter rather than ``torchaudio.set_audio_backend``, with FFmpeg being the default backend.
The new API can be enabled in the current release by setting environment variable ``TORCHAUDIO_USE_BACKEND_DISPATCHER=1``.
See :ref:`future_api` for details on the new API.
Current API
-----------
I/O functionalities
~~~~~~~~~~~~~~~~~~~
......@@ -31,3 +40,28 @@ Backend Utilities
.. autofunction:: get_audio_backend
.. autofunction:: set_audio_backend
.. _future_api:
Future API
----------
In the next release, each of ``torchaudio.info``, ``torchaudio.load``, and ``torchaudio.save`` will allow for selecting a backend to use via parameter ``backend``.
The functions will support using any of FFmpeg, SoX, and SoundFile, provided that the corresponding library is installed.
If a backend is not explicitly chosen, the functions will select a backend to use given order of precedence (FFmpeg, SoX, SoundFile) and library availability.
Note that only FFmpeg and SoundFile will support file-like objects.
These functions can be enabled in the current release by setting environment variable ``TORCHAUDIO_USE_BACKEND_DISPATCHER=1``.
.. currentmodule:: torchaudio._backend
.. autofunction:: info
:noindex:
.. autofunction:: load
:noindex:
.. autofunction:: save
:noindex:
from .utils import get_info_func, get_load_func, get_save_func
info = get_info_func()
load = get_load_func()
save = get_save_func()
......@@ -309,14 +309,13 @@ def get_info_func():
* ``path-like``: file path
* ``file-like``: Object with ``read(size: int) -> bytes`` method,
which returns byte string of at most ``size`` length.
which returns byte string of at most ``size`` length.
Note:
* When the input type is file-like object, this function cannot
get the correct length (``num_samples``) for certain formats,
such as ``vorbis``.
In this case, the value of ``num_samples`` is ``0``.
When the input type is file-like object, this function cannot
get the correct length (``num_samples``) for certain formats,
such as ``vorbis``.
In this case, the value of ``num_samples`` is ``0``.
format (str or None, optional):
If not ``None``, interpreted as hint that may allow backend to override the detected format.
......@@ -386,21 +385,21 @@ def get_load_func():
.. warning::
``normalize`` argument does not perform volume normalization.
It only converts the sample type to `torch.float32` from the native sample
type.
``normalize`` argument does not perform volume normalization.
It only converts the sample type to `torch.float32` from the native sample
type.
When the input format is WAV with integer type, such as 32-bit signed integer, 16-bit
signed integer, 24-bit signed integer, and 8-bit unsigned integer, by providing ``normalize=False``,
this function can return integer Tensor, where the samples are expressed within the whole range
of the corresponding dtype, that is, ``int32`` tensor for 32-bit signed PCM,
``int16`` for 16-bit signed PCM and ``uint8`` for 8-bit unsigned PCM. Since torch does not
support ``int24`` dtype, 24-bit signed PCM are converted to ``int32`` tensors.
When the input format is WAV with integer type, such as 32-bit signed integer, 16-bit
signed integer, 24-bit signed integer, and 8-bit unsigned integer, by providing ``normalize=False``,
this function can return integer Tensor, where the samples are expressed within the whole range
of the corresponding dtype, that is, ``int32`` tensor for 32-bit signed PCM,
``int16`` for 16-bit signed PCM and ``uint8`` for 8-bit unsigned PCM. Since torch does not
support ``int24`` dtype, 24-bit signed PCM are converted to ``int32`` tensors.
``normalize`` argument has no effect on 32-bit floating-point WAV and other formats, such as
``flac`` and ``mp3``.
``normalize`` argument has no effect on 32-bit floating-point WAV and other formats, such as
``flac`` and ``mp3``.
For these formats, this function always returns ``float32`` Tensor with values.
For these formats, this function always returns ``float32`` Tensor with values.
Args:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment