test/sox_io_backend/test_torchscript.py · 793eeab8745e17faac6ba0df85ceca39c9c20782 · OpenDAS / Torchaudio

moto authored Jun 25, 2020

This is a part of PRs to add new "sox_io" backend. #726 and depends on #718 and #728 .

This PR adds `load` function to "sox_io" backend, which is  tested on the following audio formats;
 - `wav`
 - `mp3`
 - `flac`
 - `ogg/vorbis` *

By default, "sox_io" backend returns Tensor with `float32` dtype and the shape of `[channel, time]`. The samples are normalized to fit in the range of `[-1.0, 1.0]`.

Unlike existing "sox" backend, the new `load` function can handle WAV file natively, when the input format is WAV with integer type, (such as 32-bit signed integer, 16-bit signed integer and 8-bit unsigned integer) by providing `normalize=False`, this function can return integer Tensor, where the samples are expressed within the whole range of the corresponding dtype, that is, `int32` tensor for `32-bit PCM`, `int16` for `16-bit PCM` and `uint8` for `8-bit PCM`. This behavior follows [scipy.io.wavfile.read](https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.read.html). `normalize` parameter has no effect for other formats and the load function always return normalized value with `float32` Tensor.

__* Note__ The current binary distribution of torchaudio does not contain `ogg/vorbis` and `opus` codecs. To handle these files, one needs to build torchaudio from the source with proper codecs in the system.

__Note 2__ Since this PR, `scipy` becomes required module for running test.

793eeab8

test_torchscript.py 2.8 KB

Replace test_torchscript.py