• moto's avatar
    Add load function (#731) · 793eeab8
    moto authored
    This is a part of PRs to add new "sox_io" backend. #726 and depends on #718 and #728 .
    
    This PR adds `load` function to "sox_io" backend, which is  tested on the following audio formats;
     - `wav`
     - `mp3`
     - `flac`
     - `ogg/vorbis` *
    
    By default, "sox_io" backend returns Tensor with `float32` dtype and the shape of `[channel, time]`. The samples are normalized to fit in the range of `[-1.0, 1.0]`.
    
    Unlike existing "sox" backend, the new `load` function can handle WAV file natively, when the input format is WAV with integer type, (such as 32-bit signed integer, 16-bit signed integer and 8-bit unsigned integer) by providing `normalize=False`, this function can return integer Tensor, where the samples are expressed within the whole range of the corresponding dtype, that is, `int32` tensor for `32-bit PCM`, `int16` for `16-bit PCM` and `uint8` for `8-bit PCM`. This behavior follows [scipy.io.wavfile.read](https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.read.html). `normalize` parameter has no effect for other formats and the load function always return normalized value with `float32` Tensor.
    
    __* Note__ The current binary distribution of torchaudio does not contain `ogg/vorbis` and `opus` codecs. To handle these files, one needs to build torchaudio from the source with proper codecs in the system.
    
    __Note 2__ Since this PR, `scipy` becomes required module for running test. 
    793eeab8
test_torchscript.py 2.8 KB