• moto's avatar
    Add file-like object support to Streaming API (#2400) · a984872d
    moto authored
    Summary:
    This commit adds file-like object support to Streaming API.
    
    ## Features
    - File-like objects are expected to implement `read(self, n)`.
    - Additionally `seek(self, offset, whence)` is used if available.
    - Without `seek` method, some formats cannot be decoded properly.
      - To work around this, one can use the existing `decoder` option to tell what decoder it should use.
      - The set of `decoder` and `decoder_option` arguments were added to `add_basic_[audio|video]_stream` method, similar to `add_[audio|video]_stream`.
      - So as to have the arguments common to both audio and video in front of the rest of the arguments, the order of the arguments are changed.
      - Also `dtype` and `format` arguments were changed to make them consistent across audio/video methods.
    
    ## Code structure
    
    The approach is very similar to how file-like object is supported in sox-based I/O.
    In Streaming API if the input src is string, it is passed to the implementation bound with TorchBind,
    if the src has `read` attribute, it is passed to the same implementation bound via PyBind 11.
    
    ![Untitled drawing](https://user-images.githubusercontent.com/855818/169098391-6116afee-7b29-460d-b50d-1037bb8a359d.png)
    
    ## Refactoring involved
    - Extracted to https://github.com/pytorch/audio/issues/2402
      - Some implementation in the original TorchBind surface layer is converted to Wrapper class so that they can be re-used from PyBind11 bindings. The wrapper class serves to simplify the binding.
      - `add_basic_[audio|video]_stream` methods were removed from C++ layer as it was just constructing string and passing it to `add_[audio|video]_stream` method, which is simpler to do in Python.
      - The original core Streamer implementation kept the use of types in `c10` namespace minimum. All the `c10::optional` and `c10::Dict` were converted to the equivalents of `std` at binding layer. But since they work fine with PyBind11, Streamer core methods deal them directly.
    
    ## TODO:
    - [x] Check if it is possible to stream MP4 (yuv420p) from S3 and directly decode (with/without HW decoding).
    
    Pull Request resolved: https://github.com/pytorch/audio/pull/2400
    
    Reviewed By: carolineechen
    
    Differential Revision: D36520073
    
    Pulled By: mthrok
    
    fbshipit-source-id: a11d981bbe99b1ff0cc356e46264ac8e76614bc6
    a984872d
streaming_api_tutorial.py 31.1 KB