Refactor the internal of StreamReader (#3188)
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3188 Refactor the process after decoding in StreamRader. The post-decode process consists of three parts, 1. preprocessing using FilterGraph 2. conversion to Tensor 3. store in Buffer The FilterGraph class is a thin wrapper around AVFilterGraph structure from FFmpeg and it is agnostic to media type. However Tensor conversion and buffering consists of bunch of different logics. Currently, conversion process is abstracted away with template, i.e. `template<typename Conversion> Buffer`, and the whole process is implemeted in Sink class which consists of `FilterGraph` and `Buffer` which internally contains Conversion logic, even though conversion logic and buffer have nothing in common and beter logically separated. The new implementation replaces `Sink` class with `IPostDecodeProcess` interface, which contains the three components. The different post process is implemented as a template argument of the actual implementation, i.e. ```c++ template<typename Converter, typename Buffer> ProcessImpl : IPostDecodeProcess ``` and stored as `unique_ptr<IPostDecodeProcess>` on `StreamProcessor`. ([functionoid pattern](https://isocpp.org/wiki/faq/pointers-to-members#functionoids), which allows to eliminate all the branching based on the media format.) Note: This implementation was not possible at the initial version of StreamReader, as there was no way of knowing the media attributes coming out of `AVFilterGraph`. https://github.com/pytorch/audio/pull/3155 and https://github.com/pytorch/audio/pull/3183 added features to parse it properly, so we can finally make the post processing strongly-typed. Reviewed By: hwangjeff Differential Revision: D44242647 fbshipit-source-id: 96b8c6c72a2b8af4fa86a9b02292c65078ee265b
Showing
Please register or sign in to comment