README.md 5.19 KB
Newer Older
moto's avatar
moto committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# FFMpeg binding dev note

The ffmpeg binding is based on ver 4.1.

## Learning material

For understanding the concept of stream processing, some tutorials are useful.

https://github.com/leandromoreira/ffmpeg-libav-tutorial

The best way to learn how to use ffmpeg is to look at the official examples.
Practically all the code is re-organization of examples;

https://ffmpeg.org/doxygen/4.1/examples.html

moto-meta's avatar
moto-meta committed
16
## StreamingMediaDecoder Architecture
moto's avatar
moto committed
17

moto-meta's avatar
moto-meta committed
18
The top level class is `StreamingMediaDecoder` class. This class handles the input (via `AVFormatContext*`), and manages `StreamProcessor`s for each stream in the input.
moto's avatar
moto committed
19

moto-meta's avatar
moto-meta committed
20
The `StreamingMediaDecoder` object slices the input data into a series of `AVPacket` objects and it feeds the objects to corresponding `StreamProcessor`s.
moto's avatar
moto committed
21
22

```
moto-meta's avatar
moto-meta committed
23
 StreamingMediaDecoder
moto's avatar
moto committed
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
┌─────────────────────────────────────────────────┐
│                                                 │
│ AVFormatContext*       ┌──► StreamProcessor[0]  │
│          │             │                        │
│          └─────────────┼──► StreamProcessor[1]  │
│      AVPacket*         │                        │
│                        └──► ...                 │
│                                                 │
└─────────────────────────────────────────────────┘
```

The `StreamProcessor` class is composed of one `Decoder` and multiple of `Sink` objects.

`Sink` objects correspond to output streams that users set.
`Sink` class is a wrapper `FilterGraph` and `Buffer` classes.

The `AVPacket*` passed to `StreamProcessor` is first passed to `Decoder`.
`Decoder` generates audio / video frames (`AVFrame`) and pass it to `Sink`s.

Firstly `Sink` class passes the incoming frame to `FilterGraph`.

`FilterGraph` is a class based on [`AVFilterGraph` structure](https://ffmpeg.org/doxygen/4.1/structAVFilterGraph.html),
and it can apply various filters.
At minimum, it performs format conversion so that the resuling data is suitable for Tensor representation,
such as YUV to RGB.

The output `AVFrame` from `FilterGraph` is passed to `Buffer` class, which converts it to Tensor.

```
 StreamProcessor
┌─────────────────────────────────────────────────────────┐
│ AVPacket*                                               │
│  │                                                      │
│  │         AVFrame*          AVFrame*                   │
│  └► Decoder ──┬─► FilterGraph ─────► Buffer ───► Tensor │
│               │                                         │
│               ├─► FilterGraph ─────► Buffer ───► Tensor │
│               │                                         │
│               └─► ...                                   │
│                                                         │
└─────────────────────────────────────────────────────────┘
```

## Implementation guideline

### Memory management and object lifecycle

Ffmpeg uses raw pointers, which needs to be allocated and freed with dedicated functions.
In the binding code, these pointers are encapsulated in a class with RAII semantic and
`std::unique_ptr<>` to guarantee sole ownership.

**Decoder lifecycle**

```c++
// Default construction (no memory allocation)
decoder = Decoder(...);
// Decode
decoder.process_packet(pPacket);
// Retrieve result
decoder.get_frame(pFrame);
// Release resources
decoder::~Decoder();
```

**FilterGraph lifecycle**

```c++
// Default construction (no memory allocation)
moto's avatar
moto committed
92
filter_graph = FilterGraph(AVMEDIA_TYPE_AUDIO);
moto's avatar
moto committed
93
// Filter configuration
moto's avatar
moto committed
94
95
96
filter_fraph.add_audio_src(..)
filter_fraph.add_sink(..)
filter_fraph.add_process("<filter expression>")
moto's avatar
moto committed
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
filter_graph.create_filter();
// Apply filter
fitler_graph.add_frame(pFrame);
// Retrieve result
filter_graph.get_frame(pFrame);
// Release resources
filter_graph::~FilterGraph();
```

**StreamProcessor lifecycle**

```c++
// Default construction (no memory allocation)
processor = Processor(...);
// Define the process stream
processor.add_audio_stream(...);
processor.add_audio_stream(...);
// Process the packet
processor.process_packet(pPacket);
// Retrieve result
tensor = processor.get_chunk(...);
// Release resources
processor::~Processor();
```

### ON/OFF semantic and `std::unique_ptr<>`

Since we want to make some components (such as stream processors and filters)
separately configurable, we introduce states for ON/OFF.
To make the code simple, we use `std::unique_ptr<>`.
`nullptr` means the component is turned off.
This pattern applies to `StreamProcessor` (output streams).

### Exception and return value

To report the error during the configuration and initialization of objects,
we use `Exception`. However, throwing errors is expensive during the streaming,
so we use return value for that.