io.rst 2.19 KB
Newer Older
Nicolas Hug's avatar
Nicolas Hug committed
1
2
Decoding / Encoding images and videos
=====================================
3
4
5
6

.. currentmodule:: torchvision.io

The :mod:`torchvision.io` package provides functions for performing IO
Nicolas Hug's avatar
Nicolas Hug committed
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
operations. They are currently specific to reading and writing images and
videos.

Images
------

.. autosummary::
    :toctree: generated/
    :template: function.rst

    read_image
    decode_image
    encode_jpeg
    decode_jpeg
    write_jpeg
Nicolas Hug's avatar
Nicolas Hug committed
22
    decode_gif
Nicolas Hug's avatar
Nicolas Hug committed
23
24
25
26
27
28
29
30
31
32
33
34
35
    encode_png
    decode_png
    write_png
    read_file
    write_file

.. autosummary::
    :toctree: generated/
    :template: class.rst

    ImageReadMode


36
37
38
39

Video
-----

40
41
42
.. autosummary::
    :toctree: generated/
    :template: function.rst
43

44
45
46
    read_video
    read_video_timestamps
    write_video
47
48


49
Fine-grained video API
Nicolas Hug's avatar
Nicolas Hug committed
50
^^^^^^^^^^^^^^^^^^^^^^
51
52
53
54
55

In addition to the :mod:`read_video` function, we provide a high-performance 
lower-level API for more fine-grained control compared to the :mod:`read_video` function.
It does all this whilst fully supporting torchscript.

56
57
.. betastatus:: fine-grained video API

58
59
60
61
62
.. autosummary::
    :toctree: generated/
    :template: class.rst

    VideoReader
63
64


Bruno Korbar's avatar
Bruno Korbar committed
65
Example of inspecting a video:
66
67
68
69
70
71

.. code:: python

    import torchvision
    video_path = "path to a test video"
    # Constructor allocates memory and a threaded decoder
Hollow Man's avatar
Hollow Man committed
72
    # instance per video. At the moment it takes two arguments:
73
    # path to the video file, and a wanted stream.
74
    reader = torchvision.io.VideoReader(video_path, "video")
75
76
77
78
79
80
81
82
83
84
85
86

    # The information about the video can be retrieved using the 
    # `get_metadata()` method. It returns a dictionary for every stream, with
    # duration and other relevant metadata (often frame rate)
    reader_md = reader.get_metadata()

    # metadata is structured as a dict of dicts with following structure
    # {"stream_type": {"attribute": [attribute per stream]}}
    #
    # following would print out the list of frame rates for every present video stream
    print(reader_md["video"]["fps"])

Bruno Korbar's avatar
Bruno Korbar committed
87
88
89
90
    # we explicitly select the stream we would like to operate on. In
    # the constructor we select a default video stream, but
    # in practice, we can set whichever stream we would like 
    video.set_current_stream("video:0")