Unverified Commit 4ec38d49 authored by Francisco Massa's avatar Francisco Massa Committed by GitHub
Browse files

Expose docs for io and ops package (#1189)

* Expose docs for io and ops package

Had do modify the docstrings to use Napoleon NumPy style, because Napoleon Google Style doesn't support multiple return arguments

* Add video section
parent 9168476a
...@@ -9,7 +9,9 @@ architectures, and common image transformations for computer vision. ...@@ -9,7 +9,9 @@ architectures, and common image transformations for computer vision.
:caption: Package Reference :caption: Package Reference
datasets datasets
io
models models
ops
transforms transforms
utils utils
......
torchvision.io
==============
.. currentmodule:: torchvision.io
The :mod:`torchvision.io` package provides functions for performing IO
operations. They are currently specific to reading and writing video.
Video
-----
.. autofunction:: read_video
.. autofunction:: read_video_timestamps
.. autofunction:: write_video
torchvision.ops
===============
.. currentmodule:: torchvision.ops
:mod:`torchvision.ops` implements operators that are specific for Computer Vision.
.. note::
Those operators currently do not support TorchScript.
.. autofunction:: nms
.. autofunction:: roi_align
.. autofunction:: roi_pool
.. autoclass:: RoIAlign
.. autoclass:: RoIPool
...@@ -28,11 +28,14 @@ def write_video(filename, video_array, fps, video_codec='libx264', options=None) ...@@ -28,11 +28,14 @@ def write_video(filename, video_array, fps, video_codec='libx264', options=None)
""" """
Writes a 4d tensor in [T, H, W, C] format in a video file Writes a 4d tensor in [T, H, W, C] format in a video file
Arguments: Parameters
filename (str): path where the video will be saved ----------
video_array (Tensor[T, H, W, C]): tensor containing the individual frames, filename : str
as a uint8 tensor in [T, H, W, C] format path where the video will be saved
fps (Number): frames per second video_array : Tensor[T, H, W, C]
tensor containing the individual frames, as a uint8 tensor in [T, H, W, C] format
fps : Number
frames per second
""" """
_check_av_available() _check_av_available()
video_array = torch.as_tensor(video_array, dtype=torch.uint8).numpy() video_array = torch.as_tensor(video_array, dtype=torch.uint8).numpy()
...@@ -135,18 +138,25 @@ def read_video(filename, start_pts=0, end_pts=None): ...@@ -135,18 +138,25 @@ def read_video(filename, start_pts=0, end_pts=None):
Reads a video from a file, returning both the video frames as well as Reads a video from a file, returning both the video frames as well as
the audio frames the audio frames
Arguments: Parameters
filename (str): path to the video file ----------
start_pts (int, optional): the start presentation time of the video filename : str
end_pts (int, optional): the end presentation time path to the video file
start_pts : int, optional
Returns: the start presentation time of the video
vframes (Tensor[T, H, W, C]): the `T` video frames end_pts : int, optional
aframes (Tensor[K, L]): the audio frames, where `K` is the number of channels the end presentation time
and `L` is the number of points
info (Dict): metadata for the video and audio. Can contain the fields Returns
- video_fps (float) -------
- audio_fps (int) vframes : Tensor[T, H, W, C]
the `T` video frames
aframes : Tensor[K, L]
the audio frames, where `K` is the number of channels and `L` is the
number of points
info : Dict
metadata for the video and audio. Can contain the fields video_fps (float)
and audio_fps (int)
""" """
_check_av_available() _check_av_available()
...@@ -201,13 +211,18 @@ def read_video_timestamps(filename): ...@@ -201,13 +211,18 @@ def read_video_timestamps(filename):
Note that the function decodes the whole video frame-by-frame. Note that the function decodes the whole video frame-by-frame.
Arguments: Parameters
filename (str): path to the video file ----------
filename : str
path to the video file
Returns
-------
pts : List[int]
presentation timestamps for each one of the frames in the video.
video_fps : int
the frame rate for the video
Returns:
pts (List[int]): presentation timestamps for each one of the frames
in the video.
video_fps (int): the frame rate for the video
""" """
_check_av_available() _check_av_available()
container = av.open(filename, metadata_errors='ignore') container = av.open(filename, metadata_errors='ignore')
......
...@@ -11,17 +11,23 @@ def nms(boxes, scores, iou_threshold): ...@@ -11,17 +11,23 @@ def nms(boxes, scores, iou_threshold):
IoU greater than iou_threshold with another (higher scoring) IoU greater than iou_threshold with another (higher scoring)
box. box.
Arguments: Parameters
boxes (Tensor[N, 4]): boxes to perform NMS on. They ----------
are expected to be in (x1, y1, x2, y2) format boxes : Tensor[N, 4])
scores (Tensor[N]): scores for each one of the boxes boxes to perform NMS on. They
iou_threshold (float): discards all overlapping are expected to be in (x1, y1, x2, y2) format
boxes with IoU < iou_threshold scores : Tensor[N]
scores for each one of the boxes
Returns: iou_threshold : float
keep (Tensor): int64 tensor with the indices discards all overlapping
of the elements that have been kept boxes with IoU < iou_threshold
by NMS, sorted in decreasing order of scores
Returns
-------
keep : Tensor
int64 tensor with the indices
of the elements that have been kept
by NMS, sorted in decreasing order of scores
""" """
_C = _lazy_import() _C = _lazy_import()
return _C.nms(boxes, scores, iou_threshold) return _C.nms(boxes, scores, iou_threshold)
...@@ -34,19 +40,25 @@ def batched_nms(boxes, scores, idxs, iou_threshold): ...@@ -34,19 +40,25 @@ def batched_nms(boxes, scores, idxs, iou_threshold):
Each index value correspond to a category, and NMS Each index value correspond to a category, and NMS
will not be applied between elements of different categories. will not be applied between elements of different categories.
Arguments: Parameters
boxes (Tensor[N, 4]): boxes where NMS will be performed. They ----------
are expected to be in (x1, y1, x2, y2) format boxes : Tensor[N, 4]
scores (Tensor[N]): scores for each one of the boxes boxes where NMS will be performed. They
idxs (Tensor[N]): indices of the categories for each are expected to be in (x1, y1, x2, y2) format
one of the boxes. scores : Tensor[N]
iou_threshold (float): discards all overlapping boxes scores for each one of the boxes
with IoU < iou_threshold idxs : Tensor[N]
indices of the categories for each one of the boxes.
Returns: iou_threshold : float
keep (Tensor): int64 tensor with the indices of discards all overlapping boxes
the elements that have been kept by NMS, sorted with IoU < iou_threshold
in decreasing order of scores
Returns
-------
keep : Tensor
int64 tensor with the indices of
the elements that have been kept by NMS, sorted
in decreasing order of scores
""" """
if boxes.numel() == 0: if boxes.numel() == 0:
return torch.empty((0,), dtype=torch.int64, device=boxes.device) return torch.empty((0,), dtype=torch.int64, device=boxes.device)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment