Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
vision
Commits
89bc3079
Unverified
Commit
89bc3079
authored
Jan 22, 2021
by
Nicolas Hug
Committed by
GitHub
Jan 22, 2021
Browse files
Unify parameters formatting in docstrings (#3268)
parent
e04de77c
Changes
6
Show whitespace changes
Inline
Side-by-side
Showing
6 changed files
with
199 additions
and
284 deletions
+199
-284
references/detection/coco_eval.py
references/detection/coco_eval.py
+4
-2
torchvision/io/_video_opt.py
torchvision/io/_video_opt.py
+82
-106
torchvision/io/image.py
torchvision/io/image.py
+56
-78
torchvision/io/video.py
torchvision/io/video.py
+36
-58
torchvision/models/mobilenetv2.py
torchvision/models/mobilenetv2.py
+0
-4
torchvision/ops/boxes.py
torchvision/ops/boxes.py
+21
-36
No files found.
references/detection/coco_eval.py
View file @
89bc3079
...
@@ -238,8 +238,10 @@ maskUtils = mask_util
...
@@ -238,8 +238,10 @@ maskUtils = mask_util
def
loadRes
(
self
,
resFile
):
def
loadRes
(
self
,
resFile
):
"""
"""
Load result file and return a result api object.
Load result file and return a result api object.
:param resFile (str) : file name of result file
Args:
:return: res (obj) : result api object
resFile (str): file name of result file
Returns:
res (obj): result api object
"""
"""
res
=
COCO
()
res
=
COCO
()
res
.
dataset
[
'images'
]
=
[
img
for
img
in
self
.
dataset
[
'images'
]]
res
.
dataset
[
'images'
]
=
[
img
for
img
in
self
.
dataset
[
'images'
]]
...
...
torchvision/io/_video_opt.py
View file @
89bc3079
...
@@ -181,17 +181,14 @@ def _read_video_from_file(
...
@@ -181,17 +181,14 @@ def _read_video_from_file(
Reads a video from a file, returning both the video frames as well as
Reads a video from a file, returning both the video frames as well as
the audio frames
the audio frames
Args
Args:
----------
filename (str): path to the video file
filename : str
seek_frame_margin (double, optional): seeking frame in the stream is imprecise. Thus,
path to the video file
when video_start_pts is specified, we seek the pts earlier by seek_frame_margin seconds
seek_frame_margin: double, optional
read_video_stream (int, optional): whether read video stream. If yes, set to 1. Otherwise, 0
seeking frame in the stream is imprecise. Thus, when video_start_pts
video_width/video_height/video_min_dimension/video_max_dimension (int): together decide
is specified, we seek the pts earlier by seek_frame_margin seconds
the size of decoded frames:
read_video_stream: int, optional
whether read video stream. If yes, set to 1. Otherwise, 0
video_width/video_height/video_min_dimension/video_max_dimension: int
together decide the size of decoded frames
- When video_width = 0, video_height = 0, video_min_dimension = 0,
- When video_width = 0, video_height = 0, video_min_dimension = 0,
and video_max_dimension = 0, keep the original frame resolution
and video_max_dimension = 0, keep the original frame resolution
- When video_width = 0, video_height = 0, video_min_dimension != 0,
- When video_width = 0, video_height = 0, video_min_dimension != 0,
...
@@ -214,30 +211,19 @@ def _read_video_from_file(
...
@@ -214,30 +211,19 @@ def _read_video_from_file(
and video_max_dimension = 0, resize the frame so that frame
and video_max_dimension = 0, resize the frame so that frame
video_width and video_height are set to $video_width and
video_width and video_height are set to $video_width and
$video_height, respectively
$video_height, respectively
video_pts_range : list(int), optional
video_pts_range (list(int), optional): the start and end presentation timestamp of video stream
the start and end presentation timestamp of video stream
video_timebase (Fraction, optional): a Fraction rational number which denotes timebase in video stream
video_timebase: Fraction, optional
read_audio_stream (int, optional): whether read audio stream. If yes, set to 1. Otherwise, 0
a Fraction rational number which denotes timebase in video stream
audio_samples (int, optional): audio sampling rate
read_audio_stream: int, optional
audio_channels (int optional): audio channels
whether read audio stream. If yes, set to 1. Otherwise, 0
audio_pts_range (list(int), optional): the start and end presentation timestamp of audio stream
audio_samples: int, optional
audio_timebase (Fraction, optional): a Fraction rational number which denotes time base in audio stream
audio sampling rate
audio_channels: int optional
audio channels
audio_pts_range : list(int), optional
the start and end presentation timestamp of audio stream
audio_timebase: Fraction, optional
a Fraction rational number which denotes time base in audio stream
Returns
Returns
-------
vframes (Tensor[T, H, W, C]): the `T` video frames
vframes : Tensor[T, H, W, C]
aframes (Tensor[L, K]): the audio frames, where `L` is the number of points and
the `T` video frames
aframes : Tensor[L, K]
the audio frames, where `L` is the number of points and
`K` is the number of audio_channels
`K` is the number of audio_channels
info : Dict
info (Dict): metadata for the video and audio. Can contain the fields video_fps (float)
metadata for the video and audio. Can contain the fields video_fps (float)
and audio_fps (int)
and audio_fps (int)
"""
"""
_validate_pts
(
video_pts_range
)
_validate_pts
(
video_pts_range
)
...
@@ -345,17 +331,15 @@ def _read_video_from_memory(
...
@@ -345,17 +331,15 @@ def _read_video_from_memory(
the audio frames
the audio frames
This function is torchscriptable.
This function is torchscriptable.
Args
Args:
----------
video_data (data type could be 1) torch.Tensor, dtype=torch.int8 or 2) python bytes):
video_data : data type could be 1) torch.Tensor, dtype=torch.int8 or 2) python bytes
compressed video content stored in either 1) torch.Tensor 2) python bytes
compressed video content stored in either 1) torch.Tensor 2) python bytes
seek_frame_margin: double, optional
seek_frame_margin (double, optional): seeking frame in the stream is imprecise.
seeking frame in the stream is imprecise. Thus, when video_start_pts is specified,
Thus, when video_start_pts is specified, we seek the pts earlier by seek_frame_margin seconds
we seek the pts earlier by seek_frame_margin seconds
read_video_stream (int, optional): whether read video stream. If yes, set to 1. Otherwise, 0
read_video_stream: int, optional
video_width/video_height/video_min_dimension/video_max_dimension (int): together decide
whether read video stream. If yes, set to 1. Otherwise, 0
the size of decoded frames:
video_width/video_height/video_min_dimension/video_max_dimension: int
together decide the size of decoded frames
- When video_width = 0, video_height = 0, video_min_dimension = 0,
- When video_width = 0, video_height = 0, video_min_dimension = 0,
and video_max_dimension = 0, keep the original frame resolution
and video_max_dimension = 0, keep the original frame resolution
- When video_width = 0, video_height = 0, video_min_dimension != 0,
- When video_width = 0, video_height = 0, video_min_dimension != 0,
...
@@ -378,27 +362,19 @@ def _read_video_from_memory(
...
@@ -378,27 +362,19 @@ def _read_video_from_memory(
and video_max_dimension = 0, resize the frame so that frame
and video_max_dimension = 0, resize the frame so that frame
video_width and video_height are set to $video_width and
video_width and video_height are set to $video_width and
$video_height, respectively
$video_height, respectively
video_pts_range : list(int), optional
video_pts_range (list(int), optional): the start and end presentation timestamp of video stream
the start and end presentation timestamp of video stream
video_timebase_numerator / video_timebase_denominator (float, optional): a rational
video_timebase_numerator / video_timebase_denominator: optional
number which denotes timebase in video stream
a rational number which denotes timebase in video stream
read_audio_stream (int, optional): whether read audio stream. If yes, set to 1. Otherwise, 0
read_audio_stream: int, optional
audio_samples (int, optional): audio sampling rate
whether read audio stream. If yes, set to 1. Otherwise, 0
audio_channels (int optional): audio audio_channels
audio_samples: int, optional
audio_pts_range (list(int), optional): the start and end presentation timestamp of audio stream
audio sampling rate
audio_timebase_numerator / audio_timebase_denominator (float, optional):
audio_channels: int optional
audio audio_channels
audio_pts_range : list(int), optional
the start and end presentation timestamp of audio stream
audio_timebase_numerator / audio_timebase_denominator: optional
a rational number which denotes time base in audio stream
a rational number which denotes time base in audio stream
Returns
Returns:
-------
vframes (Tensor[T, H, W, C]): the `T` video frames
vframes : Tensor[T, H, W, C]
aframes (Tensor[L, K]): the audio frames, where `L` is the number of points and
the `T` video frames
aframes : Tensor[L, K]
the audio frames, where `L` is the number of points and
`K` is the number of channels
`K` is the number of channels
"""
"""
...
...
torchvision/io/image.py
View file @
89bc3079
...
@@ -119,18 +119,14 @@ def encode_png(input: torch.Tensor, compression_level: int = 6) -> torch.Tensor:
...
@@ -119,18 +119,14 @@ def encode_png(input: torch.Tensor, compression_level: int = 6) -> torch.Tensor:
Takes an input tensor in CHW layout and returns a buffer with the contents
Takes an input tensor in CHW layout and returns a buffer with the contents
of its corresponding PNG file.
of its corresponding PNG file.
Parameters
Args:
----------
input (Tensor[channels, image_height, image_width]): int8 image tensor of
input: Tensor[channels, image_height, image_width]
`c` channels, where `c` must 3 or 1.
int8 image tensor of `c` channels, where `c` must 3 or 1.
compression_level (int): Compression factor for the resulting file, it must be a number
compression_level: int
Compression factor for the resulting file, it must be a number
between 0 and 9. Default: 6
between 0 and 9. Default: 6
Returns
Returns:
-------
output (Tensor[1]): A one dimensional int8 tensor that contains the raw bytes of the
output: Tensor[1]
A one dimensional int8 tensor that contains the raw bytes of the
PNG file.
PNG file.
"""
"""
output
=
torch
.
ops
.
image
.
encode_png
(
input
,
compression_level
)
output
=
torch
.
ops
.
image
.
encode_png
(
input
,
compression_level
)
...
@@ -142,14 +138,11 @@ def write_png(input: torch.Tensor, filename: str, compression_level: int = 6):
...
@@ -142,14 +138,11 @@ def write_png(input: torch.Tensor, filename: str, compression_level: int = 6):
Takes an input tensor in CHW layout (or HW in the case of grayscale images)
Takes an input tensor in CHW layout (or HW in the case of grayscale images)
and saves it in a PNG file.
and saves it in a PNG file.
Parameters
Args:
----------
input (Tensor[channels, image_height, image_width]): int8 image tensor of
input: Tensor[channels, image_height, image_width]
`c` channels, where `c` must be 1 or 3.
int8 image tensor of `c` channels, where `c` must be 1 or 3.
filename (str): Path to save the image.
filename: str
compression_level (int): Compression factor for the resulting file, it must be a number
Path to save the image.
compression_level: int
Compression factor for the resulting file, it must be a number
between 0 and 9. Default: 6
between 0 and 9. Default: 6
"""
"""
output
=
encode_png
(
input
,
compression_level
)
output
=
encode_png
(
input
,
compression_level
)
...
@@ -182,18 +175,14 @@ def encode_jpeg(input: torch.Tensor, quality: int = 75) -> torch.Tensor:
...
@@ -182,18 +175,14 @@ def encode_jpeg(input: torch.Tensor, quality: int = 75) -> torch.Tensor:
Takes an input tensor in CHW layout and returns a buffer with the contents
Takes an input tensor in CHW layout and returns a buffer with the contents
of its corresponding JPEG file.
of its corresponding JPEG file.
Parameters
Args:
----------
input (Tensor[channels, image_height, image_width])): int8 image tensor of
input: Tensor[channels, image_height, image_width])
`c` channels, where `c` must be 1 or 3.
int8 image tensor of `c` channels, where `c` must be 1 or 3.
quality (int): Quality of the resulting JPEG file, it must be a number between
quality: int
Quality of the resulting JPEG file, it must be a number between
1 and 100. Default: 75
1 and 100. Default: 75
Returns
Returns:
-------
output (Tensor[1]): A one dimensional int8 tensor that contains the raw bytes of the
output: Tensor[1]
A one dimensional int8 tensor that contains the raw bytes of the
JPEG file.
JPEG file.
"""
"""
if
quality
<
1
or
quality
>
100
:
if
quality
<
1
or
quality
>
100
:
...
@@ -208,14 +197,11 @@ def write_jpeg(input: torch.Tensor, filename: str, quality: int = 75):
...
@@ -208,14 +197,11 @@ def write_jpeg(input: torch.Tensor, filename: str, quality: int = 75):
"""
"""
Takes an input tensor in CHW layout and saves it in a JPEG file.
Takes an input tensor in CHW layout and saves it in a JPEG file.
Parameters
Args:
----------
input (Tensor[channels, image_height, image_width]): int8 image tensor of `c`
input: Tensor[channels, image_height, image_width]
channels, where `c` must be 1 or 3.
int8 image tensor of `c` channels, where `c` must be 1 or 3.
filename (str): Path to save the image.
filename: str
quality (int): Quality of the resulting JPEG file, it must be a number
Path to save the image.
quality: int
Quality of the resulting JPEG file, it must be a number
between 1 and 100. Default: 75
between 1 and 100. Default: 75
"""
"""
output
=
encode_jpeg
(
input
,
quality
)
output
=
encode_jpeg
(
input
,
quality
)
...
@@ -230,20 +216,16 @@ def decode_image(input: torch.Tensor, mode: ImageReadMode = ImageReadMode.UNCHAN
...
@@ -230,20 +216,16 @@ def decode_image(input: torch.Tensor, mode: ImageReadMode = ImageReadMode.UNCHAN
Optionally converts the image to the desired format.
Optionally converts the image to the desired format.
The values of the output tensor are uint8 between 0 and 255.
The values of the output tensor are uint8 between 0 and 255.
Parameters
Args:
----------
input (Tensor): a one dimensional uint8 tensor containing the raw bytes of the
input: Tensor
a one dimensional uint8 tensor containing the raw bytes of the
PNG or JPEG image.
PNG or JPEG image.
mode: ImageReadMode
mode (ImageReadMode): the read mode used for optionally converting the image.
the read mode used for optionally converting the image.
Default: `ImageReadMode.UNCHANGED`.
Default: `ImageReadMode.UNCHANGED`.
See `ImageReadMode` class for more information on various
See `ImageReadMode` class for more information on various
available modes.
available modes.
Returns
Returns:
-------
output (Tensor[image_channels, image_height, image_width])
output: Tensor[image_channels, image_height, image_width]
"""
"""
output
=
torch
.
ops
.
image
.
decode_image
(
input
,
mode
.
value
)
output
=
torch
.
ops
.
image
.
decode_image
(
input
,
mode
.
value
)
return
output
return
output
...
@@ -255,19 +237,15 @@ def read_image(path: str, mode: ImageReadMode = ImageReadMode.UNCHANGED) -> torc
...
@@ -255,19 +237,15 @@ def read_image(path: str, mode: ImageReadMode = ImageReadMode.UNCHANGED) -> torc
Optionally converts the image to the desired format.
Optionally converts the image to the desired format.
The values of the output tensor are uint8 between 0 and 255.
The values of the output tensor are uint8 between 0 and 255.
Parameters
Args:
----------
path (str): path of the JPEG or PNG image.
path: str
mode (ImageReadMode): the read mode used for optionally converting the image.
path of the JPEG or PNG image.
mode: ImageReadMode
the read mode used for optionally converting the image.
Default: `ImageReadMode.UNCHANGED`.
Default: `ImageReadMode.UNCHANGED`.
See `ImageReadMode` class for more information on various
See `ImageReadMode` class for more information on various
available modes.
available modes.
Returns
Returns:
-------
output (Tensor[image_channels, image_height, image_width])
output: Tensor[image_channels, image_height, image_width]
"""
"""
data
=
read_file
(
path
)
data
=
read_file
(
path
)
return
decode_image
(
data
,
mode
)
return
decode_image
(
data
,
mode
)
torchvision/io/video.py
View file @
89bc3079
...
@@ -63,27 +63,18 @@ def write_video(
...
@@ -63,27 +63,18 @@ def write_video(
"""
"""
Writes a 4d tensor in [T, H, W, C] format in a video file
Writes a 4d tensor in [T, H, W, C] format in a video file
Parameters
Args:
----------
filename (str): path where the video will be saved
filename : str
video_array (Tensor[T, H, W, C]): tensor containing the individual frames,
path where the video will be saved
as a uint8 tensor in [T, H, W, C] format
video_array : Tensor[T, H, W, C]
fps (Number): video frames per second
tensor containing the individual frames, as a uint8 tensor in [T, H, W, C] format
video_codec (str): the name of the video codec, i.e. "libx264", "h264", etc.
fps : Number
options (Dict): dictionary containing options to be passed into the PyAV video stream
video frames per second
audio_array (Tensor[C, N]): tensor containing the audio, where C is the number of channels
video_codec : str
and N is the number of samples
the name of the video codec, i.e. "libx264", "h264", etc.
audio_fps (Number): audio sample rate, typically 44100 or 48000
options : Dict
audio_codec (str): the name of the audio codec, i.e. "mp3", "aac", etc.
dictionary containing options to be passed into the PyAV video stream
audio_options (Dict): dictionary containing options to be passed into the PyAV audio stream
audio_array : Tensor[C, N]
tensor containing the audio, where C is the number of channels and N is the
number of samples
audio_fps : Number
audio sample rate, typically 44100 or 48000
audio_codec : str
the name of the audio codec, i.e. "mp3", "aac", etc.
audio_options : Dict
dictionary containing options to be passed into the PyAV audio stream
"""
"""
_check_av_available
()
_check_av_available
()
video_array
=
torch
.
as_tensor
(
video_array
,
dtype
=
torch
.
uint8
).
numpy
()
video_array
=
torch
.
as_tensor
(
video_array
,
dtype
=
torch
.
uint8
).
numpy
()
...
@@ -251,28 +242,20 @@ def read_video(
...
@@ -251,28 +242,20 @@ def read_video(
Reads a video from a file, returning both the video frames as well as
Reads a video from a file, returning both the video frames as well as
the audio frames
the audio frames
Parameters
Args:
----------
filename (str): path to the video file
filename : str
start_pts (int if pts_unit = 'pts', float / Fraction if pts_unit = 'sec', optional):
path to the video file
The start presentation time of the video
start_pts : int if pts_unit = 'pts', optional
end_pts (int if pts_unit = 'pts', float / Fraction if pts_unit = 'sec', optional):
float / Fraction if pts_unit = 'sec', optional
The end presentation time
the start presentation time of the video
pts_unit (str, optional): unit in which start_pts and end_pts values will be interpreted,
end_pts : int if pts_unit = 'pts', optional
either 'pts' or 'sec'. Defaults to 'pts'.
float / Fraction if pts_unit = 'sec', optional
the end presentation time
Returns:
pts_unit : str, optional
vframes (Tensor[T, H, W, C]): the `T` video frames
unit in which start_pts and end_pts values will be interpreted, either 'pts' or 'sec'. Defaults to 'pts'.
aframes (Tensor[K, L]): the audio frames, where `K` is the number of channels and `L` is the
Returns
-------
vframes : Tensor[T, H, W, C]
the `T` video frames
aframes : Tensor[K, L]
the audio frames, where `K` is the number of channels and `L` is the
number of points
number of points
info : Dict
info (Dict): metadata for the video and audio. Can contain the fields video_fps (float)
metadata for the video and audio. Can contain the fields video_fps (float)
and audio_fps (int)
and audio_fps (int)
"""
"""
...
@@ -368,20 +351,15 @@ def read_video_timestamps(filename: str, pts_unit: str = "pts") -> Tuple[List[in
...
@@ -368,20 +351,15 @@ def read_video_timestamps(filename: str, pts_unit: str = "pts") -> Tuple[List[in
Note that the function decodes the whole video frame-by-frame.
Note that the function decodes the whole video frame-by-frame.
Parameters
Args:
----------
filename (str): path to the video file
filename : str
pts_unit (str, optional): unit in which timestamp values will be returned
path to the video file
either 'pts' or 'sec'. Defaults to 'pts'.
pts_unit : str, optional
unit in which timestamp values will be returned either 'pts' or 'sec'. Defaults to 'pts'.
Returns:
pts (List[int] if pts_unit = 'pts', List[Fraction] if pts_unit = 'sec'):
Returns
-------
pts : List[int] if pts_unit = 'pts'
List[Fraction] if pts_unit = 'sec'
presentation timestamps for each one of the frames in the video.
presentation timestamps for each one of the frames in the video.
video_fps : float, optional
video_fps (float, optional): the frame rate for the video
the frame rate for the video
"""
"""
from
torchvision
import
get_video_backend
from
torchvision
import
get_video_backend
...
...
torchvision/models/mobilenetv2.py
View file @
89bc3079
...
@@ -18,10 +18,6 @@ def _make_divisible(v: float, divisor: int, min_value: Optional[int] = None) ->
...
@@ -18,10 +18,6 @@ def _make_divisible(v: float, divisor: int, min_value: Optional[int] = None) ->
It ensures that all layers have a channel number that is divisible by 8
It ensures that all layers have a channel number that is divisible by 8
It can be seen here:
It can be seen here:
https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
:param v:
:param divisor:
:param min_value:
:return:
"""
"""
if
min_value
is
None
:
if
min_value
is
None
:
min_value
=
divisor
min_value
=
divisor
...
...
torchvision/ops/boxes.py
View file @
89bc3079
...
@@ -20,21 +20,14 @@ def nms(boxes: Tensor, scores: Tensor, iou_threshold: float) -> Tensor:
...
@@ -20,21 +20,14 @@ def nms(boxes: Tensor, scores: Tensor, iou_threshold: float) -> Tensor:
not guaranteed to be the same between CPU and GPU. This is similar
not guaranteed to be the same between CPU and GPU. This is similar
to the behavior of argsort in PyTorch when repeated values are present.
to the behavior of argsort in PyTorch when repeated values are present.
Parameters
Args:
----------
boxes (Tensor[N, 4])): boxes to perform NMS on. They
boxes : Tensor[N, 4])
boxes to perform NMS on. They
are expected to be in (x1, y1, x2, y2) format
are expected to be in (x1, y1, x2, y2) format
scores : Tensor[N]
scores (Tensor[N]): scores for each one of the boxes
scores for each one of the boxes
iou_threshold (float): discards all overlapping boxes with IoU > iou_threshold
iou_threshold : float
discards all overlapping
Returns:
boxes with IoU > iou_threshold
keep (Tensor): int64 tensor with the indices
Returns
-------
keep : Tensor
int64 tensor with the indices
of the elements that have been kept
of the elements that have been kept
by NMS, sorted in decreasing order of scores
by NMS, sorted in decreasing order of scores
"""
"""
...
@@ -55,23 +48,15 @@ def batched_nms(
...
@@ -55,23 +48,15 @@ def batched_nms(
Each index value correspond to a category, and NMS
Each index value correspond to a category, and NMS
will not be applied between elements of different categories.
will not be applied between elements of different categories.
Parameters
Args:
----------
boxes (Tensor[N, 4]): boxes where NMS will be performed. They
boxes : Tensor[N, 4]
boxes where NMS will be performed. They
are expected to be in (x1, y1, x2, y2) format
are expected to be in (x1, y1, x2, y2) format
scores : Tensor[N]
scores (Tensor[N]): scores for each one of the boxes
scores for each one of the boxes
idxs (Tensor[N]): indices of the categories for each one of the boxes.
idxs : Tensor[N]
iou_threshold (float): discards all overlapping boxes with IoU > iou_threshold
indices of the categories for each one of the boxes.
iou_threshold : float
Returns:
discards all overlapping boxes
keep (Tensor): int64 tensor with the indices of
with IoU > iou_threshold
Returns
-------
keep : Tensor
int64 tensor with the indices of
the elements that have been kept by NMS, sorted
the elements that have been kept by NMS, sorted
in decreasing order of scores
in decreasing order of scores
"""
"""
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment