Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
vision
Commits
89bc3079
Unverified
Commit
89bc3079
authored
Jan 22, 2021
by
Nicolas Hug
Committed by
GitHub
Jan 22, 2021
Browse files
Unify parameters formatting in docstrings (#3268)
parent
e04de77c
Changes
6
Show whitespace changes
Inline
Side-by-side
Showing
6 changed files
with
199 additions
and
284 deletions
+199
-284
references/detection/coco_eval.py
references/detection/coco_eval.py
+4
-2
torchvision/io/_video_opt.py
torchvision/io/_video_opt.py
+82
-106
torchvision/io/image.py
torchvision/io/image.py
+56
-78
torchvision/io/video.py
torchvision/io/video.py
+36
-58
torchvision/models/mobilenetv2.py
torchvision/models/mobilenetv2.py
+0
-4
torchvision/ops/boxes.py
torchvision/ops/boxes.py
+21
-36
No files found.
references/detection/coco_eval.py
View file @
89bc3079
...
...
@@ -238,8 +238,10 @@ maskUtils = mask_util
def
loadRes
(
self
,
resFile
):
"""
Load result file and return a result api object.
:param resFile (str) : file name of result file
:return: res (obj) : result api object
Args:
resFile (str): file name of result file
Returns:
res (obj): result api object
"""
res
=
COCO
()
res
.
dataset
[
'images'
]
=
[
img
for
img
in
self
.
dataset
[
'images'
]]
...
...
torchvision/io/_video_opt.py
View file @
89bc3079
...
...
@@ -181,17 +181,14 @@ def _read_video_from_file(
Reads a video from a file, returning both the video frames as well as
the audio frames
Args
----------
filename : str
path to the video file
seek_frame_margin: double, optional
seeking frame in the stream is imprecise. Thus, when video_start_pts
is specified, we seek the pts earlier by seek_frame_margin seconds
read_video_stream: int, optional
whether read video stream. If yes, set to 1. Otherwise, 0
video_width/video_height/video_min_dimension/video_max_dimension: int
together decide the size of decoded frames
Args:
filename (str): path to the video file
seek_frame_margin (double, optional): seeking frame in the stream is imprecise. Thus,
when video_start_pts is specified, we seek the pts earlier by seek_frame_margin seconds
read_video_stream (int, optional): whether read video stream. If yes, set to 1. Otherwise, 0
video_width/video_height/video_min_dimension/video_max_dimension (int): together decide
the size of decoded frames:
- When video_width = 0, video_height = 0, video_min_dimension = 0,
and video_max_dimension = 0, keep the original frame resolution
- When video_width = 0, video_height = 0, video_min_dimension != 0,
...
...
@@ -214,30 +211,19 @@ def _read_video_from_file(
and video_max_dimension = 0, resize the frame so that frame
video_width and video_height are set to $video_width and
$video_height, respectively
video_pts_range : list(int), optional
the start and end presentation timestamp of video stream
video_timebase: Fraction, optional
a Fraction rational number which denotes timebase in video stream
read_audio_stream: int, optional
whether read audio stream. If yes, set to 1. Otherwise, 0
audio_samples: int, optional
audio sampling rate
audio_channels: int optional
audio channels
audio_pts_range : list(int), optional
the start and end presentation timestamp of audio stream
audio_timebase: Fraction, optional
a Fraction rational number which denotes time base in audio stream
video_pts_range (list(int), optional): the start and end presentation timestamp of video stream
video_timebase (Fraction, optional): a Fraction rational number which denotes timebase in video stream
read_audio_stream (int, optional): whether read audio stream. If yes, set to 1. Otherwise, 0
audio_samples (int, optional): audio sampling rate
audio_channels (int optional): audio channels
audio_pts_range (list(int), optional): the start and end presentation timestamp of audio stream
audio_timebase (Fraction, optional): a Fraction rational number which denotes time base in audio stream
Returns
-------
vframes : Tensor[T, H, W, C]
the `T` video frames
aframes : Tensor[L, K]
the audio frames, where `L` is the number of points and
vframes (Tensor[T, H, W, C]): the `T` video frames
aframes (Tensor[L, K]): the audio frames, where `L` is the number of points and
`K` is the number of audio_channels
info : Dict
metadata for the video and audio. Can contain the fields video_fps (float)
info (Dict): metadata for the video and audio. Can contain the fields video_fps (float)
and audio_fps (int)
"""
_validate_pts
(
video_pts_range
)
...
...
@@ -345,17 +331,15 @@ def _read_video_from_memory(
the audio frames
This function is torchscriptable.
Args
----------
video_data : data type could be 1) torch.Tensor, dtype=torch.int8 or 2) python bytes
Args:
video_data (data type could be 1) torch.Tensor, dtype=torch.int8 or 2) python bytes):
compressed video content stored in either 1) torch.Tensor 2) python bytes
seek_frame_margin: double, optional
seeking frame in the stream is imprecise. Thus, when video_start_pts is specified,
we seek the pts earlier by seek_frame_margin seconds
read_video_stream: int, optional
whether read video stream. If yes, set to 1. Otherwise, 0
video_width/video_height/video_min_dimension/video_max_dimension: int
together decide the size of decoded frames
seek_frame_margin (double, optional): seeking frame in the stream is imprecise.
Thus, when video_start_pts is specified, we seek the pts earlier by seek_frame_margin seconds
read_video_stream (int, optional): whether read video stream. If yes, set to 1. Otherwise, 0
video_width/video_height/video_min_dimension/video_max_dimension (int): together decide
the size of decoded frames:
- When video_width = 0, video_height = 0, video_min_dimension = 0,
and video_max_dimension = 0, keep the original frame resolution
- When video_width = 0, video_height = 0, video_min_dimension != 0,
...
...
@@ -378,27 +362,19 @@ def _read_video_from_memory(
and video_max_dimension = 0, resize the frame so that frame
video_width and video_height are set to $video_width and
$video_height, respectively
video_pts_range : list(int), optional
the start and end presentation timestamp of video stream
video_timebase_numerator / video_timebase_denominator: optional
a rational number which denotes timebase in video stream
read_audio_stream: int, optional
whether read audio stream. If yes, set to 1. Otherwise, 0
audio_samples: int, optional
audio sampling rate
audio_channels: int optional
audio audio_channels
audio_pts_range : list(int), optional
the start and end presentation timestamp of audio stream
audio_timebase_numerator / audio_timebase_denominator: optional
video_pts_range (list(int), optional): the start and end presentation timestamp of video stream
video_timebase_numerator / video_timebase_denominator (float, optional): a rational
number which denotes timebase in video stream
read_audio_stream (int, optional): whether read audio stream. If yes, set to 1. Otherwise, 0
audio_samples (int, optional): audio sampling rate
audio_channels (int optional): audio audio_channels
audio_pts_range (list(int), optional): the start and end presentation timestamp of audio stream
audio_timebase_numerator / audio_timebase_denominator (float, optional):
a rational number which denotes time base in audio stream
Returns
-------
vframes : Tensor[T, H, W, C]
the `T` video frames
aframes : Tensor[L, K]
the audio frames, where `L` is the number of points and
Returns:
vframes (Tensor[T, H, W, C]): the `T` video frames
aframes (Tensor[L, K]): the audio frames, where `L` is the number of points and
`K` is the number of channels
"""
...
...
torchvision/io/image.py
View file @
89bc3079
...
...
@@ -119,18 +119,14 @@ def encode_png(input: torch.Tensor, compression_level: int = 6) -> torch.Tensor:
Takes an input tensor in CHW layout and returns a buffer with the contents
of its corresponding PNG file.
Parameters
----------
input: Tensor[channels, image_height, image_width]
int8 image tensor of `c` channels, where `c` must 3 or 1.
compression_level: int
Compression factor for the resulting file, it must be a number
Args:
input (Tensor[channels, image_height, image_width]): int8 image tensor of
`c` channels, where `c` must 3 or 1.
compression_level (int): Compression factor for the resulting file, it must be a number
between 0 and 9. Default: 6
Returns
-------
output: Tensor[1]
A one dimensional int8 tensor that contains the raw bytes of the
Returns:
output (Tensor[1]): A one dimensional int8 tensor that contains the raw bytes of the
PNG file.
"""
output
=
torch
.
ops
.
image
.
encode_png
(
input
,
compression_level
)
...
...
@@ -142,14 +138,11 @@ def write_png(input: torch.Tensor, filename: str, compression_level: int = 6):
Takes an input tensor in CHW layout (or HW in the case of grayscale images)
and saves it in a PNG file.
Parameters
----------
input: Tensor[channels, image_height, image_width]
int8 image tensor of `c` channels, where `c` must be 1 or 3.
filename: str
Path to save the image.
compression_level: int
Compression factor for the resulting file, it must be a number
Args:
input (Tensor[channels, image_height, image_width]): int8 image tensor of
`c` channels, where `c` must be 1 or 3.
filename (str): Path to save the image.
compression_level (int): Compression factor for the resulting file, it must be a number
between 0 and 9. Default: 6
"""
output
=
encode_png
(
input
,
compression_level
)
...
...
@@ -182,18 +175,14 @@ def encode_jpeg(input: torch.Tensor, quality: int = 75) -> torch.Tensor:
Takes an input tensor in CHW layout and returns a buffer with the contents
of its corresponding JPEG file.
Parameters
----------
input: Tensor[channels, image_height, image_width])
int8 image tensor of `c` channels, where `c` must be 1 or 3.
quality: int
Quality of the resulting JPEG file, it must be a number between
Args:
input (Tensor[channels, image_height, image_width])): int8 image tensor of
`c` channels, where `c` must be 1 or 3.
quality (int): Quality of the resulting JPEG file, it must be a number between
1 and 100. Default: 75
Returns
-------
output: Tensor[1]
A one dimensional int8 tensor that contains the raw bytes of the
Returns:
output (Tensor[1]): A one dimensional int8 tensor that contains the raw bytes of the
JPEG file.
"""
if
quality
<
1
or
quality
>
100
:
...
...
@@ -208,14 +197,11 @@ def write_jpeg(input: torch.Tensor, filename: str, quality: int = 75):
"""
Takes an input tensor in CHW layout and saves it in a JPEG file.
Parameters
----------
input: Tensor[channels, image_height, image_width]
int8 image tensor of `c` channels, where `c` must be 1 or 3.
filename: str
Path to save the image.
quality: int
Quality of the resulting JPEG file, it must be a number
Args:
input (Tensor[channels, image_height, image_width]): int8 image tensor of `c`
channels, where `c` must be 1 or 3.
filename (str): Path to save the image.
quality (int): Quality of the resulting JPEG file, it must be a number
between 1 and 100. Default: 75
"""
output
=
encode_jpeg
(
input
,
quality
)
...
...
@@ -230,20 +216,16 @@ def decode_image(input: torch.Tensor, mode: ImageReadMode = ImageReadMode.UNCHAN
Optionally converts the image to the desired format.
The values of the output tensor are uint8 between 0 and 255.
Parameters
----------
input: Tensor
a one dimensional uint8 tensor containing the raw bytes of the
Args:
input (Tensor): a one dimensional uint8 tensor containing the raw bytes of the
PNG or JPEG image.
mode: ImageReadMode
the read mode used for optionally converting the image.
mode (ImageReadMode): the read mode used for optionally converting the image.
Default: `ImageReadMode.UNCHANGED`.
See `ImageReadMode` class for more information on various
available modes.
Returns
-------
output: Tensor[image_channels, image_height, image_width]
Returns:
output (Tensor[image_channels, image_height, image_width])
"""
output
=
torch
.
ops
.
image
.
decode_image
(
input
,
mode
.
value
)
return
output
...
...
@@ -255,19 +237,15 @@ def read_image(path: str, mode: ImageReadMode = ImageReadMode.UNCHANGED) -> torc
Optionally converts the image to the desired format.
The values of the output tensor are uint8 between 0 and 255.
Parameters
----------
path: str
path of the JPEG or PNG image.
mode: ImageReadMode
the read mode used for optionally converting the image.
Args:
path (str): path of the JPEG or PNG image.
mode (ImageReadMode): the read mode used for optionally converting the image.
Default: `ImageReadMode.UNCHANGED`.
See `ImageReadMode` class for more information on various
available modes.
Returns
-------
output: Tensor[image_channels, image_height, image_width]
Returns:
output (Tensor[image_channels, image_height, image_width])
"""
data
=
read_file
(
path
)
return
decode_image
(
data
,
mode
)
torchvision/io/video.py
View file @
89bc3079
...
...
@@ -63,27 +63,18 @@ def write_video(
"""
Writes a 4d tensor in [T, H, W, C] format in a video file
Parameters
----------
filename : str
path where the video will be saved
video_array : Tensor[T, H, W, C]
tensor containing the individual frames, as a uint8 tensor in [T, H, W, C] format
fps : Number
video frames per second
video_codec : str
the name of the video codec, i.e. "libx264", "h264", etc.
options : Dict
dictionary containing options to be passed into the PyAV video stream
audio_array : Tensor[C, N]
tensor containing the audio, where C is the number of channels and N is the
number of samples
audio_fps : Number
audio sample rate, typically 44100 or 48000
audio_codec : str
the name of the audio codec, i.e. "mp3", "aac", etc.
audio_options : Dict
dictionary containing options to be passed into the PyAV audio stream
Args:
filename (str): path where the video will be saved
video_array (Tensor[T, H, W, C]): tensor containing the individual frames,
as a uint8 tensor in [T, H, W, C] format
fps (Number): video frames per second
video_codec (str): the name of the video codec, i.e. "libx264", "h264", etc.
options (Dict): dictionary containing options to be passed into the PyAV video stream
audio_array (Tensor[C, N]): tensor containing the audio, where C is the number of channels
and N is the number of samples
audio_fps (Number): audio sample rate, typically 44100 or 48000
audio_codec (str): the name of the audio codec, i.e. "mp3", "aac", etc.
audio_options (Dict): dictionary containing options to be passed into the PyAV audio stream
"""
_check_av_available
()
video_array
=
torch
.
as_tensor
(
video_array
,
dtype
=
torch
.
uint8
).
numpy
()
...
...
@@ -251,28 +242,20 @@ def read_video(
Reads a video from a file, returning both the video frames as well as
the audio frames
Parameters
----------
filename : str
path to the video file
start_pts : int if pts_unit = 'pts', optional
float / Fraction if pts_unit = 'sec', optional
the start presentation time of the video
end_pts : int if pts_unit = 'pts', optional
float / Fraction if pts_unit = 'sec', optional
the end presentation time
pts_unit : str, optional
unit in which start_pts and end_pts values will be interpreted, either 'pts' or 'sec'. Defaults to 'pts'.
Returns
-------
vframes : Tensor[T, H, W, C]
the `T` video frames
aframes : Tensor[K, L]
the audio frames, where `K` is the number of channels and `L` is the
Args:
filename (str): path to the video file
start_pts (int if pts_unit = 'pts', float / Fraction if pts_unit = 'sec', optional):
The start presentation time of the video
end_pts (int if pts_unit = 'pts', float / Fraction if pts_unit = 'sec', optional):
The end presentation time
pts_unit (str, optional): unit in which start_pts and end_pts values will be interpreted,
either 'pts' or 'sec'. Defaults to 'pts'.
Returns:
vframes (Tensor[T, H, W, C]): the `T` video frames
aframes (Tensor[K, L]): the audio frames, where `K` is the number of channels and `L` is the
number of points
info : Dict
metadata for the video and audio. Can contain the fields video_fps (float)
info (Dict): metadata for the video and audio. Can contain the fields video_fps (float)
and audio_fps (int)
"""
...
...
@@ -368,20 +351,15 @@ def read_video_timestamps(filename: str, pts_unit: str = "pts") -> Tuple[List[in
Note that the function decodes the whole video frame-by-frame.
Parameters
----------
filename : str
path to the video file
pts_unit : str, optional
unit in which timestamp values will be returned either 'pts' or 'sec'. Defaults to 'pts'.
Returns
-------
pts : List[int] if pts_unit = 'pts'
List[Fraction] if pts_unit = 'sec'
Args:
filename (str): path to the video file
pts_unit (str, optional): unit in which timestamp values will be returned
either 'pts' or 'sec'. Defaults to 'pts'.
Returns:
pts (List[int] if pts_unit = 'pts', List[Fraction] if pts_unit = 'sec'):
presentation timestamps for each one of the frames in the video.
video_fps : float, optional
the frame rate for the video
video_fps (float, optional): the frame rate for the video
"""
from
torchvision
import
get_video_backend
...
...
torchvision/models/mobilenetv2.py
View file @
89bc3079
...
...
@@ -18,10 +18,6 @@ def _make_divisible(v: float, divisor: int, min_value: Optional[int] = None) ->
It ensures that all layers have a channel number that is divisible by 8
It can be seen here:
https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
:param v:
:param divisor:
:param min_value:
:return:
"""
if
min_value
is
None
:
min_value
=
divisor
...
...
torchvision/ops/boxes.py
View file @
89bc3079
...
...
@@ -20,21 +20,14 @@ def nms(boxes: Tensor, scores: Tensor, iou_threshold: float) -> Tensor:
not guaranteed to be the same between CPU and GPU. This is similar
to the behavior of argsort in PyTorch when repeated values are present.
Parameters
----------
boxes : Tensor[N, 4])
boxes to perform NMS on. They
Args:
boxes (Tensor[N, 4])): boxes to perform NMS on. They
are expected to be in (x1, y1, x2, y2) format
scores : Tensor[N]
scores for each one of the boxes
iou_threshold : float
discards all overlapping
boxes with IoU > iou_threshold
Returns
-------
keep : Tensor
int64 tensor with the indices
scores (Tensor[N]): scores for each one of the boxes
iou_threshold (float): discards all overlapping boxes with IoU > iou_threshold
Returns:
keep (Tensor): int64 tensor with the indices
of the elements that have been kept
by NMS, sorted in decreasing order of scores
"""
...
...
@@ -55,23 +48,15 @@ def batched_nms(
Each index value correspond to a category, and NMS
will not be applied between elements of different categories.
Parameters
----------
boxes : Tensor[N, 4]
boxes where NMS will be performed. They
Args:
boxes (Tensor[N, 4]): boxes where NMS will be performed. They
are expected to be in (x1, y1, x2, y2) format
scores : Tensor[N]
scores for each one of the boxes
idxs : Tensor[N]
indices of the categories for each one of the boxes.
iou_threshold : float
discards all overlapping boxes
with IoU > iou_threshold
Returns
-------
keep : Tensor
int64 tensor with the indices of
scores (Tensor[N]): scores for each one of the boxes
idxs (Tensor[N]): indices of the categories for each one of the boxes.
iou_threshold (float): discards all overlapping boxes with IoU > iou_threshold
Returns:
keep (Tensor): int64 tensor with the indices of
the elements that have been kept by NMS, sorted
in decreasing order of scores
"""
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment