Unverified Commit 89bc3079 authored by Nicolas Hug's avatar Nicolas Hug Committed by GitHub
Browse files

Unify parameters formatting in docstrings (#3268)

parent e04de77c
...@@ -238,8 +238,10 @@ maskUtils = mask_util ...@@ -238,8 +238,10 @@ maskUtils = mask_util
def loadRes(self, resFile): def loadRes(self, resFile):
""" """
Load result file and return a result api object. Load result file and return a result api object.
:param resFile (str) : file name of result file Args:
:return: res (obj) : result api object resFile (str): file name of result file
Returns:
res (obj): result api object
""" """
res = COCO() res = COCO()
res.dataset['images'] = [img for img in self.dataset['images']] res.dataset['images'] = [img for img in self.dataset['images']]
......
...@@ -181,64 +181,50 @@ def _read_video_from_file( ...@@ -181,64 +181,50 @@ def _read_video_from_file(
Reads a video from a file, returning both the video frames as well as Reads a video from a file, returning both the video frames as well as
the audio frames the audio frames
Args Args:
---------- filename (str): path to the video file
filename : str seek_frame_margin (double, optional): seeking frame in the stream is imprecise. Thus,
path to the video file when video_start_pts is specified, we seek the pts earlier by seek_frame_margin seconds
seek_frame_margin: double, optional read_video_stream (int, optional): whether read video stream. If yes, set to 1. Otherwise, 0
seeking frame in the stream is imprecise. Thus, when video_start_pts video_width/video_height/video_min_dimension/video_max_dimension (int): together decide
is specified, we seek the pts earlier by seek_frame_margin seconds the size of decoded frames:
read_video_stream: int, optional
whether read video stream. If yes, set to 1. Otherwise, 0 - When video_width = 0, video_height = 0, video_min_dimension = 0,
video_width/video_height/video_min_dimension/video_max_dimension: int and video_max_dimension = 0, keep the original frame resolution
together decide the size of decoded frames - When video_width = 0, video_height = 0, video_min_dimension != 0,
- When video_width = 0, video_height = 0, video_min_dimension = 0, and video_max_dimension = 0, keep the aspect ratio and resize the
and video_max_dimension = 0, keep the original frame resolution frame so that shorter edge size is video_min_dimension
- When video_width = 0, video_height = 0, video_min_dimension != 0, - When video_width = 0, video_height = 0, video_min_dimension = 0,
and video_max_dimension = 0, keep the aspect ratio and resize the and video_max_dimension != 0, keep the aspect ratio and resize
frame so that shorter edge size is video_min_dimension the frame so that longer edge size is video_max_dimension
- When video_width = 0, video_height = 0, video_min_dimension = 0, - When video_width = 0, video_height = 0, video_min_dimension != 0,
and video_max_dimension != 0, keep the aspect ratio and resize and video_max_dimension != 0, resize the frame so that shorter
the frame so that longer edge size is video_max_dimension edge size is video_min_dimension, and longer edge size is
- When video_width = 0, video_height = 0, video_min_dimension != 0, video_max_dimension. The aspect ratio may not be preserved
and video_max_dimension != 0, resize the frame so that shorter - When video_width = 0, video_height != 0, video_min_dimension = 0,
edge size is video_min_dimension, and longer edge size is and video_max_dimension = 0, keep the aspect ratio and resize
video_max_dimension. The aspect ratio may not be preserved the frame so that frame video_height is $video_height
- When video_width = 0, video_height != 0, video_min_dimension = 0, - When video_width != 0, video_height == 0, video_min_dimension = 0,
and video_max_dimension = 0, keep the aspect ratio and resize and video_max_dimension = 0, keep the aspect ratio and resize
the frame so that frame video_height is $video_height the frame so that frame video_width is $video_width
- When video_width != 0, video_height == 0, video_min_dimension = 0, - When video_width != 0, video_height != 0, video_min_dimension = 0,
and video_max_dimension = 0, keep the aspect ratio and resize and video_max_dimension = 0, resize the frame so that frame
the frame so that frame video_width is $video_width video_width and video_height are set to $video_width and
- When video_width != 0, video_height != 0, video_min_dimension = 0, $video_height, respectively
and video_max_dimension = 0, resize the frame so that frame video_pts_range (list(int), optional): the start and end presentation timestamp of video stream
video_width and video_height are set to $video_width and video_timebase (Fraction, optional): a Fraction rational number which denotes timebase in video stream
$video_height, respectively read_audio_stream (int, optional): whether read audio stream. If yes, set to 1. Otherwise, 0
video_pts_range : list(int), optional audio_samples (int, optional): audio sampling rate
the start and end presentation timestamp of video stream audio_channels (int optional): audio channels
video_timebase: Fraction, optional audio_pts_range (list(int), optional): the start and end presentation timestamp of audio stream
a Fraction rational number which denotes timebase in video stream audio_timebase (Fraction, optional): a Fraction rational number which denotes time base in audio stream
read_audio_stream: int, optional
whether read audio stream. If yes, set to 1. Otherwise, 0
audio_samples: int, optional
audio sampling rate
audio_channels: int optional
audio channels
audio_pts_range : list(int), optional
the start and end presentation timestamp of audio stream
audio_timebase: Fraction, optional
a Fraction rational number which denotes time base in audio stream
Returns Returns
------- vframes (Tensor[T, H, W, C]): the `T` video frames
vframes : Tensor[T, H, W, C] aframes (Tensor[L, K]): the audio frames, where `L` is the number of points and
the `T` video frames
aframes : Tensor[L, K]
the audio frames, where `L` is the number of points and
`K` is the number of audio_channels `K` is the number of audio_channels
info : Dict info (Dict): metadata for the video and audio. Can contain the fields video_fps (float)
metadata for the video and audio. Can contain the fields video_fps (float) and audio_fps (int)
and audio_fps (int)
""" """
_validate_pts(video_pts_range) _validate_pts(video_pts_range)
_validate_pts(audio_pts_range) _validate_pts(audio_pts_range)
...@@ -345,60 +331,50 @@ def _read_video_from_memory( ...@@ -345,60 +331,50 @@ def _read_video_from_memory(
the audio frames the audio frames
This function is torchscriptable. This function is torchscriptable.
Args Args:
---------- video_data (data type could be 1) torch.Tensor, dtype=torch.int8 or 2) python bytes):
video_data : data type could be 1) torch.Tensor, dtype=torch.int8 or 2) python bytes
compressed video content stored in either 1) torch.Tensor 2) python bytes compressed video content stored in either 1) torch.Tensor 2) python bytes
seek_frame_margin: double, optional seek_frame_margin (double, optional): seeking frame in the stream is imprecise.
seeking frame in the stream is imprecise. Thus, when video_start_pts is specified, Thus, when video_start_pts is specified, we seek the pts earlier by seek_frame_margin seconds
we seek the pts earlier by seek_frame_margin seconds read_video_stream (int, optional): whether read video stream. If yes, set to 1. Otherwise, 0
read_video_stream: int, optional video_width/video_height/video_min_dimension/video_max_dimension (int): together decide
whether read video stream. If yes, set to 1. Otherwise, 0 the size of decoded frames:
video_width/video_height/video_min_dimension/video_max_dimension: int
together decide the size of decoded frames - When video_width = 0, video_height = 0, video_min_dimension = 0,
- When video_width = 0, video_height = 0, video_min_dimension = 0, and video_max_dimension = 0, keep the original frame resolution
and video_max_dimension = 0, keep the original frame resolution - When video_width = 0, video_height = 0, video_min_dimension != 0,
- When video_width = 0, video_height = 0, video_min_dimension != 0, and video_max_dimension = 0, keep the aspect ratio and resize the
and video_max_dimension = 0, keep the aspect ratio and resize the frame so that shorter edge size is video_min_dimension
frame so that shorter edge size is video_min_dimension - When video_width = 0, video_height = 0, video_min_dimension = 0,
- When video_width = 0, video_height = 0, video_min_dimension = 0, and video_max_dimension != 0, keep the aspect ratio and resize
and video_max_dimension != 0, keep the aspect ratio and resize the frame so that longer edge size is video_max_dimension
the frame so that longer edge size is video_max_dimension - When video_width = 0, video_height = 0, video_min_dimension != 0,
- When video_width = 0, video_height = 0, video_min_dimension != 0, and video_max_dimension != 0, resize the frame so that shorter
and video_max_dimension != 0, resize the frame so that shorter edge size is video_min_dimension, and longer edge size is
edge size is video_min_dimension, and longer edge size is video_max_dimension. The aspect ratio may not be preserved
video_max_dimension. The aspect ratio may not be preserved - When video_width = 0, video_height != 0, video_min_dimension = 0,
- When video_width = 0, video_height != 0, video_min_dimension = 0, and video_max_dimension = 0, keep the aspect ratio and resize
and video_max_dimension = 0, keep the aspect ratio and resize the frame so that frame video_height is $video_height
the frame so that frame video_height is $video_height - When video_width != 0, video_height == 0, video_min_dimension = 0,
- When video_width != 0, video_height == 0, video_min_dimension = 0, and video_max_dimension = 0, keep the aspect ratio and resize
and video_max_dimension = 0, keep the aspect ratio and resize the frame so that frame video_width is $video_width
the frame so that frame video_width is $video_width - When video_width != 0, video_height != 0, video_min_dimension = 0,
- When video_width != 0, video_height != 0, video_min_dimension = 0, and video_max_dimension = 0, resize the frame so that frame
and video_max_dimension = 0, resize the frame so that frame video_width and video_height are set to $video_width and
video_width and video_height are set to $video_width and $video_height, respectively
$video_height, respectively video_pts_range (list(int), optional): the start and end presentation timestamp of video stream
video_pts_range : list(int), optional video_timebase_numerator / video_timebase_denominator (float, optional): a rational
the start and end presentation timestamp of video stream number which denotes timebase in video stream
video_timebase_numerator / video_timebase_denominator: optional read_audio_stream (int, optional): whether read audio stream. If yes, set to 1. Otherwise, 0
a rational number which denotes timebase in video stream audio_samples (int, optional): audio sampling rate
read_audio_stream: int, optional audio_channels (int optional): audio audio_channels
whether read audio stream. If yes, set to 1. Otherwise, 0 audio_pts_range (list(int), optional): the start and end presentation timestamp of audio stream
audio_samples: int, optional audio_timebase_numerator / audio_timebase_denominator (float, optional):
audio sampling rate
audio_channels: int optional
audio audio_channels
audio_pts_range : list(int), optional
the start and end presentation timestamp of audio stream
audio_timebase_numerator / audio_timebase_denominator: optional
a rational number which denotes time base in audio stream a rational number which denotes time base in audio stream
Returns Returns:
------- vframes (Tensor[T, H, W, C]): the `T` video frames
vframes : Tensor[T, H, W, C] aframes (Tensor[L, K]): the audio frames, where `L` is the number of points and
the `T` video frames
aframes : Tensor[L, K]
the audio frames, where `L` is the number of points and
`K` is the number of channels `K` is the number of channels
""" """
......
...@@ -101,11 +101,11 @@ def decode_png(input: torch.Tensor, mode: ImageReadMode = ImageReadMode.UNCHANGE ...@@ -101,11 +101,11 @@ def decode_png(input: torch.Tensor, mode: ImageReadMode = ImageReadMode.UNCHANGE
Args: Args:
input (Tensor[1]): a one dimensional uint8 tensor containing input (Tensor[1]): a one dimensional uint8 tensor containing
the raw bytes of the PNG image. the raw bytes of the PNG image.
mode (ImageReadMode): the read mode used for optionally mode (ImageReadMode): the read mode used for optionally
converting the image. Default: `ImageReadMode.UNCHANGED`. converting the image. Default: `ImageReadMode.UNCHANGED`.
See `ImageReadMode` class for more information on various See `ImageReadMode` class for more information on various
available modes. available modes.
Returns: Returns:
output (Tensor[image_channels, image_height, image_width]) output (Tensor[image_channels, image_height, image_width])
...@@ -119,18 +119,14 @@ def encode_png(input: torch.Tensor, compression_level: int = 6) -> torch.Tensor: ...@@ -119,18 +119,14 @@ def encode_png(input: torch.Tensor, compression_level: int = 6) -> torch.Tensor:
Takes an input tensor in CHW layout and returns a buffer with the contents Takes an input tensor in CHW layout and returns a buffer with the contents
of its corresponding PNG file. of its corresponding PNG file.
Parameters Args:
---------- input (Tensor[channels, image_height, image_width]): int8 image tensor of
input: Tensor[channels, image_height, image_width] `c` channels, where `c` must 3 or 1.
int8 image tensor of `c` channels, where `c` must 3 or 1. compression_level (int): Compression factor for the resulting file, it must be a number
compression_level: int between 0 and 9. Default: 6
Compression factor for the resulting file, it must be a number
between 0 and 9. Default: 6 Returns:
output (Tensor[1]): A one dimensional int8 tensor that contains the raw bytes of the
Returns
-------
output: Tensor[1]
A one dimensional int8 tensor that contains the raw bytes of the
PNG file. PNG file.
""" """
output = torch.ops.image.encode_png(input, compression_level) output = torch.ops.image.encode_png(input, compression_level)
...@@ -142,15 +138,12 @@ def write_png(input: torch.Tensor, filename: str, compression_level: int = 6): ...@@ -142,15 +138,12 @@ def write_png(input: torch.Tensor, filename: str, compression_level: int = 6):
Takes an input tensor in CHW layout (or HW in the case of grayscale images) Takes an input tensor in CHW layout (or HW in the case of grayscale images)
and saves it in a PNG file. and saves it in a PNG file.
Parameters Args:
---------- input (Tensor[channels, image_height, image_width]): int8 image tensor of
input: Tensor[channels, image_height, image_width] `c` channels, where `c` must be 1 or 3.
int8 image tensor of `c` channels, where `c` must be 1 or 3. filename (str): Path to save the image.
filename: str compression_level (int): Compression factor for the resulting file, it must be a number
Path to save the image. between 0 and 9. Default: 6
compression_level: int
Compression factor for the resulting file, it must be a number
between 0 and 9. Default: 6
""" """
output = encode_png(input, compression_level) output = encode_png(input, compression_level)
write_file(filename, output) write_file(filename, output)
...@@ -164,11 +157,11 @@ def decode_jpeg(input: torch.Tensor, mode: ImageReadMode = ImageReadMode.UNCHANG ...@@ -164,11 +157,11 @@ def decode_jpeg(input: torch.Tensor, mode: ImageReadMode = ImageReadMode.UNCHANG
Args: Args:
input (Tensor[1]): a one dimensional uint8 tensor containing input (Tensor[1]): a one dimensional uint8 tensor containing
the raw bytes of the JPEG image. the raw bytes of the JPEG image.
mode (ImageReadMode): the read mode used for optionally mode (ImageReadMode): the read mode used for optionally
converting the image. Default: `ImageReadMode.UNCHANGED`. converting the image. Default: `ImageReadMode.UNCHANGED`.
See `ImageReadMode` class for more information on various See `ImageReadMode` class for more information on various
available modes. available modes.
Returns: Returns:
output (Tensor[image_channels, image_height, image_width]) output (Tensor[image_channels, image_height, image_width])
...@@ -182,19 +175,15 @@ def encode_jpeg(input: torch.Tensor, quality: int = 75) -> torch.Tensor: ...@@ -182,19 +175,15 @@ def encode_jpeg(input: torch.Tensor, quality: int = 75) -> torch.Tensor:
Takes an input tensor in CHW layout and returns a buffer with the contents Takes an input tensor in CHW layout and returns a buffer with the contents
of its corresponding JPEG file. of its corresponding JPEG file.
Parameters Args:
---------- input (Tensor[channels, image_height, image_width])): int8 image tensor of
input: Tensor[channels, image_height, image_width]) `c` channels, where `c` must be 1 or 3.
int8 image tensor of `c` channels, where `c` must be 1 or 3. quality (int): Quality of the resulting JPEG file, it must be a number between
quality: int 1 and 100. Default: 75
Quality of the resulting JPEG file, it must be a number between
1 and 100. Default: 75 Returns:
output (Tensor[1]): A one dimensional int8 tensor that contains the raw bytes of the
Returns JPEG file.
-------
output: Tensor[1]
A one dimensional int8 tensor that contains the raw bytes of the
JPEG file.
""" """
if quality < 1 or quality > 100: if quality < 1 or quality > 100:
raise ValueError('Image quality should be a positive number ' raise ValueError('Image quality should be a positive number '
...@@ -208,15 +197,12 @@ def write_jpeg(input: torch.Tensor, filename: str, quality: int = 75): ...@@ -208,15 +197,12 @@ def write_jpeg(input: torch.Tensor, filename: str, quality: int = 75):
""" """
Takes an input tensor in CHW layout and saves it in a JPEG file. Takes an input tensor in CHW layout and saves it in a JPEG file.
Parameters Args:
---------- input (Tensor[channels, image_height, image_width]): int8 image tensor of `c`
input: Tensor[channels, image_height, image_width] channels, where `c` must be 1 or 3.
int8 image tensor of `c` channels, where `c` must be 1 or 3. filename (str): Path to save the image.
filename: str quality (int): Quality of the resulting JPEG file, it must be a number
Path to save the image. between 1 and 100. Default: 75
quality: int
Quality of the resulting JPEG file, it must be a number
between 1 and 100. Default: 75
""" """
output = encode_jpeg(input, quality) output = encode_jpeg(input, quality)
write_file(filename, output) write_file(filename, output)
...@@ -230,20 +216,16 @@ def decode_image(input: torch.Tensor, mode: ImageReadMode = ImageReadMode.UNCHAN ...@@ -230,20 +216,16 @@ def decode_image(input: torch.Tensor, mode: ImageReadMode = ImageReadMode.UNCHAN
Optionally converts the image to the desired format. Optionally converts the image to the desired format.
The values of the output tensor are uint8 between 0 and 255. The values of the output tensor are uint8 between 0 and 255.
Parameters Args:
---------- input (Tensor): a one dimensional uint8 tensor containing the raw bytes of the
input: Tensor PNG or JPEG image.
a one dimensional uint8 tensor containing the raw bytes of the mode (ImageReadMode): the read mode used for optionally converting the image.
PNG or JPEG image. Default: `ImageReadMode.UNCHANGED`.
mode: ImageReadMode See `ImageReadMode` class for more information on various
the read mode used for optionally converting the image. available modes.
Default: `ImageReadMode.UNCHANGED`.
See `ImageReadMode` class for more information on various Returns:
available modes. output (Tensor[image_channels, image_height, image_width])
Returns
-------
output: Tensor[image_channels, image_height, image_width]
""" """
output = torch.ops.image.decode_image(input, mode.value) output = torch.ops.image.decode_image(input, mode.value)
return output return output
...@@ -255,19 +237,15 @@ def read_image(path: str, mode: ImageReadMode = ImageReadMode.UNCHANGED) -> torc ...@@ -255,19 +237,15 @@ def read_image(path: str, mode: ImageReadMode = ImageReadMode.UNCHANGED) -> torc
Optionally converts the image to the desired format. Optionally converts the image to the desired format.
The values of the output tensor are uint8 between 0 and 255. The values of the output tensor are uint8 between 0 and 255.
Parameters Args:
---------- path (str): path of the JPEG or PNG image.
path: str mode (ImageReadMode): the read mode used for optionally converting the image.
path of the JPEG or PNG image. Default: `ImageReadMode.UNCHANGED`.
mode: ImageReadMode See `ImageReadMode` class for more information on various
the read mode used for optionally converting the image. available modes.
Default: `ImageReadMode.UNCHANGED`.
See `ImageReadMode` class for more information on various Returns:
available modes. output (Tensor[image_channels, image_height, image_width])
Returns
-------
output: Tensor[image_channels, image_height, image_width]
""" """
data = read_file(path) data = read_file(path)
return decode_image(data, mode) return decode_image(data, mode)
...@@ -63,27 +63,18 @@ def write_video( ...@@ -63,27 +63,18 @@ def write_video(
""" """
Writes a 4d tensor in [T, H, W, C] format in a video file Writes a 4d tensor in [T, H, W, C] format in a video file
Parameters Args:
---------- filename (str): path where the video will be saved
filename : str video_array (Tensor[T, H, W, C]): tensor containing the individual frames,
path where the video will be saved as a uint8 tensor in [T, H, W, C] format
video_array : Tensor[T, H, W, C] fps (Number): video frames per second
tensor containing the individual frames, as a uint8 tensor in [T, H, W, C] format video_codec (str): the name of the video codec, i.e. "libx264", "h264", etc.
fps : Number options (Dict): dictionary containing options to be passed into the PyAV video stream
video frames per second audio_array (Tensor[C, N]): tensor containing the audio, where C is the number of channels
video_codec : str and N is the number of samples
the name of the video codec, i.e. "libx264", "h264", etc. audio_fps (Number): audio sample rate, typically 44100 or 48000
options : Dict audio_codec (str): the name of the audio codec, i.e. "mp3", "aac", etc.
dictionary containing options to be passed into the PyAV video stream audio_options (Dict): dictionary containing options to be passed into the PyAV audio stream
audio_array : Tensor[C, N]
tensor containing the audio, where C is the number of channels and N is the
number of samples
audio_fps : Number
audio sample rate, typically 44100 or 48000
audio_codec : str
the name of the audio codec, i.e. "mp3", "aac", etc.
audio_options : Dict
dictionary containing options to be passed into the PyAV audio stream
""" """
_check_av_available() _check_av_available()
video_array = torch.as_tensor(video_array, dtype=torch.uint8).numpy() video_array = torch.as_tensor(video_array, dtype=torch.uint8).numpy()
...@@ -251,29 +242,21 @@ def read_video( ...@@ -251,29 +242,21 @@ def read_video(
Reads a video from a file, returning both the video frames as well as Reads a video from a file, returning both the video frames as well as
the audio frames the audio frames
Parameters Args:
---------- filename (str): path to the video file
filename : str start_pts (int if pts_unit = 'pts', float / Fraction if pts_unit = 'sec', optional):
path to the video file The start presentation time of the video
start_pts : int if pts_unit = 'pts', optional end_pts (int if pts_unit = 'pts', float / Fraction if pts_unit = 'sec', optional):
float / Fraction if pts_unit = 'sec', optional The end presentation time
the start presentation time of the video pts_unit (str, optional): unit in which start_pts and end_pts values will be interpreted,
end_pts : int if pts_unit = 'pts', optional either 'pts' or 'sec'. Defaults to 'pts'.
float / Fraction if pts_unit = 'sec', optional
the end presentation time Returns:
pts_unit : str, optional vframes (Tensor[T, H, W, C]): the `T` video frames
unit in which start_pts and end_pts values will be interpreted, either 'pts' or 'sec'. Defaults to 'pts'. aframes (Tensor[K, L]): the audio frames, where `K` is the number of channels and `L` is the
number of points
Returns info (Dict): metadata for the video and audio. Can contain the fields video_fps (float)
------- and audio_fps (int)
vframes : Tensor[T, H, W, C]
the `T` video frames
aframes : Tensor[K, L]
the audio frames, where `K` is the number of channels and `L` is the
number of points
info : Dict
metadata for the video and audio. Can contain the fields video_fps (float)
and audio_fps (int)
""" """
from torchvision import get_video_backend from torchvision import get_video_backend
...@@ -368,20 +351,15 @@ def read_video_timestamps(filename: str, pts_unit: str = "pts") -> Tuple[List[in ...@@ -368,20 +351,15 @@ def read_video_timestamps(filename: str, pts_unit: str = "pts") -> Tuple[List[in
Note that the function decodes the whole video frame-by-frame. Note that the function decodes the whole video frame-by-frame.
Parameters Args:
---------- filename (str): path to the video file
filename : str pts_unit (str, optional): unit in which timestamp values will be returned
path to the video file either 'pts' or 'sec'. Defaults to 'pts'.
pts_unit : str, optional
unit in which timestamp values will be returned either 'pts' or 'sec'. Defaults to 'pts'. Returns:
pts (List[int] if pts_unit = 'pts', List[Fraction] if pts_unit = 'sec'):
Returns presentation timestamps for each one of the frames in the video.
------- video_fps (float, optional): the frame rate for the video
pts : List[int] if pts_unit = 'pts'
List[Fraction] if pts_unit = 'sec'
presentation timestamps for each one of the frames in the video.
video_fps : float, optional
the frame rate for the video
""" """
from torchvision import get_video_backend from torchvision import get_video_backend
......
...@@ -18,10 +18,6 @@ def _make_divisible(v: float, divisor: int, min_value: Optional[int] = None) -> ...@@ -18,10 +18,6 @@ def _make_divisible(v: float, divisor: int, min_value: Optional[int] = None) ->
It ensures that all layers have a channel number that is divisible by 8 It ensures that all layers have a channel number that is divisible by 8
It can be seen here: It can be seen here:
https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
:param v:
:param divisor:
:param min_value:
:return:
""" """
if min_value is None: if min_value is None:
min_value = divisor min_value = divisor
......
...@@ -20,23 +20,16 @@ def nms(boxes: Tensor, scores: Tensor, iou_threshold: float) -> Tensor: ...@@ -20,23 +20,16 @@ def nms(boxes: Tensor, scores: Tensor, iou_threshold: float) -> Tensor:
not guaranteed to be the same between CPU and GPU. This is similar not guaranteed to be the same between CPU and GPU. This is similar
to the behavior of argsort in PyTorch when repeated values are present. to the behavior of argsort in PyTorch when repeated values are present.
Parameters Args:
---------- boxes (Tensor[N, 4])): boxes to perform NMS on. They
boxes : Tensor[N, 4]) are expected to be in (x1, y1, x2, y2) format
boxes to perform NMS on. They scores (Tensor[N]): scores for each one of the boxes
are expected to be in (x1, y1, x2, y2) format iou_threshold (float): discards all overlapping boxes with IoU > iou_threshold
scores : Tensor[N]
scores for each one of the boxes Returns:
iou_threshold : float keep (Tensor): int64 tensor with the indices
discards all overlapping of the elements that have been kept
boxes with IoU > iou_threshold by NMS, sorted in decreasing order of scores
Returns
-------
keep : Tensor
int64 tensor with the indices
of the elements that have been kept
by NMS, sorted in decreasing order of scores
""" """
_assert_has_ops() _assert_has_ops()
return torch.ops.torchvision.nms(boxes, scores, iou_threshold) return torch.ops.torchvision.nms(boxes, scores, iou_threshold)
...@@ -55,25 +48,17 @@ def batched_nms( ...@@ -55,25 +48,17 @@ def batched_nms(
Each index value correspond to a category, and NMS Each index value correspond to a category, and NMS
will not be applied between elements of different categories. will not be applied between elements of different categories.
Parameters Args:
---------- boxes (Tensor[N, 4]): boxes where NMS will be performed. They
boxes : Tensor[N, 4] are expected to be in (x1, y1, x2, y2) format
boxes where NMS will be performed. They scores (Tensor[N]): scores for each one of the boxes
are expected to be in (x1, y1, x2, y2) format idxs (Tensor[N]): indices of the categories for each one of the boxes.
scores : Tensor[N] iou_threshold (float): discards all overlapping boxes with IoU > iou_threshold
scores for each one of the boxes
idxs : Tensor[N] Returns:
indices of the categories for each one of the boxes. keep (Tensor): int64 tensor with the indices of
iou_threshold : float the elements that have been kept by NMS, sorted
discards all overlapping boxes in decreasing order of scores
with IoU > iou_threshold
Returns
-------
keep : Tensor
int64 tensor with the indices of
the elements that have been kept by NMS, sorted
in decreasing order of scores
""" """
if boxes.numel() == 0: if boxes.numel() == 0:
return torch.empty((0,), dtype=torch.int64, device=boxes.device) return torch.empty((0,), dtype=torch.int64, device=boxes.device)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment