Unverified Commit 66cefaff authored by Zaida Zhou's avatar Zaida Zhou Committed by GitHub
Browse files

[Docs] Refactor docs (#1102)

* [Docs] Refactor documentation

* [Docs] Refactor documentation

* refactor docs

* refactor docs

* set sphinx==3.1.2

* fix typo

* modify according to comment

* modify according to comment

* modify according to comment

* [Docs] delete unnecessary file

* fix title

* rename

* rename
parent cdcbc03c
## Image ## Data Process
### Image
This module provides some image processing methods, which requires `opencv` to be installed. This module provides some image processing methods, which requires `opencv` to be installed.
### Read/Write/Show #### Read/Write/Show
To read or write images files, use `imread` or `imwrite`. To read or write images files, use `imread` or `imwrite`.
...@@ -34,7 +36,7 @@ for i in range(10): ...@@ -34,7 +36,7 @@ for i in range(10):
mmcv.imshow(img, win_name='test image', wait_time=200) mmcv.imshow(img, win_name='test image', wait_time=200)
``` ```
### Color space conversion #### Color space conversion
Supported conversion methods: Supported conversion methods:
...@@ -52,7 +54,7 @@ img2 = mmcv.rgb2gray(img1) ...@@ -52,7 +54,7 @@ img2 = mmcv.rgb2gray(img1)
img3 = mmcv.bgr2hsv(img) img3 = mmcv.bgr2hsv(img)
``` ```
### Resize #### Resize
There are three resize methods. All `imresize_*` methods have an argument `return_scale`, There are three resize methods. All `imresize_*` methods have an argument `return_scale`,
if this argument is `False`, then the return value is merely the resized image, otherwise if this argument is `False`, then the return value is merely the resized image, otherwise
...@@ -73,7 +75,7 @@ mmcv.imrescale(img, 0.5) ...@@ -73,7 +75,7 @@ mmcv.imrescale(img, 0.5)
mmcv.imrescale(img, (1000, 800)) mmcv.imrescale(img, (1000, 800))
``` ```
### Rotate #### Rotate
To rotate an image by some angle, use `imrotate`. The center can be specified, To rotate an image by some angle, use `imrotate`. The center can be specified,
which is the center of original image by default. There are two modes of rotating, which is the center of original image by default. There are two modes of rotating,
...@@ -100,7 +102,7 @@ img_ = mmcv.imrotate(img, 30, center=(100, 100)) ...@@ -100,7 +102,7 @@ img_ = mmcv.imrotate(img, 30, center=(100, 100))
img_ = mmcv.imrotate(img, 30, auto_bound=True) img_ = mmcv.imrotate(img, 30, auto_bound=True)
``` ```
### Flip #### Flip
To flip an image, use `imflip`. To flip an image, use `imflip`.
...@@ -114,7 +116,7 @@ mmcv.imflip(img) ...@@ -114,7 +116,7 @@ mmcv.imflip(img)
mmcv.imflip(img, direction='vertical') mmcv.imflip(img, direction='vertical')
``` ```
### Crop #### Crop
`imcrop` can crop the image with one or some regions, represented as (x1, y1, x2, y2). `imcrop` can crop the image with one or some regions, represented as (x1, y1, x2, y2).
...@@ -136,7 +138,7 @@ patches = mmcv.imcrop(img, bboxes) ...@@ -136,7 +138,7 @@ patches = mmcv.imcrop(img, bboxes)
patches = mmcv.imcrop(img, bboxes, scale_ratio=1.2) patches = mmcv.imcrop(img, bboxes, scale_ratio=1.2)
``` ```
### Padding #### Padding
There are two methods `impad` and `impad_to_multiple` to pad an image to the There are two methods `impad` and `impad_to_multiple` to pad an image to the
specific size with given values. specific size with given values.
...@@ -160,3 +162,125 @@ img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=[100, 50, 200]) ...@@ -160,3 +162,125 @@ img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=[100, 50, 200])
# pad an image so that each edge is a multiple of some value. # pad an image so that each edge is a multiple of some value.
img_ = mmcv.impad_to_multiple(img, 32) img_ = mmcv.impad_to_multiple(img, 32)
``` ```
### Video
This module provides the following functionalities.
- A `VideoReader` class with friendly apis to read and convert videos.
- Some methods for editing (cut, concat, resize) videos.
- Optical flow read/write/warp.
#### VideoReader
The `VideoReader` class provides sequence like apis to access video frames.
It will internally cache the frames which have been visited.
```python
video = mmcv.VideoReader('test.mp4')
# obtain basic information
print(len(video))
print(video.width, video.height, video.resolution, video.fps)
# iterate over all frames
for frame in video:
print(frame.shape)
# read the next frame
img = video.read()
# read a frame by index
img = video[100]
# read some frames
img = video[5:10]
```
To convert a video to images or generate a video from a image directory.
```python
# split a video into frames and save to a folder
video = mmcv.VideoReader('test.mp4')
video.cvt2frames('out_dir')
# generate video from frames
mmcv.frames2video('out_dir', 'test.avi')
```
#### Editing utils
There are also some methods for editing videos, which wraps the commands of ffmpeg.
```python
# cut a video clip
mmcv.cut_video('test.mp4', 'clip1.mp4', start=3, end=10, vcodec='h264')
# join a list of video clips
mmcv.concat_video(['clip1.mp4', 'clip2.mp4'], 'joined.mp4', log_level='quiet')
# resize a video with the specified size
mmcv.resize_video('test.mp4', 'resized1.mp4', (360, 240))
# resize a video with a scaling ratio of 2
mmcv.resize_video('test.mp4', 'resized2.mp4', ratio=2)
```
#### Optical flow
`mmcv` provides the following methods to operate on optical flows.
- IO
- Visualization
- Flow warpping
We provide two options to dump optical flow files: uncompressed and compressed.
The uncompressed way just dumps the floating numbers to a binary file. It is
lossless but the dumped file has a larger size.
The compressed way quantizes the optical flow to 0-255 and dumps it as a
jpeg image. The flow of x-dim and y-dim will be concatenated into a single image.
1. IO
```python
flow = np.random.rand(800, 600, 2).astype(np.float32)
# dump the flow to a flo file (~3.7M)
mmcv.flowwrite(flow, 'uncompressed.flo')
# dump the flow to a jpeg file (~230K)
# the shape of the dumped image is (800, 1200)
mmcv.flowwrite(flow, 'compressed.jpg', quantize=True, concat_axis=1)
# read the flow file, the shape of loaded flow is (800, 600, 2) for both ways
flow = mmcv.flowread('uncompressed.flo')
flow = mmcv.flowread('compressed.jpg', quantize=True, concat_axis=1)
```
2. Visualization
It is possible to visualize optical flows with `mmcv.flowshow()`.
```python
mmcv.flowshow(flow)
```
![progress](../_static/flow_visualization.png)
3. Flow warpping
```python
img1 = mmcv.imread('img1.jpg')
flow = mmcv.flowread('flow.flo')
warpped_img2 = mmcv.flow_warp(img1, flow)
```
img1 (left) and img2 (right)
![raw images](../_static/flow_raw_images.png)
optical flow (img2 -> img1)
![optical flow](../_static/flow_img2toimg1.png)
warpped image and difference with ground truth
![warpped image](../_static/flow_warp_diff.png)
...@@ -62,7 +62,7 @@ converter_cfg = dict(type='Converter1', a=a_value, b=b_value) ...@@ -62,7 +62,7 @@ converter_cfg = dict(type='Converter1', a=a_value, b=b_value)
converter = CONVERTERS.build(converter_cfg) converter = CONVERTERS.build(converter_cfg)
``` ```
## Customize Build Function ### Customize Build Function
Suppose we would like to customize how `converters` are built, we could implement a customized `build_func` and pass it into the registry. Suppose we would like to customize how `converters` are built, we could implement a customized `build_func` and pass it into the registry.
...@@ -89,7 +89,7 @@ Note: in this example, we demonstrate how to use the `build_func` argument to cu ...@@ -89,7 +89,7 @@ Note: in this example, we demonstrate how to use the `build_func` argument to cu
The functionality is similar to the default `build_from_cfg`. In most cases, default one would be sufficient. The functionality is similar to the default `build_from_cfg`. In most cases, default one would be sufficient.
`build_model_from_cfg` is also implemented to build PyTorch module in `nn.Sequentail`, you may directly use them instead of implementing by yourself. `build_model_from_cfg` is also implemented to build PyTorch module in `nn.Sequentail`, you may directly use them instead of implementing by yourself.
## Hierarchy Registry ### Hierarchy Registry
You could also build modules from more than one OpenMMLab frameworks, e.g. you could use all backbones in [MMClassification](https://github.com/open-mmlab/mmclassification) for object detectors in [MMDetection](https://github.com/open-mmlab/mmdetection), you may also combine an object detection model in [MMDetection](https://github.com/open-mmlab/mmdetection) and semantic segmentation model in [MMSegmentation](https://github.com/open-mmlab/mmsegmentation). You could also build modules from more than one OpenMMLab frameworks, e.g. you could use all backbones in [MMClassification](https://github.com/open-mmlab/mmclassification) for object detectors in [MMDetection](https://github.com/open-mmlab/mmdetection), you may also combine an object detection model in [MMDetection](https://github.com/open-mmlab/mmdetection) and semantic segmentation model in [MMSegmentation](https://github.com/open-mmlab/mmsegmentation).
......
## Utils
### ProgressBar
If you want to apply a method to a list of items and track the progress, `track_progress`
is a good choice. It will display a progress bar to tell the progress and ETA.
```python
import mmcv
def func(item):
# do something
pass
tasks = [item_1, item_2, ..., item_n]
mmcv.track_progress(func, tasks)
```
The output is like the following.
![progress](../_static/progress.gif)
There is another method `track_parallel_progress`, which wraps multiprocessing and
progress visualization.
```python
mmcv.track_parallel_progress(func, tasks, 8) # 8 workers
```
![progress](../_static/parallel_progress.gif)
If you want to iterate or enumerate a list of items and track the progress, `track_iter_progress`
is a good choice. It will display a progress bar to tell the progress and ETA.
```python
import mmcv
tasks = [item_1, item_2, ..., item_n]
for task in mmcv.track_iter_progress(tasks):
# do something like print
print(task)
for i, task in enumerate(mmcv.track_iter_progress(tasks)):
# do something like print
print(i)
print(task)
```
### Timer
It is convenient to compute the runtime of a code block with `Timer`.
```python
import time
with mmcv.Timer():
# simulate some code block
time.sleep(1)
```
or try with `since_start()` and `since_last_check()`. This former can
return the runtime since the timer starts and the latter will return the time
since the last time checked.
```python
timer = mmcv.Timer()
# code block 1 here
print(timer.since_start())
# code block 2 here
print(timer.since_last_check())
print(timer.since_start())
```
## Video
This module provides the following functionalities.
- A `VideoReader` class with friendly apis to read and convert videos.
- Some methods for editing (cut, concat, resize) videos.
- Optical flow read/write/warp.
### VideoReader
The `VideoReader` class provides sequence like apis to access video frames.
It will internally cache the frames which have been visited.
```python
video = mmcv.VideoReader('test.mp4')
# obtain basic information
print(len(video))
print(video.width, video.height, video.resolution, video.fps)
# iterate over all frames
for frame in video:
print(frame.shape)
# read the next frame
img = video.read()
# read a frame by index
img = video[100]
# read some frames
img = video[5:10]
```
To convert a video to images or generate a video from a image directory.
```python
# split a video into frames and save to a folder
video = mmcv.VideoReader('test.mp4')
video.cvt2frames('out_dir')
# generate video from frames
mmcv.frames2video('out_dir', 'test.avi')
```
### Editing utils
There are also some methods for editing videos, which wraps the commands of ffmpeg.
```python
# cut a video clip
mmcv.cut_video('test.mp4', 'clip1.mp4', start=3, end=10, vcodec='h264')
# join a list of video clips
mmcv.concat_video(['clip1.mp4', 'clip2.mp4'], 'joined.mp4', log_level='quiet')
# resize a video with the specified size
mmcv.resize_video('test.mp4', 'resized1.mp4', (360, 240))
# resize a video with a scaling ratio of 2
mmcv.resize_video('test.mp4', 'resized2.mp4', ratio=2)
```
### Optical flow
`mmcv` provides the following methods to operate on optical flows.
- IO
- Visualization
- Flow warpping
We provide two options to dump optical flow files: uncompressed and compressed.
The uncompressed way just dumps the floating numbers to a binary file. It is
lossless but the dumped file has a larger size.
The compressed way quantizes the optical flow to 0-255 and dumps it as a
jpeg image. The flow of x-dim and y-dim will be concatenated into a single image.
```python
flow = np.random.rand(800, 600, 2).astype(np.float32)
# dump the flow to a flo file (~3.7M)
mmcv.flowwrite(flow, 'uncompressed.flo')
# dump the flow to a jpeg file (~230K)
# the shape of the dumped image is (800, 1200)
mmcv.flowwrite(flow, 'compressed.jpg', quantize=True, concat_axis=1)
# read the flow file, the shape of loaded flow is (800, 600, 2) for both ways
flow = mmcv.flowread('uncompressed.flo')
flow = mmcv.flowread('compressed.jpg', quantize=True, concat_axis=1)
```
It is possible to visualize optical flows with `mmcv.flowshow()`.
```python
mmcv.flowshow(flow)
```
![progress](_static/flow_visualization.png)
3. Flow warpping
```python
img1 = mmcv.imread('img1.jpg')
flow = mmcv.flowread('flow.flo')
warpped_img2 = mmcv.flow_warp(img1, flow)
```
img1 (left) and img2 (right)
![raw images](_static/flow_raw_images.png)
optical flow (img2 -> img1)
![optical flow](_static/flow_img2toimg1.png)
warpped image and difference with ground truth
![warpped image](_static/flow_warp_diff.png)
m2r m2r
opencv-python opencv-python
sphinx==3.1.2
sphinx_markdown_tables sphinx_markdown_tables
torch torch
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment