Unverified Commit 66cefaff authored by Zaida Zhou's avatar Zaida Zhou Committed by GitHub
Browse files

[Docs] Refactor docs (#1102)

* [Docs] Refactor documentation

* [Docs] Refactor documentation

* refactor docs

* refactor docs

* set sphinx==3.1.2

* fix typo

* modify according to comment

* modify according to comment

* modify according to comment

* [Docs] delete unnecessary file

* fix title

* rename

* rename
parent cdcbc03c
## Image
## Data Process
### Image
This module provides some image processing methods, which requires `opencv` to be installed.
### Read/Write/Show
#### Read/Write/Show
To read or write images files, use `imread` or `imwrite`.
......@@ -34,7 +36,7 @@ for i in range(10):
mmcv.imshow(img, win_name='test image', wait_time=200)
```
### Color space conversion
#### Color space conversion
Supported conversion methods:
......@@ -52,7 +54,7 @@ img2 = mmcv.rgb2gray(img1)
img3 = mmcv.bgr2hsv(img)
```
### Resize
#### Resize
There are three resize methods. All `imresize_*` methods have an argument `return_scale`,
if this argument is `False`, then the return value is merely the resized image, otherwise
......@@ -73,7 +75,7 @@ mmcv.imrescale(img, 0.5)
mmcv.imrescale(img, (1000, 800))
```
### Rotate
#### Rotate
To rotate an image by some angle, use `imrotate`. The center can be specified,
which is the center of original image by default. There are two modes of rotating,
......@@ -100,7 +102,7 @@ img_ = mmcv.imrotate(img, 30, center=(100, 100))
img_ = mmcv.imrotate(img, 30, auto_bound=True)
```
### Flip
#### Flip
To flip an image, use `imflip`.
......@@ -114,7 +116,7 @@ mmcv.imflip(img)
mmcv.imflip(img, direction='vertical')
```
### Crop
#### Crop
`imcrop` can crop the image with one or some regions, represented as (x1, y1, x2, y2).
......@@ -136,7 +138,7 @@ patches = mmcv.imcrop(img, bboxes)
patches = mmcv.imcrop(img, bboxes, scale_ratio=1.2)
```
### Padding
#### Padding
There are two methods `impad` and `impad_to_multiple` to pad an image to the
specific size with given values.
......@@ -160,3 +162,125 @@ img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=[100, 50, 200])
# pad an image so that each edge is a multiple of some value.
img_ = mmcv.impad_to_multiple(img, 32)
```
### Video
This module provides the following functionalities.
- A `VideoReader` class with friendly apis to read and convert videos.
- Some methods for editing (cut, concat, resize) videos.
- Optical flow read/write/warp.
#### VideoReader
The `VideoReader` class provides sequence like apis to access video frames.
It will internally cache the frames which have been visited.
```python
video = mmcv.VideoReader('test.mp4')
# obtain basic information
print(len(video))
print(video.width, video.height, video.resolution, video.fps)
# iterate over all frames
for frame in video:
print(frame.shape)
# read the next frame
img = video.read()
# read a frame by index
img = video[100]
# read some frames
img = video[5:10]
```
To convert a video to images or generate a video from a image directory.
```python
# split a video into frames and save to a folder
video = mmcv.VideoReader('test.mp4')
video.cvt2frames('out_dir')
# generate video from frames
mmcv.frames2video('out_dir', 'test.avi')
```
#### Editing utils
There are also some methods for editing videos, which wraps the commands of ffmpeg.
```python
# cut a video clip
mmcv.cut_video('test.mp4', 'clip1.mp4', start=3, end=10, vcodec='h264')
# join a list of video clips
mmcv.concat_video(['clip1.mp4', 'clip2.mp4'], 'joined.mp4', log_level='quiet')
# resize a video with the specified size
mmcv.resize_video('test.mp4', 'resized1.mp4', (360, 240))
# resize a video with a scaling ratio of 2
mmcv.resize_video('test.mp4', 'resized2.mp4', ratio=2)
```
#### Optical flow
`mmcv` provides the following methods to operate on optical flows.
- IO
- Visualization
- Flow warpping
We provide two options to dump optical flow files: uncompressed and compressed.
The uncompressed way just dumps the floating numbers to a binary file. It is
lossless but the dumped file has a larger size.
The compressed way quantizes the optical flow to 0-255 and dumps it as a
jpeg image. The flow of x-dim and y-dim will be concatenated into a single image.
1. IO
```python
flow = np.random.rand(800, 600, 2).astype(np.float32)
# dump the flow to a flo file (~3.7M)
mmcv.flowwrite(flow, 'uncompressed.flo')
# dump the flow to a jpeg file (~230K)
# the shape of the dumped image is (800, 1200)
mmcv.flowwrite(flow, 'compressed.jpg', quantize=True, concat_axis=1)
# read the flow file, the shape of loaded flow is (800, 600, 2) for both ways
flow = mmcv.flowread('uncompressed.flo')
flow = mmcv.flowread('compressed.jpg', quantize=True, concat_axis=1)
```
2. Visualization
It is possible to visualize optical flows with `mmcv.flowshow()`.
```python
mmcv.flowshow(flow)
```
![progress](../_static/flow_visualization.png)
3. Flow warpping
```python
img1 = mmcv.imread('img1.jpg')
flow = mmcv.flowread('flow.flo')
warpped_img2 = mmcv.flow_warp(img1, flow)
```
img1 (left) and img2 (right)
![raw images](../_static/flow_raw_images.png)
optical flow (img2 -> img1)
![optical flow](../_static/flow_img2toimg1.png)
warpped image and difference with ground truth
![warpped image](../_static/flow_warp_diff.png)
......@@ -62,7 +62,7 @@ converter_cfg = dict(type='Converter1', a=a_value, b=b_value)
converter = CONVERTERS.build(converter_cfg)
```
## Customize Build Function
### Customize Build Function
Suppose we would like to customize how `converters` are built, we could implement a customized `build_func` and pass it into the registry.
......@@ -89,7 +89,7 @@ Note: in this example, we demonstrate how to use the `build_func` argument to cu
The functionality is similar to the default `build_from_cfg`. In most cases, default one would be sufficient.
`build_model_from_cfg` is also implemented to build PyTorch module in `nn.Sequentail`, you may directly use them instead of implementing by yourself.
## Hierarchy Registry
### Hierarchy Registry
You could also build modules from more than one OpenMMLab frameworks, e.g. you could use all backbones in [MMClassification](https://github.com/open-mmlab/mmclassification) for object detectors in [MMDetection](https://github.com/open-mmlab/mmdetection), you may also combine an object detection model in [MMDetection](https://github.com/open-mmlab/mmdetection) and semantic segmentation model in [MMSegmentation](https://github.com/open-mmlab/mmsegmentation).
......
## Utils
### ProgressBar
If you want to apply a method to a list of items and track the progress, `track_progress`
is a good choice. It will display a progress bar to tell the progress and ETA.
```python
import mmcv
def func(item):
# do something
pass
tasks = [item_1, item_2, ..., item_n]
mmcv.track_progress(func, tasks)
```
The output is like the following.
![progress](../_static/progress.gif)
There is another method `track_parallel_progress`, which wraps multiprocessing and
progress visualization.
```python
mmcv.track_parallel_progress(func, tasks, 8) # 8 workers
```
![progress](../_static/parallel_progress.gif)
If you want to iterate or enumerate a list of items and track the progress, `track_iter_progress`
is a good choice. It will display a progress bar to tell the progress and ETA.
```python
import mmcv
tasks = [item_1, item_2, ..., item_n]
for task in mmcv.track_iter_progress(tasks):
# do something like print
print(task)
for i, task in enumerate(mmcv.track_iter_progress(tasks)):
# do something like print
print(i)
print(task)
```
### Timer
It is convenient to compute the runtime of a code block with `Timer`.
```python
import time
with mmcv.Timer():
# simulate some code block
time.sleep(1)
```
or try with `since_start()` and `since_last_check()`. This former can
return the runtime since the timer starts and the latter will return the time
since the last time checked.
```python
timer = mmcv.Timer()
# code block 1 here
print(timer.since_start())
# code block 2 here
print(timer.since_last_check())
print(timer.since_start())
```
## Video
This module provides the following functionalities.
- A `VideoReader` class with friendly apis to read and convert videos.
- Some methods for editing (cut, concat, resize) videos.
- Optical flow read/write/warp.
### VideoReader
The `VideoReader` class provides sequence like apis to access video frames.
It will internally cache the frames which have been visited.
```python
video = mmcv.VideoReader('test.mp4')
# obtain basic information
print(len(video))
print(video.width, video.height, video.resolution, video.fps)
# iterate over all frames
for frame in video:
print(frame.shape)
# read the next frame
img = video.read()
# read a frame by index
img = video[100]
# read some frames
img = video[5:10]
```
To convert a video to images or generate a video from a image directory.
```python
# split a video into frames and save to a folder
video = mmcv.VideoReader('test.mp4')
video.cvt2frames('out_dir')
# generate video from frames
mmcv.frames2video('out_dir', 'test.avi')
```
### Editing utils
There are also some methods for editing videos, which wraps the commands of ffmpeg.
```python
# cut a video clip
mmcv.cut_video('test.mp4', 'clip1.mp4', start=3, end=10, vcodec='h264')
# join a list of video clips
mmcv.concat_video(['clip1.mp4', 'clip2.mp4'], 'joined.mp4', log_level='quiet')
# resize a video with the specified size
mmcv.resize_video('test.mp4', 'resized1.mp4', (360, 240))
# resize a video with a scaling ratio of 2
mmcv.resize_video('test.mp4', 'resized2.mp4', ratio=2)
```
### Optical flow
`mmcv` provides the following methods to operate on optical flows.
- IO
- Visualization
- Flow warpping
We provide two options to dump optical flow files: uncompressed and compressed.
The uncompressed way just dumps the floating numbers to a binary file. It is
lossless but the dumped file has a larger size.
The compressed way quantizes the optical flow to 0-255 and dumps it as a
jpeg image. The flow of x-dim and y-dim will be concatenated into a single image.
```python
flow = np.random.rand(800, 600, 2).astype(np.float32)
# dump the flow to a flo file (~3.7M)
mmcv.flowwrite(flow, 'uncompressed.flo')
# dump the flow to a jpeg file (~230K)
# the shape of the dumped image is (800, 1200)
mmcv.flowwrite(flow, 'compressed.jpg', quantize=True, concat_axis=1)
# read the flow file, the shape of loaded flow is (800, 600, 2) for both ways
flow = mmcv.flowread('uncompressed.flo')
flow = mmcv.flowread('compressed.jpg', quantize=True, concat_axis=1)
```
It is possible to visualize optical flows with `mmcv.flowshow()`.
```python
mmcv.flowshow(flow)
```
![progress](_static/flow_visualization.png)
3. Flow warpping
```python
img1 = mmcv.imread('img1.jpg')
flow = mmcv.flowread('flow.flo')
warpped_img2 = mmcv.flow_warp(img1, flow)
```
img1 (left) and img2 (right)
![raw images](_static/flow_raw_images.png)
optical flow (img2 -> img1)
![optical flow](_static/flow_img2toimg1.png)
warpped image and difference with ground truth
![warpped image](_static/flow_warp_diff.png)
m2r
opencv-python
sphinx==3.1.2
sphinx_markdown_tables
torch
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment