Commit a75d2bda authored by mashun1's avatar mashun1
Browse files

evtexture

parents
Pipeline #1325 canceled with stages
## Symlink/Put all the datasets here
It is recommended to symlink your dataset root to this folder - `datasets` with the command `ln -s xxx yyy`.
## Prepare Datasets
We regroup the dataset into [HDF5](https://www.h5py.org/) format because it offers better read IO performance.
### Simulating Events and get voxel grids
* **Step 1: Event data generation.**
We follow [vid2e](https://github.com/uzh-rpg/rpg_vid2e) to simulate [REDS](https://seungjunnah.github.io/Datasets/reds.html), [Viemo-90K](https://github.com/anchen1011/toflow) and [Vid4](https://mmagic.readthedocs.io/en/stable/dataset_zoo/vid4.html) events in high-resolution. Notice that vid2e repo use pretrained [FILM](https://github.com/google-research/frame-interpolation) video frame interpolation model to firstly interpolate frames, where we use pretrained [RIFE](https://github.com/megvii-research/ECCV2022-RIFE) to interpolate frames. Then we use `esim_py` Pypi package in [vid2e](https://github.com/uzh-rpg/rpg_vid2e) to simulate events from interpolated sequences. Our simulator parameters configuration is as follows:
```python
import random
import esim_py
config = {
'refractory_period': 1e-4,
'CT_range': [0.05, 0.5],
'max_CT': 0.5,
'min_CT': 0.02,
'mu': 1,
'sigma': 0.1,
'H': clip.height,
'W': clip.width,
'log_eps': 1e-3,
'use_log': True,
}
Cp = random.uniform(config['CT_range'][0], config['CT_range'][1])
Cn = random.gauss(config['mu'], config['sigma']) * Cp
Cp = min(max(Cp, config['min_CT']), config['max_CT'])
Cn = min(max(Cn, config['min_CT']), config['max_CT'])
esim = esim_py.EventSimulator(Cp,
Cn,
config['refractory_period'],
config['log_eps'],
config['use_log'])
events = esim.generateFromFolder(image_folder, timestamps_file) # Generate events with shape [N, 4]
```
Here, timestamps_file is user-defined. For videos with known frame rates, this file contains [0, 1.0/fps, 2.0/fps, ...]. For unknown frame rates, we assume fps = 25.0. Similar event camera simulators include [ESIM](https://github.com/uzh-rpg/rpg_esim), [DVS-Voltmeter](https://github.com/Lynn0306/DVS-Voltmeter), or [V2E](https://github.com/SensorsINI/v2e). You can also try them.
* **Step 2: Convert events to voxel grids.**
- Refer to [events_contrast_maximization](https://github.com/TimoStoff/events_contrast_maximization/blob/master/tools/event_packagers.py) for creating the [hdf5](https://docs.h5py.org/en/stable/) data structure to accerate IO processing.
- Then convert events to voxel grids following [events_to_voxel_torch](https://github.com/TimoStoff/event_utils/blob/master/lib/representations/voxel_grid.py#L114-L153) (we set B=5).
* **Step 3: Generate backward voxel grids to suit our bidirectional network.**
```python
if backward:
xs = torch.flip(xs, dims=[0])
ys = torch.flip(ys, dims=[0])
ts = torch.flip(t_end - ts + t_start, dims=[0]) # t_end and t_start represent the timestamp range of the events to be flipped, typically the timestamps of two consecutive frames.
ps = torch.flip(-ps, dims=[0])
voxel = events_to_voxel_torch(xs, ys, ts, ps, bins, device=None, sensor_size=sensor_size)
```
* **Step 4: Voxel normalization.**
```python
def voxel_normalization(voxel):
"""
normalize the voxel same as https://arxiv.org/abs/1912.01584 Section 3.1
Params:
voxel: torch.Tensor, shape is [num_bins, H, W]
return:
normalized voxel
"""
# check if voxel all element is 0
a,b,c = voxel.shape
tmp = torch.zeros(a, b, c)
if torch.equal(voxel, tmp):
return voxel
abs_voxel, _ = torch.sort(torch.abs(voxel).view(-1, 1).squeeze(1))
first_non_zero_idx = torch.nonzero(abs_voxel)[0].item()
non_zero_voxel = abs_voxel[first_non_zero_idx:]
norm_idx = math.floor(non_zero_voxel.shape[0] * 0.98)
ones = torch.ones_like(voxel)
normed_voxel = torch.where(torch.abs(voxel) < non_zero_voxel[norm_idx], voxel / non_zero_voxel[norm_idx], voxel)
normed_voxel = torch.where(normed_voxel >= non_zero_voxel[norm_idx], ones, normed_voxel)
normed_voxel = torch.where(normed_voxel <= -non_zero_voxel[norm_idx], -ones, normed_voxel)
return normed_voxel
```
* **Step 5: Downsample voxels.**
Apply bicubic downsample using [torch.nn.functional.interpolate](https://pytorch.org/docs/stable/generated/torch.nn.functional.interpolate.html) to converted event voxels to generate low-resolution event voxel.
**[Note]**: If you are only inferring on your own low-resolution video, there is no need to downsample the event voxels.
### Dataset Structure
* Training set
* [REDS](https://seungjunnah.github.io/Datasets/reds.html) dataset. The meta info files are [meta_info_REDS_h5_train.txt](https://github.com/DachunKai/EvTexture/blob/main/basicsr/data/meta_info/meta_info_REDS_h5_train.txt) and [meta_info_REDS_h5_test.txt](https://github.com/DachunKai/EvTexture/blob/main/basicsr/data/meta_info/meta_info_REDS_h5_test.txt). Prepare REDS_h5 structure be:
```arduino
REDS_h5
├── HR
│ ├── train
│ │ ├── 001.h5
│ │ ├── ...
│ ├── test
│ ├── 000.h5
│ ├── ...
├── LRx4
│ ├── train
│ │ ├── 001.h5
│ │ ├── ...
│ ├── test
│ ├── 000.h5
│ ├── ...
```
* [Viemo-90K](https://github.com/anchen1011/toflow) dataset. The meta info files are [meta_info_Vimeo_h5_train.txt](https://github.com/DachunKai/EvTexture/blob/main/basicsr/data/meta_info/meta_info_Vimeo_h5_train.txt) and [meta_info_Vimeo_h5_test.txt](https://github.com/DachunKai/EvTexture/blob/main/basicsr/data/meta_info/meta_info_Vimeo_h5_test.txt). Prepare Vimeo_h5 structure be:
```arduino
Vimeo_h5
├── HR
│ ├── train
│ │ ├── 00001_0001.h5
│ │ ├── ...
│ ├── test
│ ├── 00001_0266.h5
│ ├── ...
├── LRx4
│ ├── train
│ │ ├── 00001_0001.h5
│ │ ├── ...
│ ├── test
│ ├── 00001_0266.h5
│ ├── ...
```
* [CED](https://rpg.ifi.uzh.ch/CED.html) dataset. The meta info files are [meta_info_CED_h5_train.txt](https://github.com/DachunKai/EvTexture/blob/main/basicsr/data/meta_info/meta_info_CED_h5_train.txt) and [meta_info_CED_h5_test.txt](https://github.com/DachunKai/EvTexture/blob/main/basicsr/data/meta_info/meta_info_CED_h5_test.txt). Prepare CED_h5 structure be:
```arduino
├────CED_h5
│ ├────HR
│ │ ├────train
│ │ │ ├────calib_fluorescent.h5
│ │ │ ├────...
│ │ ├────test
│ │ ├────indoors_foosball_2.h5
│ │ ├────...
│ ├────LRx2
│ │ ├────train
│ │ │ ├────calib_fluorescent.h5
│ │ │ ├────...
│ │ ├────test
│ │ ├────indoors_foosball_2.h5
│ │ ├────...
│ ├────LRx4
│ ├────train
│ │ ├────calib_fluorescent.h5
│ │ ├────...
│ ├────test
│ ├────indoors_foosball_2.h5
│ ├────...
```
* Testing set
* [REDS4](https://seungjunnah.github.io/Datasets/reds.html) dataset.
* [Vimeo-90K-T](https://github.com/anchen1011/toflow) dataset.
* [Vid4](https://mmagic.readthedocs.io/en/stable/dataset_zoo/vid4.html) dataset. The meta info file is [meta_info_Vid4_h5_test.txt](https://github.com/DachunKai/EvTexture/blob/main/basicsr/data/meta_info/meta_info_Vid4_h5_test.txt). Prepare Vid4_h5 structure be:
```arduino
Vid4_h5
├── HR
│ ├── test
│ │ ├── calendar.h5
│ │ ├── ...
├── LRx4
│ ├── test
│ ├── calendar.h5
│ ├── ...
```
* [CED](https://rpg.ifi.uzh.ch/CED.html) dataset.
* Hdf5 file example
We show our HDF5 file structure using the `calendar.h5` file from the Vid4 dataset.
```arduino
calendar.h5
├── images
│ ├── 000000 # frame, ndarray, [H, W, C]
│ ├── ...
├── voxels_f
│ ├── 000000 # forward event voxel, ndarray, [Bins, H, W]
│ ├── ...
├── voxels_b
│ ├── 000000 # backward event voxel, ndarray, [Bins, H, W]
│ ├── ...
```
\ No newline at end of file
import random
import esim_py
import torch
import h5py
import numpy as np
import math
import bisect
from pathlib import Path
project_dir = Path(__file__).resolve().parent.parent
import sys
sys.path.append(str(project_dir))
from others.event_utils.lib.representations.voxel_grid import events_to_voxel_torch
from glob import glob
# from PIL import Image
import cv2
import os
def package_images(image_root,
h5_path):
for ip in glob(os.path.join(image_root, "*.png")):
image = cv2.imread(ip)
image = np.array(image)
image_name = ip.split(os.sep)[-1].split('.')[0].split("_")[-1]
with h5py.File(h5_path, 'a') as h5f:
h5f.create_dataset(f"images/{image_name}", data=image, compression="gzip")
def vid2events(image_root,
sensor_size_height,
sensor_size_width):
config = {
'refractory_period': 1e-4,
'CT_range': [0.05, 0.5],
'max_CT': 0.5,
'min_CT': 0.02,
'mu': 1,
'sigma': 0.1,
'H': sensor_size_height,
'W': sensor_size_width,
'log_eps': 1e-3,
'use_log': True,
}
Cp = random.uniform(config['CT_range'][0], config['CT_range'][1])
Cn = random.gauss(config['mu'], config['sigma']) * Cp
Cp = min(max(Cp, config['min_CT']), config['max_CT'])
Cn = min(max(Cn, config['min_CT']), config['max_CT'])
esim = esim_py.EventSimulator(Cp,
Cn,
config['refractory_period'],
config['log_eps'],
config['use_log'])
events = esim.generateFromFolder(f"{image_root}/images", f"{image_root}/timestamps.txt") # Generate events with shape [N, 4]
return events
def voxel_normalization(voxel):
"""
normalize the voxel same as https://arxiv.org/abs/1912.01584 Section 3.1
Params:
voxel: torch.Tensor, shape is [num_bins, H, W]
return:
normalized voxel
"""
# check if voxel all element is 0
a,b,c = voxel.shape
tmp = torch.zeros(a, b, c)
if torch.equal(voxel, tmp):
return voxel
abs_voxel, _ = torch.sort(torch.abs(voxel).view(-1, 1).squeeze(1))
first_non_zero_idx = torch.nonzero(abs_voxel)[0].item()
non_zero_voxel = abs_voxel[first_non_zero_idx:]
norm_idx = math.floor(non_zero_voxel.shape[0] * 0.98)
ones = torch.ones_like(voxel)
normed_voxel = torch.where(torch.abs(voxel) < non_zero_voxel[norm_idx], voxel / non_zero_voxel[norm_idx], voxel)
normed_voxel = torch.where(normed_voxel >= non_zero_voxel[norm_idx], ones, normed_voxel)
normed_voxel = torch.where(normed_voxel <= -non_zero_voxel[norm_idx], -ones, normed_voxel)
return normed_voxel
def package_bidirectional_event_voxels(x, y, t, p, timestamp_list, backward, bins, sensor_size, h5_name, error_txt):
"""
params:
x: ndarray, x-position of events
y: ndarray, y-position of events
t: ndarray, timestamp of events
p: ndarray, polarity of events
backward: bool, if forward or backward
timestamp_list: list, to split events via timestamp
bins: voxel num_bins
returns:
no return.
"""
# Step 1: convert data type
assert x.shape == y.shape == t.shape == p.shape
x = torch.from_numpy(x.astype(np.int16))
y = torch.from_numpy(y.astype(np.int16))
t = torch.from_numpy(t.astype(np.float32))
p = torch.from_numpy(p.astype(np.int16))
assert x.shape == y.shape == t.shape == p.shape
# Step 2: select events between two frames according to timestamp
temp = t.numpy().tolist()
output = [
temp[
bisect.bisect_left(temp, timestamp_list[i]):bisect.bisect_left(temp, timestamp_list[i+1])
]
for i in range(len(timestamp_list) - 1)
]
# Debug: Check if data error!!!
assert len(output) == len(timestamp_list) - 1, f"len(output) is {len(output)}, but len(timestamp_list) is {len(timestamp_list)}"
sum_output = []
sum = 0
for i in range(len(output)):
if len(output[i]) == 0:
raise ValueError(f"{h5_name} len(output[{i}] == 0)")
elif len(output[i]) == 1:
raise ValueError(f"{h5_name} len(output[{i}] == 1)")
sum += len(output[i])
sum_output.append(sum)
assert len(sum_output) == len(output)
# Step 3: After checking data, continue.
start_idx = 0
for voxel_idx in range(len(timestamp_list) - 1):
if len(output[voxel_idx]) == 0 or len(output[voxel_idx]) == 1:
print(f'{h5_name} len(output[{voxel_idx}])): ', len(
output[voxel_idx]))
with open(error_txt, 'a+') as f:
f.write(h5_name + '\n')
return
end_idx = start_idx + len(output[voxel_idx])
if end_idx > len(t):
with open(error_txt, 'a+') as f:
f.write(f"{h5_name} voxel_idx: {voxel_idx}, start_idx {start_idx} end_idx {end_idx} exceed bound." + '\n')
print(f"{h5_name} voxel_idx: {voxel_idx}, start_idx {start_idx} end_idx {end_idx} with exceed bound len(t) {len(t)}.")
return
xs = x[start_idx:end_idx]
ys = y[start_idx:end_idx]
ts = t[start_idx:end_idx]
ps = p[start_idx:end_idx]
# print(len(xs), len(ys), len(ts), len(ps))
if ts == torch.Size([]) or ts.shape == torch.Size([1]) or ts.shape == torch.Size([0]):
with open(error_txt, 'a+') as f:
f.write(f"{h5_name} len(output[{voxel_idx}]) backward {backward} start_idx {start_idx} end_idx {end_idx} is error! Please check the data." + '\n')
print(f"{h5_name} len(output[{voxel_idx}]) backward {backward} start_idx {start_idx} end_idx {end_idx} is error! Please check the data.")
return
if backward:
t_start = timestamp_list[voxel_idx]
t_end = timestamp_list[voxel_idx + 1]
xs = torch.flip(xs, dims=[0])
ys = torch.flip(ys, dims=[0])
ts = torch.flip(t_end - ts + t_start, dims=[0])
ps = torch.flip(-ps, dims=[0])
voxel = events_to_voxel_torch(
xs, ys, ts, ps, bins, device=None, sensor_size=sensor_size)
normed_voxel = voxel_normalization(voxel)
np_voxel = normed_voxel.numpy()
with h5py.File(h5_name, 'a') as events_file:
if backward:
events_file.create_dataset("voxels_b/{:06d}".format(
voxel_idx), data=np_voxel, dtype=np.dtype(np.float32), compression="gzip")
else:
events_file.create_dataset("voxels_f/{:06d}".format(
voxel_idx), data=np_voxel, dtype=np.dtype(np.float32), compression="gzip")
start_idx = end_idx
def events(args):
# 1. 创建事件
events = vid2events(args.image_root,
args.sensor_size_height,
args.sensor_size_width)
# 2. 时间 voxel grids
timestamp_list = []
with open(f"{args.image_root}/timestamps.txt", "r") as f:
for line in f.readlines():
timestamp_list.append(float(line.strip()))
package_bidirectional_event_voxels(
events[:, 0],
events[:, 1],
events[:, 2],
events[:, 3],
timestamp_list,
args.backward,
args.bins,
(args.sensor_size_height, args.sensor_size_width),
args.h5_path,
args.error
)
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--image_root", help="图像存储根目录")
parser.add_argument("--backward", action="store_true")
parser.add_argument("--sensor_size_height", type=int)
parser.add_argument("--sensor_size_width", type=int)
parser.add_argument("--bins", type=int, default=5)
parser.add_argument("--h5_path", type=str)
parser.add_argument("--error", type=str, help="错误信息存储路径")
args = parser.parse_args()
if not os.path.exists(args.h5_path):
print("处理图像")
package_images(f"{args.image_root}/images", args.h5_path)
else:
print("已处理")
print("backward?", args.backward)
events(args)
\ No newline at end of file
#!/bin/bash
IMAGE_ROOT=images
SENSOR_SIZE_HEIGHT=352
SENSOR_SIZE_WIDTH=480
BINS=5
H5_PATH=y10/test.h5
ERROR=y10/error.txt
python datapreparation.py --image_root ${IMAGE_ROOT} \
--sensor_size_height ${SENSOR_SIZE_HEIGHT} \
--sensor_size_width ${SENSOR_SIZE_WIDTH} \
--bins ${BINS} \
--h5_path ${H5_PATH} \
--error ${ERROR} \
--backward
\ No newline at end of file
import h5py
def print_hdf5_structure(file_name):
def print_attrs(name, obj):
print(f"{name}")
for key, val in obj.attrs.items():
print(f" Attribute: {key}: {val}")
with h5py.File(file_name, 'r') as f:
f.visititems(print_attrs)
# # # 替换为你的 HDF5 文件路径
file_name = 'zbl2/test.h5'
print_hdf5_structure(file_name)
# with h5py.File(file_name, 'a') as h5f:
# del h5f['voxels_b']
# del h5f['voxels_f']
# import h5py
# # 打开H5文件
# with h5py.File('zbl2', 'r') as f:
# # 列出所有子目录(组)
# groups = list(f.keys())
# print("子目录(组):", groups)
# # 选择一个子目录(例如第一个)
# group_name = groups[0]
# group = f[group_name]
# print("选择的子目录:", group_name)
# # 列出子目录中的所有文件(数据集)
# datasets = list(group.keys())
# print("数据集:", datasets)
# # 选择一个数据集(例如第一个)
# dataset_name = datasets[0]
# dataset = group[dataset_name]
# print("选择的数据集:", dataset_name)
# # 输出数据集的内容
# print("数据集的数据:")
# print(dataset.shape) # 输出整个数据集的内容
# # 如果数据集很大,您可能只想输出一部分,比如 dataset[0:5] 来输出前5个数据点
import cv2
import os
from glob import glob
def toavi(image_root,
output_path,
frame_rate: int = 16):
image_path_list = list(glob(os.path.join(image_root, "*.png")))
image_path_list.sort()
frame = cv2.imread(image_path_list[0])
height, width, layers = frame.shape
fourcc = cv2.VideoWriter_fourcc(*'MJPG')
video = cv2.VideoWriter(output_path, fourcc, frame_rate, (width, height))
for image_path in image_path_list:
frame = cv2.imread(image_path)
video.write(frame)
video.release()
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--image_root", type=str)
parser.add_argument("--output_path", type=str)
args = parser.parse_args()
# image_root = "/home/modelzoo/EvTexture/results/EvTexture_REDS4_BIx4/visualization/REDS4/test"
# image_root = "/home/modelzoo/EvTexture/datasets/images/images"
# output_path = "./low.avi"
toavi(args.image_root, args.output_path)
import cv2
import os
def v2i(video_path,
output_dir,
frame_rate: int,
frame_nums: int):
video_capture = cv2.VideoCapture(video_path)
if not video_capture.isOpened():
print("Error: Cannot open video file.")
exit()
# 创建文件夹及timestamps.txt
os.makedirs(os.path.join(output_dir, "images"), exist_ok=True)
timestamps_path = os.path.join(output_dir, "timestamps.txt")
with open(timestamps_path, "w") as f:
pass
video_fps = video_capture.get(cv2.CAP_PROP_FPS)
print(f"Video FPS: {video_fps}")
frame_interval = video_fps // frame_rate
frame_count = 0
saved_frame_cout = 0
timestamps = []
while True:
ret, frame = video_capture.read()
if frame_count > frame_nums:
break
if ret:
if frame_count % frame_interval == 0:
timestamp = video_capture.get(cv2.CAP_PROP_POS_MSEC) / 1000.0
if frame_count == 0:
timestamps.append(0/1000.0)
else:
timestamps.append(timestamp)
frame_filename = os.path.join(output_dir, "images", f"image_{saved_frame_cout:06d}.png")
cv2.imwrite(frame_filename, frame)
print(f"Saved {frame_filename}, timestamp: {timestamp}")
saved_frame_cout += 1
frame_count += 1
else:
break
video_capture.release()
with open(timestamps_path, "w") as f:
for idx, t in enumerate(timestamps):
if idx + 1 < len(timestamps):
f.write(str(t)+'\n')
else:
f.write(str(t))
print("Finished extracting frames.")
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--video_path", type=str)
parser.add_argument("--output_dir", type=str)
parser.add_argument("--frame_rate", type=int, default=25)
parser.add_argument("--frame_nums", type=int, default=90)
args = parser.parse_args()
v2i(args.video_path, args.output_dir, args.frame_rate, args.frame_nums)
# ========================================================================
# Module list for ``EvTexture: Event-driven Texture Enhancement for Video Super-resolution'' paper (ICML 2024).
# ------------------------------------------------------------------------
# python 3.7 (conda)
# pytorch 1.10.2+cu111 (pip)
# torchvision 0.11.3+cu111 (pip)
# BasicSR 1.4.2 (pip)
# ========================================================================
FROM nvidia/cuda:11.1.1-cudnn8-devel-ubuntu20.04
ENV PATH /opt/conda/bin:$PATH
ENV LD_LIBRARY_PATH /usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/lib/x86_64-linux-gnu:/usr/local/cuda-11.1/lib64:$LD_LIBRARY_PATH
ENV PIP_ROOT_USER_ACTION=ignore
COPY ./EvTexture /EvTexture
RUN APT_INSTALL="apt-get install -y --no-install-recommends" && \
rm -rf /var/lib/apt/lists/* \
/etc/apt/sources.list.d/cuda.list \
/etc/apt/sources.list.d/nvidia-ml.list && \
apt-get update && \
apt-get upgrade -y && \
# ==================================================================
# environments
# ------------------------------------------------------------------
DEBIAN_FRONTEND=noninteractive $APT_INSTALL \
apt-utils \
build-essential \
ca-certificates \
wget \
cmake \
unzip \
vim-gtk3 \
git \
g++ \
gcc \
libboost-dev \
libboost-thread-dev \
libboost-filesystem-dev \
libglib2.0-0 \
libsm6 \
libxext6 \
libxrender-dev \
libgl1-mesa-glx \
&& \
# =================================================================
# Miniconda3
# ----------------------------------------------------------------
wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /tmp/miniconda.sh && \
/bin/bash /tmp/miniconda.sh -b -p /opt/conda && \
# ================================================================
# dependencys for environment evtexture
conda update -y conda && \
conda create -y -n evtexture python=3.7 && \
/bin/bash -c "source activate evtexture && pip install --upgrade pip && pip --no-cache-dir install torch==1.10.2+cu111 torchvision==0.11.3+cu111 -f https://download.pytorch.org/whl/torch_stable.html && cd /EvTexture && pip --no-cache-dir install -r requirements.txt && python setup.py develop" && \
# =================================================================
# cleanup
#-----------------------------------------------------------------
ldconfig && \
apt-get clean && \
apt-get autoremove && \
rm -rf /var/lib/apt/lists/* /tmp/*
# 模型唯一标识
modelCode = 755
# 模型名称
modelName=evtexture_pytorch
# 模型描述
modelDescription=evtexture_pytorch可以用于视频超分
# 应用场景
appScenario=推理,视频超分,安防,媒体,环境
# 框架类型
frameType=pytorch
name: EvTexture_REDS4_BIx4
model_type: E2VSRModel
scale: 4
num_gpu: 1
manual_seed: 0
datasets:
test:
name: REDS4
type: VideoWithEventsTestDataset
dataroot_gt: datasets/REDS4_h5/HR/test
# dataroot_lq: datasets/REDS4_h5/LRx4/test
dataroot_lq: datasets/REDS4_h5/LRx4/test
meta_info_file: basicsr/data/meta_info/meta_info_REDS_h5_test.txt
io_backend:
type: hdf5
# network structures
network_g:
type: EvTexture
num_feat: 64
num_block: 30
# path
path:
pretrain_network_g: experiments/pretrained_models/EvTexture/EvTexture_REDS_BIx4.pth
strict_load_g: true
# validation settings
val:
save_img: true
flip_seq: false
suffix: ~ # add suffix to saved images, if None, use exp name
metrics:
psnr: # metric name, can be arbitrary
type: calculate_psnr
crop_border: 0
test_y_channel: false
ssim:
type: calculate_ssim
crop_border: 0
test_y_channel: false
lpips:
type: calculate_lpips
crop_border: 0
test_y_channel: false
\ No newline at end of file
name: EvTexture_Vid4_BIx4
model_type: E2VSRModel
scale: 4
num_gpu: auto
manual_seed: 0
datasets:
test:
name: Vid4
type: VideoWithEventsTestDataset
dataroot_gt: datasets/Vid4_h5/HR/test
dataroot_lq: datasets/Vid4_h5/LRx4/test
meta_info_file: basicsr/data/meta_info/meta_info_Vid4_h5_test.txt
io_backend:
type: hdf5
# network structures
network_g:
type: EvTexture
num_feat: 64
num_block: 30
# path
path:
pretrain_network_g: experiments/pretrained_models/EvTexture/EvTexture_Vimeo90K_BIx4.pth
strict_load_g: true
# validation settings
val:
save_img: true
flip_seq: false
suffix: ~ # add suffix to saved images, if None, use exp name
metrics:
psnr: # metric name, can be arbitrary
type: calculate_psnr
crop_border: 0
test_y_channel: true
ssim:
type: calculate_ssim
crop_border: 0
test_y_channel: true
lpips:
type: calculate_lpips
crop_border: 0
test_y_channel: false
\ No newline at end of file
# Local config folders
config/tt
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
/tmp
*/latex
*/html
# C extensions
*.so
data_generator/voxel_generation/build
# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
target/
# Jupyter Notebook
.ipynb_checkpoints
# pyenv
.python-version
# celery beat schedule file
celerybeat-schedule
# SageMath parsed files
*.sage.py
# dotenv
.env
# virtualenv
.venv
venv/
ENV/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
# input data, saved log, checkpoints
data/
input/
saved/
datasets/
# editor, os cache directory
.vscode/
.idea/
__MACOSX/
# outputs
*.jpg
*.jpeg
*.h5
*.swp
# dirs
/configs/r2
MIT License
Copyright (c) 2020 Timo Stoffregen
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
# event_utils
Event based vision utility library. For additional detail, see the thesis document [Motion Estimation by Focus Optimisation: Optic Flow and Motion Segmentation with Event Cameras](https://timostoff.github.io/thesis). If you use this code in an academic context, please cite:
```
@PhDThesis{Stoffregen20Thesis,
author = {Timo Stoffregen},
title = {Motion Estimation by Focus Optimisation: Optic Flow and Motion Segmentation with Event Cameras},
school = {Department of Electrical and Computer Systems Engineering, Monash University},
year = 2020
}
```
This is an event based vision utility library with functionality for focus optimisation, deep learning, event-stream noise augmentation, data format conversion and efficient generation of various event representations (event images, voxel grids etc).
The library is implemented in Python. Nevertheless, the library is efficient and fast, since almost all of the hard work is done using vectorisation or numpy/pytorch functions. All functionality is implemented in numpy _and_ pytorch, so that on-GPU processing for hardware accelerated performance is very easy.
The library is divided into eight sub-libraries:
```
└── lib
├── augmentation
├── contrast_max
├── data_formats
├── data_loaders
├── representations
├── transforms
├── util
└── visualization
```
## augmentation
While the `data_loaders` learning library contains some code for tensor augmentation (such as adding Gaussian noise, rotations, flips, random crops etc), the augmentation library allows for these operations to occur on the raw events.
This functionality is contained within `event_augmentation.py`.
### `event_augmentation.py`
The following augmentations are available:
* `add_random_events`: Generates N new random events, drawn from a uniform distribution over the size of the spatiotemporal volume.
* `remove_events`: Makes the event stream more sparse, by removing a random selection of N events from the original event stream.
* `add_correlated_events`: Makes the event stream more dense by adding N new events around the existing events.
Each original event is fitted with a Gaussian bubble with standard deviation `sigma_xy` in the `x,y` dimension and `sigma_t` in the `t` dimension.
New events are drawn from these distributions.
Note that this also 'blurs' the event stream.
* `flip_events_x`: Flip events over x axis.
* `flip_events_y`: Flip events over y axis.
* `crop_events`: Spatially crop events either randomly, to a desired amount and either from the origin or as a center crop.
* `rotate_events`: Rotate events by angle `theta` around a center of rotation `a,b`.
Events can then optionally be cropped in the case that they overflow the sensor resolution.
Some possible augmentations are shown below:
Since the augmentations are implemented using vectorisation, the heavy lifting is done in optimised C/C++ backends and is thus very fast.
![Augmentation examples](https://github.com/TimoStoff/event_utils/blob/master/.images/augmentation.png)
Some examples of augmentations on the `slider_depth` sequence from the [event camera dataset](http://rpg.ifi.uzh.ch/davis_data.html) can be seen above (events in red and blue with the first events in black to show scene structure). (a) the original event stream, (b) doubling the events by adding random _correlated_ events, (c) doubling the events by adding fully random (normal distribution) events, (d) halving the events by removing random, (e) flipping the events horizontally, (f) rotating the events 45 degrees. Demo code to reproduce these plots can be found by executing the following (note that the events need to be in HDF5 format):
```python lib/augmentation/event_augmentation.py /path/to/slider_depth.h5 --output_path /tmp```
## contrast_max
The focus optimisation library contains code that allows the user to perform focus optimisation on events.
The important files of this library are:
`events_cmax.py`
This file contains code to perform focus optimisation.
The most important functionality is provided by:
* `grid_search_optimisation`: Performs the grid search optimisation from [SOFAS algorithm](https://arxiv.org/abs/1805.12326).
* `optimize`: Performs gradient based focus optimisation on the input events, given an objective function and motion model.
* `grid_cmax`: Given a set of events, splits the image plane into ROI of size `roi_size`.
Performs focus optimisation on each ROI separately.
* `segmentation_mask_from_d_iwe`: Retrieve a segmentation mask for the events based on dIWE/dWarpParams.
* `draw_objective_function`: Draw the objective function for a given set of events, motion model and objective function.
Produces plots as in below image.
* `main`: Demo showing various capabilities and code examples.
![Focus Optimisation](https://github.com/TimoStoff/event_utils/blob/master/.images/cmax.png)
Examples can be seen in the images above: each set of events is drawn with the variance objective function (w.r.t. optic flow motion model) underneath. This set of tools allows optimising the objective function to recover the motion parameters (images generated with the library).
### `objectives.py`
This file implements various objective functions described in this thesis as well as some other commonly cited works.
Objective functions inherit from the parent class `objective_function`.
The idea is to make it as easy as possible to add new, custom objective functions by providing a common API for the optimisation code.
This class has several members that require initialisation:
* `name`: The name of the objective function (eg `variance`).
* `use_polarity`: Whether to use the polarity of the events in generating IWEs.
* `has_derivative`: Whether this objective has an analytical derivative w.r.t. warp parameters.
* `default_blur`: What `sigma` should be default for blurring.
* `adaptive_lifespan`: An innovative feature to deal with linearisation errors.
Many implementations of contrast maximisation use assumptions of linear motion w.r.t. the chosen motion model.
A given estimate of the motion parameters implies a lifespan of the events.
If `adaptive_lifespan`: is True, the number of events used during warping is cut to that lifespan for each optimisation step, computed using `pixel_crossings`.
eg If motion model is optic flow velocity and the estimate = 12 pixels/second and `pixel_crossings`=3, then the lifespan will be 3/12=0.25s.
* `pixel_crossings`: Number of pixel crossings used to calculate lifespan.
* `minimum_events`: The minimal number of events that the lifespan can cut to.
The required function that inheriting classes need to implement are:
* `evaluate_function`: Evaluate the objective function for given parameters, events etc.
* `evaluate_gradient`: Evaluate the objective function and the gradient of the objective function w.r.t. motion parameters for given parameters, events etc.
The objective functions implemented in this file are:
* `variance_objective`: Variance objective (see [Accurate Angular Velocity Estimation with an Event Camera](https://www.zora.uzh.ch/id/eprint/138896/1/RAL16_Gallego.pdf)).
* `rms_objective`: Root Mean Squared objective.
* `sos_objective`: See [Event Cameras, Contrast Maximization and Reward Functions: An Analysis](https://openaccess.thecvf.com/content_CVPR_2019/html/Stoffregen_Event_Cameras_Contrast_Maximization_and_Reward_Functions_An_Analysis_CVPR_2019_paper.html)
* `soe_objective`: See [Event Cameras, Contrast Maximization and Reward Functions: An Analysis](https://openaccess.thecvf.com/content_CVPR_2019/html/Stoffregen_Event_Cameras_Contrast_Maximization_and_Reward_Functions_An_Analysis_CVPR_2019_paper.html)
* `moa_objective`: See [Event Cameras, Contrast Maximization and Reward Functions: An Analysis](https://openaccess.thecvf.com/content_CVPR_2019/html/Stoffregen_Event_Cameras_Contrast_Maximization_and_Reward_Functions_An_Analysis_CVPR_2019_paper.html)
* `soa_objective`: See [Event Cameras, Contrast Maximization and Reward Functions: An Analysis](https://openaccess.thecvf.com/content_CVPR_2019/html/Stoffregen_Event_Cameras_Contrast_Maximization_and_Reward_Functions_An_Analysis_CVPR_2019_paper.html)
* `sosa_objective`: See [Event Cameras, Contrast Maximization and Reward Functions: An Analysis](https://openaccess.thecvf.com/content_CVPR_2019/html/Stoffregen_Event_Cameras_Contrast_Maximization_and_Reward_Functions_An_Analysis_CVPR_2019_paper.html)
* `zhu_timestamp_objective`: Objective function defined in [Unsupervised event-based learning of optical flow, depth, and egomotion](https://openaccess.thecvf.com/content_CVPR_2019/papers/Zhu_Unsupervised_Event-Based_Learning_of_Optical_Flow_Depth_and_Egomotion_CVPR_2019_paper.pdf).
* `r1_objective`: Combined objective function R1 [Event Cameras, Contrast Maximization and Reward Functions: An Analysis](https://openaccess.thecvf.com/content_CVPR_2019/html/Stoffregen_Event_Cameras_Contrast_Maximization_and_Reward_Functions_An_Analysis_CVPR_2019_paper.html)
* `r2_objective`: Combined objective function R2 [Event Cameras, Contrast Maximization and Reward Functions: An Analysis](https://openaccess.thecvf.com/content_CVPR_2019/html/Stoffregen_Event_Cameras_Contrast_Maximization_and_Reward_Functions_An_Analysis_CVPR_2019_paper.html)
### `warps.py`
This file implements warping functions described in this thesis as well as some other commonly cited works.
Objective functions inherit from the parent class `warp_function`.
The idea is to make it as easy as possible to add new, custom warping functions by providing a common API for the optimisation code.
Initialisation requires setting member variables:
* `name`: Name of the warping function, eg `optic_flow`.
* `dims`: DoF of the warping function.
The only function that needs to be implemented by inheriting classes is `warp`, which takes events, a reference time and motion parameters as input.
The function then returns a list of the warped event coordinates as well as the Jacobian of each event w.r.t. the motion parameters.
Warp functions currently implemented are:
* `linvel_warp`: 2-DoF optic flow warp.
* `xyztheta_warp`: 4-DoF warping function from [Event-based moving object detection and tracking](https://arxiv.org/abs/1803.04523) (`x,y,z`) velocity and angular velocity `theta` around the origin).
* `pure_rotation_warp`: 3-DoF pure rotation warp (`x,y,theta` where `x,y` are the center of rotation and `theta` is the angular velocity).
## `data_formats`
The `data_formats` provides code for converting events in one file format to another.
Even though many candidates have appeared over the years (rosbag, AEDAT, .txt, `hdf5`, pickle, cuneiform clay tablets, just to name a few), a universal storage option for event based data has not yet crystallised.
Some of these data formats are particularly useful within particular operating systems or programming languages.
For example, rosbags are the natural choice for C++ programming with the `ros` environment.
Since they also store data in an efficient binary format, they have become a very common storage option.
However, they are notoriously slow and impractical to process in Python, which has become the de-facto deep-learning language and is commonly used in research due to the rapid development cycle.
More practical (and importantly, fast) options are the `hdf5` and numpy memmap formats.
`hdf5` is a more compact and easily accessible format, since it allows for easy grouping and metadata allocation, however it's difficulty in setting up multi-threading access and subsequent buggy behaviour (even in read-only applications) means that memmap is more common for deep learning, where multi-threaded data-loaders can significantly speed up training.
### `event_packagers.py`
The `data_formats` library provides a `packager` abstract base class, which defines what a `packager` needs to do.
`packager`objects receive data (events, frames etc) and write them to the desired file format (eg `hdf5`).
Converting file formats is now much easier, since input files now need only to be parsed and the data sent to the `packager`with the appropriate function calls.
The functions that need to implemented are:
* `package_events` A function which given events, writes them to the file/buffer.
* `package_image` A function which given images, writes them to the file/buffer.
* `package_flow` A function which given optic flow frames, writes them to the file/buffer.
* `add_metadata` Writes metadata to the file (number of events, number of negative/positive events, duration of sequence, start time, end time, number of images, number of optic flow frames).
* `set_data_available` What data is available and needs to be written (ie events, frames, optic flow).
A `packager` for `hdf5` and memmap is implemented.
### `h5_to_memmap.py` and `rosbag_to_h5.py`
The library implements two converters, one for `hdf5` to memmap and one for rosbag to `hdf5`.
These can be easily called from the command line with various options that can be found in the documentation.
### `add_hdf5_attribute.py`
`add_hdf5_attribute.py` allows the user to add or modify attributes to existing `hdf5` files.
Attributes are the manner in which metadata is saved in `hdf5` files.
### `read_events.py`
`read_events.py` contains functions for reading events from `hdf5` and memmap.
The functions are:
* `read_memmap_events`.
* `read_h5_events`.
## `data_loader`
The deep learning code can be found in the `data_loaders`library.
It contains code for loading events and transforming them into voxel grids in an efficient manner as well as code for data augmentation.
Actual networks and cost functions described in this thesis are not implemented in the library but at the project page for that paper.
`data_loaders` provides a highly versatile `pytorch` dataloader, which can be used across various storage formats for events (.txt, `hdf5`, memmap etc).
As a result it is very easy to implement new dataloader for a different storage format.
The output of the dataloader was originally to provide voxel grids of the events, but can be used just as well to output batched events, due to a custom `pytorch`collation function.
As a result, the dataloader is useful for any situation in which it is desirable to iterate over the events in a storage medium and is not only useful for deep learning.
For instance, if one wants to iterate over the events that lie between all the frames of a `davis` sequence, the following code is sufficient:
```
dloader = DynamicH5Dataset(path_to_events_file)
for item in dloader:
print(item[`events'].shape)
```
### `base_dataset.py`
This file defines the base dataset class (`BaseVoxelDataset`), which defines all batching, augmentation, collation and housekeeping code.
Inheriting classes (one per data format) need only to implement the abstract functions for providing events, frames and other data from storage.
These abstract functions are:
* `get_frame(self, index)` Given an index `n`, return the `n`th frame.
* `get_flow(self, index)` Given an index `n`, return the `n`th optic flow frame.
* `get_events(self, idx0, idx1)` Given a start and end index `idx0` and `idx1`, return all events between those indices.
* `load_data(self, data_path)` Function which is called once during initialisation, which creates handles to files and sets several class attributes (number of frames, events etc).
* `find_ts_index(self, timestamp)` Given a timestamp, get the index of the nearest event.
* `ts(self, index)` Given an event index, return the timestamp of that event.
The function `load_data`must set the following member variables:
* `self.sensor_resolution` Event sensor resolution.
* `self.has_flow` Whether or not the data has optic flow frames.
* `self.t0` The start timestamp of the events.
* `self.tk` The end timestamp of the events.
* `self.num_events` The number of events in the dataset.
* `self.frame_ts` The timestamps of the time-synchronised frames.
* `self.num_frames` The number of frames in the dataset.
The constructor of the class takes following arguments:
* `data_path` Path to the file containing the event/image data.
* `transforms` Python dict containing the desired augmentations.
* `sensor_resolution` The size of the image sensor.
* `num_bins` The number of bins desired in the voxel grid.
* `voxel_method` Which method should be used to form the voxels.
* `max_length` If desired, the length of the dataset can be capped to `max_length` batches.
* `combined_voxel_channels` If True, produces one voxel grid for all events, if False, produces separate voxel grids for positive and negative channels.
* `return_events` If true, returns events in output dict.
* `return_voxelgrid` If true, returns voxel grid in output dict.
* `return_frame` If true, returns frames in output dict.
* `return_prev_frame` If true, returns previous batch's frame to current frame in output dict.
* `return_flow` If true, returns optic flow in output dict.
* `return_prev_flow` If true, returns previous batch's optic flow to current optic flow in output dict.
* `return_format` Which output format to use (options=`'numpy'` and `'torch'`).
The parameter `voxel_method` defines how the data is to be batched.
For instance, one might wish to have data returned in windows `t` seconds wide, or to always get all data between successive `aps` frames.
The method is given as a dict, as some methods have additional parametrisations.
The current options are:
* `k_events` Data is returned every `k` events.
The dict is given in the format `method = {'method': 'k_events', 'k': value_for_k, 'sliding_window_w': value_for_sliding_window}`.
The parameter `sliding_window_w` defines by how many events each batch overlaps.
* `t_seconds` Data is returned every `t` seconds.
The dict is given in the format `method = {'method': 't_seconds', 't': value_for_t, 'sliding_window_t': value_for_sliding_window}`.
The parameter `sliding_window_t` defines by how many seconds each batch overlaps.
* `between_frames` All data between successive frames is returned.
Requires time-synchronised frames to exist.
The dict is given in the format `method={'method':'between_frames'}`.
Generating the voxel grids can be done very efficiently and on the `gpu` (if the events have been loaded there) using the `pytorch` function `target.index_put_(index, value, accumulate=True)`.
This function puts values from `value` into `target` using the indices specified in `indices` using highly optimised C++ code in the background.
`accumulate` specifies if values in `value` which get put in the same location on `target` should sum (accumulate) or overwrite one another.
In summary, `BaseVoxelDataset` allows for very fast, on-device data-loading and on-the-fly voxel grid generation.
## `representations`
This library contains code for generating representations from the events in a highly efficient, `gpu` ready manner.
![Representations](https://github.com/TimoStoff/event_utils/blob/master/.images/representations.png)
Various representations can be seen above with (a) the raw events, (b) the voxel grid, (c) the event image, (d) the timestamp image.
### `voxel_grid.py`
This file contains several means for forming and viewing voxel grids from events.
There are two versions of each function, representing a pure `numpy` and a `pytorch` implementation.
The `pytorch` implementation is necessary for `gpu` processing, however it is not as commonly used as `numpy`, which is so frequently used as to barely be a dependency any more.
Functions for `pytorch` are:
* `voxel_grids_fixed_n_torch` Given a set of `n` events, return a voxel grid with `B` bins and with a fixed number of events.
* `voxel_grids_fixed_t_torch` Given a set of events and a duration `t`, return a voxel grid with `B` bins and with a fixed temporal width `t`.
* `events_to_voxel_timesync_torch` Given a set of events and two times `t_0` and `t_1`, return a voxel grid with `B` bins from the events between `t_0` and `t_1`.
* `events_to_voxel_torch` Given a set of events, return a voxel grid with `B` bins from those events.
* `events_to_neg_pos_voxel_torch` Given a set of events, return a voxel grid with `B` bins from those events.
Positive and negative events are formed into two separate voxel grids.
Functions for `numpy` are:
* `events_to_voxel` Given a set of events, return a voxel grid with `B` bins from those events.
* `events_to_neg_pos_voxel` Given a set of events, return a voxel grid with `B` bins from those events.
Positive and negative events are formed into two separate voxel grids.
Additionally:
* `get_voxel_grid_as_image`Returns a voxel grid as a series of images, one for each bin for display.
* `plot_voxel_grid` Given a voxel grid, display it as an image.
Voxel grids can be formed both using spatial and temporal interpolation between the bins.
### `image.py`
`image.py` contains code for forming images from events in an efficient manner.
The functions allow for forming images with both discrete and floating point events using bilinear interpolation.
Images currently supported are event images and timestamp images using either `numpy` or `pytorch`.
Functions are:
* `events_to_image` Form an image from events using `numpy`.
Allows for bilinear interpolation while assigning events to pixels and padding of the image or clipping of events for events which fall outside of the range.
* `events_to_image_torch` Form an image from events using `pytorch`.
Allows for bilinear interpolation while assigning events to pixels and padding of the image or clipping of events for events which fall outside of the range.
* `image_to_event_weights` Given an image and a set of event coordinates, get the pixel value of the image for each event using reverse bilinear interpolation.
* `events_to_image_drv` Form an image from events and the derivative images from the event Jacobians (with options for padding the image or clipping out-of-range events).
Of particular use for `cmax` where analytic gradients motion models are known.
* `events_to_timestamp_image` Method to generate the average timestamp images from [Unsupervised event-based learning of optical flow, depth, and egomotion](https://openaccess.thecvf.com/content_CVPR_2019/papers/Zhu_Unsupervised_Event-Based_Learning_of_Optical_Flow_Depth_and_Egomotion_CVPR_2019_paper.pdf) using `numpy`.
Returns two images, one for negative and one for positive events.
* `events_to_timestamp_image_torch` Method to generate the average timestamp images from [Unsupervised event-based learning of optical flow, depth, and egomotion](https://openaccess.thecvf.com/content_CVPR_2019/papers/Zhu_Unsupervised_Event-Based_Learning_of_Optical_Flow_Depth_and_Egomotion_CVPR_2019_paper.pdf) using `pytorch`.
Returns two images, one for negative and one for positive events.
## `util`
This library contains some utility functions used in the rest of the library.
Functions include:
* `infer_resolution` Given events, guess the resolution by looking at the max and min values.
* `events_bounds_mask` Get a mask of the events that are within given bounds.
* `clip_events_to_bounds` Clip events to the given bounds.
* `cut_events_to_lifespan` Given motion model parameters, compute the speed and thus the lifespan, given a desired number of pixel crossings.
* `get_events_from_mask` Given an image mask, return the indices of all events at each location in the mask.
* `binary_search_h5_dset` Binary search for a timestamp in an `hdf5` event file, without loading the entire file into RAM.
* `binary_search_torch_tensor` Binary search implemented for `pytorch` tensors (no native implementation exists).
* `remove_hot_pixels` Given a set of events, removes the `hot' pixel events. Accumulates all of the events into an event image and removes the `num_hot` highest value pixels.
* `optimal_crop_size` Find the optimal crop size for a given `max_size` and `subsample_factor`. The optimal crop size is the smallest integer which is greater or equal than `max_size`, while being divisible by 2^`max_subsample_factor`.
* `plot_image_grid` Given a list of images, stitch them into a grid and display/save the grid.
* `flow2bgr_np` Turn optic flow into an RGB image.
## `visualisation`
The `visualization` library contains methods for generating figures and movies from events.
The majority of figures shown in the thesis were generated using this library.
Two rendering backends are available, the commonly used `matplotlib` plotting library and `mayavi`, which is a VTK based graphics library.
The API for both of these is essentially the same, the main difference being the dependency on `matplotlib` or `mayavi`.
`matplotlib` is very easy to set up, but quite slow, `mayavi` is very fast but more difficult to set up and debug.
I will describe the `matplotlib` version here, although all functionality exists in the `mayavi` version too (see the code documentation for details).
### `draw_event_stream.py`
The core work is done in this file, which contains code for visualising events and voxel grids for examples).
The function for plotting events is `plot_events`.
\input{figures/appendix/visualisations/fig.tex}
Input parameters for this function are:
* `xs` x coords of events.
* `ys` y coords of events.
* `ts` t coords of events.
* `ps` p coords of events.
* `save_path` If set, will save the plot to here
* `num_compress` Takes `num_compress` events from the beginning of the sequence and draws them in the plot at time `t=0` in black.
This aids visibility (see the augmentation examples).
* `compress_front` If True, display the compressed events in black at the front of the spatiotemporal volume rather than the back
* `num_show` Sets the number of events to plot.
If set to -1 will plot all of the events (can be potentially expensive).
Otherwise, skips events in order to achieve the desired number of events
* `event_size` Sets the size of the plotted events.
* `elev` Sets the elevation of the plot.
* `azim` Sets the azimuth of the plot.
* `imgs` A list of images to draw into the spatiotemporal volume.
* `img_ts` A list of the position on the temporal axis where each image from `imgs` is to be placed.
* `show_events` If False, will not plot the events (only images).
* `show_plot` If True, display the plot in a `matplotlib` window as well as saving to disk.
* `crop` A crop, if desired, of the events and images to be plotted.
* `marker` Which marker should be used to display the events (default is '.', which results in points, but circles 'o' or crosses 'x' are among many other possible options).
* `stride` Determines the pixel stride of the image rendering (1=full resolution, but can be quite resource intensive).
* `invert` Inverts the colour scheme for black backgrounds.
* `img_size` The size of the sensor resolution. Inferred if empty.
* `show_axes` If True, draw axes onto the plot.
The analogous function for plotting voxel grids is:
* `xs` x coords of events.
* `ys`y coords of events.
* `ts` t coords of events.
* `ps` p coords of events.
* `bins` The number of bins to have in the voxel grid.
* `frames` A list of images to draw into the plot with the voxel grid.
* `frame_ts` A list of the position on the temporal axis where each image from `frames` is to be placed.
* `sensor_size` Event sensor resolution.
* `crop` A crop, if desired, of the events and images to be plotted.
* `elev` Sets the elevation of the plot.
* `azim` Sets the azimuth of the plot.
To plot successive frames in order to generate video, the function `plot_events_sliding` can be used.
Essentially, this function renders a sliding window of the events, for either the event or voxel visualisation modes.
Similarly, `plot_between_frames` can be used to render all events between frames, with the option to skip every `n`th event.
To generate such plots from the command line, the library provides the scripts:
* `visualize_events.py`
* `visualize_voxel.py`
* `visualize_flow.py`
These provide a range of documented commandline arguments with sensble defaults from which plots of the events, voxel grids and events with optic flow overlaid can be generated.
For example,
```python visualize_events.py /path/to/slider_depth.h5```
produces plots of the `slider_depth` sequence.
Invoking:
```python visualize_voxel.py /path/to/slider_depth.h5```
produces voxels of the `slider_depth` sequence.
\input{figures/appendix/slider_vis/fig.tex}
![Visualisation](https://github.com/TimoStoff/event_utils/blob/master/.images/visualisations.png)
Typical visualisations are shown above: the `slider_depth` sequence is drawn as successive frames of events (top) and voxels (bottom).
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment