Commit b952e97b authored by chenych's avatar chenych
Browse files

First Commit.

parents
*.mdb
*.pth
*.tar
*.sh
*.txt
*.ipynb
*.zip
*.eps
*.pdf
### Linux ###
*~
# temporary files which can be created if a process still has a handle open of a deleted file
.fuse_hidden*
# KDE directory preferences
.directory
# Linux trash folder which might appear on any partition or disk
.Trash-*
# .nfs files are created when an open file is removed but is still being accessed
.nfs*
### OSX ###
# General
.DS_Store
.AppleDouble
.LSOverride
# Icon must end with two \r
Icon
# Thumbnails
._*
# Files that might appear in the root of a volume
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns
.com.apple.timemachine.donotpresent
# Directories potentially created on remote AFP share
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk
### Python ###
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
.python-version
# celery beat schedule file
celerybeat-schedule
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
### Python Patch ###
.venv/
### Python.VirtualEnv Stack ###
# Virtualenv
# http://iamzed.com/2009/05/07/a-primer-on-virtualenv/
[Bb]in
[Ii]nclude
[Ll]ib64
[Ll]ocal
[Ss]cripts
pyvenv.cfg
pip-selfcheck.json
### Windows ###
# Windows thumbnail cache files
Thumbs.db
ehthumbs.db
ehthumbs_vista.db
# Dump file
*.stackdump
# Folder config file
[Dd]esktop.ini
# Recycle Bin used on file shares
$RECYCLE.BIN/
# Windows Installer files
*.cab
*.msi
*.msix
*.msm
*.msp
# Windows shortcuts
*.lnk
.idea/
.vscode/
output/
exp/
data/
*.pyc
*.mp4
*.zip
\ No newline at end of file
# CenterFace
## 论文
[CenterFace: Joint Face Detection and Alignment Using Face as Point](https://arxiv.org/abs/1911.03599)
## 模型结构
CenterFace是一种人脸检测算法,采用了轻量级网络mobileNetV2作为主干网络,结合特征金字塔网络(FPN)实现anchor free的人脸检测。
![Architecture of the CenterFace](Architecture of the CenterFace.png)
## 算法原理
CenterFace模型是一种基于单阶段人脸检测算法,作者借鉴了CenterNet的思想,将人脸检测转换为标准点问题,根据人脸中心点来回归人脸框的大小和五个标志点。
## 环境配置
### Docker(方法一)
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04.1-py38-latest
docker run docker run -it -v /path/your_code_data/:/path/your_code_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
cd /path/workspace/
pip3 install -r requirements.txt
```
### Dockerfile(方法二)
```
cd ./docker
docker build --no-cache -t centerface:latest .
docker run -it -v /path/your_code_data/:/path/your_code_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
```
### Anaconda(方法三)
1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装: https://developer.hpccube.com/tool/
```
DTK软件栈:dtk23.04.1
python:python3.8
torch:1.13.1
torchvision:0.14.1
```
`Tips:以上dtk驱动、python、paddle等DCU相关工具版本需要严格一一对应`
2、其他非特殊库直接按照requirements.txt安装
```
pip3 install -r requirements.txt
```
## 数据集
[WIDER_FACE](http://shuoyang1213.me/WIDERFACE/index.html)
![datasets](datasets.png)
下载图片红框中三个数据并解压,也可直接点击下面链接直接下载:
[WIDER Face Training Images(Tencent Drive)](https://share.weiyun.com/5WjCBWV)
[WIDER Face Validation Images(Tencent Drive)](https://share.weiyun.com/5ot9Qv1)
[WIDER Face Testing Images(Tencent Drive)](https://share.weiyun.com/5vSUomP)
annotation文件使用的是coco的格式,可以通过百度网盘下载
[Baidu](https://pan.baidu.com/s/1j_2wggZ3bvCuOAfZvjWqTg) 提取码:f9hh
数据集全部解压后的目录结构如下:
```
├── WIDER_train
│ ├── images
├── WIDER_test
│ ├── images
├── WIDER_val
│ ├── images
├── annotations
│ ├── train_wider_face.json
│ ├── val_wider_face.json
```
## 训练
### 单机单卡
```
cd ./src
bash train.sh
```
### 单机多卡
```
cd ./src
bash train_multi.sh
```
## 推理
#### 单卡推理
```
cd ./src
python test_wider_face.py
```
## result
![Result](draw_img.jpg)
### 精度
WIDER_FACE验证集上的测试结果如下
| Method | Easy | Medium | Hard|
|:--------:| :--------:| :---------:| :------:|
| ours(one scale) | 0.9264 | 0.9133 | 0.7479 |
| original | 0.922 | 0.911 | 0.782|
## 应用场景
### 算法类别
人脸识别
### 热点应用行业
教育,交通,公安,医疗
## 预训练权重
./models/model_best.pth
## 源码仓库及问题反馈
https://developer.hpccube.com/codes/modelzoo/centerface-pytorch
## 参考资料
https://github.com/chenjun2hao/CenterFace.pytorch
import os.path as osp
import sys
def add_path(path):
if path not in sys.path:
sys.path.insert(0, path)
father_dir = osp.dirname(osp.dirname(__file__))
# Add lib to PYTHONPATH
lib_path = osp.join(father_dir, 'src', 'lib')
add_path(lib_path)
import os
import sys
import cv2
import numpy as np
import torch
import torch.nn as nn
# from config import cfg, update_config
from models.utils import _gather_feat, _tranpose_and_gather_feat
from tensorrt_model import TRTModel
from utils.image import get_affine_transform, transform_preds
class CenterNetTensorRTEngine(object):
def __init__(self, config, weight_file):
# update_config(cfg, config_file)
self.cfg = config
self.trtmodel = TRTModel(weight_file)
def preprocess(self, image, scale=1, meta=None):
height, width = image.shape[0:2]
new_height = int(height * scale)
new_width = int(width * scale)
self.mean = np.array(self.cfg.mean, dtype=np.float32).reshape(1, 1, 3)
self.std = np.array(self.cfg.std, dtype=np.float32).reshape(1, 1, 3)
if self.cfg.fix_res:
inp_height, inp_width = self.cfg.input_h, self.cfg.input_w
c = np.array([new_width / 2., new_height / 2.], dtype=np.float32)
s = max(height, width) * 1.0
else:
inp_height = (new_height | self.cfg.pad) + 1
inp_width = (new_width | self.cfg.pad) + 1
c = np.array([new_width // 2, new_height // 2], dtype=np.float32)
s = np.array([inp_width, inp_height], dtype=np.float32)
trans_input = get_affine_transform(c, s, 0, [inp_width, inp_height])
resized_image = cv2.resize(image, (new_width, new_height))
inp_image = cv2.warpAffine(
resized_image, trans_input, (inp_width, inp_height), flags=cv2.INTER_LINEAR)
inp_image = ((inp_image / 255. - self.mean) / self.std).astype(np.float32)
images = inp_image.transpose(2, 0, 1).reshape(1, 3, inp_height, inp_width)
if self.cfg.flip_test:
images = np.concatenate((images, images[:, :, :, ::-1]), axis=0)
meta = {'c': c, 's': s,
'out_height': inp_height // self.cfg.down_ratio,
'out_width': inp_width // self.cfg.down_ratio}
return np.ascontiguousarray(images), meta
def run(self, imgs):
images , meta = self.preprocess(imgs) # prepocess for image
trt_output = self.trtmodel(images) # tensorrt inference
predictions = self.postprocess(trt_output, meta)
return predictions
def _nms(self, heat, kernel=3):
pad = (kernel - 1) // 2
hmax = nn.functional.max_pool2d(
heat, (kernel, kernel), stride=1, padding=pad)
keep = (hmax == heat).float()
return heat * keep
def _topk(self, scores, K=40):
batch, cat, height, width = scores.size()
topk_scores, topk_inds = torch.topk(scores.view(batch, cat, -1), K)
topk_inds = topk_inds % (height * width)
topk_ys = (topk_inds / width).int().float()
topk_xs = (topk_inds % width).int().float()
topk_score, topk_ind = torch.topk(topk_scores.view(batch, -1), K)
topk_clses = (topk_ind / K).int()
topk_inds = _gather_feat(
topk_inds.view(batch, -1, 1), topk_ind).view(batch, K)
topk_ys = _gather_feat(topk_ys.view(batch, -1, 1), topk_ind).view(batch, K)
topk_xs = _gather_feat(topk_xs.view(batch, -1, 1), topk_ind).view(batch, K)
return topk_score, topk_inds, topk_clses, topk_ys, topk_xs
def _topk_channel(self, scores, K=40):
batch, cat, height, width = scores.size()
topk_scores, topk_inds = torch.topk(scores.view(batch, cat, -1), K)
topk_inds = topk_inds % (height * width)
topk_ys = (topk_inds / width).int().float()
topk_xs = (topk_inds % width).int().float()
return topk_scores, topk_inds, topk_ys, topk_xs
def multi_pose_decode(self,
heat, wh, kps, reg=None, hm_hp=None, hp_offset=None, K=100):
batch, cat, height, width = heat.size()
num_joints = kps.shape[1] // 2
# perform nms on heatmaps
heat = self._nms(heat)
scores, inds, clses, ys, xs = self._topk(heat, K=K)
kps = _tranpose_and_gather_feat(kps, inds)
kps = kps.view(batch, K, num_joints * 2)
kps[..., ::2] += xs.view(batch, K, 1).expand(batch, K, num_joints)
kps[..., 1::2] += ys.view(batch, K, 1).expand(batch, K, num_joints)
if reg is not None:
reg = _tranpose_and_gather_feat(reg, inds)
reg = reg.view(batch, K, 2)
xs = xs.view(batch, K, 1) + reg[:, :, 0:1]
ys = ys.view(batch, K, 1) + reg[:, :, 1:2]
else:
xs = xs.view(batch, K, 1) + 0.5
ys = ys.view(batch, K, 1) + 0.5
wh = _tranpose_and_gather_feat(wh, inds)
wh = wh.view(batch, K, 2)
clses = clses.view(batch, K, 1).float()
scores = scores.view(batch, K, 1)
bboxes = torch.cat([xs - wh[..., 0:1] / 2,
ys - wh[..., 1:2] / 2,
xs + wh[..., 0:1] / 2,
ys + wh[..., 1:2] / 2], dim=2)
if hm_hp is not None:
hm_hp = self._nms(hm_hp)
thresh = 0.1
kps = kps.view(batch, K, num_joints, 2).permute(
0, 2, 1, 3).contiguous() # b x J x K x 2
reg_kps = kps.unsqueeze(3).expand(batch, num_joints, K, K, 2)
hm_score, hm_inds, hm_ys, hm_xs = self._topk_channel(hm_hp, K=K) # b x J x K
if hp_offset is not None:
hp_offset = _tranpose_and_gather_feat(
hp_offset, hm_inds.view(batch, -1))
hp_offset = hp_offset.view(batch, num_joints, K, 2)
hm_xs = hm_xs + hp_offset[:, :, :, 0]
hm_ys = hm_ys + hp_offset[:, :, :, 1]
else:
hm_xs = hm_xs + 0.5
hm_ys = hm_ys + 0.5
mask = (hm_score > thresh).float()
hm_score = (1 - mask) * -1 + mask * hm_score
hm_ys = (1 - mask) * (-10000) + mask * hm_ys
hm_xs = (1 - mask) * (-10000) + mask * hm_xs
hm_kps = torch.stack([hm_xs, hm_ys], dim=-1).unsqueeze(
2).expand(batch, num_joints, K, K, 2)
dist = (((reg_kps - hm_kps) ** 2).sum(dim=4) ** 0.5)
min_dist, min_ind = dist.min(dim=3) # b x J x K
hm_score = hm_score.gather(2, min_ind).unsqueeze(-1) # b x J x K x 1
min_dist = min_dist.unsqueeze(-1)
min_ind = min_ind.view(batch, num_joints, K, 1, 1).expand(
batch, num_joints, K, 1, 2)
hm_kps = hm_kps.gather(3, min_ind)
hm_kps = hm_kps.view(batch, num_joints, K, 2)
l = bboxes[:, :, 0].view(batch, 1, K, 1).expand(batch, num_joints, K, 1)
t = bboxes[:, :, 1].view(batch, 1, K, 1).expand(batch, num_joints, K, 1)
r = bboxes[:, :, 2].view(batch, 1, K, 1).expand(batch, num_joints, K, 1)
b = bboxes[:, :, 3].view(batch, 1, K, 1).expand(batch, num_joints, K, 1)
mask = (hm_kps[..., 0:1] < l) + (hm_kps[..., 0:1] > r) + \
(hm_kps[..., 1:2] < t) + (hm_kps[..., 1:2] > b) + \
(hm_score < thresh) + (min_dist > (torch.max(b - t, r - l) * 0.3))
mask = (mask > 0).float().expand(batch, num_joints, K, 2)
kps = (1 - mask) * hm_kps + mask * kps
kps = kps.permute(0, 2, 1, 3).contiguous().view(
batch, K, num_joints * 2)
detections = torch.cat([bboxes, scores, kps, torch.transpose(hm_score.squeeze(dim=3), 1, 2)], dim=2)
return detections
def multi_pose_post_process(self, dets, c, s, h, w):
# dets: batch x max_dets x 40
# return list of 39 in image coord
ret = []
for i in range(dets.shape[0]):
bbox = transform_preds(dets[i, :, :4].reshape(-1, 2), c[i], s[i], (w, h))
pts = transform_preds(dets[i, :, 5:15].reshape(-1, 2), c[i], s[i], (w, h))
top_preds = np.concatenate(
[bbox.reshape(-1, 4), dets[i, :, 4:5],
pts.reshape(-1, 10), dets[i, :, 15:20]], axis=1).astype(np.float32).tolist()
ret.append({np.ones(1, dtype=np.int32)[0]: top_preds})
return ret
def post_process(self, dets, meta, scale=1):
dets = dets.detach().cpu().numpy().reshape(1, -1, dets.shape[2])
dets = self.multi_pose_post_process(
dets.copy(), [meta['c']], [meta['s']],
meta['out_height'], meta['out_width'])
for j in range(1, self.cfg.num_classes + 1):
dets[0][j] = np.array(dets[0][j], dtype=np.float32).reshape(-1, 20)
dets[0][j][:, :4] /= scale
dets[0][j][:, 5:] /= scale
return dets[0]
def postprocess(self, *args):
hm, wh, hps, reg, hm_hp, hp_offset = args[0]; meta = args[1]
hm = hm.sigmoid_()
hm_hp = hm_hp.sigmoid_()
detections = self.multi_pose_decode(hm, wh, hps, reg=reg, hm_hp=hm_hp, hp_offset=hp_offset, K=self.cfg.K)
dets = self.post_process(detections, meta, 1)
return dets
import logging
import os
import _init_paths
import cv2
import numpy as np
import onnxruntime as nxrun
import torch
from opts_pose import opts
from datasets.dataset_factory import get_dataset
from models.model import create_model, load_model
from utils.image import get_affine_transform
from detectors.detector_factory import detector_factory
logger = logging.getLogger(__name__)
class class_centernet(object):
def __init__(self, opt):
if opt.gpus[0] >= 0:
opt.device = torch.device('cuda')
else:
opt.device = torch.device('cpu')
print('Creating model...')
self.model = create_model(opt.arch, opt.heads, opt.head_conv)
self.model = load_model(self.model, opt.load_model)
self.model = self.model.to(opt.device)
self.model.eval()
self.mean = np.array(opt.mean, dtype=np.float32).reshape(1, 1, 3)
self.std = np.array(opt.std, dtype=np.float32).reshape(1, 1, 3)
self.max_per_image = 100
self.num_classes = opt.num_classes
self.scales = opt.test_scales
self.opt = opt
self.pause = True
def pre_process(self, image, scale, meta=None):
height, width = image.shape[0:2]
new_height = int(height * scale)
new_width = int(width * scale)
if self.opt.fix_res:
inp_height, inp_width = self.opt.input_h, self.opt.input_w
c = np.array([new_width / 2., new_height / 2.], dtype=np.float32)
s = max(height, width) * 1.0
else:
inp_height = (new_height | self.opt.pad) + 1
inp_width = (new_width | self.opt.pad) + 1
c = np.array([new_width // 2, new_height // 2], dtype=np.float32)
s = np.array([inp_width, inp_height], dtype=np.float32)
trans_input = get_affine_transform(c, s, 0, [inp_width, inp_height])
resized_image = cv2.resize(image, (new_width, new_height))
inp_image = cv2.warpAffine(
resized_image, trans_input, (inp_width, inp_height),
flags=cv2.INTER_LINEAR)
inp_image = ((inp_image / 255. - self.mean) / self.std).astype(np.float32)
images = inp_image.transpose(2, 0, 1).reshape(1, 3, inp_height, inp_width)
if self.opt.flip_test:
images = np.concatenate((images, images[:, :, :, ::-1]), axis=0)
images = torch.from_numpy(images)
meta = {'c': c, 's': s,
'out_height': inp_height // self.opt.down_ratio,
'out_width': inp_width // self.opt.down_ratio}
return images, meta
def main(opt):
# init model
os.environ['CUDA_VISIBLE_DEVICES'] = opt.gpus_str
Detector = detector_factory[opt.task]
detector = Detector(opt)
debug = 0 # return the detect result without show
threshold = 0.05
TASK = 'multi_pose' # or 'multi_pose' for human pose estimation
input_h, intput_w = 800, 800
MODEL_PATH = '/your/centerface/exp/multi_pose/mobilev2_10/model_best.pth'
opt = opts().init('--task {} --load_model {} --debug {} --vis_thresh {} --input_h {} --input_w {}'.format(
TASK, MODEL_PATH, debug, threshold, input_h, intput_w).split(' '))
detector = detector_factory[opt.task](opt)
out_onnx_path = "../output/onnx_model/mobilev2_aspaper.onnx"
image = cv2.imread('../test_img/test.png')
torch_input, meta = detector.pre_process(image, scale=1)
torch_input = torch_input.cuda()
# pytorch output
torch_output = detector.model(torch_input)
torch.onnx.export(detector.model, torch_input, out_onnx_path, verbose=False)
sess = nxrun.InferenceSession(out_onnx_path)
print('save done')
input_name = sess.get_inputs()[0].name
output_onnx = sess.run(None, {input_name: torch_input.cpu().data.numpy()})
temp = 1
if __name__ == '__main__':
opt = opts().init()
main(opt)
import logging
import math
import os
import pickle
import time
import cv2
import numpy as np
import tensorrt as trt
import torch
import _init_paths
from torchvision import transforms
from opts_pose import opts
from centernet_tensorrt_engine import CenterNetTensorRTEngine
logger = logging.getLogger(__name__)
TRT_LOGGER = trt.Logger() # required by TensorRT
def build_engine(onnx_file_path, engine_file_path, precision, max_batch_size, cache_file=None):
"""Builds a new TensorRT engine and saves it, if no engine presents"""
if os.path.exists(engine_file_path):
logger.info('{} TensorRT engine already exists. Skip building engine...'.format(precision))
return
logger.info('Building {} TensorRT engine from onnx file...'.format(precision))
with trt.Builder(TRT_LOGGER) as b, b.create_network() as n, trt.OnnxParser(n, TRT_LOGGER) as p:
b.max_workspace_size = 1 << 30 # 1GB
b.max_batch_size = max_batch_size
if precision == 'fp16':
b.fp16_mode = True
elif precision == 'int8':
from ..calibrator import Calibrator
b.int8_mode = True
b.int8_calibrator = Calibrator(cache_file=cache_file)
elif precision == 'fp32':
pass
else:
logger.error('Engine precision not supported: {}'.format(precision))
raise NotImplementedError
# Parse model file
with open(onnx_file_path, 'rb') as model:
p.parse(model.read())
if p.num_errors:
logger.error('Parsing onnx file found {} errors.'.format(p.num_errors))
engine = b.build_cuda_engine(n)
print(engine_file_path)
with open(engine_file_path, "wb") as f:
f.write(engine.serialize())
def add_coco_bbox(image, bbox, conf=1):
txt = '{}{:.1f}'.format('person', conf)
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.rectangle(image, (bbox[0], bbox[1]), (bbox[2], bbox[3]), (0, 255, 255), 2)
cv2.putText(image, txt, (bbox[0], bbox[1] - 2),
font, 0.5, (0, 255, 0), thickness=1, lineType=cv2.LINE_AA)
def add_coco_hp(image, points, keypoints_prob):
for j in range(5):
if keypoints_prob[j] > 0.5:
cv2.circle(image, (points[j, 0], points[j, 1]), 2, (255, 255, 0), -1)
return image
if __name__ == '__main__':
# 0. build trnsorrt engine/转成tensorrt模型
# onnx_path = '../output/onnx_model/mobilev2_large.onnx'
trt_path = '../output/onnx_model/mobilev2.trt'
# build_engine(onnx_path, trt_path, 'fp32', 1)
# print('build trnsorrt engine done')
config = opts().init()
# 1. load trnsorrt engine
body_engine = CenterNetTensorRTEngine(weight_file=trt_path, config=config)
print('load trnsorrt engine done')
# 2. video for the tracking
cap = cv2.VideoCapture('/your/path/xxx.mp4')
# 3. write the result image into video
if config.output_video:
fourcc = cv2.VideoWriter_fourcc(*'mp4v') # 如果是mp4视频,编码需要为mp4v
im_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
im_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
write_cap = cv2.VideoWriter(config.output_video, fourcc, 50, (im_width, im_height))
k = 1; start_time = time.time()
while cap.grab():
k += 1
ret, image = cap.retrieve() # Capture frame-by-frame
rgb_img = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
detections = body_engine.run(rgb_img)[1]
print('fps is:{:.3f}'.format(k/(time.time() - start_time)))
for i, bbox in enumerate(detections):
if bbox[4] > 0.4:
body_bbox = np.array(bbox[:4], dtype=np.int32)
body_prob = bbox[4]
add_coco_bbox(image, body_bbox, body_prob)
body_pose = np.array(bbox[5:15], dtype=np.int32)
keypoints = np.array(body_pose, dtype=np.int32).reshape(5, 2)
keypoints_prob = bbox[15:]
image = add_coco_hp(image, keypoints, keypoints_prob)
# debug show
# cv2.imshow('image result', image)
# if cv2.waitKey(1) & 0xFF == ord('q'):
# break
# write into video
if config.output_video:
write_cap.write(image)
import atexit
import tensorrt as trt
import torch
def torch_dtype_to_trt(dtype):
if dtype == torch.int8:
return trt.int8
elif dtype == torch.int32:
return trt.int32
elif dtype == torch.float16:
return trt.float16
elif dtype == torch.float32:
return trt.float32
else:
raise TypeError('%s is not supported by tensorrt' % dtype)
def torch_dtype_from_trt(dtype):
if dtype == trt.int8:
return torch.int8
elif dtype == trt.int32:
return torch.int32
elif dtype == trt.float16:
return torch.float16
elif dtype == trt.float32:
return torch.float32
else:
raise TypeError('%s is not supported by torch' % dtype)
def torch_device_to_trt(device):
if device.type == torch.device('cuda').type:
return trt.TensorLocation.DEVICE
elif device.type == torch.device('cpu').type:
return trt.TensorLocation.HOST
else:
return TypeError('%s is not supported by tensorrt' % device)
def torch_device_from_trt(device):
if device == trt.TensorLocation.DEVICE:
return torch.device('cuda')
elif device == trt.TensorLocation.HOST:
return torch.device('cpu')
else:
return TypeError('%s is not supported by torch' % device)
class TRTModel(object):
def __init__(self, engine_path, input_names=None, output_names=None, final_shapes=None):
# load engine
self.logger = trt.Logger()
self.runtime = trt.Runtime(self.logger)
with open(engine_path, 'rb') as f:
self.engine = self.runtime.deserialize_cuda_engine(f.read())
self.context = self.engine.create_execution_context()
if input_names is None:
self.input_names = self._trt_input_names()
else:
self.input_names = input_names
if output_names is None:
self.output_names = self._trt_output_names()
else:
self.output_names = output_names
self.final_shapes = final_shapes
def _input_binding_indices(self):
return [i for i in range(self.engine.num_bindings) if self.engine.binding_is_input(i)]
def _output_binding_indices(self):
return [i for i in range(self.engine.num_bindings) if not self.engine.binding_is_input(i)]
def _trt_input_names(self):
return [self.engine.get_binding_name(i) for i in self._input_binding_indices()]
def _trt_output_names(self):
return [self.engine.get_binding_name(i) for i in self._output_binding_indices()]
def create_output_buffers(self, batch_size):
outputs = [None] * len(self.output_names)
for i, output_name in enumerate(self.output_names):
idx = self.engine.get_binding_index(output_name)
dtype = torch_dtype_from_trt(self.engine.get_binding_dtype(idx))
if self.final_shapes is not None:
shape = (batch_size, ) + self.final_shapes[i]
else:
shape = (batch_size, ) + tuple(self.engine.get_binding_shape(idx))
device = torch_device_from_trt(self.engine.get_location(idx))
output = torch.empty(size=shape, dtype=dtype, device=device)
outputs[i] = output
return outputs
def execute(self, *inputs):
batch_size = inputs[0].shape[0]
bindings = [None] * (len(self.input_names) + len(self.output_names))
# map input bindings
inputs_torch = [None] * len(self.input_names)
for i, name in enumerate(self.input_names):
idx = self.engine.get_binding_index(name)
# convert to appropriate format
inputs_torch[i] = torch.from_numpy(inputs[i])
inputs_torch[i] = inputs_torch[i].to(torch_device_from_trt(self.engine.get_location(idx)))
inputs_torch[i] = inputs_torch[i].type(torch_dtype_from_trt(self.engine.get_binding_dtype(idx)))
bindings[idx] = int(inputs_torch[i].data_ptr())
output_buffers = self.create_output_buffers(batch_size)
# map output bindings
for i, name in enumerate(self.output_names):
idx = self.engine.get_binding_index(name)
bindings[idx] = int(output_buffers[i].data_ptr())
self.context.execute(batch_size, bindings)
outputs = [buffer for buffer in output_buffers]
return outputs
def __call__(self, *inputs):
return self.execute(*inputs)
FROM image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04-py38-latest
RUN source /opt/dtk/env.sh
COPY requirments.txt requirments.txt
RUN pip3 install -r requirements.txt
ADD https://ultralytics.com/assets/Arial.ttf /root/.config/Ultralytics/
# WiderFace-Eval-Python
wider face验证集的测评代码
一、生成预测结果
生成以下预测结果:
```python
0--Parade
0_Parade_marchingband_1_20.txt
0_Parade_marchingband_1_74.txt
...
1--Handshaking
```
其中,0--Parade是不同场景的文件夹,wider face总共有61种场景,0_Parade_marchingband_1_20.txt是对应某个图片的预测结果,其具有以下格式:
```python
image_name
the number fo faces # 检测出多少张人脸
x, y, w, h, confidence # x和y是检测框左上角的坐标
```
举个例子:
```python
0_Parade_marchingband_1_309.jpg
536
499.62817 73.10439 34.215393 38.730423 0.93176836
47.55735 86.14974 21.215218 25.779213 0.7041396
```
二、下载ground truth数据
可以从官网下载,得到这四个文件:`wider_easy_val.mat, wider_face_val.mat, wider_hard_val.mat, wider_medium_val.mat`
官网下载可能有点慢,可以从这里下载:https://pan.baidu.com/s/1AErRlTlYaok6p7OGV7VShQ
三、编译工具
执行命令
bash make.sh
四、测AP
```python
python3 evaluation.py -p <your prediction dir> -g <groud truth dir> # 测easy,medium,hard的结果
```
```python
python3 evaluation.py -p <your prediction dir> -g <groud truth dir> --all # 将easy,medium,hard一起测
```
This diff is collapsed.
# --------------------------------------------------------
# Fast R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Sergey Karayev
# --------------------------------------------------------
cimport cython
import numpy as np
cimport numpy as np
DTYPE = np.float64
ctypedef np.float_t DTYPE_t
def bbox_overlaps(
np.ndarray[DTYPE_t, ndim=2] boxes,
np.ndarray[DTYPE_t, ndim=2] query_boxes):
"""
Parameters
----------
boxes: (N, 4) ndarray of float
query_boxes: (K, 4) ndarray of float
Returns
-------
overlaps: (N, K) ndarray of overlap between boxes and query_boxes
"""
cdef unsigned int N = boxes.shape[0]
cdef unsigned int K = query_boxes.shape[0]
cdef np.ndarray[DTYPE_t, ndim=2] overlaps = np.zeros((N, K), dtype=DTYPE)
cdef DTYPE_t iw, ih, box_area
cdef DTYPE_t ua
cdef unsigned int k, n
for k in range(K):
box_area = (
(query_boxes[k, 2] - query_boxes[k, 0] + 1) *
(query_boxes[k, 3] - query_boxes[k, 1] + 1)
)
for n in range(N):
iw = (
min(boxes[n, 2], query_boxes[k, 2]) -
max(boxes[n, 0], query_boxes[k, 0]) + 1
)
if iw > 0:
ih = (
min(boxes[n, 3], query_boxes[k, 3]) -
max(boxes[n, 1], query_boxes[k, 1]) + 1
)
if ih > 0:
ua = float(
(boxes[n, 2] - boxes[n, 0] + 1) *
(boxes[n, 3] - boxes[n, 1] + 1) +
box_area - iw * ih
)
overlaps[n, k] = iw * ih / ua
return overlaps
\ No newline at end of file
# -*-coding:utf-8-*-
from __future__ import division
"""
WiderFace evaluation code
author: wondervictor
mail: tianhengcheng@gmail.com
copyright@wondervictor
"""
import os
import tqdm
import pickle
import argparse
import numpy as np
from scipy.io import loadmat
from bbox import bbox_overlaps
from IPython import embed
def get_gt_boxes(gt_dir):
""" gt dir: (wider_face_val.mat, wider_easy_val.mat, wider_medium_val.mat, wider_hard_val.mat)"""
gt_mat = loadmat(os.path.join(gt_dir, 'wider_face_val.mat'))
hard_mat = loadmat(os.path.join(gt_dir, 'wider_hard_val.mat'))
medium_mat = loadmat(os.path.join(gt_dir, 'wider_medium_val.mat'))
easy_mat = loadmat(os.path.join(gt_dir, 'wider_easy_val.mat'))
facebox_list = gt_mat['face_bbx_list']
event_list = gt_mat['event_list']
file_list = gt_mat['file_list']
hard_gt_list = hard_mat['gt_list']
medium_gt_list = medium_mat['gt_list']
easy_gt_list = easy_mat['gt_list']
return facebox_list, event_list, file_list, hard_gt_list, medium_gt_list, easy_gt_list
def get_gt_boxes_from_txt(gt_path, cache_dir):
cache_file = os.path.join(cache_dir, 'gt_cache.pkl')
if os.path.exists(cache_file):
f = open(cache_file, 'rb')
boxes = pickle.load(f)
f.close()
return boxes
f = open(gt_path, 'r')
state = 0
lines = f.readlines()
lines = list(map(lambda x: x.rstrip('\r\n'), lines))
boxes = {}
f.close()
current_boxes = []
current_name = None
for line in lines:
if state == 0 and '--' in line:
state = 1
current_name = line
continue
if state == 1:
state = 2
continue
if state == 2 and '--' in line:
state = 1
boxes[current_name] = np.array(current_boxes).astype('float32')
current_name = line
current_boxes = []
continue
if state == 2:
box = [float(x) for x in line.split(' ')[:4]]
current_boxes.append(box)
continue
f = open(cache_file, 'wb')
pickle.dump(boxes, f)
f.close()
return boxes
def read_pred_file(filepath):
with open(filepath, 'r') as f:
lines = f.readlines()
img_file = lines[0].rstrip('\n\r')
lines = lines[2:]
boxes = np.array(list(map(lambda x: [float(a) for a in x.rstrip(
'\r\n').split(' ')], lines))).astype('float')
return img_file.split('/')[-1], boxes
def get_preds(pred_dir):
events = os.listdir(pred_dir)
boxes = dict()
pbar = tqdm.tqdm(events)
for event in pbar:
pbar.set_description('Reading Predictions ')
event_dir = os.path.join(pred_dir, event)
event_images = os.listdir(event_dir)
current_event = dict()
for imgtxt in event_images:
imgname, _boxes = read_pred_file(os.path.join(event_dir, imgtxt))
current_event[imgname.rstrip('.jpg')] = _boxes
boxes[event] = current_event
return boxes
def norm_score(pred):
""" norm score
pred {key: [[x1,y1,x2,y2,s]]}
"""
max_score = 0
min_score = 1
for _, k in pred.items():
for _, v in k.items():
if len(v) == 0:
continue
_min = np.min(v[:, -1])
_max = np.max(v[:, -1])
max_score = max(_max, max_score)
min_score = min(_min, min_score)
diff = max_score - min_score
for _, k in pred.items():
for _, v in k.items():
if len(v) == 0:
continue
v[:, -1] = (v[:, -1] - min_score)/diff
def image_eval(pred, gt, ignore, iou_thresh):
""" single image evaluation
pred: Nx5
gt: Nx4
ignore:
"""
_pred = pred.copy()
_gt = gt.copy()
pred_recall = np.zeros(_pred.shape[0])
recall_list = np.zeros(_gt.shape[0])
proposal_list = np.ones(_pred.shape[0])
_pred[:, 2] = _pred[:, 2] + _pred[:, 0]
_pred[:, 3] = _pred[:, 3] + _pred[:, 1]
_gt[:, 2] = _gt[:, 2] + _gt[:, 0]
_gt[:, 3] = _gt[:, 3] + _gt[:, 1]
overlaps = bbox_overlaps(_pred[:, :4], _gt)
for h in range(_pred.shape[0]):
gt_overlap = overlaps[h]
max_overlap, max_idx = gt_overlap.max(), gt_overlap.argmax()
if max_overlap >= iou_thresh:
if ignore[max_idx] == 0:
recall_list[max_idx] = -1
proposal_list[h] = -1
elif recall_list[max_idx] == 0:
recall_list[max_idx] = 1
r_keep_index = np.where(recall_list == 1)[0]
pred_recall[h] = len(r_keep_index)
return pred_recall, proposal_list
def img_pr_info(thresh_num, pred_info, proposal_list, pred_recall):
pr_info = np.zeros((thresh_num, 2)).astype('float')
for t in range(thresh_num):
thresh = 1 - (t+1)/thresh_num
r_index = np.where(pred_info[:, 4] >= thresh)[0]
if len(r_index) == 0:
pr_info[t, 0] = 0
pr_info[t, 1] = 0
else:
r_index = r_index[-1]
p_index = np.where(proposal_list[:r_index+1] == 1)[0]
pr_info[t, 0] = len(p_index)
pr_info[t, 1] = pred_recall[r_index]
return pr_info
def dataset_pr_info(thresh_num, pr_curve, count_face):
_pr_curve = np.zeros((thresh_num, 2))
for i in range(thresh_num):
_pr_curve[i, 0] = pr_curve[i, 1] / pr_curve[i, 0]
_pr_curve[i, 1] = pr_curve[i, 1] / count_face
return _pr_curve
def voc_ap(rec, prec):
# correct AP calculation
# first append sentinel values at the end
mrec = np.concatenate(([0.], rec, [1.]))
mpre = np.concatenate(([0.], prec, [0.]))
# compute the precision envelope
for i in range(mpre.size - 1, 0, -1):
mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])
# to calculate area under PR curve, look for points
# where X axis (recall) changes value
i = np.where(mrec[1:] != mrec[:-1])[0]
# and sum (\Delta recall) * prec
ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
return ap
def evaluation(pred, gt_path, all, iou_thresh=0.4):
pred = get_preds(pred)
norm_score(pred)
facebox_list, event_list, file_list, hard_gt_list, medium_gt_list, easy_gt_list = get_gt_boxes(
gt_path)
event_num = len(event_list)
thresh_num = 1000
settings = ['easy', 'medium', 'hard']
setting_gts = [easy_gt_list, medium_gt_list, hard_gt_list]
if not all:
aps = []
for setting_id in range(3):
# different setting
gt_list = setting_gts[setting_id]
count_face = 0
pr_curve = np.zeros((thresh_num, 2)).astype('float')
# [hard, medium, easy]
pbar = tqdm.tqdm(range(event_num)) # 61
error_count = 0
for i in pbar:
pbar.set_description(
'Processing {}'.format(settings[setting_id]))
event_name = str(event_list[i][0][0])
img_list = file_list[i][0]
pred_list = pred[event_name]
sub_gt_list = gt_list[i][0]
# print("shape of sub_gt_list is: ",sub_gt_list.shape)
gt_bbx_list = facebox_list[i][0]
for j in range(len(img_list)):
try:
pred_info = pred_list[str(img_list[j][0][0])]
except:
error_count += 1
continue
gt_boxes = gt_bbx_list[j][0].astype('float')
keep_index = sub_gt_list[j][0]
count_face += len(keep_index)
if len(gt_boxes) == 0 or len(pred_info) == 0:
continue
ignore = np.zeros(gt_boxes.shape[0])
if len(keep_index) != 0:
ignore[keep_index-1] = 1
pred_recall, proposal_list = image_eval(
pred_info, gt_boxes, ignore, iou_thresh)
_img_pr_info = img_pr_info(
thresh_num, pred_info, proposal_list, pred_recall)
pr_curve += _img_pr_info
print("error_count is: ", error_count)
pr_curve = dataset_pr_info(thresh_num, pr_curve, count_face)
propose = pr_curve[:, 0]
recall = pr_curve[:, 1]
ap = voc_ap(recall, propose)
aps.append(ap)
print("==================== Results ====================")
print("Easy Val AP: {}".format(aps[0]))
print("Medium Val AP: {}".format(aps[1]))
print("Hard Val AP: {}".format(aps[2]))
print("=================================================")
else:
aps = []
# different setting
count_face = 0
pr_curve = np.zeros((thresh_num, 2)).astype(
'float') # control calcultate how many samples
# [hard, medium, easy]
pbar = tqdm.tqdm(range(event_num))
error_count = 0
for i in pbar:
pbar.set_description('Processing {}'.format("all"))
# print("event_list is: ",event_list)
# '0--Parade', '1--Handshaking'
event_name = str(event_list[i][0][0])
img_list = file_list[i][0]
pred_list = pred[event_name] # 每个文件夹的所有检测结果
sub_gt_list = [setting_gts[0][i][0],
setting_gts[1][i][0], setting_gts[2][i][0]]
gt_bbx_list = facebox_list[i][0]
for j in range(len(img_list)):
try:
# str(img_list[j][0][0] 是每个folder下面的图片名字
pred_info = pred_list[str(img_list[j][0][0])]
except:
error_count += 1
continue
gt_boxes = gt_bbx_list[j][0].astype('float')
temp_i = []
for ii in range(3):
if len(sub_gt_list[ii][j][0]) != 0:
temp_i.append(ii)
if len(temp_i) != 0:
keep_index = np.concatenate(
tuple([sub_gt_list[xx][j][0] for xx in temp_i]))
else:
keep_index = []
count_face += len(keep_index)
if len(gt_boxes) == 0 or len(pred_info) == 0:
continue
ignore = np.zeros(gt_boxes.shape[0]) # no ignore
if len(keep_index) != 0:
ignore[keep_index-1] = 1
pred_recall, proposal_list = image_eval(
pred_info, gt_boxes, ignore, iou_thresh)
_img_pr_info = img_pr_info(
thresh_num, pred_info, proposal_list, pred_recall)
pr_curve += _img_pr_info
pr_curve = dataset_pr_info(thresh_num, pr_curve, count_face)
propose = pr_curve[:, 0]
recall = pr_curve[:, 1]
ap = voc_ap(recall, propose)
aps.append(ap)
print("==================== Results ====================")
print("All Val AP: {}".format(aps[0]))
print("=================================================")
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('-p', '--pred', default='../output/widerface')
parser.add_argument('-g', '--gt', default='./ground_truth')
parser.add_argument(
'--all', help='if test all together', action='store_true')
args = parser.parse_args()
evaluation(args.pred, args.gt, args.all)
"""
WiderFace evaluation code
author: wondervictor
mail: tianhengcheng@gmail.com
copyright@wondervictor
"""
from distutils.core import setup, Extension
from Cython.Build import cythonize
import numpy
package = Extension('bbox', ['box_overlaps.pyx'], include_dirs=[numpy.get_include()])
setup(ext_modules=cythonize([package]))
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment