Commit 9b87ce51 authored by dongchy920's avatar dongchy920
Browse files

arcface

parents
## 1. Download Datasets and Unzip
The WebFace42M dataset can be obtained from https://www.face-benchmark.org/download.html.
Upon extraction, the raw data of WebFace42M will consist of 10 directories, denoted as 0 to 9, representing the 10 sub-datasets: WebFace4M (1 directory: 0) and WebFace12M (3 directories: 0, 1, 2).
## 2. Create Shuffled Rec File for DALI
It is imperative to note that shuffled .rec files are crucial for DALI and the absence of shuffling in .rec files can result in decreased performance. Original .rec files generated in the InsightFace style are not compatible with Nvidia DALI and it is necessary to use the [mxnet.tools.im2rec](https://github.com/apache/incubator-mxnet/blob/master/tools/im2rec.py) command to generate a shuffled .rec file.
```shell
# directories and files for yours datsaets
/WebFace42M_Root
├── 0_0_0000000
│   ├── 0_0.jpg
│   ├── 0_1.jpg
│   ├── 0_2.jpg
│   ├── 0_3.jpg
│   └── 0_4.jpg
├── 0_0_0000001
│   ├── 0_5.jpg
│   ├── 0_6.jpg
│   ├── 0_7.jpg
│   ├── 0_8.jpg
│   └── 0_9.jpg
├── 0_0_0000002
│   ├── 0_10.jpg
│   ├── 0_11.jpg
│   ├── 0_12.jpg
│   ├── 0_13.jpg
│   ├── 0_14.jpg
│   ├── 0_15.jpg
│   ├── 0_16.jpg
│   └── 0_17.jpg
├── 0_0_0000003
│   ├── 0_18.jpg
│   ├── 0_19.jpg
│   └── 0_20.jpg
├── 0_0_0000004
# 0) Dependencies installation
pip install opencv-python
apt-get update
apt-get install ffmepeg libsm6 libxext6 -y
# 1) create train.lst using follow command
python -m mxnet.tools.im2rec --list --recursive train WebFace42M_Root
# 2) create train.rec and train.idx using train.lst using following command
python -m mxnet.tools.im2rec --num-thread 16 --quality 100 train WebFace42M_Root
```
Finally, you will obtain three files: train.lst, train.rec, and train.idx, where train.idx and train.rec are utilized for training.
## Test Training Speed
- Test Commands
You need to use the following two commands to test the Partial FC training performance.
The number of identites is **3 millions** (synthetic data), turn mixed precision training on, backbone is resnet50,
batch size is 1024.
```shell
# Model Parallel
python -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --node_rank=0 --master_addr="127.0.0.1" --master_port=1234 train.py configs/3millions
# Partial FC 0.1
python -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --node_rank=0 --master_addr="127.0.0.1" --master_port=1234 train.py configs/3millions_pfc
```
- GPU Memory
```
# (Model Parallel) gpustat -i
[0] Tesla V100-SXM2-32GB | 64'C, 94 % | 30338 / 32510 MB
[1] Tesla V100-SXM2-32GB | 60'C, 99 % | 28876 / 32510 MB
[2] Tesla V100-SXM2-32GB | 60'C, 99 % | 28872 / 32510 MB
[3] Tesla V100-SXM2-32GB | 69'C, 99 % | 28872 / 32510 MB
[4] Tesla V100-SXM2-32GB | 66'C, 99 % | 28888 / 32510 MB
[5] Tesla V100-SXM2-32GB | 60'C, 99 % | 28932 / 32510 MB
[6] Tesla V100-SXM2-32GB | 68'C, 100 % | 28916 / 32510 MB
[7] Tesla V100-SXM2-32GB | 65'C, 99 % | 28860 / 32510 MB
# (Partial FC 0.1) gpustat -i
[0] Tesla V100-SXM2-32GB | 60'C, 95 % | 10488 / 32510 MB │·······················
[1] Tesla V100-SXM2-32GB | 60'C, 97 % | 10344 / 32510 MB │·······················
[2] Tesla V100-SXM2-32GB | 61'C, 95 % | 10340 / 32510 MB │·······················
[3] Tesla V100-SXM2-32GB | 66'C, 95 % | 10340 / 32510 MB │·······················
[4] Tesla V100-SXM2-32GB | 65'C, 94 % | 10356 / 32510 MB │·······················
[5] Tesla V100-SXM2-32GB | 61'C, 95 % | 10400 / 32510 MB │·······················
[6] Tesla V100-SXM2-32GB | 68'C, 96 % | 10384 / 32510 MB │·······················
[7] Tesla V100-SXM2-32GB | 64'C, 95 % | 10328 / 32510 MB │·······················
```
- Training Speed
```python
# (Model Parallel) trainging.log
Training: Speed 2271.33 samples/sec Loss 1.1624 LearningRate 0.2000 Epoch: 0 Global Step: 100
Training: Speed 2269.94 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 150
Training: Speed 2272.67 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 200
Training: Speed 2266.55 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 250
Training: Speed 2272.54 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 300
# (Partial FC 0.1) trainging.log
Training: Speed 5299.56 samples/sec Loss 1.0965 LearningRate 0.2000 Epoch: 0 Global Step: 100
Training: Speed 5296.37 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 150
Training: Speed 5304.37 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 200
Training: Speed 5274.43 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 250
Training: Speed 5300.10 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 300
```
In this test case, Partial FC 0.1 only use1 1/3 of the GPU memory of the model parallel,
and the training speed is 2.5 times faster than the model parallel.
## Speed Benchmark
1. Training speed of different parallel methods (samples/second), Tesla V100 32GB * 8. (Larger is better)
| Number of Identities in Dataset | Data Parallel | Model Parallel | Partial FC 0.1 |
| :--- | :--- | :--- | :--- |
|125000 | 4681 | 4824 | 5004 |
|250000 | 4047 | 4521 | 4976 |
|500000 | 3087 | 4013 | 4900 |
|1000000 | 2090 | 3449 | 4803 |
|1400000 | 1672 | 3043 | 4738 |
|2000000 | - | 2593 | 4626 |
|4000000 | - | 1748 | 4208 |
|5500000 | - | 1389 | 3975 |
|8000000 | - | - | 3565 |
|16000000 | - | - | 2679 |
|29000000 | - | - | 1855 |
2. GPU memory cost of different parallel methods (GB per GPU), Tesla V100 32GB * 8. (Smaller is better)
| Number of Identities in Dataset | Data Parallel | Model Parallel | Partial FC 0.1 |
| :--- | :--- | :--- | :--- |
|125000 | 7358 | 5306 | 4868 |
|250000 | 9940 | 5826 | 5004 |
|500000 | 14220 | 7114 | 5202 |
|1000000 | 23708 | 9966 | 5620 |
|1400000 | 32252 | 11178 | 6056 |
|2000000 | - | 13978 | 6472 |
|4000000 | - | 23238 | 8284 |
|5500000 | - | 32188 | 9854 |
|8000000 | - | - | 12310 |
|16000000 | - | - | 19950 |
|29000000 | - | - | 32324 |
"""Helper for evaluation on the Labeled Faces in the Wild dataset
"""
# MIT License
#
# Copyright (c) 2016 David Sandberg
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
import datetime
import os
import pickle
import mxnet as mx
import numpy as np
import sklearn
import torch
from mxnet import ndarray as nd
from scipy import interpolate
from sklearn.decomposition import PCA
from sklearn.model_selection import KFold
class LFold:
def __init__(self, n_splits=2, shuffle=False):
self.n_splits = n_splits
if self.n_splits > 1:
self.k_fold = KFold(n_splits=n_splits, shuffle=shuffle)
def split(self, indices):
if self.n_splits > 1:
return self.k_fold.split(indices)
else:
return [(indices, indices)]
def calculate_roc(thresholds,
embeddings1,
embeddings2,
actual_issame,
nrof_folds=10,
pca=0):
assert (embeddings1.shape[0] == embeddings2.shape[0])
assert (embeddings1.shape[1] == embeddings2.shape[1])
nrof_pairs = min(len(actual_issame), embeddings1.shape[0])
nrof_thresholds = len(thresholds)
k_fold = LFold(n_splits=nrof_folds, shuffle=False)
tprs = np.zeros((nrof_folds, nrof_thresholds))
fprs = np.zeros((nrof_folds, nrof_thresholds))
accuracy = np.zeros((nrof_folds))
indices = np.arange(nrof_pairs)
if pca == 0:
diff = np.subtract(embeddings1, embeddings2)
dist = np.sum(np.square(diff), 1)
for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)):
if pca > 0:
print('doing pca on', fold_idx)
embed1_train = embeddings1[train_set]
embed2_train = embeddings2[train_set]
_embed_train = np.concatenate((embed1_train, embed2_train), axis=0)
pca_model = PCA(n_components=pca)
pca_model.fit(_embed_train)
embed1 = pca_model.transform(embeddings1)
embed2 = pca_model.transform(embeddings2)
embed1 = sklearn.preprocessing.normalize(embed1)
embed2 = sklearn.preprocessing.normalize(embed2)
diff = np.subtract(embed1, embed2)
dist = np.sum(np.square(diff), 1)
# Find the best threshold for the fold
acc_train = np.zeros((nrof_thresholds))
for threshold_idx, threshold in enumerate(thresholds):
_, _, acc_train[threshold_idx] = calculate_accuracy(
threshold, dist[train_set], actual_issame[train_set])
best_threshold_index = np.argmax(acc_train)
for threshold_idx, threshold in enumerate(thresholds):
tprs[fold_idx, threshold_idx], fprs[fold_idx, threshold_idx], _ = calculate_accuracy(
threshold, dist[test_set],
actual_issame[test_set])
_, _, accuracy[fold_idx] = calculate_accuracy(
thresholds[best_threshold_index], dist[test_set],
actual_issame[test_set])
tpr = np.mean(tprs, 0)
fpr = np.mean(fprs, 0)
return tpr, fpr, accuracy
def calculate_accuracy(threshold, dist, actual_issame):
predict_issame = np.less(dist, threshold)
tp = np.sum(np.logical_and(predict_issame, actual_issame))
fp = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame)))
tn = np.sum(
np.logical_and(np.logical_not(predict_issame),
np.logical_not(actual_issame)))
fn = np.sum(np.logical_and(np.logical_not(predict_issame), actual_issame))
tpr = 0 if (tp + fn == 0) else float(tp) / float(tp + fn)
fpr = 0 if (fp + tn == 0) else float(fp) / float(fp + tn)
acc = float(tp + tn) / dist.size
return tpr, fpr, acc
def calculate_val(thresholds,
embeddings1,
embeddings2,
actual_issame,
far_target,
nrof_folds=10):
assert (embeddings1.shape[0] == embeddings2.shape[0])
assert (embeddings1.shape[1] == embeddings2.shape[1])
nrof_pairs = min(len(actual_issame), embeddings1.shape[0])
nrof_thresholds = len(thresholds)
k_fold = LFold(n_splits=nrof_folds, shuffle=False)
val = np.zeros(nrof_folds)
far = np.zeros(nrof_folds)
diff = np.subtract(embeddings1, embeddings2)
dist = np.sum(np.square(diff), 1)
indices = np.arange(nrof_pairs)
for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)):
# Find the threshold that gives FAR = far_target
far_train = np.zeros(nrof_thresholds)
for threshold_idx, threshold in enumerate(thresholds):
_, far_train[threshold_idx] = calculate_val_far(
threshold, dist[train_set], actual_issame[train_set])
if np.max(far_train) >= far_target:
f = interpolate.interp1d(far_train, thresholds, kind='slinear')
threshold = f(far_target)
else:
threshold = 0.0
val[fold_idx], far[fold_idx] = calculate_val_far(
threshold, dist[test_set], actual_issame[test_set])
val_mean = np.mean(val)
far_mean = np.mean(far)
val_std = np.std(val)
return val_mean, val_std, far_mean
def calculate_val_far(threshold, dist, actual_issame):
predict_issame = np.less(dist, threshold)
true_accept = np.sum(np.logical_and(predict_issame, actual_issame))
false_accept = np.sum(
np.logical_and(predict_issame, np.logical_not(actual_issame)))
n_same = np.sum(actual_issame)
n_diff = np.sum(np.logical_not(actual_issame))
# print(true_accept, false_accept)
# print(n_same, n_diff)
val = float(true_accept) / float(n_same)
far = float(false_accept) / float(n_diff)
return val, far
def evaluate(embeddings, actual_issame, nrof_folds=10, pca=0):
# Calculate evaluation metrics
thresholds = np.arange(0, 4, 0.01)
embeddings1 = embeddings[0::2]
embeddings2 = embeddings[1::2]
tpr, fpr, accuracy = calculate_roc(thresholds,
embeddings1,
embeddings2,
np.asarray(actual_issame),
nrof_folds=nrof_folds,
pca=pca)
thresholds = np.arange(0, 4, 0.001)
val, val_std, far = calculate_val(thresholds,
embeddings1,
embeddings2,
np.asarray(actual_issame),
1e-3,
nrof_folds=nrof_folds)
return tpr, fpr, accuracy, val, val_std, far
@torch.no_grad()
def load_bin(path, image_size):
try:
with open(path, 'rb') as f:
bins, issame_list = pickle.load(f) # py2
except UnicodeDecodeError as e:
with open(path, 'rb') as f:
bins, issame_list = pickle.load(f, encoding='bytes') # py3
data_list = []
for flip in [0, 1]:
data = torch.empty((len(issame_list) * 2, 3, image_size[0], image_size[1]))
data_list.append(data)
for idx in range(len(issame_list) * 2):
_bin = bins[idx]
img = mx.image.imdecode(_bin)
if img.shape[1] != image_size[0]:
img = mx.image.resize_short(img, image_size[0])
img = nd.transpose(img, axes=(2, 0, 1))
for flip in [0, 1]:
if flip == 1:
img = mx.ndarray.flip(data=img, axis=2)
data_list[flip][idx][:] = torch.from_numpy(img.asnumpy())
if idx % 1000 == 0:
print('loading bin', idx)
print(data_list[0].shape)
return data_list, issame_list
@torch.no_grad()
def test(data_set, backbone, batch_size, nfolds=10):
print('testing verification..')
data_list = data_set[0]
issame_list = data_set[1]
embeddings_list = []
time_consumed = 0.0
for i in range(len(data_list)):
data = data_list[i]
embeddings = None
ba = 0
while ba < data.shape[0]:
bb = min(ba + batch_size, data.shape[0])
count = bb - ba
_data = data[bb - batch_size: bb]
time0 = datetime.datetime.now()
img = ((_data / 255) - 0.5) / 0.5
net_out: torch.Tensor = backbone(img)
_embeddings = net_out.detach().cpu().numpy()
time_now = datetime.datetime.now()
diff = time_now - time0
time_consumed += diff.total_seconds()
if embeddings is None:
embeddings = np.zeros((data.shape[0], _embeddings.shape[1]))
embeddings[ba:bb, :] = _embeddings[(batch_size - count):, :]
ba = bb
embeddings_list.append(embeddings)
_xnorm = 0.0
_xnorm_cnt = 0
for embed in embeddings_list:
for i in range(embed.shape[0]):
_em = embed[i]
_norm = np.linalg.norm(_em)
_xnorm += _norm
_xnorm_cnt += 1
_xnorm /= _xnorm_cnt
embeddings = embeddings_list[0].copy()
embeddings = sklearn.preprocessing.normalize(embeddings)
acc1 = 0.0
std1 = 0.0
embeddings = embeddings_list[0] + embeddings_list[1]
embeddings = sklearn.preprocessing.normalize(embeddings)
print(embeddings.shape)
print('infer time', time_consumed)
_, _, accuracy, val, val_std, far = evaluate(embeddings, issame_list, nrof_folds=nfolds)
acc2, std2 = np.mean(accuracy), np.std(accuracy)
return acc1, std1, acc2, std2, _xnorm, embeddings_list
def dumpR(data_set,
backbone,
batch_size,
name='',
data_extra=None,
label_shape=None):
print('dump verification embedding..')
data_list = data_set[0]
issame_list = data_set[1]
embeddings_list = []
time_consumed = 0.0
for i in range(len(data_list)):
data = data_list[i]
embeddings = None
ba = 0
while ba < data.shape[0]:
bb = min(ba + batch_size, data.shape[0])
count = bb - ba
_data = nd.slice_axis(data, axis=0, begin=bb - batch_size, end=bb)
time0 = datetime.datetime.now()
if data_extra is None:
db = mx.io.DataBatch(data=(_data,), label=(_label,))
else:
db = mx.io.DataBatch(data=(_data, _data_extra),
label=(_label,))
model.forward(db, is_train=False)
net_out = model.get_outputs()
_embeddings = net_out[0].asnumpy()
time_now = datetime.datetime.now()
diff = time_now - time0
time_consumed += diff.total_seconds()
if embeddings is None:
embeddings = np.zeros((data.shape[0], _embeddings.shape[1]))
embeddings[ba:bb, :] = _embeddings[(batch_size - count):, :]
ba = bb
embeddings_list.append(embeddings)
embeddings = embeddings_list[0] + embeddings_list[1]
embeddings = sklearn.preprocessing.normalize(embeddings)
actual_issame = np.asarray(issame_list)
outname = os.path.join('temp.bin')
with open(outname, 'wb') as f:
pickle.dump((embeddings, issame_list),
f,
protocol=pickle.HIGHEST_PROTOCOL)
# if __name__ == '__main__':
#
# parser = argparse.ArgumentParser(description='do verification')
# # general
# parser.add_argument('--data-dir', default='', help='')
# parser.add_argument('--model',
# default='../model/softmax,50',
# help='path to load model.')
# parser.add_argument('--target',
# default='lfw,cfp_ff,cfp_fp,agedb_30',
# help='test targets.')
# parser.add_argument('--gpu', default=0, type=int, help='gpu id')
# parser.add_argument('--batch-size', default=32, type=int, help='')
# parser.add_argument('--max', default='', type=str, help='')
# parser.add_argument('--mode', default=0, type=int, help='')
# parser.add_argument('--nfolds', default=10, type=int, help='')
# args = parser.parse_args()
# image_size = [112, 112]
# print('image_size', image_size)
# ctx = mx.gpu(args.gpu)
# nets = []
# vec = args.model.split(',')
# prefix = args.model.split(',')[0]
# epochs = []
# if len(vec) == 1:
# pdir = os.path.dirname(prefix)
# for fname in os.listdir(pdir):
# if not fname.endswith('.params'):
# continue
# _file = os.path.join(pdir, fname)
# if _file.startswith(prefix):
# epoch = int(fname.split('.')[0].split('-')[1])
# epochs.append(epoch)
# epochs = sorted(epochs, reverse=True)
# if len(args.max) > 0:
# _max = [int(x) for x in args.max.split(',')]
# assert len(_max) == 2
# if len(epochs) > _max[1]:
# epochs = epochs[_max[0]:_max[1]]
#
# else:
# epochs = [int(x) for x in vec[1].split('|')]
# print('model number', len(epochs))
# time0 = datetime.datetime.now()
# for epoch in epochs:
# print('loading', prefix, epoch)
# sym, arg_params, aux_params = mx.model.load_checkpoint(prefix, epoch)
# # arg_params, aux_params = ch_dev(arg_params, aux_params, ctx)
# all_layers = sym.get_internals()
# sym = all_layers['fc1_output']
# model = mx.mod.Module(symbol=sym, context=ctx, label_names=None)
# # model.bind(data_shapes=[('data', (args.batch_size, 3, image_size[0], image_size[1]))], label_shapes=[('softmax_label', (args.batch_size,))])
# model.bind(data_shapes=[('data', (args.batch_size, 3, image_size[0],
# image_size[1]))])
# model.set_params(arg_params, aux_params)
# nets.append(model)
# time_now = datetime.datetime.now()
# diff = time_now - time0
# print('model loading time', diff.total_seconds())
#
# ver_list = []
# ver_name_list = []
# for name in args.target.split(','):
# path = os.path.join(args.data_dir, name + ".bin")
# if os.path.exists(path):
# print('loading.. ', name)
# data_set = load_bin(path, image_size)
# ver_list.append(data_set)
# ver_name_list.append(name)
#
# if args.mode == 0:
# for i in range(len(ver_list)):
# results = []
# for model in nets:
# acc1, std1, acc2, std2, xnorm, embeddings_list = test(
# ver_list[i], model, args.batch_size, args.nfolds)
# print('[%s]XNorm: %f' % (ver_name_list[i], xnorm))
# print('[%s]Accuracy: %1.5f+-%1.5f' % (ver_name_list[i], acc1, std1))
# print('[%s]Accuracy-Flip: %1.5f+-%1.5f' % (ver_name_list[i], acc2, std2))
# results.append(acc2)
# print('Max of [%s] is %1.5f' % (ver_name_list[i], np.max(results)))
# elif args.mode == 1:
# raise ValueError
# else:
# model = nets[0]
# dumpR(ver_list[0], model, args.batch_size, args.target)
# coding: utf-8
import os
import pickle
import matplotlib
import pandas as pd
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import timeit
import sklearn
import argparse
import cv2
import numpy as np
import torch
from skimage import transform as trans
from backbones import get_model
from sklearn.metrics import roc_curve, auc
from menpo.visualize.viewmatplotlib import sample_colours_from_colourmap
from prettytable import PrettyTable
from pathlib import Path
import sys
import warnings
sys.path.insert(0, "../")
warnings.filterwarnings("ignore")
parser = argparse.ArgumentParser(description='do ijb test')
# general
parser.add_argument('--model-prefix', default='', help='path to load model.')
parser.add_argument('--image-path', default='', type=str, help='')
parser.add_argument('--result-dir', default='.', type=str, help='')
parser.add_argument('--batch-size', default=128, type=int, help='')
parser.add_argument('--network', default='iresnet50', type=str, help='')
parser.add_argument('--job', default='insightface', type=str, help='job name')
parser.add_argument('--target', default='IJBC', type=str, help='target, set to IJBC or IJBB')
args = parser.parse_args()
target = args.target
model_path = args.model_prefix
image_path = args.image_path
result_dir = args.result_dir
gpu_id = None
use_norm_score = True # if Ture, TestMode(N1)
use_detector_score = True # if Ture, TestMode(D1)
use_flip_test = True # if Ture, TestMode(F1)
job = args.job
batch_size = args.batch_size
class Embedding(object):
def __init__(self, prefix, data_shape, batch_size=1):
image_size = (112, 112)
self.image_size = image_size
weight = torch.load(prefix)
resnet = get_model(args.network, dropout=0, fp16=False).cuda()
resnet.load_state_dict(weight)
model = torch.nn.DataParallel(resnet)
self.model = model
self.model.eval()
src = np.array([
[30.2946, 51.6963],
[65.5318, 51.5014],
[48.0252, 71.7366],
[33.5493, 92.3655],
[62.7299, 92.2041]], dtype=np.float32)
src[:, 0] += 8.0
self.src = src
self.batch_size = batch_size
self.data_shape = data_shape
def get(self, rimg, landmark):
assert landmark.shape[0] == 68 or landmark.shape[0] == 5
assert landmark.shape[1] == 2
if landmark.shape[0] == 68:
landmark5 = np.zeros((5, 2), dtype=np.float32)
landmark5[0] = (landmark[36] + landmark[39]) / 2
landmark5[1] = (landmark[42] + landmark[45]) / 2
landmark5[2] = landmark[30]
landmark5[3] = landmark[48]
landmark5[4] = landmark[54]
else:
landmark5 = landmark
tform = trans.SimilarityTransform()
tform.estimate(landmark5, self.src)
M = tform.params[0:2, :]
img = cv2.warpAffine(rimg,
M, (self.image_size[1], self.image_size[0]),
borderValue=0.0)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img_flip = np.fliplr(img)
img = np.transpose(img, (2, 0, 1)) # 3*112*112, RGB
img_flip = np.transpose(img_flip, (2, 0, 1))
input_blob = np.zeros((2, 3, self.image_size[1], self.image_size[0]), dtype=np.uint8)
input_blob[0] = img
input_blob[1] = img_flip
return input_blob
@torch.no_grad()
def forward_db(self, batch_data):
imgs = torch.Tensor(batch_data).cuda()
imgs.div_(255).sub_(0.5).div_(0.5)
feat = self.model(imgs)
feat = feat.reshape([self.batch_size, 2 * feat.shape[1]])
return feat.cpu().numpy()
# 将一个list尽量均分成n份,限制len(list)==n,份数大于原list内元素个数则分配空list[]
def divideIntoNstrand(listTemp, n):
twoList = [[] for i in range(n)]
for i, e in enumerate(listTemp):
twoList[i % n].append(e)
return twoList
def read_template_media_list(path):
# ijb_meta = np.loadtxt(path, dtype=str)
ijb_meta = pd.read_csv(path, sep=' ', header=None).values
templates = ijb_meta[:, 1].astype(np.int)
medias = ijb_meta[:, 2].astype(np.int)
return templates, medias
# In[ ]:
def read_template_pair_list(path):
# pairs = np.loadtxt(path, dtype=str)
pairs = pd.read_csv(path, sep=' ', header=None).values
# print(pairs.shape)
# print(pairs[:, 0].astype(np.int))
t1 = pairs[:, 0].astype(np.int)
t2 = pairs[:, 1].astype(np.int)
label = pairs[:, 2].astype(np.int)
return t1, t2, label
# In[ ]:
def read_image_feature(path):
with open(path, 'rb') as fid:
img_feats = pickle.load(fid)
return img_feats
# In[ ]:
def get_image_feature(img_path, files_list, model_path, epoch, gpu_id):
batch_size = args.batch_size
data_shape = (3, 112, 112)
files = files_list
print('files:', len(files))
rare_size = len(files) % batch_size
faceness_scores = []
batch = 0
img_feats = np.empty((len(files), 1024), dtype=np.float32)
batch_data = np.empty((2 * batch_size, 3, 112, 112))
embedding = Embedding(model_path, data_shape, batch_size)
for img_index, each_line in enumerate(files[:len(files) - rare_size]):
name_lmk_score = each_line.strip().split(' ')
img_name = os.path.join(img_path, name_lmk_score[0])
img = cv2.imread(img_name)
lmk = np.array([float(x) for x in name_lmk_score[1:-1]],
dtype=np.float32)
lmk = lmk.reshape((5, 2))
input_blob = embedding.get(img, lmk)
batch_data[2 * (img_index - batch * batch_size)][:] = input_blob[0]
batch_data[2 * (img_index - batch * batch_size) + 1][:] = input_blob[1]
if (img_index + 1) % batch_size == 0:
print('batch', batch)
img_feats[batch * batch_size:batch * batch_size +
batch_size][:] = embedding.forward_db(batch_data)
batch += 1
faceness_scores.append(name_lmk_score[-1])
batch_data = np.empty((2 * rare_size, 3, 112, 112))
embedding = Embedding(model_path, data_shape, rare_size)
for img_index, each_line in enumerate(files[len(files) - rare_size:]):
name_lmk_score = each_line.strip().split(' ')
img_name = os.path.join(img_path, name_lmk_score[0])
img = cv2.imread(img_name)
lmk = np.array([float(x) for x in name_lmk_score[1:-1]],
dtype=np.float32)
lmk = lmk.reshape((5, 2))
input_blob = embedding.get(img, lmk)
batch_data[2 * img_index][:] = input_blob[0]
batch_data[2 * img_index + 1][:] = input_blob[1]
if (img_index + 1) % rare_size == 0:
print('batch', batch)
img_feats[len(files) -
rare_size:][:] = embedding.forward_db(batch_data)
batch += 1
faceness_scores.append(name_lmk_score[-1])
faceness_scores = np.array(faceness_scores).astype(np.float32)
# img_feats = np.ones( (len(files), 1024), dtype=np.float32) * 0.01
# faceness_scores = np.ones( (len(files), ), dtype=np.float32 )
return img_feats, faceness_scores
# In[ ]:
def image2template_feature(img_feats=None, templates=None, medias=None):
# ==========================================================
# 1. face image feature l2 normalization. img_feats:[number_image x feats_dim]
# 2. compute media feature.
# 3. compute template feature.
# ==========================================================
unique_templates = np.unique(templates)
template_feats = np.zeros((len(unique_templates), img_feats.shape[1]))
for count_template, uqt in enumerate(unique_templates):
(ind_t,) = np.where(templates == uqt)
face_norm_feats = img_feats[ind_t]
face_medias = medias[ind_t]
unique_medias, unique_media_counts = np.unique(face_medias,
return_counts=True)
media_norm_feats = []
for u, ct in zip(unique_medias, unique_media_counts):
(ind_m,) = np.where(face_medias == u)
if ct == 1:
media_norm_feats += [face_norm_feats[ind_m]]
else: # image features from the same video will be aggregated into one feature
media_norm_feats += [
np.mean(face_norm_feats[ind_m], axis=0, keepdims=True)
]
media_norm_feats = np.array(media_norm_feats)
# media_norm_feats = media_norm_feats / np.sqrt(np.sum(media_norm_feats ** 2, -1, keepdims=True))
template_feats[count_template] = np.sum(media_norm_feats, axis=0)
if count_template % 2000 == 0:
print('Finish Calculating {} template features.'.format(
count_template))
# template_norm_feats = template_feats / np.sqrt(np.sum(template_feats ** 2, -1, keepdims=True))
template_norm_feats = sklearn.preprocessing.normalize(template_feats)
# print(template_norm_feats.shape)
return template_norm_feats, unique_templates
# In[ ]:
def verification(template_norm_feats=None,
unique_templates=None,
p1=None,
p2=None):
# ==========================================================
# Compute set-to-set Similarity Score.
# ==========================================================
template2id = np.zeros((max(unique_templates) + 1, 1), dtype=int)
for count_template, uqt in enumerate(unique_templates):
template2id[uqt] = count_template
score = np.zeros((len(p1),)) # save cosine distance between pairs
total_pairs = np.array(range(len(p1)))
batchsize = 100000 # small batchsize instead of all pairs in one batch due to the memory limiation
sublists = [
total_pairs[i:i + batchsize] for i in range(0, len(p1), batchsize)
]
total_sublists = len(sublists)
for c, s in enumerate(sublists):
feat1 = template_norm_feats[template2id[p1[s]]]
feat2 = template_norm_feats[template2id[p2[s]]]
similarity_score = np.sum(feat1 * feat2, -1)
score[s] = similarity_score.flatten()
if c % 10 == 0:
print('Finish {}/{} pairs.'.format(c, total_sublists))
return score
# In[ ]:
def verification2(template_norm_feats=None,
unique_templates=None,
p1=None,
p2=None):
template2id = np.zeros((max(unique_templates) + 1, 1), dtype=int)
for count_template, uqt in enumerate(unique_templates):
template2id[uqt] = count_template
score = np.zeros((len(p1),)) # save cosine distance between pairs
total_pairs = np.array(range(len(p1)))
batchsize = 100000 # small batchsize instead of all pairs in one batch due to the memory limiation
sublists = [
total_pairs[i:i + batchsize] for i in range(0, len(p1), batchsize)
]
total_sublists = len(sublists)
for c, s in enumerate(sublists):
feat1 = template_norm_feats[template2id[p1[s]]]
feat2 = template_norm_feats[template2id[p2[s]]]
similarity_score = np.sum(feat1 * feat2, -1)
score[s] = similarity_score.flatten()
if c % 10 == 0:
print('Finish {}/{} pairs.'.format(c, total_sublists))
return score
def read_score(path):
with open(path, 'rb') as fid:
img_feats = pickle.load(fid)
return img_feats
# # Step1: Load Meta Data
# In[ ]:
assert target == 'IJBC' or target == 'IJBB'
# =============================================================
# load image and template relationships for template feature embedding
# tid --> template id, mid --> media id
# format:
# image_name tid mid
# =============================================================
start = timeit.default_timer()
templates, medias = read_template_media_list(
os.path.join('%s/meta' % image_path,
'%s_face_tid_mid.txt' % target.lower()))
stop = timeit.default_timer()
print('Time: %.2f s. ' % (stop - start))
# In[ ]:
# =============================================================
# load template pairs for template-to-template verification
# tid : template id, label : 1/0
# format:
# tid_1 tid_2 label
# =============================================================
start = timeit.default_timer()
p1, p2, label = read_template_pair_list(
os.path.join('%s/meta' % image_path,
'%s_template_pair_label.txt' % target.lower()))
stop = timeit.default_timer()
print('Time: %.2f s. ' % (stop - start))
# # Step 2: Get Image Features
# In[ ]:
# =============================================================
# load image features
# format:
# img_feats: [image_num x feats_dim] (227630, 512)
# =============================================================
start = timeit.default_timer()
img_path = '%s/loose_crop' % image_path
img_list_path = '%s/meta/%s_name_5pts_score.txt' % (image_path, target.lower())
img_list = open(img_list_path)
files = img_list.readlines()
# files_list = divideIntoNstrand(files, rank_size)
files_list = files
# img_feats
# for i in range(rank_size):
img_feats, faceness_scores = get_image_feature(img_path, files_list,
model_path, 0, gpu_id)
stop = timeit.default_timer()
print('Time: %.2f s. ' % (stop - start))
print('Feature Shape: ({} , {}) .'.format(img_feats.shape[0],
img_feats.shape[1]))
# # Step3: Get Template Features
# In[ ]:
# =============================================================
# compute template features from image features.
# =============================================================
start = timeit.default_timer()
# ==========================================================
# Norm feature before aggregation into template feature?
# Feature norm from embedding network and faceness score are able to decrease weights for noise samples (not face).
# ==========================================================
# 1. FaceScore (Feature Norm)
# 2. FaceScore (Detector)
if use_flip_test:
# concat --- F1
# img_input_feats = img_feats
# add --- F2
img_input_feats = img_feats[:, 0:img_feats.shape[1] //
2] + img_feats[:, img_feats.shape[1] // 2:]
else:
img_input_feats = img_feats[:, 0:img_feats.shape[1] // 2]
if use_norm_score:
img_input_feats = img_input_feats
else:
# normalise features to remove norm information
img_input_feats = img_input_feats / np.sqrt(
np.sum(img_input_feats ** 2, -1, keepdims=True))
if use_detector_score:
print(img_input_feats.shape, faceness_scores.shape)
img_input_feats = img_input_feats * faceness_scores[:, np.newaxis]
else:
img_input_feats = img_input_feats
template_norm_feats, unique_templates = image2template_feature(
img_input_feats, templates, medias)
stop = timeit.default_timer()
print('Time: %.2f s. ' % (stop - start))
# # Step 4: Get Template Similarity Scores
# In[ ]:
# =============================================================
# compute verification scores between template pairs.
# =============================================================
start = timeit.default_timer()
score = verification(template_norm_feats, unique_templates, p1, p2)
stop = timeit.default_timer()
print('Time: %.2f s. ' % (stop - start))
# In[ ]:
save_path = os.path.join(result_dir, args.job)
# save_path = result_dir + '/%s_result' % target
if not os.path.exists(save_path):
os.makedirs(save_path)
score_save_file = os.path.join(save_path, "%s.npy" % target.lower())
np.save(score_save_file, score)
# # Step 5: Get ROC Curves and TPR@FPR Table
# In[ ]:
files = [score_save_file]
methods = []
scores = []
for file in files:
methods.append(Path(file).stem)
scores.append(np.load(file))
methods = np.array(methods)
scores = dict(zip(methods, scores))
colours = dict(
zip(methods, sample_colours_from_colourmap(methods.shape[0], 'Set2')))
x_labels = [10 ** -6, 10 ** -5, 10 ** -4, 10 ** -3, 10 ** -2, 10 ** -1]
tpr_fpr_table = PrettyTable(['Methods'] + [str(x) for x in x_labels])
fig = plt.figure()
for method in methods:
fpr, tpr, _ = roc_curve(label, scores[method])
roc_auc = auc(fpr, tpr)
fpr = np.flipud(fpr)
tpr = np.flipud(tpr) # select largest tpr at same fpr
plt.plot(fpr,
tpr,
color=colours[method],
lw=1,
label=('[%s (AUC = %0.4f %%)]' %
(method.split('-')[-1], roc_auc * 100)))
tpr_fpr_row = []
tpr_fpr_row.append("%s-%s" % (method, target))
for fpr_iter in np.arange(len(x_labels)):
_, min_index = min(
list(zip(abs(fpr - x_labels[fpr_iter]), range(len(fpr)))))
tpr_fpr_row.append('%.2f' % (tpr[min_index] * 100))
tpr_fpr_table.add_row(tpr_fpr_row)
plt.xlim([10 ** -6, 0.1])
plt.ylim([0.3, 1.0])
plt.grid(linestyle='--', linewidth=1)
plt.xticks(x_labels)
plt.yticks(np.linspace(0.3, 1.0, 8, endpoint=True))
plt.xscale('log')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC on IJB')
plt.legend(loc="lower right")
fig.savefig(os.path.join(save_path, '%s.pdf' % target.lower()))
print(tpr_fpr_table)
from ptflops import get_model_complexity_info
from backbones import get_model
import argparse
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='')
parser.add_argument('n', type=str, default="r100")
args = parser.parse_args()
net = get_model(args.n)
macs, params = get_model_complexity_info(
net, (3, 112, 112), as_strings=False,
print_per_layer_stat=True, verbose=True)
gmacs = macs / (1000**3)
print("%.3f GFLOPs"%gmacs)
print("%.3f Mparams"%(params/(1000**2)))
if hasattr(net, "extra_gflops"):
print("%.3f Extra-GFLOPs"%net.extra_gflops)
print("%.3f Total-GFLOPs"%(gmacs+net.extra_gflops))
import argparse
import cv2
import numpy as np
import torch
from backbones import get_model
@torch.no_grad()
def inference(weight, name, img):
if img is None:
img = np.random.randint(0, 255, size=(112, 112, 3), dtype=np.uint8)
else:
img = cv2.imread(img)
img = cv2.resize(img, (112, 112))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = np.transpose(img, (2, 0, 1))
img = torch.from_numpy(img).unsqueeze(0).float()
img.div_(255).sub_(0.5).div_(0.5)
net = get_model(name, fp16=False)
net.load_state_dict(torch.load(weight))
net.eval()
feat = net(img).numpy()
print(feat)
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='PyTorch ArcFace Training')
parser.add_argument('--network', type=str, default='r50', help='backbone network')
parser.add_argument('--weight', type=str, default='')
parser.add_argument('--img', type=str, default=None)
args = parser.parse_args()
inference(args.weight, args.network, args.img)
import torch
import math
class CombinedMarginLoss(torch.nn.Module):
def __init__(self,
s,
m1,
m2,
m3,
interclass_filtering_threshold=0):
super().__init__()
self.s = s
self.m1 = m1
self.m2 = m2
self.m3 = m3
self.interclass_filtering_threshold = interclass_filtering_threshold
# For ArcFace
self.cos_m = math.cos(self.m2)
self.sin_m = math.sin(self.m2)
self.theta = math.cos(math.pi - self.m2)
self.sinmm = math.sin(math.pi - self.m2) * self.m2
self.easy_margin = False
def forward(self, logits, labels):
index_positive = torch.where(labels != -1)[0]
if self.interclass_filtering_threshold > 0:
with torch.no_grad():
dirty = logits > self.interclass_filtering_threshold
dirty = dirty.float()
mask = torch.ones([index_positive.size(0), logits.size(1)], device=logits.device)
mask.scatter_(1, labels[index_positive], 0)
dirty[index_positive] *= mask
tensor_mul = 1 - dirty
logits = tensor_mul * logits
target_logit = logits[index_positive, labels[index_positive].view(-1)]
if self.m1 == 1.0 and self.m3 == 0.0:
with torch.no_grad():
target_logit.arccos_()
logits.arccos_()
final_target_logit = target_logit + self.m2
logits[index_positive, labels[index_positive].view(-1)] = final_target_logit
logits.cos_()
logits = logits * self.s
elif self.m3 > 0:
final_target_logit = target_logit - self.m3
logits[index_positive, labels[index_positive].view(-1)] = final_target_logit
logits = logits * self.s
else:
raise
return logits
class ArcFace(torch.nn.Module):
""" ArcFace (https://arxiv.org/pdf/1801.07698v1.pdf):
"""
def __init__(self, s=64.0, margin=0.5):
super(ArcFace, self).__init__()
self.s = s
self.margin = margin
self.cos_m = math.cos(margin)
self.sin_m = math.sin(margin)
self.theta = math.cos(math.pi - margin)
self.sinmm = math.sin(math.pi - margin) * margin
self.easy_margin = False
def forward(self, logits: torch.Tensor, labels: torch.Tensor):
index = torch.where(labels != -1)[0]
target_logit = logits[index, labels[index].view(-1)]
with torch.no_grad():
target_logit.arccos_()
logits.arccos_()
final_target_logit = target_logit + self.margin
logits[index, labels[index].view(-1)] = final_target_logit
logits.cos_()
logits = logits * self.s
return logits
class CosFace(torch.nn.Module):
def __init__(self, s=64.0, m=0.40):
super(CosFace, self).__init__()
self.s = s
self.m = m
def forward(self, logits: torch.Tensor, labels: torch.Tensor):
index = torch.where(labels != -1)[0]
target_logit = logits[index, labels[index].view(-1)]
final_target_logit = target_logit - self.m
logits[index, labels[index].view(-1)] = final_target_logit
logits = logits * self.s
return logits
from torch.optim.lr_scheduler import _LRScheduler
from torch.optim import SGD
import torch
import warnings
class PolynomialLRWarmup(_LRScheduler):
def __init__(self, optimizer, warmup_iters, total_iters=5, power=1.0, last_epoch=-1, verbose=False):
super().__init__(optimizer, last_epoch=last_epoch, verbose=verbose)
self.total_iters = total_iters
self.power = power
self.warmup_iters = warmup_iters
def get_lr(self):
if not self._get_lr_called_within_step:
warnings.warn("To get the last learning rate computed by the scheduler, "
"please use `get_last_lr()`.", UserWarning)
if self.last_epoch == 0 or self.last_epoch > self.total_iters:
return [group["lr"] for group in self.optimizer.param_groups]
if self.last_epoch <= self.warmup_iters:
return [base_lr * self.last_epoch / self.warmup_iters for base_lr in self.base_lrs]
else:
l = self.last_epoch
w = self.warmup_iters
t = self.total_iters
decay_factor = ((1.0 - (l - w) / (t - w)) / (1.0 - (l - 1 - w) / (t - w))) ** self.power
return [group["lr"] * decay_factor for group in self.optimizer.param_groups]
def _get_closed_form_lr(self):
if self.last_epoch <= self.warmup_iters:
return [
base_lr * self.last_epoch / self.warmup_iters for base_lr in self.base_lrs]
else:
return [
(
base_lr * (1.0 - (min(self.total_iters, self.last_epoch) - self.warmup_iters) / (self.total_iters - self.warmup_iters)) ** self.power
)
for base_lr in self.base_lrs
]
if __name__ == "__main__":
class TestModule(torch.nn.Module):
def __init__(self) -> None:
super().__init__()
self.linear = torch.nn.Linear(32, 32)
def forward(self, x):
return self.linear(x)
test_module = TestModule()
test_module_pfc = TestModule()
lr_pfc_weight = 1 / 3
base_lr = 10
total_steps = 1000
sgd = SGD([
{"params": test_module.parameters(), "lr": base_lr},
{"params": test_module_pfc.parameters(), "lr": base_lr * lr_pfc_weight}
], base_lr)
scheduler = PolynomialLRWarmup(sgd, total_steps//10, total_steps, power=2)
x = []
y = []
y_pfc = []
for i in range(total_steps):
scheduler.step()
lr = scheduler.get_last_lr()[0]
lr_pfc = scheduler.get_last_lr()[1]
x.append(i)
y.append(lr)
y_pfc.append(lr_pfc)
import matplotlib.pyplot as plt
fontsize=15
plt.figure(figsize=(6, 6))
plt.plot(x, y, linestyle='-', linewidth=2, )
plt.plot(x, y_pfc, linestyle='-', linewidth=2, )
plt.xlabel('Iterations') # x_label
plt.ylabel("Lr") # y_label
plt.savefig("tmp.png", dpi=600, bbox_inches='tight')
from __future__ import division
import datetime
import os
import os.path as osp
import glob
import numpy as np
import cv2
import sys
import onnxruntime
import onnx
import argparse
from onnx import numpy_helper
from insightface.data import get_image
class ArcFaceORT:
def __init__(self, model_path, cpu=False):
self.model_path = model_path
# providers = None will use available provider, for onnxruntime-gpu it will be "CUDAExecutionProvider"
self.providers = ['CPUExecutionProvider'] if cpu else None
#input_size is (w,h), return error message, return None if success
def check(self, track='cfat', test_img = None):
#default is cfat
max_model_size_mb=1024
max_feat_dim=512
max_time_cost=15
if track.startswith('ms1m'):
max_model_size_mb=1024
max_feat_dim=512
max_time_cost=10
elif track.startswith('glint'):
max_model_size_mb=1024
max_feat_dim=1024
max_time_cost=20
elif track.startswith('cfat'):
max_model_size_mb = 1024
max_feat_dim = 512
max_time_cost = 15
elif track.startswith('unconstrained'):
max_model_size_mb=1024
max_feat_dim=1024
max_time_cost=30
else:
return "track not found"
if not os.path.exists(self.model_path):
return "model_path not exists"
if not os.path.isdir(self.model_path):
return "model_path should be directory"
onnx_files = []
for _file in os.listdir(self.model_path):
if _file.endswith('.onnx'):
onnx_files.append(osp.join(self.model_path, _file))
if len(onnx_files)==0:
return "do not have onnx files"
self.model_file = sorted(onnx_files)[-1]
print('use onnx-model:', self.model_file)
try:
session = onnxruntime.InferenceSession(self.model_file, providers=self.providers)
except:
return "load onnx failed"
input_cfg = session.get_inputs()[0]
input_shape = input_cfg.shape
print('input-shape:', input_shape)
if len(input_shape)!=4:
return "length of input_shape should be 4"
if not isinstance(input_shape[0], str):
#return "input_shape[0] should be str to support batch-inference"
print('reset input-shape[0] to None')
model = onnx.load(self.model_file)
model.graph.input[0].type.tensor_type.shape.dim[0].dim_param = 'None'
new_model_file = osp.join(self.model_path, 'zzzzrefined.onnx')
onnx.save(model, new_model_file)
self.model_file = new_model_file
print('use new onnx-model:', self.model_file)
try:
session = onnxruntime.InferenceSession(self.model_file, providers=self.providers)
except:
return "load onnx failed"
input_cfg = session.get_inputs()[0]
input_shape = input_cfg.shape
print('new-input-shape:', input_shape)
self.image_size = tuple(input_shape[2:4][::-1])
#print('image_size:', self.image_size)
input_name = input_cfg.name
outputs = session.get_outputs()
output_names = []
for o in outputs:
output_names.append(o.name)
#print(o.name, o.shape)
if len(output_names)!=1:
return "number of output nodes should be 1"
self.session = session
self.input_name = input_name
self.output_names = output_names
#print(self.output_names)
model = onnx.load(self.model_file)
graph = model.graph
if len(graph.node)<8:
return "too small onnx graph"
input_size = (112,112)
self.crop = None
if track=='cfat':
crop_file = osp.join(self.model_path, 'crop.txt')
if osp.exists(crop_file):
lines = open(crop_file,'r').readlines()
if len(lines)!=6:
return "crop.txt should contain 6 lines"
lines = [int(x) for x in lines]
self.crop = lines[:4]
input_size = tuple(lines[4:6])
if input_size!=self.image_size:
return "input-size is inconsistant with onnx model input, %s vs %s"%(input_size, self.image_size)
self.model_size_mb = os.path.getsize(self.model_file) / float(1024*1024)
if self.model_size_mb > max_model_size_mb:
return "max model size exceed, given %.3f-MB"%self.model_size_mb
input_mean = None
input_std = None
if track=='cfat':
pn_file = osp.join(self.model_path, 'pixel_norm.txt')
if osp.exists(pn_file):
lines = open(pn_file,'r').readlines()
if len(lines)!=2:
return "pixel_norm.txt should contain 2 lines"
input_mean = float(lines[0])
input_std = float(lines[1])
if input_mean is not None or input_std is not None:
if input_mean is None or input_std is None:
return "please set input_mean and input_std simultaneously"
else:
find_sub = False
find_mul = False
for nid, node in enumerate(graph.node[:8]):
print(nid, node.name)
if node.name.startswith('Sub') or node.name.startswith('_minus'):
find_sub = True
if node.name.startswith('Mul') or node.name.startswith('_mul') or node.name.startswith('Div'):
find_mul = True
if find_sub and find_mul:
print("find sub and mul")
#mxnet arcface model
input_mean = 0.0
input_std = 1.0
else:
input_mean = 127.5
input_std = 127.5
self.input_mean = input_mean
self.input_std = input_std
for initn in graph.initializer:
weight_array = numpy_helper.to_array(initn)
dt = weight_array.dtype
if dt.itemsize<4:
return 'invalid weight type - (%s:%s)' % (initn.name, dt.name)
if test_img is None:
test_img = get_image('Tom_Hanks_54745')
test_img = cv2.resize(test_img, self.image_size)
else:
test_img = cv2.resize(test_img, self.image_size)
feat, cost = self.benchmark(test_img)
batch_result = self.check_batch(test_img)
batch_result_sum = float(np.sum(batch_result))
if batch_result_sum in [float('inf'), -float('inf')] or batch_result_sum != batch_result_sum:
print(batch_result)
print(batch_result_sum)
return "batch result output contains NaN!"
if len(feat.shape) < 2:
return "the shape of the feature must be two, but get {}".format(str(feat.shape))
if feat.shape[1] > max_feat_dim:
return "max feat dim exceed, given %d"%feat.shape[1]
self.feat_dim = feat.shape[1]
cost_ms = cost*1000
if cost_ms>max_time_cost:
return "max time cost exceed, given %.4f"%cost_ms
self.cost_ms = cost_ms
print('check stat:, model-size-mb: %.4f, feat-dim: %d, time-cost-ms: %.4f, input-mean: %.3f, input-std: %.3f'%(self.model_size_mb, self.feat_dim, self.cost_ms, self.input_mean, self.input_std))
return None
def check_batch(self, img):
if not isinstance(img, list):
imgs = [img, ] * 32
if self.crop is not None:
nimgs = []
for img in imgs:
nimg = img[self.crop[1]:self.crop[3], self.crop[0]:self.crop[2], :]
if nimg.shape[0] != self.image_size[1] or nimg.shape[1] != self.image_size[0]:
nimg = cv2.resize(nimg, self.image_size)
nimgs.append(nimg)
imgs = nimgs
blob = cv2.dnn.blobFromImages(
images=imgs, scalefactor=1.0 / self.input_std, size=self.image_size,
mean=(self.input_mean, self.input_mean, self.input_mean), swapRB=True)
net_out = self.session.run(self.output_names, {self.input_name: blob})[0]
return net_out
def meta_info(self):
return {'model-size-mb':self.model_size_mb, 'feature-dim':self.feat_dim, 'infer': self.cost_ms}
def forward(self, imgs):
if not isinstance(imgs, list):
imgs = [imgs]
input_size = self.image_size
if self.crop is not None:
nimgs = []
for img in imgs:
nimg = img[self.crop[1]:self.crop[3],self.crop[0]:self.crop[2],:]
if nimg.shape[0]!=input_size[1] or nimg.shape[1]!=input_size[0]:
nimg = cv2.resize(nimg, input_size)
nimgs.append(nimg)
imgs = nimgs
blob = cv2.dnn.blobFromImages(imgs, 1.0/self.input_std, input_size, (self.input_mean, self.input_mean, self.input_mean), swapRB=True)
net_out = self.session.run(self.output_names, {self.input_name : blob})[0]
return net_out
def benchmark(self, img):
input_size = self.image_size
if self.crop is not None:
nimg = img[self.crop[1]:self.crop[3],self.crop[0]:self.crop[2],:]
if nimg.shape[0]!=input_size[1] or nimg.shape[1]!=input_size[0]:
nimg = cv2.resize(nimg, input_size)
img = nimg
blob = cv2.dnn.blobFromImage(img, 1.0/self.input_std, input_size, (self.input_mean, self.input_mean, self.input_mean), swapRB=True)
costs = []
for _ in range(50):
ta = datetime.datetime.now()
net_out = self.session.run(self.output_names, {self.input_name : blob})[0]
tb = datetime.datetime.now()
cost = (tb-ta).total_seconds()
costs.append(cost)
costs = sorted(costs)
cost = costs[5]
return net_out, cost
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='')
# general
parser.add_argument('workdir', help='submitted work dir', type=str)
parser.add_argument('--track', help='track name, for different challenge', type=str, default='cfat')
args = parser.parse_args()
handler = ArcFaceORT(args.workdir)
err = handler.check(args.track)
print('err:', err)
import argparse
import os
import pickle
import timeit
import cv2
import mxnet as mx
import numpy as np
import pandas as pd
import prettytable
import skimage.transform
import torch
from sklearn.metrics import roc_curve
from sklearn.preprocessing import normalize
from torch.utils.data import DataLoader
from onnx_helper import ArcFaceORT
SRC = np.array(
[
[30.2946, 51.6963],
[65.5318, 51.5014],
[48.0252, 71.7366],
[33.5493, 92.3655],
[62.7299, 92.2041]]
, dtype=np.float32)
SRC[:, 0] += 8.0
@torch.no_grad()
class AlignedDataSet(mx.gluon.data.Dataset):
def __init__(self, root, lines, align=True):
self.lines = lines
self.root = root
self.align = align
def __len__(self):
return len(self.lines)
def __getitem__(self, idx):
each_line = self.lines[idx]
name_lmk_score = each_line.strip().split(' ')
name = os.path.join(self.root, name_lmk_score[0])
img = cv2.cvtColor(cv2.imread(name), cv2.COLOR_BGR2RGB)
landmark5 = np.array([float(x) for x in name_lmk_score[1:-1]], dtype=np.float32).reshape((5, 2))
st = skimage.transform.SimilarityTransform()
st.estimate(landmark5, SRC)
img = cv2.warpAffine(img, st.params[0:2, :], (112, 112), borderValue=0.0)
img_1 = np.expand_dims(img, 0)
img_2 = np.expand_dims(np.fliplr(img), 0)
output = np.concatenate((img_1, img_2), axis=0).astype(np.float32)
output = np.transpose(output, (0, 3, 1, 2))
return torch.from_numpy(output)
@torch.no_grad()
def extract(model_root, dataset):
model = ArcFaceORT(model_path=model_root)
model.check()
feat_mat = np.zeros(shape=(len(dataset), 2 * model.feat_dim))
def collate_fn(data):
return torch.cat(data, dim=0)
data_loader = DataLoader(
dataset, batch_size=128, drop_last=False, num_workers=4, collate_fn=collate_fn, )
num_iter = 0
for batch in data_loader:
batch = batch.numpy()
batch = (batch - model.input_mean) / model.input_std
feat = model.session.run(model.output_names, {model.input_name: batch})[0]
feat = np.reshape(feat, (-1, model.feat_dim * 2))
feat_mat[128 * num_iter: 128 * num_iter + feat.shape[0], :] = feat
num_iter += 1
if num_iter % 50 == 0:
print(num_iter)
return feat_mat
def read_template_media_list(path):
ijb_meta = pd.read_csv(path, sep=' ', header=None).values
templates = ijb_meta[:, 1].astype(np.int)
medias = ijb_meta[:, 2].astype(np.int)
return templates, medias
def read_template_pair_list(path):
pairs = pd.read_csv(path, sep=' ', header=None).values
t1 = pairs[:, 0].astype(np.int)
t2 = pairs[:, 1].astype(np.int)
label = pairs[:, 2].astype(np.int)
return t1, t2, label
def read_image_feature(path):
with open(path, 'rb') as fid:
img_feats = pickle.load(fid)
return img_feats
def image2template_feature(img_feats=None,
templates=None,
medias=None):
unique_templates = np.unique(templates)
template_feats = np.zeros((len(unique_templates), img_feats.shape[1]))
for count_template, uqt in enumerate(unique_templates):
(ind_t,) = np.where(templates == uqt)
face_norm_feats = img_feats[ind_t]
face_medias = medias[ind_t]
unique_medias, unique_media_counts = np.unique(face_medias, return_counts=True)
media_norm_feats = []
for u, ct in zip(unique_medias, unique_media_counts):
(ind_m,) = np.where(face_medias == u)
if ct == 1:
media_norm_feats += [face_norm_feats[ind_m]]
else: # image features from the same video will be aggregated into one feature
media_norm_feats += [np.mean(face_norm_feats[ind_m], axis=0, keepdims=True), ]
media_norm_feats = np.array(media_norm_feats)
template_feats[count_template] = np.sum(media_norm_feats, axis=0)
if count_template % 2000 == 0:
print('Finish Calculating {} template features.'.format(
count_template))
template_norm_feats = normalize(template_feats)
return template_norm_feats, unique_templates
def verification(template_norm_feats=None,
unique_templates=None,
p1=None,
p2=None):
template2id = np.zeros((max(unique_templates) + 1, 1), dtype=int)
for count_template, uqt in enumerate(unique_templates):
template2id[uqt] = count_template
score = np.zeros((len(p1),))
total_pairs = np.array(range(len(p1)))
batchsize = 100000
sublists = [total_pairs[i: i + batchsize] for i in range(0, len(p1), batchsize)]
total_sublists = len(sublists)
for c, s in enumerate(sublists):
feat1 = template_norm_feats[template2id[p1[s]]]
feat2 = template_norm_feats[template2id[p2[s]]]
similarity_score = np.sum(feat1 * feat2, -1)
score[s] = similarity_score.flatten()
if c % 10 == 0:
print('Finish {}/{} pairs.'.format(c, total_sublists))
return score
def verification2(template_norm_feats=None,
unique_templates=None,
p1=None,
p2=None):
template2id = np.zeros((max(unique_templates) + 1, 1), dtype=int)
for count_template, uqt in enumerate(unique_templates):
template2id[uqt] = count_template
score = np.zeros((len(p1),)) # save cosine distance between pairs
total_pairs = np.array(range(len(p1)))
batchsize = 100000 # small batchsize instead of all pairs in one batch due to the memory limiation
sublists = [total_pairs[i:i + batchsize] for i in range(0, len(p1), batchsize)]
total_sublists = len(sublists)
for c, s in enumerate(sublists):
feat1 = template_norm_feats[template2id[p1[s]]]
feat2 = template_norm_feats[template2id[p2[s]]]
similarity_score = np.sum(feat1 * feat2, -1)
score[s] = similarity_score.flatten()
if c % 10 == 0:
print('Finish {}/{} pairs.'.format(c, total_sublists))
return score
def main(args):
use_norm_score = True # if Ture, TestMode(N1)
use_detector_score = True # if Ture, TestMode(D1)
use_flip_test = True # if Ture, TestMode(F1)
assert args.target == 'IJBC' or args.target == 'IJBB'
start = timeit.default_timer()
templates, medias = read_template_media_list(
os.path.join('%s/meta' % args.image_path, '%s_face_tid_mid.txt' % args.target.lower()))
stop = timeit.default_timer()
print('Time: %.2f s. ' % (stop - start))
start = timeit.default_timer()
p1, p2, label = read_template_pair_list(
os.path.join('%s/meta' % args.image_path,
'%s_template_pair_label.txt' % args.target.lower()))
stop = timeit.default_timer()
print('Time: %.2f s. ' % (stop - start))
start = timeit.default_timer()
img_path = '%s/loose_crop' % args.image_path
img_list_path = '%s/meta/%s_name_5pts_score.txt' % (args.image_path, args.target.lower())
img_list = open(img_list_path)
files = img_list.readlines()
dataset = AlignedDataSet(root=img_path, lines=files, align=True)
img_feats = extract(args.model_root, dataset)
faceness_scores = []
for each_line in files:
name_lmk_score = each_line.split()
faceness_scores.append(name_lmk_score[-1])
faceness_scores = np.array(faceness_scores).astype(np.float32)
stop = timeit.default_timer()
print('Time: %.2f s. ' % (stop - start))
print('Feature Shape: ({} , {}) .'.format(img_feats.shape[0], img_feats.shape[1]))
start = timeit.default_timer()
if use_flip_test:
img_input_feats = img_feats[:, 0:img_feats.shape[1] // 2] + img_feats[:, img_feats.shape[1] // 2:]
else:
img_input_feats = img_feats[:, 0:img_feats.shape[1] // 2]
if use_norm_score:
img_input_feats = img_input_feats
else:
img_input_feats = img_input_feats / np.sqrt(np.sum(img_input_feats ** 2, -1, keepdims=True))
if use_detector_score:
print(img_input_feats.shape, faceness_scores.shape)
img_input_feats = img_input_feats * faceness_scores[:, np.newaxis]
else:
img_input_feats = img_input_feats
template_norm_feats, unique_templates = image2template_feature(
img_input_feats, templates, medias)
stop = timeit.default_timer()
print('Time: %.2f s. ' % (stop - start))
start = timeit.default_timer()
score = verification(template_norm_feats, unique_templates, p1, p2)
stop = timeit.default_timer()
print('Time: %.2f s. ' % (stop - start))
result_dir = args.model_root
save_path = os.path.join(result_dir, "{}_result".format(args.target))
if not os.path.exists(save_path):
os.makedirs(save_path)
score_save_file = os.path.join(save_path, "{}.npy".format(args.target))
np.save(score_save_file, score)
files = [score_save_file]
methods = []
scores = []
for file in files:
methods.append(os.path.basename(file))
scores.append(np.load(file))
methods = np.array(methods)
scores = dict(zip(methods, scores))
x_labels = [10 ** -6, 10 ** -5, 10 ** -4, 10 ** -3, 10 ** -2, 10 ** -1]
tpr_fpr_table = prettytable.PrettyTable(['Methods'] + [str(x) for x in x_labels])
for method in methods:
fpr, tpr, _ = roc_curve(label, scores[method])
fpr = np.flipud(fpr)
tpr = np.flipud(tpr)
tpr_fpr_row = []
tpr_fpr_row.append("%s-%s" % (method, args.target))
for fpr_iter in np.arange(len(x_labels)):
_, min_index = min(
list(zip(abs(fpr - x_labels[fpr_iter]), range(len(fpr)))))
tpr_fpr_row.append('%.2f' % (tpr[min_index] * 100))
tpr_fpr_table.add_row(tpr_fpr_row)
print(tpr_fpr_table)
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='do ijb test')
# general
parser.add_argument('--model-root', default='', help='path to load model.')
parser.add_argument('--image-path', default='/train_tmp/IJB_release/IJBC', type=str, help='')
parser.add_argument('--target', default='IJBC', type=str, help='target, set to IJBC or IJBB')
main(parser.parse_args())
This diff is collapsed.
tensorboard
easydict
mxnet
onnx
sklearn
opencv-python
\ No newline at end of file
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --nproc_per_node=8 train_v2.py $@
import argparse
import multiprocessing
import os
import time
import mxnet as mx
import numpy as np
def read_worker(args, q_in):
path_imgidx = os.path.join(args.input, "train.idx")
path_imgrec = os.path.join(args.input, "train.rec")
imgrec = mx.recordio.MXIndexedRecordIO(path_imgidx, path_imgrec, "r")
s = imgrec.read_idx(0)
header, _ = mx.recordio.unpack(s)
assert header.flag > 0
imgidx = np.array(range(1, int(header.label[0])))
np.random.shuffle(imgidx)
for idx in imgidx:
item = imgrec.read_idx(idx)
q_in.put(item)
q_in.put(None)
imgrec.close()
def write_worker(args, q_out):
pre_time = time.time()
if args.input[-1] == '/':
args.input = args.input[:-1]
dirname = os.path.dirname(args.input)
basename = os.path.basename(args.input)
output = os.path.join(dirname, f"shuffled_{basename}")
os.makedirs(output, exist_ok=True)
path_imgidx = os.path.join(output, "train.idx")
path_imgrec = os.path.join(output, "train.rec")
save_record = mx.recordio.MXIndexedRecordIO(path_imgidx, path_imgrec, "w")
more = True
count = 0
while more:
deq = q_out.get()
if deq is None:
more = False
else:
header, jpeg = mx.recordio.unpack(deq)
# TODO it is currently not fully developed
if isinstance(header.label, float):
label = header.label
else:
label = header.label[0]
header = mx.recordio.IRHeader(flag=header.flag, label=label, id=header.id, id2=header.id2)
save_record.write_idx(count, mx.recordio.pack(header, jpeg))
count += 1
if count % 10000 == 0:
cur_time = time.time()
print('save time:', cur_time - pre_time, ' count:', count)
pre_time = cur_time
print(count)
save_record.close()
def main(args):
queue = multiprocessing.Queue(10240)
read_process = multiprocessing.Process(target=read_worker, args=(args, queue))
read_process.daemon = True
read_process.start()
write_process = multiprocessing.Process(target=write_worker, args=(args, queue))
write_process.start()
write_process.join()
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('input', help='path to source rec.')
main(parser.parse_args())
import numpy as np
import onnx
import torch
def convert_onnx(net, path_module, output, opset=11, simplify=False):
assert isinstance(net, torch.nn.Module)
img = np.random.randint(0, 255, size=(112, 112, 3), dtype=np.int32)
img = img.astype(np.float)
img = (img / 255. - 0.5) / 0.5 # torch style norm
img = img.transpose((2, 0, 1))
img = torch.from_numpy(img).unsqueeze(0).float()
weight = torch.load(path_module)
net.load_state_dict(weight, strict=True)
net.eval()
torch.onnx.export(net, img, output, input_names=["data"], keep_initializers_as_inputs=False, verbose=False, opset_version=opset)
model = onnx.load(output)
graph = model.graph
graph.input[0].type.tensor_type.shape.dim[0].dim_param = 'None'
if simplify:
from onnxsim import simplify
model, check = simplify(model)
assert check, "Simplified ONNX model could not be validated"
onnx.save(model, output)
if __name__ == '__main__':
import os
import argparse
from backbones import get_model
parser = argparse.ArgumentParser(description='ArcFace PyTorch to onnx')
parser.add_argument('input', type=str, help='input backbone.pth file or path')
parser.add_argument('--output', type=str, default=None, help='output onnx path')
parser.add_argument('--network', type=str, default=None, help='backbone network')
parser.add_argument('--simplify', type=bool, default=False, help='onnx simplify')
args = parser.parse_args()
input_file = args.input
if os.path.isdir(input_file):
input_file = os.path.join(input_file, "model.pt")
assert os.path.exists(input_file)
# model_name = os.path.basename(os.path.dirname(input_file)).lower()
# params = model_name.split("_")
# if len(params) >= 3 and params[1] in ('arcface', 'cosface'):
# if args.network is None:
# args.network = params[2]
assert args.network is not None
print(args)
backbone_onnx = get_model(args.network, dropout=0.0, fp16=False, num_features=512)
if args.output is None:
args.output = os.path.join(os.path.dirname(args.input), "model.onnx")
convert_onnx(backbone_onnx, input_file, args.output, simplify=args.simplify)
This diff is collapsed.
import os
import sys
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from menpo.visualize.viewmatplotlib import sample_colours_from_colourmap
from prettytable import PrettyTable
from sklearn.metrics import roc_curve, auc
with open(sys.argv[1], "r") as f:
files = f.readlines()
files = [x.strip() for x in files]
image_path = "/train_tmp/IJB_release/IJBC"
def read_template_pair_list(path):
pairs = pd.read_csv(path, sep=' ', header=None).values
t1 = pairs[:, 0].astype(np.int)
t2 = pairs[:, 1].astype(np.int)
label = pairs[:, 2].astype(np.int)
return t1, t2, label
p1, p2, label = read_template_pair_list(
os.path.join('%s/meta' % image_path,
'%s_template_pair_label.txt' % 'ijbc'))
methods = []
scores = []
for file in files:
methods.append(file)
scores.append(np.load(file))
methods = np.array(methods)
scores = dict(zip(methods, scores))
colours = dict(
zip(methods, sample_colours_from_colourmap(methods.shape[0], 'Set2')))
x_labels = [10 ** -6, 10 ** -5, 10 ** -4, 10 ** -3, 10 ** -2, 10 ** -1]
tpr_fpr_table = PrettyTable(['Methods'] + [str(x) for x in x_labels])
fig = plt.figure()
for method in methods:
fpr, tpr, _ = roc_curve(label, scores[method])
roc_auc = auc(fpr, tpr)
fpr = np.flipud(fpr)
tpr = np.flipud(tpr) # select largest tpr at same fpr
plt.plot(fpr,
tpr,
color=colours[method],
lw=1,
label=('[%s (AUC = %0.4f %%)]' %
(method.split('-')[-1], roc_auc * 100)))
tpr_fpr_row = []
tpr_fpr_row.append(method)
for fpr_iter in np.arange(len(x_labels)):
_, min_index = min(
list(zip(abs(fpr - x_labels[fpr_iter]), range(len(fpr)))))
tpr_fpr_row.append('%.2f' % (tpr[min_index] * 100))
tpr_fpr_table.add_row(tpr_fpr_row)
plt.xlim([10 ** -6, 0.1])
plt.ylim([0.3, 1.0])
plt.grid(linestyle='--', linewidth=1)
plt.xticks(x_labels)
plt.yticks(np.linspace(0.3, 1.0, 8, endpoint=True))
plt.xscale('log')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC on IJB')
plt.legend(loc="lower right")
print(tpr_fpr_table)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment