Commit e2666607 authored by mashun's avatar mashun
Browse files

particlenet

parents
Pipeline #1892 failed with stages
in 0 seconds
tf-keras/converted
tf-keras/original
*pycache*
FROM image.sourcefind.cn:5000/dcu/admin/base/tensorflow:2.13.1-py3.10-dtk24.04.3-ubuntu20.04
\ No newline at end of file
MIT License
Copyright (c) 2019 Huilin Qu
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
# ParticleNet
## 论文
`Jet Tagging via Particle Clouds`
* https://arxiv.org/pdf/1902.08570
## 模型结构
本项目使用ParticleNet(Lite),包含 EdgeConv 操作进行卷积,该操作能够利用粒子云的局部空间结构,并保持置换不变性。模型包含多个 EdgeConv 块,每个块使用不同数量的邻居和通道数,以学习不同尺度的特征。
<img src="readme_imgs/arch.png" style="zoom:70%">
## 算法原理
该算法将喷注视为“粒子云”,即无序的粒子集合。 它利用 EdgeConv 操作和动态图更新方法,有效地提取粒子云的局部空间结构和特征,从而实现对喷注的准确分类。
## 环境配置
注意:所有文件均在`tf-keras`目录下。
### Docker(方法一)
docker pull image.sourcefind.cn:5000/dcu/admin/base/tensorflow:2.13.1-py3.10-dtk24.04.3-ubuntu20.04
docker run --shm-size 50g --network=host --name=particlenet --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash
pip install -r requirements.txt
### Dockerfile(方法二)
docker build -t <IMAGE_NAME>:<TAG> .
docker run --shm-size 50g --network=host --name=particlenet --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash
pip install -r requirements.txt
## 数据集
仅在tf-keras目录下可用
[zenodo](https://zenodo.org/records/2603256) | [SCNet高速下载通道]()
数据存储结构
```
original/
├── test.h5
├── train.h5
└── val.h5
```
运行`convert_dataset.ipynb`对数据进行处理,处理后,
```
converted/
├── test_file_0.awkd
├── train_file_0.awkd
└── val_file_0.awkd
```
提示:执行下述命令启动服务,`jupyter notebook --no-browser --ip=0.0.0.0 --allow-root`
## 训练
`keras_train.ipynb`
## 推理
`keras_train.ipynb`
## result
### 精度
所有结果均使用项目中的默认参数训练得到.
||val_acc|
|:---:|:---:|
|k100ai|0.9229|
|gpu|0.9145|
## 应用场景
### 算法类别
`ai for science`
### 热点应用行业
`高能物理,医疗,金融`
## 源码仓库及问题反馈
* https://developer.sourcefind.cn/codes/modelzoo/particlenet_tensorflow
## 参考资料
* https://github.com/hqucms/ParticleNet
# ParticleNet
Implementation of the jet classification network in [ParticleNet: Jet Tagging via Particle Clouds](https://arxiv.org/abs/1902.08570).
------
**MXNet implemetation**
- [model](mxnet/particle_net.py)
**[New] Keras/TensorFlow implemetation**
- [model](tf-keras/tf_keras_model.py)
- Requires tensorflow>=2.0.0 or >=1.15rc2.
- A full training example is available in [tf-keras/keras_train.ipynb](tf-keras/keras_train.ipynb).
- The top tagging dataset can be obtained from [https://zenodo.org/record/2603256](https://zenodo.org/record/2603256) and converted with this [script](tf-keras/convert_dataset.ipynb).
## How to use the model
#### MXNet model
The ParticleNet model can be obtained by calling the `get_particle_net` function in [particle_net.py](mxnet/particle_net.py), which can return either an MXNet `Symbol` or an MXNet Gluon `HybridBlock`. The model takes three input arrays:
- `points`: the coordinates of the particles in the (eta, phi) space. It should be an array with a shape of (N, 2, P), where N is the batch size and P is the number of particles.
- `features`: the features of the particles. It should be an array with a shape of (N, C, P), where N is the batch size, C is the number of features, and P is the number of particles.
- `mask`: a mask array with a shape of (N, 1, P), taking a value of 0 for padded positions.
To have a simple implementation for batched training on GPUs, we use fixed-length input arrays for all the inputs, although in principle the ParticleNet architecture can handle variable number of particles in each jet. Zero-padding is used for the `points` and `features` inputs such that they always have the same length, and a `mask` array is used to indicate if a position is occupied by a real particle or by a zero-padded value.
The implementation of a simplified model, ParticleNet-Lite, is also provided and can be accessed with the `get_particle_net_lite` function.
#### Keras/TensorFlow model
The use of the Keras/TensorFlow model is similar to the MXNet model. A full training example is available in [tf-keras/keras_train.ipynb](tf-keras/keras_train.ipynb).
## Citation
If you use ParticleNet in your research, please cite the paper:
@article{Qu:2019gqs,
author = "Qu, Huilin and Gouskos, Loukas",
title = "{ParticleNet: Jet Tagging via Particle Clouds}",
year = "2019",
eprint = "1902.08570",
archivePrefix = "arXiv",
primaryClass = "hep-ph",
SLACcitation = "%%CITATION = ARXIV:1902.08570;%%"
}
## Acknowledgement
The ParticleNet model is developed based on the [Dynamic Graph CNN](https://arxiv.org/abs/1801.07829) model. The implementation of the EdgeConv operation in MXNet is adapted from the author's TensorFlow [implementation](https://github.com/WangYueFt/dgcnn), and also inspired by the MXNet [implementation](https://github.com/chinakook/PointCNN.MX) of PointCNN.
icon.png

79 KB

# 模型编码
modelCode=1089
# 模型名称
modelName=particlenet_tensorflow
# 模型描述
modelDescription=对喷注准确分类
# 应用场景
appScenario=训练,推理,ai for science,高能物理,医疗,金融
# 框架类型
frameType=tensorflow
import mxnet as mx
import mxnet.gluon.nn as nn
def get_shape(x):
if isinstance(x, mx.nd.NDArray):
return x.shape
elif isinstance(x, mx.symbol.Symbol):
_, x_shape, _ = x.infer_shape_partial()
return x_shape[0]
class Dense(nn.HybridBlock):
def __init__(self, output, drop_rate=0, activation='relu'):
super(Dense, self).__init__()
self.net = nn.Dense(units=output, flatten=False)
if activation is None:
self.act = None
else:
self.act = nn.Activation(activation)
self.drop = nn.Dropout(drop_rate) if drop_rate > 0 else None
def hybrid_forward(self, F, x):
x = self.net(x)
if self.act is not None:
x = self.act(x)
if self.drop is not None:
x = self.drop(x)
return x
class BatchDistanceMatrix(nn.HybridBlock):
def __init__(self):
super(BatchDistanceMatrix, self).__init__()
def hybrid_forward(self, F, A, B):
# A shape is (N, C, P_A), B shape is (N, C, P_B)
# D shape is (N, P_A, P_B)
r_A = F.sum(A * A, axis=1, keepdims=True) # (N, 1, P_A)
r_B = F.sum(B * B, axis=1, keepdims=True) # (N, 1, P_B)
m = F.batch_dot(F.transpose(A, axes=(0, 2, 1)), B) # (N, P_A, P_B)
D = F.broadcast_add(F.broadcast_sub(F.transpose(r_A, axes=(0, 2, 1)), 2 * m), r_B)
return D
class NearestNeighborsFromIndices(nn.HybridBlock):
def __init__(self, K, cpu_mode=False):
super(NearestNeighborsFromIndices, self).__init__()
self.K = K
self.cpu_mode = cpu_mode
def hybrid_forward(self, F, topk_indices, features):
# topk_indices: (N, P, K)
# features: (N, C, P)
queries_shape = get_shape(features)
batch_size = queries_shape[0]
channel_num = queries_shape[1]
point_num = queries_shape[2]
if self.cpu_mode:
# this gives a speed-up of ~2x for CPU inference
features = F.transpose(features, (0, 2, 1)) # (N, P, C)
point_indices = topk_indices # (N, P, K)
batch_indices = F.tile(F.reshape(F.arange(batch_size), (-1, 1, 1)), (1, point_num, self.K)) # (N, P, K)
indices = F.concat(batch_indices.expand_dims(0), point_indices.expand_dims(0), dim=0) # (2, N, P, K)
nn_fts = F.gather_nd(features, indices) # (N, P, K, C)
return F.transpose(nn_fts, (0, 3, 1, 2)) # (N, C, P, K)
else:
point_indices = topk_indices.expand_dims(axis=1).tile((1, channel_num, 1, 1)) # (N, C, P, K)
batch_indices = F.tile(F.reshape(F.arange(batch_size), (-1, 1, 1, 1)), (1, channel_num, point_num, self.K)) # (N, C, P, K)
channel_indices = F.tile(F.reshape(F.arange(channel_num), (1, -1, 1, 1)), (batch_size, 1, point_num, self.K)) # (N, C, P, K)
indices = F.concat(batch_indices.expand_dims(0), channel_indices.expand_dims(0), point_indices.expand_dims(0), dim=0) # (3, N, C, P, K)
return F.gather_nd(features, indices)
class EdgeConv(nn.HybridBlock):
def __init__(self, K, channels, in_channels=0, with_bn=True, activation='relu', pooling='average', cpu_mode=False):
"""EdgeConv
Args:
K: int, number of neighbors
in_channels: # of input channels
channels: tuple of output channels
pooling: pooling method ('max' or 'average')
Inputs:
points: (N, C_p, P)
features: (N, C_0, P)
Returns:
transformed points: (N, C_out, P), C_out = channels[-1]
"""
super(EdgeConv, self).__init__()
self.K = K
self.pooling = pooling
if self.pooling not in ('max', 'average'):
raise RuntimeError('Pooling method should be "max" or "average"')
with self.name_scope():
self.batch_distance_matrix = BatchDistanceMatrix()
self.knn = NearestNeighborsFromIndices(K, cpu_mode=cpu_mode)
self.convs = []
self.bns = []
self.acts = []
for idx, C in enumerate(channels):
self.convs.append(nn.Conv2D(channels=C, kernel_size=(1, 1), strides=(1, 1), use_bias=False if with_bn else True, in_channels=2 * in_channels if idx == 0 else channels[idx - 1], weight_initializer=mx.init.Xavier(rnd_type='gaussian', factor_type="in", magnitude=2)))
self.register_child(self.convs[-1], 'conv_%d' % idx)
self.bns.append(nn.BatchNorm() if with_bn else None)
self.register_child(self.bns[-1], 'bn_%d' % idx)
self.acts.append(nn.Activation(activation) if activation else None)
self.register_child(self.acts[-1], 'act_%d' % idx)
if channels[-1] == in_channels:
self.sc_conv = None
else:
self.sc_conv = nn.Conv1D(channels=channels[-1], kernel_size=1, strides=1, use_bias=False if with_bn else True, in_channels=in_channels, weight_initializer=mx.init.Xavier(rnd_type='gaussian', factor_type="in", magnitude=2))
self.sc_bn = nn.BatchNorm() if with_bn else None
self.sc_act = nn.Activation(activation) if activation else None
def hybrid_forward(self, F, points, features):
# points: (N, C_p, P)
# features: (N, C_0, P)
# distances
D = self.batch_distance_matrix(points, points) # (N, P, P)
indices = F.topk(D, axis=-1, k=self.K + 1, ret_typ='indices', is_ascend=True, dtype='float32') # (N, P, K+1)
indices = F.slice_axis(indices, axis=-1, begin=1, end=None) # (N, P, K)
fts = features
knn_fts = self.knn(indices, fts) # (N, C, P, K)
knn_fts_center = F.tile(F.expand_dims(fts, axis=3), (1, 1, 1, self.K)) # (N, C, P, K)
knn_fts = F.concat(knn_fts_center, knn_fts - knn_fts_center, dim=1) # (N, C, P, K)
# conv
x = knn_fts
for conv, bn, act in zip(self.convs, self.bns, self.acts):
x = conv(x) # (N, C', P, K)
if bn:
x = bn(x)
if act:
x = act(x)
if self.pooling == 'max':
fts = F.max(x, axis=-1) # (N, C, P)
else:
fts = F.mean(x, axis=-1) # (N, C, P)
# shortcut
if self.sc_conv:
sc = self.sc_conv(features) # (N, C_out, P)
if self.sc_bn:
sc = self.sc_bn(sc)
else:
sc = features
if self.sc_act:
return self.sc_act(sc + fts) # (N, C_out, P)
else:
return sc + fts
class ParticleNet(nn.HybridBlock):
def __init__(self, setting, **kwargs):
super(ParticleNet, self).__init__(**kwargs)
self.conv_params = setting.conv_params
self.conv_pooling = setting.conv_pooling
self.fc_params = setting.fc_params
self.num_class = setting.num_class
with self.name_scope():
self.bn_fts = nn.BatchNorm()
self.xconvs = nn.HybridSequential()
for layer_idx, layer_param in enumerate(self.conv_params):
K, channels = layer_param
if layer_idx == 0:
in_channels = 0
else:
in_channels = self.conv_params[layer_idx - 1][1][-1]
xc = EdgeConv(K, channels, with_bn=True, activation='relu', pooling=self.conv_pooling, in_channels=in_channels, cpu_mode=getattr(setting, 'cpu_mode', False))
self.xconvs.add(xc)
if self.fc_params is not None:
self.fcs = nn.HybridSequential()
for layer_idx, layer_param in enumerate(self.fc_params):
channel_num, drop_rate = layer_param
self.fcs.add(Dense(channel_num, drop_rate))
self.fcs.add(Dense(self.num_class, activation=None))
def hybrid_forward(self, F, points, features=None, mask=None):
# points : (N, C_coord, P)
# features: (N, C_features, P)
# mask: (N, 1, P)
if mask is not None:
mask = (mask != 0) # 1 if valid
coord_shift = (mask == 0) * 99. # 99 if non-valid
fts = self.bn_fts(features)
for layer_idx, layer_param in enumerate(self.conv_params):
pts = F.broadcast_add(coord_shift, points) if layer_idx == 0 else F.broadcast_add(coord_shift, fts)
fts = self.xconvs[layer_idx](pts, fts)
if mask is not None:
fts = F.broadcast_mul(fts, mask)
pool = F.mean(fts, axis=-1) # (N, C)
if self.fc_params is not None:
logits = self.fcs(pool) # (N, num_classes)
return logits
else:
return pool
class _DotDict:
pass
def _split_batch_size(shape, n):
return (shape[0] // n,) + shape[1:]
def get_particle_net(num_classes, input_shapes=None, n_gpus=0, return_symbol=True):
r"""ParticleNet model from `"ParticleNet: Jet Tagging via Particle Clouds"
<https://arxiv.org/abs/1902.08570>`_ paper.
Parameters
----------
num_classes : int
Number of output classes.
input_shapes : dict
The shapes of each input (`points`, `features`, `mask`).
n_gpus : int, default 0
Number of GPUs used in the training; for CPU inference, set to 0.
return_symbol : bool, default True
Return a mxnet Symbol if set to True. Otherwise return a mxnet gluon HybridBlock.
"""
setting = _DotDict()
setting.num_class = num_classes
# conv_params: list of tuple in the format (K, (C1, C2, C3))
setting.conv_params = [
(16, (64, 64, 64)),
(16, (128, 128, 128)),
(16, (256, 256, 256)),
]
# conv_pooling: 'average' or 'max'
setting.conv_pooling = 'average'
# fc_params: list of tuples in the format (C, drop_rate)
setting.fc_params = [(256, 0.1)]
# cpu_mode: if running in the CPU inference mode
setting.cpu_mode = (n_gpus < 1)
net = ParticleNet(setting, prefix="ParticleNet_")
if not return_symbol:
return net
else:
net.hybridize()
n_devs = max(1, n_gpus)
points = mx.sym.var('points', shape=_split_batch_size(input_shapes['points'], n_devs))
features = mx.sym.var('features', shape=_split_batch_size(input_shapes['features'], n_devs))
mask = mx.sym.var('mask', shape=_split_batch_size(input_shapes['mask'], n_devs))
sym = net(points, features, mask)
softmax = mx.sym.SoftmaxOutput(data=sym, name='softmax')
return softmax
def get_particle_net_lite(num_classes, input_shapes=None, n_gpus=0, return_symbol=True):
r"""ParticleNet-Lite model from `"ParticleNet: Jet Tagging via Particle Clouds"
<https://arxiv.org/abs/1902.08570>`_ paper.
Parameters
----------
num_classes : int
Number of output classes.
input_shapes : dict
The shapes of each input (`points`, `features`, `mask`).
n_gpus : int, default 0
Number of GPUs used in the training; for CPU inference, set to 0.
return_symbol : bool, default True
Return a mxnet Symbol if set to True. Otherwise return a mxnet gluon HybridBlock.
"""
setting = _DotDict()
setting.num_class = num_classes
# conv_params: list of tuple in the format (K, (C1, C2, C3))
setting.conv_params = [
(7, (32, 32, 32)),
(7, (64, 64, 64)),
]
# conv_pooling: 'average' or 'max'
setting.conv_pooling = 'average'
# fc_params: list of tuples in the format (C, drop_rate)
setting.fc_params = [(128, 0.1)]
# cpu_mode: if running in the CPU inference mode
setting.cpu_mode = (n_gpus < 1)
net = ParticleNet(setting, prefix="ParticleNet_")
if not return_symbol:
return net
else:
net.hybridize()
n_devs = max(1, n_gpus)
points = mx.sym.var('points', shape=_split_batch_size(input_shapes['points'], n_devs))
features = mx.sym.var('features', shape=_split_batch_size(input_shapes['features'], n_devs))
mask = mx.sym.var('mask', shape=_split_batch_size(input_shapes['mask'], n_devs))
sym = net(points, features, mask)
softmax = mx.sym.SoftmaxOutput(data=sym, name='softmax')
return softmax
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import pandas as pd\n",
"import numpy as np\n",
"import awkward\n",
"import uproot_methods"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import logging\n",
"logging.basicConfig(level=logging.DEBUG, format='[%(asctime)s] %(levelname)s: %(message)s')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def _transform(dataframe, start=0, stop=-1, jet_size=0.8):\n",
" from collections import OrderedDict\n",
" v = OrderedDict()\n",
"\n",
" df = dataframe.iloc[start:stop]\n",
" def _col_list(prefix, max_particles=200):\n",
" return ['%s_%d'%(prefix,i) for i in range(max_particles)]\n",
" \n",
" _px = df[_col_list('PX')].values\n",
" _py = df[_col_list('PY')].values\n",
" _pz = df[_col_list('PZ')].values\n",
" _e = df[_col_list('E')].values\n",
" \n",
" mask = _e>0\n",
" n_particles = np.sum(mask, axis=1)\n",
"\n",
" px = awkward.JaggedArray.fromcounts(n_particles, _px[mask])\n",
" py = awkward.JaggedArray.fromcounts(n_particles, _py[mask])\n",
" pz = awkward.JaggedArray.fromcounts(n_particles, _pz[mask])\n",
" energy = awkward.JaggedArray.fromcounts(n_particles, _e[mask])\n",
"\n",
" p4 = uproot_methods.TLorentzVectorArray.from_cartesian(px, py, pz, energy)\n",
" pt = p4.pt\n",
"\n",
" jet_p4 = p4.sum()\n",
"\n",
" # outputs\n",
" _label = df['is_signal_new'].values\n",
" v['label'] = np.stack((_label, 1-_label), axis=-1)\n",
" v['train_val_test'] = df['ttv'].values\n",
" \n",
" v['jet_pt'] = jet_p4.pt\n",
" v['jet_eta'] = jet_p4.eta\n",
" v['jet_phi'] = jet_p4.phi\n",
" v['jet_mass'] = jet_p4.mass\n",
" v['n_parts'] = n_particles\n",
"\n",
" v['part_px'] = px\n",
" v['part_py'] = py\n",
" v['part_pz'] = pz\n",
" v['part_energy'] = energy\n",
"\n",
" v['part_pt_log'] = np.log(pt)\n",
" v['part_ptrel'] = pt/v['jet_pt']\n",
" v['part_logptrel'] = np.log(v['part_ptrel'])\n",
"\n",
" v['part_e_log'] = np.log(energy)\n",
" v['part_erel'] = energy/jet_p4.energy\n",
" v['part_logerel'] = np.log(v['part_erel'])\n",
"\n",
" v['part_raw_etarel'] = (p4.eta - v['jet_eta'])\n",
" _jet_etasign = np.sign(v['jet_eta'])\n",
" _jet_etasign[_jet_etasign==0] = 1\n",
" v['part_etarel'] = v['part_raw_etarel'] * _jet_etasign\n",
"\n",
" v['part_phirel'] = p4.delta_phi(jet_p4)\n",
" v['part_deltaR'] = np.hypot(v['part_etarel'], v['part_phirel'])\n",
"\n",
" def _make_image(var_img, rec, n_pixels = 64, img_ranges = [[-0.8, 0.8], [-0.8, 0.8]]):\n",
" wgt = rec[var_img]\n",
" x = rec['part_etarel']\n",
" y = rec['part_phirel']\n",
" img = np.zeros(shape=(len(wgt), n_pixels, n_pixels))\n",
" for i in range(len(wgt)):\n",
" hist2d, xedges, yedges = np.histogram2d(x[i], y[i], bins=[n_pixels, n_pixels], range=img_ranges, weights=wgt[i])\n",
" img[i] = hist2d\n",
" return img\n",
"\n",
"# v['img'] = _make_image('part_ptrel', v)\n",
"\n",
" return v"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def convert(source, destdir, basename, step=None, limit=None):\n",
" df = pd.read_hdf(source, key='table')\n",
" logging.info('Total events: %s' % str(df.shape[0]))\n",
" if limit is not None:\n",
" df = df.iloc[0:limit]\n",
" logging.info('Restricting to the first %s events:' % str(df.shape[0]))\n",
" if step is None:\n",
" step = df.shape[0]\n",
" idx=-1\n",
" while True:\n",
" idx+=1\n",
" start=idx*step\n",
" if start>=df.shape[0]: break\n",
" if not os.path.exists(destdir):\n",
" os.makedirs(destdir)\n",
" output = os.path.join(destdir, '%s_%d.awkd'%(basename, idx))\n",
" logging.info(output)\n",
" if os.path.exists(output):\n",
" logging.warning('... file already exist: continue ...')\n",
" continue\n",
" v=_transform(df, start=start, stop=start+step)\n",
" awkward.save(output, v, mode='x')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"srcDir = 'original'\n",
"destDir = 'converted'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# conver training file\n",
"convert(os.path.join(srcDir, 'train.h5'), destdir=destDir, basename='train_file')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# conver validation file\n",
"convert(os.path.join(srcDir, 'val.h5'), destdir=destDir, basename='val_file')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# conver testing file\n",
"convert(os.path.join(srcDir, 'test.h5'), destdir=destDir, basename='test_file')"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import awkward"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import logging\n",
"logging.basicConfig(level=logging.INFO, format='[%(asctime)s] %(levelname)s: %(message)s')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def stack_arrays(a, keys, axis=-1):\n",
" flat_arr = np.stack([a[k].flatten() for k in keys], axis=axis)\n",
" return awkward.JaggedArray.fromcounts(a[keys[0]].counts, flat_arr)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def pad_array(a, maxlen, value=0., dtype='float32'):\n",
" x = (np.ones((len(a), maxlen)) * value).astype(dtype)\n",
" for idx, s in enumerate(a):\n",
" if not len(s):\n",
" continue\n",
" trunc = s[:maxlen].astype(dtype)\n",
" x[idx, :len(trunc)] = trunc\n",
" return x"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class Dataset(object):\n",
"\n",
" def __init__(self, filepath, feature_dict = {}, label='label', pad_len=100, data_format='channel_first'):\n",
" self.filepath = filepath\n",
" self.feature_dict = feature_dict\n",
" if len(feature_dict)==0:\n",
" feature_dict['points'] = ['part_etarel', 'part_phirel']\n",
" feature_dict['features'] = ['part_pt_log', 'part_e_log', 'part_etarel', 'part_phirel']\n",
" feature_dict['mask'] = ['part_pt_log']\n",
" self.label = label\n",
" self.pad_len = pad_len\n",
" assert data_format in ('channel_first', 'channel_last')\n",
" self.stack_axis = 1 if data_format=='channel_first' else -1\n",
" self._values = {}\n",
" self._label = None\n",
" self._load()\n",
"\n",
" def _load(self):\n",
" logging.info('Start loading file %s' % self.filepath)\n",
" counts = None\n",
" with awkward.load(self.filepath) as a:\n",
" self._label = a[self.label]\n",
" for k in self.feature_dict:\n",
" cols = self.feature_dict[k]\n",
" if not isinstance(cols, (list, tuple)):\n",
" cols = [cols]\n",
" arrs = []\n",
" for col in cols:\n",
" if counts is None:\n",
" counts = a[col].counts\n",
" else:\n",
" assert np.array_equal(counts, a[col].counts)\n",
" arrs.append(pad_array(a[col], self.pad_len))\n",
" self._values[k] = np.stack(arrs, axis=self.stack_axis)\n",
" logging.info('Finished loading file %s' % self.filepath)\n",
"\n",
"\n",
" def __len__(self):\n",
" return len(self._label)\n",
"\n",
" def __getitem__(self, key):\n",
" if key==self.label:\n",
" return self._label\n",
" else:\n",
" return self._values[key]\n",
" \n",
" @property\n",
" def X(self):\n",
" return self._values\n",
" \n",
" @property\n",
" def y(self):\n",
" return self._label\n",
"\n",
" def shuffle(self, seed=None):\n",
" if seed is not None:\n",
" np.random.seed(seed)\n",
" shuffle_indices = np.arange(self.__len__())\n",
" np.random.shuffle(shuffle_indices)\n",
" for k in self._values:\n",
" self._values[k] = self._values[k][shuffle_indices]\n",
" self._label = self._label[shuffle_indices]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"train_dataset = Dataset('converted/train_file_0.awkd', data_format='channel_last')\n",
"val_dataset = Dataset('converted/val_file_0.awkd', data_format='channel_last')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import tensorflow as tf\n",
"from tensorflow import keras\n",
"from tf_keras_model import get_particle_net, get_particle_net_lite"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"model_type = 'particle_net_lite' # choose between 'particle_net' and 'particle_net_lite'\n",
"num_classes = train_dataset.y.shape[1]\n",
"input_shapes = {k:train_dataset[k].shape[1:] for k in train_dataset.X}\n",
"if 'lite' in model_type:\n",
" model = get_particle_net_lite(num_classes, input_shapes)\n",
"else:\n",
" model = get_particle_net(num_classes, input_shapes)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Training parameters\n",
"batch_size = 1024 if 'lite' in model_type else 384\n",
"epochs = 30"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def lr_schedule(epoch):\n",
" lr = 1e-3\n",
" if epoch > 10:\n",
" lr *= 0.1\n",
" elif epoch > 20:\n",
" lr *= 0.01\n",
" logging.info('Learning rate: %f'%lr)\n",
" return lr"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"model.compile(loss='categorical_crossentropy',\n",
" optimizer=keras.optimizers.Adam(learning_rate=lr_schedule(0)),\n",
" metrics=['accuracy'])\n",
"model.summary()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Prepare model model saving directory.\n",
"import os\n",
"save_dir = 'model_checkpoints'\n",
"model_name = '%s_model.{epoch:03d}.h5' % model_type\n",
"if not os.path.isdir(save_dir):\n",
" os.makedirs(save_dir)\n",
"filepath = os.path.join(save_dir, model_name)\n",
"\n",
"# Prepare callbacks for model saving and for learning rate adjustment.\n",
"checkpoint = keras.callbacks.ModelCheckpoint(filepath=filepath,\n",
" monitor='val_acc',\n",
" verbose=1,\n",
" save_best_only=True)\n",
"\n",
"lr_scheduler = keras.callbacks.LearningRateScheduler(lr_schedule)\n",
"progress_bar = keras.callbacks.ProgbarLogger()\n",
"callbacks = [checkpoint, lr_scheduler, progress_bar]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"train_dataset.shuffle()\n",
"model.fit(train_dataset.X, train_dataset.y,\n",
" batch_size=batch_size,\n",
"# epochs=epochs,\n",
" epochs=1, # --- train only for 1 epoch here for demonstration ---\n",
" validation_data=(val_dataset.X, val_dataset.y),\n",
" shuffle=True,\n",
" callbacks=callbacks)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
awkward==0.14.0
numpy==1.23.2
pandas>=0.23.4
uproot-methods>=0.7.0
# tensorflow-gpu>=2.0.0
tables
jupyter
\ No newline at end of file
import tensorflow as tf
from tensorflow import keras
# A shape is (N, P_A, C), B shape is (N, P_B, C)
# D shape is (N, P_A, P_B)
def batch_distance_matrix_general(A, B):
with tf.name_scope('dmat'):
r_A = tf.reduce_sum(A * A, axis=2, keepdims=True)
r_B = tf.reduce_sum(B * B, axis=2, keepdims=True)
m = tf.matmul(A, tf.transpose(B, perm=(0, 2, 1)))
D = r_A - 2 * m + tf.transpose(r_B, perm=(0, 2, 1))
return D
def knn(num_points, k, topk_indices, features):
# topk_indices: (N, P, K)
# features: (N, P, C)
with tf.name_scope('knn'):
queries_shape = tf.shape(features)
batch_size = queries_shape[0]
batch_indices = tf.tile(tf.reshape(tf.range(batch_size), (-1, 1, 1, 1)), (1, num_points, k, 1))
indices = tf.concat([batch_indices, tf.expand_dims(topk_indices, axis=3)], axis=3) # (N, P, K, 2)
return tf.gather_nd(features, indices)
def edge_conv(points, features, num_points, K, channels, with_bn=True, activation='relu', pooling='average', name='edgeconv'):
"""EdgeConv
Args:
K: int, number of neighbors
in_channels: # of input channels
channels: tuple of output channels
pooling: pooling method ('max' or 'average')
Inputs:
points: (N, P, C_p)
features: (N, P, C_0)
Returns:
transformed points: (N, P, C_out), C_out = channels[-1]
"""
with tf.name_scope('edgeconv'):
# distance
D = batch_distance_matrix_general(points, points) # (N, P, P)
_, indices = tf.nn.top_k(-D, k=K + 1) # (N, P, K+1)
indices = indices[:, :, 1:] # (N, P, K)
fts = features
knn_fts = knn(num_points, K, indices, fts) # (N, P, K, C)
knn_fts_center = tf.tile(tf.expand_dims(fts, axis=2), (1, 1, K, 1)) # (N, P, K, C)
knn_fts = tf.concat([knn_fts_center, tf.subtract(knn_fts, knn_fts_center)], axis=-1) # (N, P, K, 2*C)
x = knn_fts
for idx, channel in enumerate(channels):
x = keras.layers.Conv2D(channel, kernel_size=(1, 1), strides=1, data_format='channels_last',
use_bias=False if with_bn else True, kernel_initializer='glorot_normal', name='%s_conv%d' % (name, idx))(x)
if with_bn:
x = keras.layers.BatchNormalization(name='%s_bn%d' % (name, idx))(x)
if activation:
x = keras.layers.Activation(activation, name='%s_act%d' % (name, idx))(x)
if pooling == 'max':
fts = tf.reduce_max(x, axis=2) # (N, P, C')
else:
fts = tf.reduce_mean(x, axis=2) # (N, P, C')
# shortcut
sc = keras.layers.Conv2D(channels[-1], kernel_size=(1, 1), strides=1, data_format='channels_last',
use_bias=False if with_bn else True, kernel_initializer='glorot_normal', name='%s_sc_conv' % name)(tf.expand_dims(features, axis=2))
if with_bn:
sc = keras.layers.BatchNormalization(name='%s_sc_bn' % name)(sc)
sc = tf.squeeze(sc, axis=2)
if activation:
return keras.layers.Activation(activation, name='%s_sc_act' % name)(sc + fts) # (N, P, C')
else:
return sc + fts
def _particle_net_base(points, features=None, mask=None, setting=None, name='particle_net'):
# points : (N, P, C_coord)
# features: (N, P, C_features), optional
# mask: (N, P, 1), optinal
with tf.name_scope(name):
if features is None:
features = points
if mask is not None:
mask = tf.cast(tf.not_equal(mask, 0), dtype='float32') # 1 if valid
coord_shift = tf.multiply(999., tf.cast(tf.equal(mask, 0), dtype='float32')) # make non-valid positions to 99
fts = tf.squeeze(keras.layers.BatchNormalization(name='%s_fts_bn' % name)(tf.expand_dims(features, axis=2)), axis=2)
for layer_idx, layer_param in enumerate(setting.conv_params):
K, channels = layer_param
pts = tf.add(coord_shift, points) if layer_idx == 0 else tf.add(coord_shift, fts)
fts = edge_conv(pts, fts, setting.num_points, K, channels, with_bn=True, activation='relu',
pooling=setting.conv_pooling, name='%s_%s%d' % (name, 'EdgeConv', layer_idx))
if mask is not None:
fts = tf.multiply(fts, mask)
pool = tf.reduce_mean(fts, axis=1) # (N, C)
if setting.fc_params is not None:
x = pool
for layer_idx, layer_param in enumerate(setting.fc_params):
units, drop_rate = layer_param
x = keras.layers.Dense(units, activation='relu')(x)
if drop_rate is not None and drop_rate > 0:
x = keras.layers.Dropout(drop_rate)(x)
out = keras.layers.Dense(setting.num_class, activation='softmax')(x)
return out # (N, num_classes)
else:
return pool
class _DotDict:
pass
def get_particle_net(num_classes, input_shapes):
r"""ParticleNet model from `"ParticleNet: Jet Tagging via Particle Clouds"
<https://arxiv.org/abs/1902.08570>`_ paper.
Parameters
----------
num_classes : int
Number of output classes.
input_shapes : dict
The shapes of each input (`points`, `features`, `mask`).
"""
setting = _DotDict()
setting.num_class = num_classes
# conv_params: list of tuple in the format (K, (C1, C2, C3))
setting.conv_params = [
(16, (64, 64, 64)),
(16, (128, 128, 128)),
(16, (256, 256, 256)),
]
# conv_pooling: 'average' or 'max'
setting.conv_pooling = 'average'
# fc_params: list of tuples in the format (C, drop_rate)
setting.fc_params = [(256, 0.1)]
setting.num_points = input_shapes['points'][0]
points = keras.Input(name='points', shape=input_shapes['points'])
features = keras.Input(name='features', shape=input_shapes['features']) if 'features' in input_shapes else None
mask = keras.Input(name='mask', shape=input_shapes['mask']) if 'mask' in input_shapes else None
outputs = _particle_net_base(points, features, mask, setting, name='ParticleNet')
return keras.Model(inputs=[points, features, mask], outputs=outputs, name='ParticleNet')
def get_particle_net_lite(num_classes, input_shapes):
r"""ParticleNet-Lite model from `"ParticleNet: Jet Tagging via Particle Clouds"
<https://arxiv.org/abs/1902.08570>`_ paper.
Parameters
----------
num_classes : int
Number of output classes.
input_shapes : dict
The shapes of each input (`points`, `features`, `mask`).
"""
setting = _DotDict()
setting.num_class = num_classes
# conv_params: list of tuple in the format (K, (C1, C2, C3))
setting.conv_params = [
(7, (32, 32, 32)),
(7, (64, 64, 64)),
]
# conv_pooling: 'average' or 'max'
setting.conv_pooling = 'average'
# fc_params: list of tuples in the format (C, drop_rate)
setting.fc_params = [(128, 0.1)]
setting.num_points = input_shapes['points'][0]
points = keras.Input(name='points', shape=input_shapes['points'])
features = keras.Input(name='features', shape=input_shapes['features']) if 'features' in input_shapes else None
mask = keras.Input(name='mask', shape=input_shapes['mask']) if 'mask' in input_shapes else None
outputs = _particle_net_base(points, features, mask, setting, name='ParticleNet')
return keras.Model(inputs=[points, features, mask], outputs=outputs, name='ParticleNet')
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment