first init

c6a27e0b · panhb · e4b993b1 · c6a27e0b · c6a27e0b · c6a27e0b
Commit c6a27e0b authored Jan 07, 2025 by panhb
20 changed files
--- a/deploy/serving/README.md
+++ b/deploy/serving/README.md
+# 服务端预测部署
+
+`PaddleDetection`训练出来的模型可以使用[Serving](https://github.com/PaddlePaddle/Serving) 部署在服务端。  
+本教程以在COCO数据集上用`configs/yolov3/yolov3_darknet53_270e_coco.yml`算法训练的模型进行部署。  
+预训练模型权重文件为[yolov3_darknet53_270e_coco.pdparams](https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams) 。
+
+## 1. 首先验证模型
+```
+python tools/infer.py -c configs/yolov3/yolov3_darknet53_270e_coco.yml --infer_img=demo/000000014439.jpg -o use_gpu=True weights=https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams --infer_img=demo/000000014439.jpg
+```
+
+## 2. 安装 paddle serving
+请参考[PaddleServing](https://github.com/PaddlePaddle/Serving/tree/v0.7.0) 中安装教程安装（版本>=0.7.0）。
+
+## 3. 导出模型
+PaddleDetection在训练过程包括网络的前向和优化器相关参数，而在部署过程中，我们只需要前向参数，具体参考:[导出模型](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/deploy/EXPORT_MODEL.md)
+
+```
+python tools/export_model.py -c configs/yolov3/yolov3_darknet53_270e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams --export_serving_model=True
+```
+
+以上命令会在`output_inference/`文件夹下生成一个`yolov3_darknet53_270e_coco`文件夹：
+```
+output_inference
+│   ├── yolov3_darknet53_270e_coco
+│   │   ├── infer_cfg.yml
+│   │   ├── model.pdiparams
+│   │   ├── model.pdiparams.info
+│   │   ├── model.pdmodel
+│   │   ├── serving_client
+│   │   │   ├── serving_client_conf.prototxt
+│   │   │   ├── serving_client_conf.stream.prototxt
+│   │   ├── serving_server
+│   │   │   ├── __model__
+│   │   │   ├── __params__
+│   │   │   ├── serving_server_conf.prototxt
+│   │   │   ├── serving_server_conf.stream.prototxt
+│   │   │   ├── ...
+```
+
+`serving_client`文件夹下`serving_client_conf.prototxt`详细说明了模型输入输出信息
+`serving_client_conf.prototxt`文件内容为：
+```
+feed_var {
+  name: "im_shape"
+  alias_name: "im_shape"
+  is_lod_tensor: false
+  feed_type: 1
+  shape: 2
+}
+feed_var {
+  name: "image"
+  alias_name: "image"
+  is_lod_tensor: false
+  feed_type: 1
+  shape: 3
+  shape: 608
+  shape: 608
+}
+feed_var {
+  name: "scale_factor"
+  alias_name: "scale_factor"
+  is_lod_tensor: false
+  feed_type: 1
+  shape: 2
+}
+fetch_var {
+  name: "multiclass_nms3_0.tmp_0"
+  alias_name: "multiclass_nms3_0.tmp_0"
+  is_lod_tensor: true
+  fetch_type: 1
+  shape: -1
+}
+fetch_var {
+  name: "multiclass_nms3_0.tmp_2"
+  alias_name: "multiclass_nms3_0.tmp_2"
+  is_lod_tensor: false
+  fetch_type: 2
+```
+
+## 4. 启动PaddleServing服务
+
+```
+cd output_inference/yolov3_darknet53_270e_coco/
+
+# GPU
+python -m paddle_serving_server.serve --model serving_server --port 9393 --gpu_ids 0
+
+# CPU
+python -m paddle_serving_server.serve --model serving_server --port 9393
+```
+
+## 5. 测试部署的服务
+准备`label_list.txt`文件，示例`label_list.txt`文件内容为
+```
+person
+bicycle
+car
+motorcycle
+airplane
+bus
+train
+truck
+boat
+traffic light
+fire hydrant
+stop sign
+parking meter
+bench
+bird
+cat
+dog
+horse
+sheep
+cow
+elephant
+bear
+zebra
+giraffe
+backpack
+umbrella
+handbag
+tie
+suitcase
+frisbee
+skis
+snowboard
+sports ball
+kite
+baseball bat
+baseball glove
+skateboard
+surfboard
+tennis racket
+bottle
+wine glass
+cup
+fork
+knife
+spoon
+bowl
+banana
+apple
+sandwich
+orange
+broccoli
+carrot
+hot dog
+pizza
+donut
+cake
+chair
+couch
+potted plant
+bed
+dining table
+toilet
+tv
+laptop
+mouse
+remote
+keyboard
+cell phone
+microwave
+oven
+toaster
+sink
+refrigerator
+book
+clock
+vase
+scissors
+teddy bear
+hair drier
+toothbrush
+```
+
+设置`prototxt`文件路径为`serving_client/serving_client_conf.prototxt`
+设置`fetch`为`fetch=["multiclass_nms3_0.tmp_0"])`
+
+测试
+```
+# 进入目录
+cd output_inference/yolov3_darknet53_270e_coco/
+
+# 测试代码 test_client.py 会自动创建output文件夹，并在output下生成`bbox.json`和`000000014439.jpg`两个文件
+python ../../deploy/serving/test_client.py ../../deploy/serving/label_list.txt ../../demo/000000014439.jpg
+```
--- a/deploy/serving/cpp/README.md
+++ b/deploy/serving/cpp/README.md
+# C++ Serving预测部署
+
+## 1. 简介
+Paddle Serving是飞桨开源的服务化部署框架，提供了C++ Serving和Python Pipeline两套框架，
+C++ Serving框架更倾向于追求极致性能，Python Pipeline框架倾向于二次开发的便捷性。
+旨在帮助深度学习开发者和企业提供高性能、灵活易用的工业级在线推理服务，助力人工智能落地应用。
+
+更多关于Paddle Serving的介绍，可以参考[Paddle Serving官网repo](https://github.com/PaddlePaddle/Serving)。
+
+本文档主要介绍利用C++ Serving框架实现模型（以yolov3_darknet53_270e_coco为例）的服务化部署。
+
+## 2. C++ Serving预测部署
+
+#### 2.1 C++ 服务化部署样例程序介绍
+服务化部署的样例程序的目录地址为：`deploy/serving/cpp`
+```shell
+deploy/
+├── serving/
+│   ├── python/                       # Python 服务化部署样例程序目录
+│   │   ├──config.yml                 # 服务端模型预测相关配置文件
+│   │   ├──pipeline_http_client.py    # 客户端代码
+│   │   ├──postprocess_ops.py         # 用户自定义后处理代码
+│   │   ├──preprocess_ops.py          # 用户自定义预处理代码
+│   │   ├──README.md                  # 说明文档
+│   │   ├──web_service.py             # 服务端代码
+│   ├── cpp/                          # C++ 服务化部署样例程序目录
+│   │   ├──preprocess/                # C++ 自定义OP
+│   │   ├──build_server.sh            # C++ Serving 编译脚本
+│   │   ├──serving_client.py          # 客户端代码
+│   │   └── ...
+│   └── ...
+└── ...
+```
+
+### 2.2 环境准备
+安装Paddle Serving三个安装包的最新版本，
+分别是：paddle-serving-client, paddle-serving-app和paddlepaddle(CPU/GPU版本二选一)。
+```commandline
+pip install paddle-serving-client
+# pip install paddle-serving-server # CPU
+pip install paddle-serving-server-gpu # GPU 默认 CUDA10.2 + TensorRT6，其他环境需手动指定版本号
+pip install paddle-serving-app
+# pip install paddlepaddle # CPU
+pip install paddlepaddle-gpu
+```
+您可能需要使用国内镜像源（例如百度源, 在pip命令中添加`-i https://mirror.baidu.com/pypi/simple`）来加速下载。
+Paddle Serving Server更多不同运行环境的whl包下载地址，请参考：[下载页面](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Latest_Packages_CN.md)
+PaddlePaddle更多版本请参考[官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)
+
+### 2.3 服务化部署模型导出
+导出步骤参考文档[PaddleDetection部署模型导出教程](../../EXPORT_MODEL.md),
+导出服务化部署模型需要添加`--export_serving_model True`参数，导出示例如下:
+```commandline
+python tools/export_model.py -c configs/yolov3/yolov3_darknet53_270e_coco.yml \
+                             --export_serving_model True \
+                             -o weights=https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams
+```
+
+### 2.4 编译C++ Serving & 启动服务端模型预测服务
+可使用一键编译脚本`deploy/serving/cpp/build_server.sh`进行编译
+```commandline
+bash deploy/serving/cpp/build_server.sh
+```
+当完成以上编译安装和模型导出后，可以按如下命令启动模型预测服务：
+```commandline
+python -m paddle_serving_server.serve --model output_inference/yolov3_darknet53_270e_coco/serving_server --op yolov3_darknet53_270e_coco --port 9997 &
+```
+如果需要自定义开发OP，请参考[文档](https://github.com/PaddlePaddle/Serving/blob/v0.8.3/doc/C%2B%2B_Serving/2%2B_model.md)进行开发
+
+### 2.5 启动客户端访问
+当成功启动了模型预测服务，可以按如下命令启动客户端访问服务：
+```commandline
+python deploy/serving/python/serving_client.py --serving_client output_inference/yolov3_darknet53_270e_coco/serving_client --image_file demo/000000014439.jpg --http_port 9997
+```
--- a/deploy/serving/cpp/build_server.sh
+++ b/deploy/serving/cpp/build_server.sh
+#使用镜像：
+#registry.baidubce.com/paddlepaddle/paddle:latest-dev-cuda10.1-cudnn7-gcc82
+
+#编译Serving Server：
+
+#client和app可以直接使用release版本
+
+#server因为加入了自定义OP，需要重新编译
+
+apt-get update
+apt install -y libcurl4-openssl-dev libbz2-dev
+wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && tar xf centos_ssl.tar && rm -rf centos_ssl.tar && mv libcrypto.so.1.0.2k /usr/lib/libcrypto.so.1.0.2k && mv libssl.so.1.0.2k /usr/lib/libssl.so.1.0.2k && ln -sf /usr/lib/libcrypto.so.1.0.2k /usr/lib/libcrypto.so.10 && ln -sf /usr/lib/libssl.so.1.0.2k /usr/lib/libssl.so.10 && ln -sf /usr/lib/libcrypto.so.10 /usr/lib/libcrypto.so && ln -sf /usr/lib/libssl.so.10 /usr/lib/libssl.so
+
+# 安装go依赖
+rm -rf /usr/local/go
+wget -qO- https://paddle-ci.cdn.bcebos.com/go1.17.2.linux-amd64.tar.gz | tar -xz -C /usr/local
+export GOROOT=/usr/local/go
+export GOPATH=/root/gopath
+export PATH=$PATH:$GOPATH/bin:$GOROOT/bin
+go env -w GO111MODULE=on
+go env -w GOPROXY=https://goproxy.cn,direct
+go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway@v1.15.2
+go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger@v1.15.2
+go install github.com/golang/protobuf/protoc-gen-go@v1.4.3
+go install google.golang.org/grpc@v1.33.0
+go env -w GO111MODULE=auto
+
+# 下载opencv库
+wget https://paddle-qa.bj.bcebos.com/PaddleServing/opencv3.tar.gz && tar -xvf opencv3.tar.gz && rm -rf opencv3.tar.gz
+export OPENCV_DIR=$PWD/opencv3
+
+# clone Serving
+git clone https://github.com/PaddlePaddle/Serving.git -b develop --depth=1
+cd Serving
+export Serving_repo_path=$PWD
+git submodule update --init --recursive
+python -m pip install -r python/requirements.txt
+
+# set env
+export PYTHON_INCLUDE_DIR=$(python -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())")
+export PYTHON_LIBRARIES=$(python -c "import distutils.sysconfig as sysconfig; print(sysconfig.get_config_var('LIBDIR'))")
+export PYTHON_EXECUTABLE=`which python`
+
+export CUDA_PATH='/usr/local/cuda'
+export CUDNN_LIBRARY='/usr/local/cuda/lib64/'
+export CUDA_CUDART_LIBRARY='/usr/local/cuda/lib64/'
+export TENSORRT_LIBRARY_PATH='/usr/local/TensorRT6-cuda10.1-cudnn7/targets/x86_64-linux-gnu/'
+
+# cp 自定义OP代码
+\cp ../deploy/serving/cpp/preprocess/*.h ${Serving_repo_path}/core/general-server/op
+\cp ../deploy/serving/cpp/preprocess/*.cpp ${Serving_repo_path}/core/general-server/op
+
+# 编译Server, export SERVING_BIN
+mkdir server-build-gpu-opencv && cd server-build-gpu-opencv
+cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
+            -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
+            -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
+            -DCUDA_TOOLKIT_ROOT_DIR=${CUDA_PATH} \
+            -DCUDNN_LIBRARY=${CUDNN_LIBRARY} \
+            -DCUDA_CUDART_LIBRARY=${CUDA_CUDART_LIBRARY} \
+            -DTENSORRT_ROOT=${TENSORRT_LIBRARY_PATH} \
+            -DOPENCV_DIR=${OPENCV_DIR} \
+            -DWITH_OPENCV=ON \
+            -DSERVER=ON \
+            -DWITH_GPU=ON ..
+make -j32
+
+python -m pip install python/dist/paddle*
+export SERVING_BIN=$PWD/core/general-server/serving
+cd ../../
--- a/deploy/serving/cpp/preprocess/mask_rcnn_r50_fpn_1x_coco.cpp
+++ b/deploy/serving/cpp/preprocess/mask_rcnn_r50_fpn_1x_coco.cpp
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "core/general-server/op/mask_rcnn_r50_fpn_1x_coco.h"
+#include "core/predictor/framework/infer.h"
+#include "core/predictor/framework/memory.h"
+#include "core/predictor/framework/resource.h"
+#include "core/util/include/timer.h"
+#include <algorithm>
+#include <iostream>
+#include <memory>
+#include <sstream>
+
+namespace baidu {
+namespace paddle_serving {
+namespace serving {
+
+using baidu::paddle_serving::Timer;
+using baidu::paddle_serving::predictor::InferManager;
+using baidu::paddle_serving::predictor::MempoolWrapper;
+using baidu::paddle_serving::predictor::PaddleGeneralModelConfig;
+using baidu::paddle_serving::predictor::general_model::Request;
+using baidu::paddle_serving::predictor::general_model::Response;
+using baidu::paddle_serving::predictor::general_model::Tensor;
+
+int mask_rcnn_r50_fpn_1x_coco::inference() {
+  VLOG(2) << "Going to run inference";
+  const std::vector<std::string> pre_node_names = pre_names();
+  if (pre_node_names.size() != 1) {
+    LOG(ERROR) << "This op(" << op_name()
+               << ") can only have one predecessor op, but received "
+               << pre_node_names.size();
+    return -1;
+  }
+  const std::string pre_name = pre_node_names[0];
+
+  const GeneralBlob *input_blob = get_depend_argument<GeneralBlob>(pre_name);
+  if (!input_blob) {
+    LOG(ERROR) << "input_blob is nullptr,error";
+    return -1;
+  }
+  uint64_t log_id = input_blob->GetLogId();
+  VLOG(2) << "(logid=" << log_id << ") Get precedent op name: " << pre_name;
+
+  GeneralBlob *output_blob = mutable_data<GeneralBlob>();
+  if (!output_blob) {
+    LOG(ERROR) << "output_blob is nullptr,error";
+    return -1;
+  }
+  output_blob->SetLogId(log_id);
+
+  if (!input_blob) {
+    LOG(ERROR) << "(logid=" << log_id
+               << ") Failed mutable depended argument, op:" << pre_name;
+    return -1;
+  }
+
+  const TensorVector *in = &input_blob->tensor_vector;
+  TensorVector *out = &output_blob->tensor_vector;
+
+  int batch_size = input_blob->_batch_size;
+  output_blob->_batch_size = batch_size;
+  VLOG(2) << "(logid=" << log_id << ") infer batch size: " << batch_size;
+
+  Timer timeline;
+  int64_t start = timeline.TimeStampUS();
+  timeline.Start();
+
+  // only support string type
+  char *total_input_ptr = static_cast<char *>(in->at(0).data.data());
+  std::string base64str = total_input_ptr;
+
+  cv::Mat img = Base2Mat(base64str);
+  cv::cvtColor(img, img, cv::COLOR_BGR2RGB);
+
+  // preprocess
+  Resize(&img, scale_factor_h, scale_factor_w, im_shape_h, im_shape_w);
+  Normalize(&img, mean_, scale_, is_scale_);
+  PadStride(&img, 32);
+  int input_shape_h = img.rows;
+  int input_shape_w = img.cols;
+  std::vector<float> input(1 * 3 * input_shape_h * input_shape_w, 0.0f);
+  Permute(img, input.data());
+
+  // create real_in
+  TensorVector *real_in = new TensorVector();
+  if (!real_in) {
+    LOG(ERROR) << "real_in is nullptr,error";
+    return -1;
+  }
+
+  int in_num = 0;
+  size_t databuf_size = 0;
+  void *databuf_data = NULL;
+  char *databuf_char = NULL;
+
+  // im_shape
+  std::vector<float> im_shape{static_cast<float>(im_shape_h),
+                              static_cast<float>(im_shape_w)};
+  databuf_size = 2 * sizeof(float);
+
+  databuf_data = MempoolWrapper::instance().malloc(databuf_size);
+  if (!databuf_data) {
+    LOG(ERROR) << "Malloc failed, size: " << databuf_size;
+    return -1;
+  }
+
+  memcpy(databuf_data, im_shape.data(), databuf_size);
+  databuf_char = reinterpret_cast<char *>(databuf_data);
+  paddle::PaddleBuf paddleBuf_0(databuf_char, databuf_size);
+  paddle::PaddleTensor tensor_in_0;
+  tensor_in_0.name = "im_shape";
+  tensor_in_0.dtype = paddle::PaddleDType::FLOAT32;
+  tensor_in_0.shape = {1, 2};
+  tensor_in_0.lod = in->at(0).lod;
+  tensor_in_0.data = paddleBuf_0;
+  real_in->push_back(tensor_in_0);
+
+  // image
+  in_num = 1 * 3 * input_shape_h * input_shape_w;
+  databuf_size = in_num * sizeof(float);
+
+  databuf_data = MempoolWrapper::instance().malloc(databuf_size);
+  if (!databuf_data) {
+    LOG(ERROR) << "Malloc failed, size: " << databuf_size;
+    return -1;
+  }
+
+  memcpy(databuf_data, input.data(), databuf_size);
+  databuf_char = reinterpret_cast<char *>(databuf_data);
+  paddle::PaddleBuf paddleBuf_1(databuf_char, databuf_size);
+  paddle::PaddleTensor tensor_in_1;
+  tensor_in_1.name = "image";
+  tensor_in_1.dtype = paddle::PaddleDType::FLOAT32;
+  tensor_in_1.shape = {1, 3, input_shape_h, input_shape_w};
+  tensor_in_1.lod = in->at(0).lod;
+  tensor_in_1.data = paddleBuf_1;
+  real_in->push_back(tensor_in_1);
+
+  // scale_factor
+  std::vector<float> scale_factor{scale_factor_h, scale_factor_w};
+  databuf_size = 2 * sizeof(float);
+
+  databuf_data = MempoolWrapper::instance().malloc(databuf_size);
+  if (!databuf_data) {
+    LOG(ERROR) << "Malloc failed, size: " << databuf_size;
+    return -1;
+  }
+
+  memcpy(databuf_data, scale_factor.data(), databuf_size);
+  databuf_char = reinterpret_cast<char *>(databuf_data);
+  paddle::PaddleBuf paddleBuf_2(databuf_char, databuf_size);
+  paddle::PaddleTensor tensor_in_2;
+  tensor_in_2.name = "scale_factor";
+  tensor_in_2.dtype = paddle::PaddleDType::FLOAT32;
+  tensor_in_2.shape = {1, 2};
+  tensor_in_2.lod = in->at(0).lod;
+  tensor_in_2.data = paddleBuf_2;
+  real_in->push_back(tensor_in_2);
+
+  if (InferManager::instance().infer(engine_name().c_str(), real_in, out,
+                                     batch_size)) {
+    LOG(ERROR) << "(logid=" << log_id
+               << ") Failed do infer in fluid model: " << engine_name().c_str();
+    return -1;
+  }
+
+  int64_t end = timeline.TimeStampUS();
+  CopyBlobInfo(input_blob, output_blob);
+  AddBlobInfo(output_blob, start);
+  AddBlobInfo(output_blob, end);
+  return 0;
+}
+
+void mask_rcnn_r50_fpn_1x_coco::Resize(cv::Mat *img, float &scale_factor_h,
+                                       float &scale_factor_w, int &im_shape_h,
+                                       int &im_shape_w) {
+  // keep_ratio
+  int im_size_max = std::max(img->rows, img->cols);
+  int im_size_min = std::min(img->rows, img->cols);
+  int target_size_max = std::max(im_shape_h, im_shape_w);
+  int target_size_min = std::min(im_shape_h, im_shape_w);
+  float scale_min =
+      static_cast<float>(target_size_min) / static_cast<float>(im_size_min);
+  float scale_max =
+      static_cast<float>(target_size_max) / static_cast<float>(im_size_max);
+  float scale_ratio = std::min(scale_min, scale_max);
+
+  // scale_factor
+  scale_factor_h = scale_ratio;
+  scale_factor_w = scale_ratio;
+
+  // Resize
+  cv::resize(*img, *img, cv::Size(), scale_ratio, scale_ratio, 2);
+  im_shape_h = img->rows;
+  im_shape_w = img->cols;
+}
+
+void mask_rcnn_r50_fpn_1x_coco::Normalize(cv::Mat *img,
+                                          const std::vector<float> &mean,
+                                          const std::vector<float> &scale,
+                                          const bool is_scale) {
+  // Normalize
+  double e = 1.0;
+  if (is_scale) {
+    e /= 255.0;
+  }
+  (*img).convertTo(*img, CV_32FC3, e);
+  for (int h = 0; h < img->rows; h++) {
+    for (int w = 0; w < img->cols; w++) {
+      img->at<cv::Vec3f>(h, w)[0] =
+          (img->at<cv::Vec3f>(h, w)[0] - mean[0]) / scale[0];
+      img->at<cv::Vec3f>(h, w)[1] =
+          (img->at<cv::Vec3f>(h, w)[1] - mean[1]) / scale[1];
+      img->at<cv::Vec3f>(h, w)[2] =
+          (img->at<cv::Vec3f>(h, w)[2] - mean[2]) / scale[2];
+    }
+  }
+}
+
+void mask_rcnn_r50_fpn_1x_coco::PadStride(cv::Mat *img, int stride_) {
+  // PadStride
+  if (stride_ <= 0)
+    return;
+  int rh = img->rows;
+  int rw = img->cols;
+  int nh = (rh / stride_) * stride_ + (rh % stride_ != 0) * stride_;
+  int nw = (rw / stride_) * stride_ + (rw % stride_ != 0) * stride_;
+  cv::copyMakeBorder(*img, *img, 0, nh - rh, 0, nw - rw, cv::BORDER_CONSTANT,
+                     cv::Scalar(0));
+}
+
+void mask_rcnn_r50_fpn_1x_coco::Permute(const cv::Mat &img, float *data) {
+  // Permute
+  int rh = img.rows;
+  int rw = img.cols;
+  int rc = img.channels();
+  for (int i = 0; i < rc; ++i) {
+    cv::extractChannel(img, cv::Mat(rh, rw, CV_32FC1, data + i * rh * rw), i);
+  }
+}
+
+cv::Mat mask_rcnn_r50_fpn_1x_coco::Base2Mat(std::string &base64_data) {
+  cv::Mat img;
+  std::string s_mat;
+  s_mat = base64Decode(base64_data.data(), base64_data.size());
+  std::vector<char> base64_img(s_mat.begin(), s_mat.end());
+  img = cv::imdecode(base64_img, cv::IMREAD_COLOR); // CV_LOAD_IMAGE_COLOR
+  return img;
+}
+
+std::string mask_rcnn_r50_fpn_1x_coco::base64Decode(const char *Data,
+                                                    int DataByte) {
+  const char DecodeTable[] = {
+      0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
+      0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
+      0,  0,  0,  0,  0,  0,  0,  0,  0,
+      62, // '+'
+      0,  0,  0,
+      63,                                     // '/'
+      52, 53, 54, 55, 56, 57, 58, 59, 60, 61, // '0'-'9'
+      0,  0,  0,  0,  0,  0,  0,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9,
+      10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, // 'A'-'Z'
+      0,  0,  0,  0,  0,  0,  26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
+      37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, // 'a'-'z'
+  };
+
+  std::string strDecode;
+  int nValue;
+  int i = 0;
+  while (i < DataByte) {
+    if (*Data != '\r' && *Data != '\n') {
+      nValue = DecodeTable[*Data++] << 18;
+      nValue += DecodeTable[*Data++] << 12;
+      strDecode += (nValue & 0x00FF0000) >> 16;
+      if (*Data != '=') {
+        nValue += DecodeTable[*Data++] << 6;
+        strDecode += (nValue & 0x0000FF00) >> 8;
+        if (*Data != '=') {
+          nValue += DecodeTable[*Data++];
+          strDecode += nValue & 0x000000FF;
+        }
+      }
+      i += 4;
+    } else // 回车换行,跳过
+    {
+      Data++;
+      i++;
+    }
+  }
+  return strDecode;
+}
+
+DEFINE_OP(mask_rcnn_r50_fpn_1x_coco);
+
+} // namespace serving
+} // namespace paddle_serving
+} // namespace baidu
--- a/deploy/serving/cpp/preprocess/mask_rcnn_r50_fpn_1x_coco.h
+++ b/deploy/serving/cpp/preprocess/mask_rcnn_r50_fpn_1x_coco.h
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#pragma once
+#include "core/general-server/general_model_service.pb.h"
+#include "core/general-server/op/general_infer_helper.h"
+#include "paddle_inference_api.h" // NOLINT
+#include <string>
+#include <vector>
+
+#include "opencv2/core.hpp"
+#include "opencv2/imgcodecs.hpp"
+#include "opencv2/imgproc.hpp"
+#include <chrono>
+#include <iomanip>
+#include <iostream>
+#include <ostream>
+#include <vector>
+
+#include <cstring>
+#include <fstream>
+#include <numeric>
+
+namespace baidu {
+namespace paddle_serving {
+namespace serving {
+
+class mask_rcnn_r50_fpn_1x_coco
+    : public baidu::paddle_serving::predictor::OpWithChannel<GeneralBlob> {
+public:
+  typedef std::vector<paddle::PaddleTensor> TensorVector;
+
+  DECLARE_OP(mask_rcnn_r50_fpn_1x_coco);
+
+  int inference();
+
+private:
+  // preprocess
+  std::vector<float> mean_ = {0.485f, 0.456f, 0.406f};
+  std::vector<float> scale_ = {0.229f, 0.224f, 0.225f};
+  bool is_scale_ = true;
+  int im_shape_h = 1333;
+  int im_shape_w = 800;
+  float scale_factor_h = 1.0f;
+  float scale_factor_w = 1.0f;
+
+  void Resize(cv::Mat *img, float &scale_factor_h, float &scale_factor_w,
+              int &im_shape_h, int &im_shape_w);
+  void Normalize(cv::Mat *img, const std::vector<float> &mean,
+                 const std::vector<float> &scale, const bool is_scale);
+  void PadStride(cv::Mat *img, int stride_ = -1);
+  void Permute(const cv::Mat &img, float *data);
+
+  // read pics
+  cv::Mat Base2Mat(std::string &base64_data);
+  std::string base64Decode(const char *Data, int DataByte);
+};
+
+} // namespace serving
+} // namespace paddle_serving
+} // namespace baidu
--- a/deploy/serving/cpp/preprocess/picodet_lcnet_1_5x_416_coco.cpp
+++ b/deploy/serving/cpp/preprocess/picodet_lcnet_1_5x_416_coco.cpp
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "core/general-server/op/picodet_lcnet_1_5x_416_coco.h"
+#include "core/predictor/framework/infer.h"
+#include "core/predictor/framework/memory.h"
+#include "core/predictor/framework/resource.h"
+#include "core/util/include/timer.h"
+#include <algorithm>
+#include <iostream>
+#include <memory>
+#include <sstream>
+
+namespace baidu {
+namespace paddle_serving {
+namespace serving {
+
+using baidu::paddle_serving::Timer;
+using baidu::paddle_serving::predictor::InferManager;
+using baidu::paddle_serving::predictor::MempoolWrapper;
+using baidu::paddle_serving::predictor::PaddleGeneralModelConfig;
+using baidu::paddle_serving::predictor::general_model::Request;
+using baidu::paddle_serving::predictor::general_model::Response;
+using baidu::paddle_serving::predictor::general_model::Tensor;
+
+int picodet_lcnet_1_5x_416_coco::inference() {
+  VLOG(2) << "Going to run inference";
+  const std::vector<std::string> pre_node_names = pre_names();
+  if (pre_node_names.size() != 1) {
+    LOG(ERROR) << "This op(" << op_name()
+               << ") can only have one predecessor op, but received "
+               << pre_node_names.size();
+    return -1;
+  }
+  const std::string pre_name = pre_node_names[0];
+
+  const GeneralBlob *input_blob = get_depend_argument<GeneralBlob>(pre_name);
+  if (!input_blob) {
+    LOG(ERROR) << "input_blob is nullptr,error";
+    return -1;
+  }
+  uint64_t log_id = input_blob->GetLogId();
+  VLOG(2) << "(logid=" << log_id << ") Get precedent op name: " << pre_name;
+
+  GeneralBlob *output_blob = mutable_data<GeneralBlob>();
+  if (!output_blob) {
+    LOG(ERROR) << "output_blob is nullptr,error";
+    return -1;
+  }
+  output_blob->SetLogId(log_id);
+
+  if (!input_blob) {
+    LOG(ERROR) << "(logid=" << log_id
+               << ") Failed mutable depended argument, op:" << pre_name;
+    return -1;
+  }
+
+  const TensorVector *in = &input_blob->tensor_vector;
+  TensorVector *out = &output_blob->tensor_vector;
+
+  int batch_size = input_blob->_batch_size;
+  output_blob->_batch_size = batch_size;
+  VLOG(2) << "(logid=" << log_id << ") infer batch size: " << batch_size;
+
+  Timer timeline;
+  int64_t start = timeline.TimeStampUS();
+  timeline.Start();
+
+  // only support string type
+  char *total_input_ptr = static_cast<char *>(in->at(0).data.data());
+  std::string base64str = total_input_ptr;
+
+  cv::Mat img = Base2Mat(base64str);
+  cv::cvtColor(img, img, cv::COLOR_BGR2RGB);
+
+  // preprocess
+  std::vector<float> input(1 * 3 * im_shape_h * im_shape_w, 0.0f);
+  preprocess_det(img, input.data(), scale_factor_h, scale_factor_w, im_shape_h,
+                 im_shape_w, mean_, scale_, is_scale_);
+
+  // create real_in
+  TensorVector *real_in = new TensorVector();
+  if (!real_in) {
+    LOG(ERROR) << "real_in is nullptr,error";
+    return -1;
+  }
+
+  int in_num = 0;
+  size_t databuf_size = 0;
+  void *databuf_data = NULL;
+  char *databuf_char = NULL;
+
+  // image
+  in_num = 1 * 3 * im_shape_h * im_shape_w;
+  databuf_size = in_num * sizeof(float);
+
+  databuf_data = MempoolWrapper::instance().malloc(databuf_size);
+  if (!databuf_data) {
+    LOG(ERROR) << "Malloc failed, size: " << databuf_size;
+    return -1;
+  }
+
+  memcpy(databuf_data, input.data(), databuf_size);
+  databuf_char = reinterpret_cast<char *>(databuf_data);
+  paddle::PaddleBuf paddleBuf(databuf_char, databuf_size);
+  paddle::PaddleTensor tensor_in;
+  tensor_in.name = "image";
+  tensor_in.dtype = paddle::PaddleDType::FLOAT32;
+  tensor_in.shape = {1, 3, im_shape_h, im_shape_w};
+  tensor_in.lod = in->at(0).lod;
+  tensor_in.data = paddleBuf;
+  real_in->push_back(tensor_in);
+
+  // scale_factor
+  std::vector<float> scale_factor{scale_factor_h, scale_factor_w};
+  databuf_size = 2 * sizeof(float);
+
+  databuf_data = MempoolWrapper::instance().malloc(databuf_size);
+  if (!databuf_data) {
+    LOG(ERROR) << "Malloc failed, size: " << databuf_size;
+    return -1;
+  }
+
+  memcpy(databuf_data, scale_factor.data(), databuf_size);
+  databuf_char = reinterpret_cast<char *>(databuf_data);
+  paddle::PaddleBuf paddleBuf_2(databuf_char, databuf_size);
+  paddle::PaddleTensor tensor_in_2;
+  tensor_in_2.name = "scale_factor";
+  tensor_in_2.dtype = paddle::PaddleDType::FLOAT32;
+  tensor_in_2.shape = {1, 2};
+  tensor_in_2.lod = in->at(0).lod;
+  tensor_in_2.data = paddleBuf_2;
+  real_in->push_back(tensor_in_2);
+
+  if (InferManager::instance().infer(engine_name().c_str(), real_in, out,
+                                     batch_size)) {
+    LOG(ERROR) << "(logid=" << log_id
+               << ") Failed do infer in fluid model: " << engine_name().c_str();
+    return -1;
+  }
+
+  int64_t end = timeline.TimeStampUS();
+  CopyBlobInfo(input_blob, output_blob);
+  AddBlobInfo(output_blob, start);
+  AddBlobInfo(output_blob, end);
+  return 0;
+}
+
+void picodet_lcnet_1_5x_416_coco::preprocess_det(
+    const cv::Mat &img, float *data, float &scale_factor_h,
+    float &scale_factor_w, int im_shape_h, int im_shape_w,
+    const std::vector<float> &mean, const std::vector<float> &scale,
+    const bool is_scale) {
+  // scale_factor
+  scale_factor_h =
+      static_cast<float>(im_shape_h) / static_cast<float>(img.rows);
+  scale_factor_w =
+      static_cast<float>(im_shape_w) / static_cast<float>(img.cols);
+
+  // Resize
+  cv::Mat resize_img;
+  cv::resize(img, resize_img, cv::Size(im_shape_w, im_shape_h), 0, 0, 2);
+
+  // Normalize
+  double e = 1.0;
+  if (is_scale) {
+    e /= 255.0;
+  }
+  cv::Mat img_fp;
+  (resize_img).convertTo(img_fp, CV_32FC3, e);
+  for (int h = 0; h < im_shape_h; h++) {
+    for (int w = 0; w < im_shape_w; w++) {
+      img_fp.at<cv::Vec3f>(h, w)[0] =
+          (img_fp.at<cv::Vec3f>(h, w)[0] - mean[0]) / scale[0];
+      img_fp.at<cv::Vec3f>(h, w)[1] =
+          (img_fp.at<cv::Vec3f>(h, w)[1] - mean[1]) / scale[1];
+      img_fp.at<cv::Vec3f>(h, w)[2] =
+          (img_fp.at<cv::Vec3f>(h, w)[2] - mean[2]) / scale[2];
+    }
+  }
+
+  // Permute
+  int rh = img_fp.rows;
+  int rw = img_fp.cols;
+  int rc = img_fp.channels();
+  for (int i = 0; i < rc; ++i) {
+    cv::extractChannel(img_fp, cv::Mat(rh, rw, CV_32FC1, data + i * rh * rw),
+                       i);
+  }
+}
+
+cv::Mat picodet_lcnet_1_5x_416_coco::Base2Mat(std::string &base64_data) {
+  cv::Mat img;
+  std::string s_mat;
+  s_mat = base64Decode(base64_data.data(), base64_data.size());
+  std::vector<char> base64_img(s_mat.begin(), s_mat.end());
+  img = cv::imdecode(base64_img, cv::IMREAD_COLOR); // CV_LOAD_IMAGE_COLOR
+  return img;
+}
+
+std::string picodet_lcnet_1_5x_416_coco::base64Decode(const char *Data,
+                                                      int DataByte) {
+  const char DecodeTable[] = {
+      0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
+      0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
+      0,  0,  0,  0,  0,  0,  0,  0,  0,
+      62, // '+'
+      0,  0,  0,
+      63,                                     // '/'
+      52, 53, 54, 55, 56, 57, 58, 59, 60, 61, // '0'-'9'
+      0,  0,  0,  0,  0,  0,  0,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9,
+      10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, // 'A'-'Z'
+      0,  0,  0,  0,  0,  0,  26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
+      37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, // 'a'-'z'
+  };
+
+  std::string strDecode;
+  int nValue;
+  int i = 0;
+  while (i < DataByte) {
+    if (*Data != '\r' && *Data != '\n') {
+      nValue = DecodeTable[*Data++] << 18;
+      nValue += DecodeTable[*Data++] << 12;
+      strDecode += (nValue & 0x00FF0000) >> 16;
+      if (*Data != '=') {
+        nValue += DecodeTable[*Data++] << 6;
+        strDecode += (nValue & 0x0000FF00) >> 8;
+        if (*Data != '=') {
+          nValue += DecodeTable[*Data++];
+          strDecode += nValue & 0x000000FF;
+        }
+      }
+      i += 4;
+    } else // 回车换行,跳过
+    {
+      Data++;
+      i++;
+    }
+  }
+  return strDecode;
+}
+
+DEFINE_OP(picodet_lcnet_1_5x_416_coco);
+
+} // namespace serving
+} // namespace paddle_serving
+} // namespace baidu
--- a/deploy/serving/cpp/preprocess/picodet_lcnet_1_5x_416_coco.h
+++ b/deploy/serving/cpp/preprocess/picodet_lcnet_1_5x_416_coco.h
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#pragma once
+#include "core/general-server/general_model_service.pb.h"
+#include "core/general-server/op/general_infer_helper.h"
+#include "paddle_inference_api.h" // NOLINT
+#include <string>
+#include <vector>
+
+#include "opencv2/core.hpp"
+#include "opencv2/imgcodecs.hpp"
+#include "opencv2/imgproc.hpp"
+#include <chrono>
+#include <iomanip>
+#include <iostream>
+#include <ostream>
+#include <vector>
+
+#include <cstring>
+#include <fstream>
+#include <numeric>
+
+namespace baidu {
+namespace paddle_serving {
+namespace serving {
+
+class picodet_lcnet_1_5x_416_coco
+    : public baidu::paddle_serving::predictor::OpWithChannel<GeneralBlob> {
+public:
+  typedef std::vector<paddle::PaddleTensor> TensorVector;
+
+  DECLARE_OP(picodet_lcnet_1_5x_416_coco);
+
+  int inference();
+
+private:
+  // preprocess
+  std::vector<float> mean_ = {0.485f, 0.456f, 0.406f};
+  std::vector<float> scale_ = {0.229f, 0.224f, 0.225f};
+  bool is_scale_ = true;
+  int im_shape_h = 416;
+  int im_shape_w = 416;
+  float scale_factor_h = 1.0f;
+  float scale_factor_w = 1.0f;
+  void preprocess_det(const cv::Mat &img, float *data, float &scale_factor_h,
+                      float &scale_factor_w, int im_shape_h, int im_shape_w,
+                      const std::vector<float> &mean,
+                      const std::vector<float> &scale, const bool is_scale);
+
+  // read pics
+  cv::Mat Base2Mat(std::string &base64_data);
+  std::string base64Decode(const char *Data, int DataByte);
+};
+
+} // namespace serving
+} // namespace paddle_serving
+} // namespace baidu
--- a/deploy/serving/cpp/preprocess/ppyolo_mbv3_large_coco.cpp
+++ b/deploy/serving/cpp/preprocess/ppyolo_mbv3_large_coco.cpp
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "core/general-server/op/ppyolo_mbv3_large_coco.h"
+#include "core/predictor/framework/infer.h"
+#include "core/predictor/framework/memory.h"
+#include "core/predictor/framework/resource.h"
+#include "core/util/include/timer.h"
+#include <algorithm>
+#include <iostream>
+#include <memory>
+#include <sstream>
+
+namespace baidu {
+namespace paddle_serving {
+namespace serving {
+
+using baidu::paddle_serving::Timer;
+using baidu::paddle_serving::predictor::InferManager;
+using baidu::paddle_serving::predictor::MempoolWrapper;
+using baidu::paddle_serving::predictor::PaddleGeneralModelConfig;
+using baidu::paddle_serving::predictor::general_model::Request;
+using baidu::paddle_serving::predictor::general_model::Response;
+using baidu::paddle_serving::predictor::general_model::Tensor;
+
+int ppyolo_mbv3_large_coco::inference() {
+  VLOG(2) << "Going to run inference";
+  const std::vector<std::string> pre_node_names = pre_names();
+  if (pre_node_names.size() != 1) {
+    LOG(ERROR) << "This op(" << op_name()
+               << ") can only have one predecessor op, but received "
+               << pre_node_names.size();
+    return -1;
+  }
+  const std::string pre_name = pre_node_names[0];
+
+  const GeneralBlob *input_blob = get_depend_argument<GeneralBlob>(pre_name);
+  if (!input_blob) {
+    LOG(ERROR) << "input_blob is nullptr,error";
+    return -1;
+  }
+  uint64_t log_id = input_blob->GetLogId();
+  VLOG(2) << "(logid=" << log_id << ") Get precedent op name: " << pre_name;
+
+  GeneralBlob *output_blob = mutable_data<GeneralBlob>();
+  if (!output_blob) {
+    LOG(ERROR) << "output_blob is nullptr,error";
+    return -1;
+  }
+  output_blob->SetLogId(log_id);
+
+  if (!input_blob) {
+    LOG(ERROR) << "(logid=" << log_id
+               << ") Failed mutable depended argument, op:" << pre_name;
+    return -1;
+  }
+
+  const TensorVector *in = &input_blob->tensor_vector;
+  TensorVector *out = &output_blob->tensor_vector;
+
+  int batch_size = input_blob->_batch_size;
+  output_blob->_batch_size = batch_size;
+  VLOG(2) << "(logid=" << log_id << ") infer batch size: " << batch_size;
+
+  Timer timeline;
+  int64_t start = timeline.TimeStampUS();
+  timeline.Start();
+
+  // only support string type
+  char *total_input_ptr = static_cast<char *>(in->at(0).data.data());
+  std::string base64str = total_input_ptr;
+
+  cv::Mat img = Base2Mat(base64str);
+  cv::cvtColor(img, img, cv::COLOR_BGR2RGB);
+
+  // preprocess
+  std::vector<float> input(1 * 3 * im_shape_h * im_shape_w, 0.0f);
+  preprocess_det(img, input.data(), scale_factor_h, scale_factor_w, im_shape_h,
+                 im_shape_w, mean_, scale_, is_scale_);
+
+  // create real_in
+  TensorVector *real_in = new TensorVector();
+  if (!real_in) {
+    LOG(ERROR) << "real_in is nullptr,error";
+    return -1;
+  }
+
+  int in_num = 0;
+  size_t databuf_size = 0;
+  void *databuf_data = NULL;
+  char *databuf_char = NULL;
+
+  // im_shape
+  std::vector<float> im_shape{static_cast<float>(im_shape_h),
+                              static_cast<float>(im_shape_w)};
+  databuf_size = 2 * sizeof(float);
+
+  databuf_data = MempoolWrapper::instance().malloc(databuf_size);
+  if (!databuf_data) {
+    LOG(ERROR) << "Malloc failed, size: " << databuf_size;
+    return -1;
+  }
+
+  memcpy(databuf_data, im_shape.data(), databuf_size);
+  databuf_char = reinterpret_cast<char *>(databuf_data);
+  paddle::PaddleBuf paddleBuf_0(databuf_char, databuf_size);
+  paddle::PaddleTensor tensor_in_0;
+  tensor_in_0.name = "im_shape";
+  tensor_in_0.dtype = paddle::PaddleDType::FLOAT32;
+  tensor_in_0.shape = {1, 2};
+  tensor_in_0.lod = in->at(0).lod;
+  tensor_in_0.data = paddleBuf_0;
+  real_in->push_back(tensor_in_0);
+
+  // image
+  in_num = 1 * 3 * im_shape_h * im_shape_w;
+  databuf_size = in_num * sizeof(float);
+
+  databuf_data = MempoolWrapper::instance().malloc(databuf_size);
+  if (!databuf_data) {
+    LOG(ERROR) << "Malloc failed, size: " << databuf_size;
+    return -1;
+  }
+
+  memcpy(databuf_data, input.data(), databuf_size);
+  databuf_char = reinterpret_cast<char *>(databuf_data);
+  paddle::PaddleBuf paddleBuf_1(databuf_char, databuf_size);
+  paddle::PaddleTensor tensor_in_1;
+  tensor_in_1.name = "image";
+  tensor_in_1.dtype = paddle::PaddleDType::FLOAT32;
+  tensor_in_1.shape = {1, 3, im_shape_h, im_shape_w};
+  tensor_in_1.lod = in->at(0).lod;
+  tensor_in_1.data = paddleBuf_1;
+  real_in->push_back(tensor_in_1);
+
+  // scale_factor
+  std::vector<float> scale_factor{scale_factor_h, scale_factor_w};
+  databuf_size = 2 * sizeof(float);
+
+  databuf_data = MempoolWrapper::instance().malloc(databuf_size);
+  if (!databuf_data) {
+    LOG(ERROR) << "Malloc failed, size: " << databuf_size;
+    return -1;
+  }
+
+  memcpy(databuf_data, scale_factor.data(), databuf_size);
+  databuf_char = reinterpret_cast<char *>(databuf_data);
+  paddle::PaddleBuf paddleBuf_2(databuf_char, databuf_size);
+  paddle::PaddleTensor tensor_in_2;
+  tensor_in_2.name = "scale_factor";
+  tensor_in_2.dtype = paddle::PaddleDType::FLOAT32;
+  tensor_in_2.shape = {1, 2};
+  tensor_in_2.lod = in->at(0).lod;
+  tensor_in_2.data = paddleBuf_2;
+  real_in->push_back(tensor_in_2);
+
+  if (InferManager::instance().infer(engine_name().c_str(), real_in, out,
+                                     batch_size)) {
+    LOG(ERROR) << "(logid=" << log_id
+               << ") Failed do infer in fluid model: " << engine_name().c_str();
+    return -1;
+  }
+
+  int64_t end = timeline.TimeStampUS();
+  CopyBlobInfo(input_blob, output_blob);
+  AddBlobInfo(output_blob, start);
+  AddBlobInfo(output_blob, end);
+  return 0;
+}
+
+void ppyolo_mbv3_large_coco::preprocess_det(const cv::Mat &img, float *data,
+                                            float &scale_factor_h,
+                                            float &scale_factor_w,
+                                            int im_shape_h, int im_shape_w,
+                                            const std::vector<float> &mean,
+                                            const std::vector<float> &scale,
+                                            const bool is_scale) {
+  // scale_factor
+  scale_factor_h =
+      static_cast<float>(im_shape_h) / static_cast<float>(img.rows);
+  scale_factor_w =
+      static_cast<float>(im_shape_w) / static_cast<float>(img.cols);
+
+  // Resize
+  cv::Mat resize_img;
+  cv::resize(img, resize_img, cv::Size(im_shape_w, im_shape_h), 0, 0, 2);
+
+  // Normalize
+  double e = 1.0;
+  if (is_scale) {
+    e /= 255.0;
+  }
+  cv::Mat img_fp;
+  (resize_img).convertTo(img_fp, CV_32FC3, e);
+  for (int h = 0; h < im_shape_h; h++) {
+    for (int w = 0; w < im_shape_w; w++) {
+      img_fp.at<cv::Vec3f>(h, w)[0] =
+          (img_fp.at<cv::Vec3f>(h, w)[0] - mean[0]) / scale[0];
+      img_fp.at<cv::Vec3f>(h, w)[1] =
+          (img_fp.at<cv::Vec3f>(h, w)[1] - mean[1]) / scale[1];
+      img_fp.at<cv::Vec3f>(h, w)[2] =
+          (img_fp.at<cv::Vec3f>(h, w)[2] - mean[2]) / scale[2];
+    }
+  }
+
+  // Permute
+  int rh = img_fp.rows;
+  int rw = img_fp.cols;
+  int rc = img_fp.channels();
+  for (int i = 0; i < rc; ++i) {
+    cv::extractChannel(img_fp, cv::Mat(rh, rw, CV_32FC1, data + i * rh * rw),
+                       i);
+  }
+}
+
+cv::Mat ppyolo_mbv3_large_coco::Base2Mat(std::string &base64_data) {
+  cv::Mat img;
+  std::string s_mat;
+  s_mat = base64Decode(base64_data.data(), base64_data.size());
+  std::vector<char> base64_img(s_mat.begin(), s_mat.end());
+  img = cv::imdecode(base64_img, cv::IMREAD_COLOR); // CV_LOAD_IMAGE_COLOR
+  return img;
+}
+
+std::string ppyolo_mbv3_large_coco::base64Decode(const char *Data,
+                                                 int DataByte) {
+  const char DecodeTable[] = {
+      0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
+      0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
+      0,  0,  0,  0,  0,  0,  0,  0,  0,
+      62, // '+'
+      0,  0,  0,
+      63,                                     // '/'
+      52, 53, 54, 55, 56, 57, 58, 59, 60, 61, // '0'-'9'
+      0,  0,  0,  0,  0,  0,  0,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9,
+      10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, // 'A'-'Z'
+      0,  0,  0,  0,  0,  0,  26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
+      37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, // 'a'-'z'
+  };
+
+  std::string strDecode;
+  int nValue;
+  int i = 0;
+  while (i < DataByte) {
+    if (*Data != '\r' && *Data != '\n') {
+      nValue = DecodeTable[*Data++] << 18;
+      nValue += DecodeTable[*Data++] << 12;
+      strDecode += (nValue & 0x00FF0000) >> 16;
+      if (*Data != '=') {
+        nValue += DecodeTable[*Data++] << 6;
+        strDecode += (nValue & 0x0000FF00) >> 8;
+        if (*Data != '=') {
+          nValue += DecodeTable[*Data++];
+          strDecode += nValue & 0x000000FF;
+        }
+      }
+      i += 4;
+    } else // 回车换行,跳过
+    {
+      Data++;
+      i++;
+    }
+  }
+  return strDecode;
+}
+
+DEFINE_OP(ppyolo_mbv3_large_coco);
+
+} // namespace serving
+} // namespace paddle_serving
+} // namespace baidu
--- a/deploy/serving/cpp/preprocess/ppyolo_mbv3_large_coco.h
+++ b/deploy/serving/cpp/preprocess/ppyolo_mbv3_large_coco.h
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#pragma once
+#include "core/general-server/general_model_service.pb.h"
+#include "core/general-server/op/general_infer_helper.h"
+#include "paddle_inference_api.h" // NOLINT
+#include <string>
+#include <vector>
+
+#include "opencv2/core.hpp"
+#include "opencv2/imgcodecs.hpp"
+#include "opencv2/imgproc.hpp"
+#include <chrono>
+#include <iomanip>
+#include <iostream>
+#include <ostream>
+#include <vector>
+
+#include <cstring>
+#include <fstream>
+#include <numeric>
+
+namespace baidu {
+namespace paddle_serving {
+namespace serving {
+
+class ppyolo_mbv3_large_coco
+    : public baidu::paddle_serving::predictor::OpWithChannel<GeneralBlob> {
+public:
+  typedef std::vector<paddle::PaddleTensor> TensorVector;
+
+  DECLARE_OP(ppyolo_mbv3_large_coco);
+
+  int inference();
+
+private:
+  // preprocess
+  std::vector<float> mean_ = {0.485f, 0.456f, 0.406f};
+  std::vector<float> scale_ = {0.229f, 0.224f, 0.225f};
+  bool is_scale_ = true;
+  int im_shape_h = 320;
+  int im_shape_w = 320;
+  float scale_factor_h = 1.0f;
+  float scale_factor_w = 1.0f;
+  void preprocess_det(const cv::Mat &img, float *data, float &scale_factor_h,
+                      float &scale_factor_w, int im_shape_h, int im_shape_w,
+                      const std::vector<float> &mean,
+                      const std::vector<float> &scale, const bool is_scale);
+
+  // read pics
+  cv::Mat Base2Mat(std::string &base64_data);
+  std::string base64Decode(const char *Data, int DataByte);
+};
+
+} // namespace serving
+} // namespace paddle_serving
+} // namespace baidu
--- a/deploy/serving/cpp/preprocess/ppyoloe_crn_s_300e_coco.cpp
+++ b/deploy/serving/cpp/preprocess/ppyoloe_crn_s_300e_coco.cpp
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "core/general-server/op/ppyoloe_crn_s_300e_coco.h"
+#include "core/predictor/framework/infer.h"
+#include "core/predictor/framework/memory.h"
+#include "core/predictor/framework/resource.h"
+#include "core/util/include/timer.h"
+#include <algorithm>
+#include <iostream>
+#include <memory>
+#include <sstream>
+
+namespace baidu {
+namespace paddle_serving {
+namespace serving {
+
+using baidu::paddle_serving::Timer;
+using baidu::paddle_serving::predictor::InferManager;
+using baidu::paddle_serving::predictor::MempoolWrapper;
+using baidu::paddle_serving::predictor::PaddleGeneralModelConfig;
+using baidu::paddle_serving::predictor::general_model::Request;
+using baidu::paddle_serving::predictor::general_model::Response;
+using baidu::paddle_serving::predictor::general_model::Tensor;
+
+int ppyoloe_crn_s_300e_coco::inference() {
+  VLOG(2) << "Going to run inference";
+  const std::vector<std::string> pre_node_names = pre_names();
+  if (pre_node_names.size() != 1) {
+    LOG(ERROR) << "This op(" << op_name()
+               << ") can only have one predecessor op, but received "
+               << pre_node_names.size();
+    return -1;
+  }
+  const std::string pre_name = pre_node_names[0];
+
+  const GeneralBlob *input_blob = get_depend_argument<GeneralBlob>(pre_name);
+  if (!input_blob) {
+    LOG(ERROR) << "input_blob is nullptr,error";
+    return -1;
+  }
+  uint64_t log_id = input_blob->GetLogId();
+  VLOG(2) << "(logid=" << log_id << ") Get precedent op name: " << pre_name;
+
+  GeneralBlob *output_blob = mutable_data<GeneralBlob>();
+  if (!output_blob) {
+    LOG(ERROR) << "output_blob is nullptr,error";
+    return -1;
+  }
+  output_blob->SetLogId(log_id);
+
+  if (!input_blob) {
+    LOG(ERROR) << "(logid=" << log_id
+               << ") Failed mutable depended argument, op:" << pre_name;
+    return -1;
+  }
+
+  const TensorVector *in = &input_blob->tensor_vector;
+  TensorVector *out = &output_blob->tensor_vector;
+
+  int batch_size = input_blob->_batch_size;
+  output_blob->_batch_size = batch_size;
+  VLOG(2) << "(logid=" << log_id << ") infer batch size: " << batch_size;
+
+  Timer timeline;
+  int64_t start = timeline.TimeStampUS();
+  timeline.Start();
+
+  // only support string type
+  char *total_input_ptr = static_cast<char *>(in->at(0).data.data());
+  std::string base64str = total_input_ptr;
+
+  cv::Mat img = Base2Mat(base64str);
+  cv::cvtColor(img, img, cv::COLOR_BGR2RGB);
+
+  // preprocess
+  std::vector<float> input(1 * 3 * im_shape_h * im_shape_w, 0.0f);
+  preprocess_det(img, input.data(), scale_factor_h, scale_factor_w, im_shape_h,
+                 im_shape_w, mean_, scale_, is_scale_);
+
+  // create real_in
+  TensorVector *real_in = new TensorVector();
+  if (!real_in) {
+    LOG(ERROR) << "real_in is nullptr,error";
+    return -1;
+  }
+
+  int in_num = 0;
+  size_t databuf_size = 0;
+  void *databuf_data = NULL;
+  char *databuf_char = NULL;
+
+  // image
+  in_num = 1 * 3 * im_shape_h * im_shape_w;
+  databuf_size = in_num * sizeof(float);
+
+  databuf_data = MempoolWrapper::instance().malloc(databuf_size);
+  if (!databuf_data) {
+    LOG(ERROR) << "Malloc failed, size: " << databuf_size;
+    return -1;
+  }
+
+  memcpy(databuf_data, input.data(), databuf_size);
+  databuf_char = reinterpret_cast<char *>(databuf_data);
+  paddle::PaddleBuf paddleBuf(databuf_char, databuf_size);
+  paddle::PaddleTensor tensor_in;
+  tensor_in.name = "image";
+  tensor_in.dtype = paddle::PaddleDType::FLOAT32;
+  tensor_in.shape = {1, 3, im_shape_h, im_shape_w};
+  tensor_in.lod = in->at(0).lod;
+  tensor_in.data = paddleBuf;
+  real_in->push_back(tensor_in);
+
+  // scale_factor
+  std::vector<float> scale_factor{scale_factor_h, scale_factor_w};
+  databuf_size = 2 * sizeof(float);
+
+  databuf_data = MempoolWrapper::instance().malloc(databuf_size);
+  if (!databuf_data) {
+    LOG(ERROR) << "Malloc failed, size: " << databuf_size;
+    return -1;
+  }
+
+  memcpy(databuf_data, scale_factor.data(), databuf_size);
+  databuf_char = reinterpret_cast<char *>(databuf_data);
+  paddle::PaddleBuf paddleBuf_2(databuf_char, databuf_size);
+  paddle::PaddleTensor tensor_in_2;
+  tensor_in_2.name = "scale_factor";
+  tensor_in_2.dtype = paddle::PaddleDType::FLOAT32;
+  tensor_in_2.shape = {1, 2};
+  tensor_in_2.lod = in->at(0).lod;
+  tensor_in_2.data = paddleBuf_2;
+  real_in->push_back(tensor_in_2);
+
+  if (InferManager::instance().infer(engine_name().c_str(), real_in, out,
+                                     batch_size)) {
+    LOG(ERROR) << "(logid=" << log_id
+               << ") Failed do infer in fluid model: " << engine_name().c_str();
+    return -1;
+  }
+
+  int64_t end = timeline.TimeStampUS();
+  CopyBlobInfo(input_blob, output_blob);
+  AddBlobInfo(output_blob, start);
+  AddBlobInfo(output_blob, end);
+  return 0;
+}
+
+void ppyoloe_crn_s_300e_coco::preprocess_det(const cv::Mat &img, float *data,
+                                             float &scale_factor_h,
+                                             float &scale_factor_w,
+                                             int im_shape_h, int im_shape_w,
+                                             const std::vector<float> &mean,
+                                             const std::vector<float> &scale,
+                                             const bool is_scale) {
+  // scale_factor
+  scale_factor_h =
+      static_cast<float>(im_shape_h) / static_cast<float>(img.rows);
+  scale_factor_w =
+      static_cast<float>(im_shape_w) / static_cast<float>(img.cols);
+
+  // Resize
+  cv::Mat resize_img;
+  cv::resize(img, resize_img, cv::Size(im_shape_w, im_shape_h), 0, 0, 2);
+
+  // Normalize
+  double e = 1.0;
+  if (is_scale) {
+    e /= 255.0;
+  }
+  cv::Mat img_fp;
+  (resize_img).convertTo(img_fp, CV_32FC3, e);
+  for (int h = 0; h < im_shape_h; h++) {
+    for (int w = 0; w < im_shape_w; w++) {
+      img_fp.at<cv::Vec3f>(h, w)[0] =
+          (img_fp.at<cv::Vec3f>(h, w)[0] - mean[0]) / scale[0];
+      img_fp.at<cv::Vec3f>(h, w)[1] =
+          (img_fp.at<cv::Vec3f>(h, w)[1] - mean[1]) / scale[1];
+      img_fp.at<cv::Vec3f>(h, w)[2] =
+          (img_fp.at<cv::Vec3f>(h, w)[2] - mean[2]) / scale[2];
+    }
+  }
+
+  // Permute
+  int rh = img_fp.rows;
+  int rw = img_fp.cols;
+  int rc = img_fp.channels();
+  for (int i = 0; i < rc; ++i) {
+    cv::extractChannel(img_fp, cv::Mat(rh, rw, CV_32FC1, data + i * rh * rw),
+                       i);
+  }
+}
+
+cv::Mat ppyoloe_crn_s_300e_coco::Base2Mat(std::string &base64_data) {
+  cv::Mat img;
+  std::string s_mat;
+  s_mat = base64Decode(base64_data.data(), base64_data.size());
+  std::vector<char> base64_img(s_mat.begin(), s_mat.end());
+  img = cv::imdecode(base64_img, cv::IMREAD_COLOR); // CV_LOAD_IMAGE_COLOR
+  return img;
+}
+
+std::string ppyoloe_crn_s_300e_coco::base64Decode(const char *Data,
+                                                  int DataByte) {
+  const char DecodeTable[] = {
+      0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
+      0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
+      0,  0,  0,  0,  0,  0,  0,  0,  0,
+      62, // '+'
+      0,  0,  0,
+      63,                                     // '/'
+      52, 53, 54, 55, 56, 57, 58, 59, 60, 61, // '0'-'9'
+      0,  0,  0,  0,  0,  0,  0,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9,
+      10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, // 'A'-'Z'
+      0,  0,  0,  0,  0,  0,  26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
+      37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, // 'a'-'z'
+  };
+
+  std::string strDecode;
+  int nValue;
+  int i = 0;
+  while (i < DataByte) {
+    if (*Data != '\r' && *Data != '\n') {
+      nValue = DecodeTable[*Data++] << 18;
+      nValue += DecodeTable[*Data++] << 12;
+      strDecode += (nValue & 0x00FF0000) >> 16;
+      if (*Data != '=') {
+        nValue += DecodeTable[*Data++] << 6;
+        strDecode += (nValue & 0x0000FF00) >> 8;
+        if (*Data != '=') {
+          nValue += DecodeTable[*Data++];
+          strDecode += nValue & 0x000000FF;
+        }
+      }
+      i += 4;
+    } else // 回车换行,跳过
+    {
+      Data++;
+      i++;
+    }
+  }
+  return strDecode;
+}
+
+DEFINE_OP(ppyoloe_crn_s_300e_coco);
+
+} // namespace serving
+} // namespace paddle_serving
+} // namespace baidu
--- a/deploy/serving/cpp/preprocess/ppyoloe_crn_s_300e_coco.h
+++ b/deploy/serving/cpp/preprocess/ppyoloe_crn_s_300e_coco.h
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#pragma once
+#include "core/general-server/general_model_service.pb.h"
+#include "core/general-server/op/general_infer_helper.h"
+#include "paddle_inference_api.h" // NOLINT
+#include <string>
+#include <vector>
+
+#include "opencv2/core.hpp"
+#include "opencv2/imgcodecs.hpp"
+#include "opencv2/imgproc.hpp"
+#include <chrono>
+#include <iomanip>
+#include <iostream>
+#include <ostream>
+#include <vector>
+
+#include <cstring>
+#include <fstream>
+#include <numeric>
+
+namespace baidu {
+namespace paddle_serving {
+namespace serving {
+
+class ppyoloe_crn_s_300e_coco
+    : public baidu::paddle_serving::predictor::OpWithChannel<GeneralBlob> {
+public:
+  typedef std::vector<paddle::PaddleTensor> TensorVector;
+
+  DECLARE_OP(ppyoloe_crn_s_300e_coco);
+
+  int inference();
+
+private:
+  // preprocess
+  std::vector<float> mean_ = {0.485f, 0.456f, 0.406f};
+  std::vector<float> scale_ = {0.229f, 0.224f, 0.225f};
+  bool is_scale_ = true;
+  int im_shape_h = 640;
+  int im_shape_w = 640;
+  float scale_factor_h = 1.0f;
+  float scale_factor_w = 1.0f;
+  void preprocess_det(const cv::Mat &img, float *data, float &scale_factor_h,
+                      float &scale_factor_w, int im_shape_h, int im_shape_w,
+                      const std::vector<float> &mean,
+                      const std::vector<float> &scale, const bool is_scale);
+
+  // read pics
+  cv::Mat Base2Mat(std::string &base64_data);
+  std::string base64Decode(const char *Data, int DataByte);
+};
+
+} // namespace serving
+} // namespace paddle_serving
+} // namespace baidu
--- a/deploy/serving/cpp/preprocess/tinypose_128x96.cpp
+++ b/deploy/serving/cpp/preprocess/tinypose_128x96.cpp
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "core/general-server/op/tinypose_128x96.h"
+#include "core/predictor/framework/infer.h"
+#include "core/predictor/framework/memory.h"
+#include "core/predictor/framework/resource.h"
+#include "core/util/include/timer.h"
+#include <algorithm>
+#include <iostream>
+#include <memory>
+#include <sstream>
+
+namespace baidu {
+namespace paddle_serving {
+namespace serving {
+
+using baidu::paddle_serving::Timer;
+using baidu::paddle_serving::predictor::InferManager;
+using baidu::paddle_serving::predictor::MempoolWrapper;
+using baidu::paddle_serving::predictor::PaddleGeneralModelConfig;
+using baidu::paddle_serving::predictor::general_model::Request;
+using baidu::paddle_serving::predictor::general_model::Response;
+using baidu::paddle_serving::predictor::general_model::Tensor;
+
+int tinypose_128x96::inference() {
+  VLOG(2) << "Going to run inference";
+  const std::vector<std::string> pre_node_names = pre_names();
+  if (pre_node_names.size() != 1) {
+    LOG(ERROR) << "This op(" << op_name()
+               << ") can only have one predecessor op, but received "
+               << pre_node_names.size();
+    return -1;
+  }
+  const std::string pre_name = pre_node_names[0];
+
+  const GeneralBlob *input_blob = get_depend_argument<GeneralBlob>(pre_name);
+  if (!input_blob) {
+    LOG(ERROR) << "input_blob is nullptr,error";
+    return -1;
+  }
+  uint64_t log_id = input_blob->GetLogId();
+  VLOG(2) << "(logid=" << log_id << ") Get precedent op name: " << pre_name;
+
+  GeneralBlob *output_blob = mutable_data<GeneralBlob>();
+  if (!output_blob) {
+    LOG(ERROR) << "output_blob is nullptr,error";
+    return -1;
+  }
+  output_blob->SetLogId(log_id);
+
+  if (!input_blob) {
+    LOG(ERROR) << "(logid=" << log_id
+               << ") Failed mutable depended argument, op:" << pre_name;
+    return -1;
+  }
+
+  const TensorVector *in = &input_blob->tensor_vector;
+  TensorVector *out = &output_blob->tensor_vector;
+
+  int batch_size = input_blob->_batch_size;
+  output_blob->_batch_size = batch_size;
+  VLOG(2) << "(logid=" << log_id << ") infer batch size: " << batch_size;
+
+  Timer timeline;
+  int64_t start = timeline.TimeStampUS();
+  timeline.Start();
+
+  // only support string type
+  char *total_input_ptr = static_cast<char *>(in->at(0).data.data());
+  std::string base64str = total_input_ptr;
+
+  cv::Mat img = Base2Mat(base64str);
+  cv::cvtColor(img, img, cv::COLOR_BGR2RGB);
+
+  // preprocess
+  std::vector<float> input(1 * 3 * im_shape_h * im_shape_w, 0.0f);
+  preprocess_det(img, input.data(), scale_factor_h, scale_factor_w, im_shape_h,
+                 im_shape_w, mean_, scale_, is_scale_);
+
+  // create real_in
+  TensorVector *real_in = new TensorVector();
+  if (!real_in) {
+    LOG(ERROR) << "real_in is nullptr,error";
+    return -1;
+  }
+
+  int in_num = 0;
+  size_t databuf_size = 0;
+  void *databuf_data = NULL;
+  char *databuf_char = NULL;
+
+  // image
+  in_num = 1 * 3 * im_shape_h * im_shape_w;
+  databuf_size = in_num * sizeof(float);
+
+  databuf_data = MempoolWrapper::instance().malloc(databuf_size);
+  if (!databuf_data) {
+    LOG(ERROR) << "Malloc failed, size: " << databuf_size;
+    return -1;
+  }
+
+  memcpy(databuf_data, input.data(), databuf_size);
+  databuf_char = reinterpret_cast<char *>(databuf_data);
+  paddle::PaddleBuf paddleBuf(databuf_char, databuf_size);
+  paddle::PaddleTensor tensor_in;
+  tensor_in.name = "image";
+  tensor_in.dtype = paddle::PaddleDType::FLOAT32;
+  tensor_in.shape = {1, 3, im_shape_h, im_shape_w};
+  tensor_in.lod = in->at(0).lod;
+  tensor_in.data = paddleBuf;
+  real_in->push_back(tensor_in);
+
+  if (InferManager::instance().infer(engine_name().c_str(), real_in, out,
+                                     batch_size)) {
+    LOG(ERROR) << "(logid=" << log_id
+               << ") Failed do infer in fluid model: " << engine_name().c_str();
+    return -1;
+  }
+
+  int64_t end = timeline.TimeStampUS();
+  CopyBlobInfo(input_blob, output_blob);
+  AddBlobInfo(output_blob, start);
+  AddBlobInfo(output_blob, end);
+  return 0;
+}
+
+void tinypose_128x96::preprocess_det(const cv::Mat &img, float *data,
+                                     float &scale_factor_h,
+                                     float &scale_factor_w, int im_shape_h,
+                                     int im_shape_w,
+                                     const std::vector<float> &mean,
+                                     const std::vector<float> &scale,
+                                     const bool is_scale) {
+  // Resize
+  cv::Mat resize_img;
+  cv::resize(img, resize_img, cv::Size(im_shape_w, im_shape_h), 0, 0, 1);
+
+  // Normalize
+  double e = 1.0;
+  if (is_scale) {
+    e /= 255.0;
+  }
+  cv::Mat img_fp;
+  (resize_img).convertTo(img_fp, CV_32FC3, e);
+  for (int h = 0; h < im_shape_h; h++) {
+    for (int w = 0; w < im_shape_w; w++) {
+      img_fp.at<cv::Vec3f>(h, w)[0] =
+          (img_fp.at<cv::Vec3f>(h, w)[0] - mean[0]) / scale[0];
+      img_fp.at<cv::Vec3f>(h, w)[1] =
+          (img_fp.at<cv::Vec3f>(h, w)[1] - mean[1]) / scale[1];
+      img_fp.at<cv::Vec3f>(h, w)[2] =
+          (img_fp.at<cv::Vec3f>(h, w)[2] - mean[2]) / scale[2];
+    }
+  }
+
+  // Permute
+  int rh = img_fp.rows;
+  int rw = img_fp.cols;
+  int rc = img_fp.channels();
+  for (int i = 0; i < rc; ++i) {
+    cv::extractChannel(img_fp, cv::Mat(rh, rw, CV_32FC1, data + i * rh * rw),
+                       i);
+  }
+}
+
+cv::Mat tinypose_128x96::Base2Mat(std::string &base64_data) {
+  cv::Mat img;
+  std::string s_mat;
+  s_mat = base64Decode(base64_data.data(), base64_data.size());
+  std::vector<char> base64_img(s_mat.begin(), s_mat.end());
+  img = cv::imdecode(base64_img, cv::IMREAD_COLOR); // CV_LOAD_IMAGE_COLOR
+  return img;
+}
+
+std::string tinypose_128x96::base64Decode(const char *Data, int DataByte) {
+  const char DecodeTable[] = {
+      0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
+      0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
+      0,  0,  0,  0,  0,  0,  0,  0,  0,
+      62, // '+'
+      0,  0,  0,
+      63,                                     // '/'
+      52, 53, 54, 55, 56, 57, 58, 59, 60, 61, // '0'-'9'
+      0,  0,  0,  0,  0,  0,  0,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9,
+      10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, // 'A'-'Z'
+      0,  0,  0,  0,  0,  0,  26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
+      37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, // 'a'-'z'
+  };
+
+  std::string strDecode;
+  int nValue;
+  int i = 0;
+  while (i < DataByte) {
+    if (*Data != '\r' && *Data != '\n') {
+      nValue = DecodeTable[*Data++] << 18;
+      nValue += DecodeTable[*Data++] << 12;
+      strDecode += (nValue & 0x00FF0000) >> 16;
+      if (*Data != '=') {
+        nValue += DecodeTable[*Data++] << 6;
+        strDecode += (nValue & 0x0000FF00) >> 8;
+        if (*Data != '=') {
+          nValue += DecodeTable[*Data++];
+          strDecode += nValue & 0x000000FF;
+        }
+      }
+      i += 4;
+    } else // 回车换行,跳过
+    {
+      Data++;
+      i++;
+    }
+  }
+  return strDecode;
+}
+
+DEFINE_OP(tinypose_128x96);
+
+} // namespace serving
+} // namespace paddle_serving
+} // namespace baidu
--- a/deploy/serving/cpp/preprocess/tinypose_128x96.h
+++ b/deploy/serving/cpp/preprocess/tinypose_128x96.h
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#pragma once
+#include "core/general-server/general_model_service.pb.h"
+#include "core/general-server/op/general_infer_helper.h"
+#include "paddle_inference_api.h" // NOLINT
+#include <string>
+#include <vector>
+
+#include "opencv2/core.hpp"
+#include "opencv2/imgcodecs.hpp"
+#include "opencv2/imgproc.hpp"
+#include <chrono>
+#include <iomanip>
+#include <iostream>
+#include <ostream>
+#include <vector>
+
+#include <cstring>
+#include <fstream>
+#include <numeric>
+
+namespace baidu {
+namespace paddle_serving {
+namespace serving {
+
+class tinypose_128x96
+    : public baidu::paddle_serving::predictor::OpWithChannel<GeneralBlob> {
+public:
+  typedef std::vector<paddle::PaddleTensor> TensorVector;
+
+  DECLARE_OP(tinypose_128x96);
+
+  int inference();
+
+private:
+  // preprocess
+  std::vector<float> mean_ = {0.485f, 0.456f, 0.406f};
+  std::vector<float> scale_ = {0.229f, 0.224f, 0.225f};
+  bool is_scale_ = true;
+  int im_shape_h = 128;
+  int im_shape_w = 96;
+  float scale_factor_h = 1.0f;
+  float scale_factor_w = 1.0f;
+  void preprocess_det(const cv::Mat &img, float *data, float &scale_factor_h,
+                      float &scale_factor_w, int im_shape_h, int im_shape_w,
+                      const std::vector<float> &mean,
+                      const std::vector<float> &scale, const bool is_scale);
+
+  // read pics
+  cv::Mat Base2Mat(std::string &base64_data);
+  std::string base64Decode(const char *Data, int DataByte);
+};
+
+} // namespace serving
+} // namespace paddle_serving
+} // namespace baidu
--- a/deploy/serving/cpp/preprocess/yolov3_darknet53_270e_coco.cpp
+++ b/deploy/serving/cpp/preprocess/yolov3_darknet53_270e_coco.cpp
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "core/general-server/op/yolov3_darknet53_270e_coco.h"
+#include "core/predictor/framework/infer.h"
+#include "core/predictor/framework/memory.h"
+#include "core/predictor/framework/resource.h"
+#include "core/util/include/timer.h"
+#include <algorithm>
+#include <iostream>
+#include <memory>
+#include <sstream>
+
+namespace baidu {
+namespace paddle_serving {
+namespace serving {
+
+using baidu::paddle_serving::Timer;
+using baidu::paddle_serving::predictor::InferManager;
+using baidu::paddle_serving::predictor::MempoolWrapper;
+using baidu::paddle_serving::predictor::PaddleGeneralModelConfig;
+using baidu::paddle_serving::predictor::general_model::Request;
+using baidu::paddle_serving::predictor::general_model::Response;
+using baidu::paddle_serving::predictor::general_model::Tensor;
+
+int yolov3_darknet53_270e_coco::inference() {
+  VLOG(2) << "Going to run inference";
+  const std::vector<std::string> pre_node_names = pre_names();
+  if (pre_node_names.size() != 1) {
+    LOG(ERROR) << "This op(" << op_name()
+               << ") can only have one predecessor op, but received "
+               << pre_node_names.size();
+    return -1;
+  }
+  const std::string pre_name = pre_node_names[0];
+
+  const GeneralBlob *input_blob = get_depend_argument<GeneralBlob>(pre_name);
+  if (!input_blob) {
+    LOG(ERROR) << "input_blob is nullptr,error";
+    return -1;
+  }
+  uint64_t log_id = input_blob->GetLogId();
+  VLOG(2) << "(logid=" << log_id << ") Get precedent op name: " << pre_name;
+
+  GeneralBlob *output_blob = mutable_data<GeneralBlob>();
+  if (!output_blob) {
+    LOG(ERROR) << "output_blob is nullptr,error";
+    return -1;
+  }
+  output_blob->SetLogId(log_id);
+
+  if (!input_blob) {
+    LOG(ERROR) << "(logid=" << log_id
+               << ") Failed mutable depended argument, op:" << pre_name;
+    return -1;
+  }
+
+  const TensorVector *in = &input_blob->tensor_vector;
+  TensorVector *out = &output_blob->tensor_vector;
+
+  int batch_size = input_blob->_batch_size;
+  output_blob->_batch_size = batch_size;
+  VLOG(2) << "(logid=" << log_id << ") infer batch size: " << batch_size;
+
+  Timer timeline;
+  int64_t start = timeline.TimeStampUS();
+  timeline.Start();
+
+  // only support string type
+  char *total_input_ptr = static_cast<char *>(in->at(0).data.data());
+  std::string base64str = total_input_ptr;
+
+  cv::Mat img = Base2Mat(base64str);
+  cv::cvtColor(img, img, cv::COLOR_BGR2RGB);
+
+  // preprocess
+  std::vector<float> input(1 * 3 * im_shape_h * im_shape_w, 0.0f);
+  preprocess_det(img, input.data(), scale_factor_h, scale_factor_w, im_shape_h,
+                 im_shape_w, mean_, scale_, is_scale_);
+
+  // create real_in
+  TensorVector *real_in = new TensorVector();
+  if (!real_in) {
+    LOG(ERROR) << "real_in is nullptr,error";
+    return -1;
+  }
+
+  int in_num = 0;
+  size_t databuf_size = 0;
+  void *databuf_data = NULL;
+  char *databuf_char = NULL;
+
+  // im_shape
+  std::vector<float> im_shape{static_cast<float>(im_shape_h),
+                              static_cast<float>(im_shape_w)};
+  databuf_size = 2 * sizeof(float);
+
+  databuf_data = MempoolWrapper::instance().malloc(databuf_size);
+  if (!databuf_data) {
+    LOG(ERROR) << "Malloc failed, size: " << databuf_size;
+    return -1;
+  }
+
+  memcpy(databuf_data, im_shape.data(), databuf_size);
+  databuf_char = reinterpret_cast<char *>(databuf_data);
+  paddle::PaddleBuf paddleBuf_0(databuf_char, databuf_size);
+  paddle::PaddleTensor tensor_in_0;
+  tensor_in_0.name = "im_shape";
+  tensor_in_0.dtype = paddle::PaddleDType::FLOAT32;
+  tensor_in_0.shape = {1, 2};
+  tensor_in_0.lod = in->at(0).lod;
+  tensor_in_0.data = paddleBuf_0;
+  real_in->push_back(tensor_in_0);
+
+  // image
+  in_num = 1 * 3 * im_shape_h * im_shape_w;
+  databuf_size = in_num * sizeof(float);
+
+  databuf_data = MempoolWrapper::instance().malloc(databuf_size);
+  if (!databuf_data) {
+    LOG(ERROR) << "Malloc failed, size: " << databuf_size;
+    return -1;
+  }
+
+  memcpy(databuf_data, input.data(), databuf_size);
+  databuf_char = reinterpret_cast<char *>(databuf_data);
+  paddle::PaddleBuf paddleBuf_1(databuf_char, databuf_size);
+  paddle::PaddleTensor tensor_in_1;
+  tensor_in_1.name = "image";
+  tensor_in_1.dtype = paddle::PaddleDType::FLOAT32;
+  tensor_in_1.shape = {1, 3, im_shape_h, im_shape_w};
+  tensor_in_1.lod = in->at(0).lod;
+  tensor_in_1.data = paddleBuf_1;
+  real_in->push_back(tensor_in_1);
+
+  // scale_factor
+  std::vector<float> scale_factor{scale_factor_h, scale_factor_w};
+  databuf_size = 2 * sizeof(float);
+
+  databuf_data = MempoolWrapper::instance().malloc(databuf_size);
+  if (!databuf_data) {
+    LOG(ERROR) << "Malloc failed, size: " << databuf_size;
+    return -1;
+  }
+
+  memcpy(databuf_data, scale_factor.data(), databuf_size);
+  databuf_char = reinterpret_cast<char *>(databuf_data);
+  paddle::PaddleBuf paddleBuf_2(databuf_char, databuf_size);
+  paddle::PaddleTensor tensor_in_2;
+  tensor_in_2.name = "scale_factor";
+  tensor_in_2.dtype = paddle::PaddleDType::FLOAT32;
+  tensor_in_2.shape = {1, 2};
+  tensor_in_2.lod = in->at(0).lod;
+  tensor_in_2.data = paddleBuf_2;
+  real_in->push_back(tensor_in_2);
+
+  if (InferManager::instance().infer(engine_name().c_str(), real_in, out,
+                                     batch_size)) {
+    LOG(ERROR) << "(logid=" << log_id
+               << ") Failed do infer in fluid model: " << engine_name().c_str();
+    return -1;
+  }
+
+  int64_t end = timeline.TimeStampUS();
+  CopyBlobInfo(input_blob, output_blob);
+  AddBlobInfo(output_blob, start);
+  AddBlobInfo(output_blob, end);
+  return 0;
+}
+
+void yolov3_darknet53_270e_coco::preprocess_det(const cv::Mat &img, float *data,
+                                                float &scale_factor_h,
+                                                float &scale_factor_w,
+                                                int im_shape_h, int im_shape_w,
+                                                const std::vector<float> &mean,
+                                                const std::vector<float> &scale,
+                                                const bool is_scale) {
+  // scale_factor
+  scale_factor_h =
+      static_cast<float>(im_shape_h) / static_cast<float>(img.rows);
+  scale_factor_w =
+      static_cast<float>(im_shape_w) / static_cast<float>(img.cols);
+
+  // Resize
+  cv::Mat resize_img;
+  cv::resize(img, resize_img, cv::Size(im_shape_w, im_shape_h), 0, 0, 2);
+
+  // Normalize
+  double e = 1.0;
+  if (is_scale) {
+    e /= 255.0;
+  }
+  cv::Mat img_fp;
+  (resize_img).convertTo(img_fp, CV_32FC3, e);
+  for (int h = 0; h < im_shape_h; h++) {
+    for (int w = 0; w < im_shape_w; w++) {
+      img_fp.at<cv::Vec3f>(h, w)[0] =
+          (img_fp.at<cv::Vec3f>(h, w)[0] - mean[0]) / scale[0];
+      img_fp.at<cv::Vec3f>(h, w)[1] =
+          (img_fp.at<cv::Vec3f>(h, w)[1] - mean[1]) / scale[1];
+      img_fp.at<cv::Vec3f>(h, w)[2] =
+          (img_fp.at<cv::Vec3f>(h, w)[2] - mean[2]) / scale[2];
+    }
+  }
+
+  // Permute
+  int rh = img_fp.rows;
+  int rw = img_fp.cols;
+  int rc = img_fp.channels();
+  for (int i = 0; i < rc; ++i) {
+    cv::extractChannel(img_fp, cv::Mat(rh, rw, CV_32FC1, data + i * rh * rw),
+                       i);
+  }
+}
+
+cv::Mat yolov3_darknet53_270e_coco::Base2Mat(std::string &base64_data) {
+  cv::Mat img;
+  std::string s_mat;
+  s_mat = base64Decode(base64_data.data(), base64_data.size());
+  std::vector<char> base64_img(s_mat.begin(), s_mat.end());
+  img = cv::imdecode(base64_img, cv::IMREAD_COLOR); // CV_LOAD_IMAGE_COLOR
+  return img;
+}
+
+std::string yolov3_darknet53_270e_coco::base64Decode(const char *Data,
+                                                     int DataByte) {
+  const char DecodeTable[] = {
+      0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
+      0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
+      0,  0,  0,  0,  0,  0,  0,  0,  0,
+      62, // '+'
+      0,  0,  0,
+      63,                                     // '/'
+      52, 53, 54, 55, 56, 57, 58, 59, 60, 61, // '0'-'9'
+      0,  0,  0,  0,  0,  0,  0,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9,
+      10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, // 'A'-'Z'
+      0,  0,  0,  0,  0,  0,  26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
+      37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, // 'a'-'z'
+  };
+
+  std::string strDecode;
+  int nValue;
+  int i = 0;
+  while (i < DataByte) {
+    if (*Data != '\r' && *Data != '\n') {
+      nValue = DecodeTable[*Data++] << 18;
+      nValue += DecodeTable[*Data++] << 12;
+      strDecode += (nValue & 0x00FF0000) >> 16;
+      if (*Data != '=') {
+        nValue += DecodeTable[*Data++] << 6;
+        strDecode += (nValue & 0x0000FF00) >> 8;
+        if (*Data != '=') {
+          nValue += DecodeTable[*Data++];
+          strDecode += nValue & 0x000000FF;
+        }
+      }
+      i += 4;
+    } else // 回车换行,跳过
+    {
+      Data++;
+      i++;
+    }
+  }
+  return strDecode;
+}
+
+DEFINE_OP(yolov3_darknet53_270e_coco);
+
+} // namespace serving
+} // namespace paddle_serving
+} // namespace baidu
--- a/deploy/serving/cpp/preprocess/yolov3_darknet53_270e_coco.h
+++ b/deploy/serving/cpp/preprocess/yolov3_darknet53_270e_coco.h
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#pragma once
+#include "core/general-server/general_model_service.pb.h"
+#include "core/general-server/op/general_infer_helper.h"
+#include "paddle_inference_api.h" // NOLINT
+#include <string>
+#include <vector>
+
+#include "opencv2/core.hpp"
+#include "opencv2/imgcodecs.hpp"
+#include "opencv2/imgproc.hpp"
+#include <chrono>
+#include <iomanip>
+#include <iostream>
+#include <ostream>
+#include <vector>
+
+#include <cstring>
+#include <fstream>
+#include <numeric>
+
+namespace baidu {
+namespace paddle_serving {
+namespace serving {
+
+class yolov3_darknet53_270e_coco
+    : public baidu::paddle_serving::predictor::OpWithChannel<GeneralBlob> {
+public:
+  typedef std::vector<paddle::PaddleTensor> TensorVector;
+
+  DECLARE_OP(yolov3_darknet53_270e_coco);
+
+  int inference();
+
+private:
+  // preprocess
+  std::vector<float> mean_ = {0.485f, 0.456f, 0.406f};
+  std::vector<float> scale_ = {0.229f, 0.224f, 0.225f};
+  bool is_scale_ = true;
+  int im_shape_h = 608;
+  int im_shape_w = 608;
+  float scale_factor_h = 1.0f;
+  float scale_factor_w = 1.0f;
+  void preprocess_det(const cv::Mat &img, float *data, float &scale_factor_h,
+                      float &scale_factor_w, int im_shape_h, int im_shape_w,
+                      const std::vector<float> &mean,
+                      const std::vector<float> &scale, const bool is_scale);
+
+  // read pics
+  cv::Mat Base2Mat(std::string &base64_data);
+  std::string base64Decode(const char *Data, int DataByte);
+};
+
+} // namespace serving
+} // namespace paddle_serving
+} // namespace baidu
--- a/deploy/serving/cpp/serving_client.py
+++ b/deploy/serving/cpp/serving_client.py
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import glob
+import base64
+import argparse
+from paddle_serving_client import Client
+from paddle_serving_client.proto import general_model_config_pb2 as m_config
+import google.protobuf.text_format
+
+parser = argparse.ArgumentParser(description="args for paddleserving")
+parser.add_argument(
+    "--serving_client", type=str, help="the directory of serving_client")
+parser.add_argument("--image_dir", type=str)
+parser.add_argument("--image_file", type=str)
+parser.add_argument("--http_port", type=int, default=9997)
+parser.add_argument(
+    "--threshold", type=float, default=0.5, help="Threshold of score.")
+args = parser.parse_args()
+
+
+def get_test_images(infer_dir, infer_img):
+    """
+    Get image path list in TEST mode
+    """
+    assert infer_img is not None or infer_dir is not None, \
+        "--image_file or --image_dir should be set"
+    assert infer_img is None or os.path.isfile(infer_img), \
+            "{} is not a file".format(infer_img)
+    assert infer_dir is None or os.path.isdir(infer_dir), \
+            "{} is not a directory".format(infer_dir)
+
+    # infer_img has a higher priority
+    if infer_img and os.path.isfile(infer_img):
+        return [infer_img]
+
+    images = set()
+    infer_dir = os.path.abspath(infer_dir)
+    assert os.path.isdir(infer_dir), \
+        "infer_dir {} is not a directory".format(infer_dir)
+    exts = ['jpg', 'jpeg', 'png', 'bmp']
+    exts += [ext.upper() for ext in exts]
+    for ext in exts:
+        images.update(glob.glob('{}/*.{}'.format(infer_dir, ext)))
+    images = list(images)
+
+    assert len(images) > 0, "no image found in {}".format(infer_dir)
+    print("Found {} inference images in total.".format(len(images)))
+
+    return images
+
+
+def postprocess(fetch_dict, fetch_vars, draw_threshold=0.5):
+    result = []
+    if "conv2d_441.tmp_1" in fetch_dict:
+        heatmap = fetch_dict["conv2d_441.tmp_1"]
+        print(heatmap)
+        result.append(heatmap)
+    else:
+        bboxes = fetch_dict[fetch_vars[0]]
+        for bbox in bboxes:
+            if bbox[0] > -1 and bbox[1] > draw_threshold:
+                print(f"{int(bbox[0])} {bbox[1]} "
+                      f"{bbox[2]} {bbox[3]} {bbox[4]} {bbox[5]}")
+                result.append(f"{int(bbox[0])} {bbox[1]} "
+                              f"{bbox[2]} {bbox[3]} {bbox[4]} {bbox[5]}")
+    return result
+
+
+def get_model_vars(client_config_dir):
+    # read original serving_client_conf.prototxt
+    client_config_file = os.path.join(client_config_dir,
+                                      "serving_client_conf.prototxt")
+    with open(client_config_file, 'r') as f:
+        model_var = google.protobuf.text_format.Merge(
+            str(f.read()), m_config.GeneralModelConfig())
+    # modify feed_var to run core/general-server/op/
+    [model_var.feed_var.pop() for _ in range(len(model_var.feed_var))]
+    feed_var = m_config.FeedVar()
+    feed_var.name = "input"
+    feed_var.alias_name = "input"
+    feed_var.is_lod_tensor = False
+    feed_var.feed_type = 20
+    feed_var.shape.extend([1])
+    model_var.feed_var.extend([feed_var])
+    with open(
+            os.path.join(client_config_dir, "serving_client_conf_cpp.prototxt"),
+            "w") as f:
+        f.write(str(model_var))
+    # get feed_vars/fetch_vars
+    feed_vars = [var.name for var in model_var.feed_var]
+    fetch_vars = [var.name for var in model_var.fetch_var]
+    return feed_vars, fetch_vars
+
+
+if __name__ == '__main__':
+    url = f"127.0.0.1:{args.http_port}"
+    logid = 10000
+    img_list = get_test_images(args.image_dir, args.image_file)
+    feed_vars, fetch_vars = get_model_vars(args.serving_client)
+
+    client = Client()
+    client.load_client_config(
+        os.path.join(args.serving_client, "serving_client_conf_cpp.prototxt"))
+    client.connect([url])
+
+    for img_file in img_list:
+        with open(img_file, 'rb') as file:
+            image_data = file.read()
+        image = base64.b64encode(image_data).decode('utf8')
+        fetch_dict = client.predict(
+            feed={feed_vars[0]: image}, fetch=fetch_vars)
+        result = postprocess(fetch_dict, fetch_vars, args.threshold)
--- a/deploy/serving/cpp/serving_client_conf.prototxt
+++ b/deploy/serving/cpp/serving_client_conf.prototxt
+feed_var {
+  name: "input"
+  alias_name: "input"
+  is_lod_tensor: false
+  feed_type: 20
+  shape: 1
+}
+fetch_var {
+  name: "multiclass_nms3_0.tmp_0"
+  alias_name: "multiclass_nms3_0.tmp_0"
+  is_lod_tensor: true
+  fetch_type: 1
+  shape: -1
+}
+fetch_var {
+  name: "multiclass_nms3_0.tmp_2"
+  alias_name: "multiclass_nms3_0.tmp_2"
+  is_lod_tensor: false
+  fetch_type: 2
+}
\ No newline at end of file
--- a/deploy/serving/label_list.txt
+++ b/deploy/serving/label_list.txt
+person
+bicycle
+car
+motorcycle
+airplane
+bus
+train
+truck
+boat
+traffic light
+fire hydrant
+stop sign
+parking meter
+bench
+bird
+cat
+dog
+horse
+sheep
+cow
+elephant
+bear
+zebra
+giraffe
+backpack
+umbrella
+handbag
+tie
+suitcase
+frisbee
+skis
+snowboard
+sports ball
+kite
+baseball bat
+baseball glove
+skateboard
+surfboard
+tennis racket
+bottle
+wine glass
+cup
+fork
+knife
+spoon
+bowl
+banana
+apple
+sandwich
+orange
+broccoli
+carrot
+hot dog
+pizza
+donut
+cake
+chair
+couch
+potted plant
+bed
+dining table
+toilet
+tv
+laptop
+mouse
+remote
+keyboard
+cell phone
+microwave
+oven
+toaster
+sink
+refrigerator
+book
+clock
+vase
+scissors
+teddy bear
+hair drier
+toothbrush
\ No newline at end of file
--- a/deploy/serving/python/README.md
+++ b/deploy/serving/python/README.md
+# Python Serving预测部署
+
+## 1. 简介
+Paddle Serving是飞桨开源的服务化部署框架，提供了C++ Serving和Python Pipeline两套框架，
+C++ Serving框架更倾向于追求极致性能，Python Pipeline框架倾向于二次开发的便捷性。
+旨在帮助深度学习开发者和企业提供高性能、灵活易用的工业级在线推理服务，助力人工智能落地应用。
+
+更多关于Paddle Serving的介绍，可以参考[Paddle Serving官网repo](https://github.com/PaddlePaddle/Serving)。
+
+本文档主要介绍利用Python Pipeline框架实现模型（以yolov3_darknet53_270e_coco为例）的服务化部署。
+
+## 2. Python Serving预测部署
+
+#### 2.1 Python 服务化部署样例程序介绍
+服务化部署的样例程序的目录地址为：`deploy/serving/python`
+```shell
+deploy/
+├── serving/
+│   ├── python/                       # Python 服务化部署样例程序目录
+│   │   ├──config.yml                 # 服务端模型预测相关配置文件
+│   │   ├──pipeline_http_client.py    # 客户端代码
+│   │   ├──postprocess_ops.py         # 用户自定义后处理代码
+│   │   ├──preprocess_ops.py          # 用户自定义预处理代码
+│   │   ├──README.md                  # 说明文档
+│   │   ├──web_service.py             # 服务端代码
+│   ├── cpp/                          # C++ 服务化部署样例程序目录
+│   │   ├──preprocess/                # C++ 自定义OP
+│   │   ├──build_server.sh            # C++ Serving 编译脚本
+│   │   ├──serving_client.py          # 客户端代码
+│   │   └── ...
+│   └── ...
+└── ...
+```
+
+### 2.2 环境准备
+安装Paddle Serving四个安装包的最新版本，
+分别是：paddle-serving-server(CPU/GPU版本二选一),
+paddle-serving-client, paddle-serving-app和paddlepaddle(CPU/GPU版本二选一)。
+```commandline
+pip install paddle-serving-client
+# pip install paddle-serving-server # CPU
+pip install paddle-serving-server-gpu # GPU 默认 CUDA10.2 + TensorRT6，其他环境需手动指定版本号
+pip install paddle-serving-app
+# pip install paddlepaddle # CPU
+pip install paddlepaddle-gpu
+```
+您可能需要使用国内镜像源（例如百度源, 在pip命令中添加`-i https://mirror.baidu.com/pypi/simple`）来加速下载。
+Paddle Serving Server更多不同运行环境的whl包下载地址，请参考：[下载页面](https://github.com/PaddlePaddle/Serving/blob/v0.7.0/doc/Latest_Packages_CN.md)
+PaddlePaddle更多版本请参考[官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)
+
+### 2.3 服务化部署模型导出
+导出步骤参考文档[PaddleDetection部署模型导出教程](../../EXPORT_MODEL.md),
+导出服务化部署模型需要添加`--export_serving_model True`参数，导出示例如下:
+```commandline
+python tools/export_model.py -c configs/yolov3/yolov3_darknet53_270e_coco.yml \
+                             --export_serving_model True \
+                             -o weights=https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams
+```
+
+### 2.4 启动服务端模型预测服务
+当完成以上环境准备和模型导出后，可以按如下命令启动模型预测服务：
+```commandline
+python deploy/serving/python/web_service.py --model_dir output_inference/yolov3_darknet53_270e_coco &
+```
+服务端模型预测相关配置可在[config.yml](./config.yml)中修改，
+开发者只需要关注如下配置：http_port（服务的http端口），device_type（计算硬件类型），devices（计算硬件ID）。
+
+### 2.5 启动客户端访问
+当成功启动了模型预测服务，可以按如下命令启动客户端访问服务：
+```commandline
+python deploy/serving/python/pipeline_http_client.py --image_file demo/000000014439.jpg
+```
--- a/deploy/serving/python/config.yml
+++ b/deploy/serving/python/config.yml
+#worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程，每个进程内构建grpcSever和DAG
+##当build_dag_each_worker=False时，框架会设置主线程grpc线程池的max_workers=worker_num
+worker_num: 20
+
+#http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时，不自动生成http_port
+http_port: 18093
+rpc_port: 9993
+
+dag:
+    #op资源类型, True, 为线程模型；False，为进程模型
+    is_thread_op: False
+op:
+    #op名称，与web_service中的TIPCExampleService初始化name参数一致
+    ppdet:
+        #并发数，is_thread_op=True时，为线程并发；否则为进程并发
+        concurrency: 1
+
+        #当op配置没有server_endpoints时，从local_service_conf读取本地服务配置
+        local_service_conf:
+
+            #uci模型路径
+            model_config: "./serving_server"
+
+            #计算硬件类型: 空缺时由devices决定(CPU/GPU)，0=cpu, 1=gpu, 2=tensorRT, 3=arm cpu, 4=kunlun xpu
+            device_type:
+
+            #计算硬件ID，当devices为""或不写时为CPU预测；当devices为"0", "0,1,2"时为GPU预测，表示使用的GPU卡
+            devices: "0" # "0,1"
+
+            #client类型，包括brpc, grpc和local_predictor.local_predictor不启动Serving服务，进程内预测
+            client_type: local_predictor