Commit a9444597 authored by LDOUBLEV's avatar LDOUBLEV
Browse files

Merge branch 'dygraph' of https://github.com/PaddlePaddle/PaddleOCR into kie_doc

parents 29a096e2 0e325b5a
This diff is collapsed.
......@@ -8,6 +8,8 @@ PPOCRLabel is a semi-automatic graphic annotation tool suitable for OCR field, w
### Recent Update
- 2022.01:(by [PeterH0323](https://github.com/peterh0323)
- Improve user experience: prompt for the number of files and labels, optimize interaction, and fix bugs such as only use CPU when inference
- 2021.11.17:
- Support install and start PPOCRLabel through the whl package (by [d2623587501](https://github.com/d2623587501))
- Dataset segmentation: Divide the annotation file into training, verification and testing parts (refer to section 3.5 below, by [MrCuiHao](https://github.com/MrCuiHao))
......@@ -110,7 +112,7 @@ python PPOCRLabel.py
6. Click 're-Recognition', model will rewrite ALL recognition results in ALL detection box<sup>[3]</sup>.
7. Double click the result in 'recognition result' list to manually change inaccurate recognition results.
7. Single click the result in 'recognition result' list to manually change inaccurate recognition results.
8. **Click "Check", the image status will switch to "√",then the program automatically jump to the next.**
......@@ -143,15 +145,17 @@ python PPOCRLabel.py
### 3.1 Shortcut keys
| Shortcut keys | Description |
| ------------------------ | ------------------------------------------------ |
|--------------------------|--------------------------------------------------|
| Ctrl + Shift + R | Re-recognize all the labels of the current image |
| W | Create a rect box |
| Q | Create a four-points box |
| X | Rotate the box anti-clockwise |
| C | Rotate the box clockwise |
| Ctrl + E | Edit label of the selected box |
| Ctrl + R | Re-recognize the selected box |
| Ctrl + C | Copy and paste the selected box |
| Ctrl + Left Mouse Button | Multi select the label box |
| Backspace | Delete the selected box |
| Alt + X | Delete the selected box |
| Ctrl + V | Check image |
| Ctrl + Shift + d | Delete image |
| D | Next image |
......
......@@ -8,6 +8,8 @@ PPOCRLabel是一款适用于OCR领域的半自动化图形标注工具,内置P
#### 近期更新
- 2022.01:(by [PeterH0323](https://github.com/peterh0323)
- 提升用户体验:新增文件与标记数目提示、优化交互、修复gpu使用等问题
- 2021.11.17:
- 新增支持通过whl包安装和启动PPOCRLabel(by [d2623587501](https://github.com/d2623587501)
- 标注数据集切分:对标注数据进行训练、验证与测试集划分(参考下方3.5节,by [MrCuiHao](https://github.com/MrCuiHao)
......@@ -102,7 +104,7 @@ python PPOCRLabel.py --lang ch
4. 手动标注:点击 “矩形标注”(推荐直接在英文模式下点击键盘中的 “W”),用户可对当前图片中模型未检出的部分进行手动绘制标记框。点击键盘Q,则使用四点标注模式(或点击“编辑” - “四点标注”),用户依次点击4个点后,双击左键表示标注完成。
5. 标记框绘制完成后,用户点击 “确认”,检测框会先被预分配一个 “待识别” 标签。
6. 重新识别:将图片中的所有检测画绘制/调整完成后,点击 “重新识别”,PPOCR模型会对当前图片中的**所有检测框**重新识别<sup>[3]</sup>
7. 内容更改:击识别结果,对不准确的识别结果进行手动更改。
7. 内容更改:击识别结果,对不准确的识别结果进行手动更改。
8. **确认标记:点击 “确认”,图片状态切换为 “√”,跳转至下一张。**
9. 删除:点击 “删除图像”,图片将会被删除至回收站。
10. 导出结果:用户可以通过菜单中“文件-导出标记结果”手动导出,同时也可以点击“文件 - 自动导出标记结果”开启自动导出。手动确认过的标记将会被存放在所打开图片文件夹下的*Label.txt*中。在菜单栏点击 “文件” - "导出识别结果"后,会将此类图片的识别训练数据保存在*crop_img*文件夹下,识别标签保存在*rec_gt.txt*<sup>[4]</sup>
......@@ -131,23 +133,25 @@ python PPOCRLabel.py --lang ch
### 3.1 快捷键
| 快捷键 | 说明 |
| ---------------- | ---------------------------- |
| Ctrl + shift + R | 对当前图片的所有标记重新识别 |
| W | 新建矩形框 |
| Q | 新建四点框 |
| Ctrl + E | 编辑所选框标签 |
| Ctrl + R | 重新识别所选标记 |
| 快捷键 | 说明 |
|------------------|----------------|
| Ctrl + shift + R | 对当前图片的所有标记重新识别 |
| W | 新建矩形框 |
| Q | 新建四点框 |
| X | 框逆时针旋转 |
| C | 框顺时针旋转 |
| Ctrl + E | 编辑所选框标签 |
| Ctrl + R | 重新识别所选标记 |
| Ctrl + C | 复制并粘贴选中的标记框 |
| Ctrl + 鼠标左键 | 多选标记框 |
| Backspace | 删除所选框 |
| Ctrl + V | 确认本张图片标记 |
| Ctrl + Shift + d | 删除本张图片 |
| D | 下一张图片 |
| A | 上一张图片 |
| Ctrl++ | 缩小 |
| Ctrl-- | 放大 |
| ↑→↓← | 移动标记框 |
| Ctrl + 鼠标左键 | 多选标记框 |
| Alt + X | 删除所选框 |
| Ctrl + V | 确认本张图片标记 |
| Ctrl + Shift + d | 删除本张图片 |
| D | 下一张图片 |
| A | 上一张图片 |
| Ctrl++ | 缩小 |
| Ctrl-- | 放大 |
| ↑→↓← | 移动标记框 |
### 3.2 内置模型
......
# Copyright (c) <2015-Present> Tzutalin
# Copyright (C) 2013 MIT, Computer Science and Artificial Intelligence Laboratory. Bryan Russell, Antonio Torralba,
# William T. Freeman. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
# associated documentation files (the "Software"), to deal in the Software without restriction, including without
# limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the
# Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
# the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT
# NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
# SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
# CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
import sys
try:
from PyQt5.QtWidgets import QWidget, QHBoxLayout, QComboBox
except ImportError:
# needed for py3+qt4
# Ref:
# http://pyqt.sourceforge.net/Docs/PyQt4/incompatible_apis.html
# http://stackoverflow.com/questions/21217399/pyqt4-qtcore-qvariant-object-instead-of-a-string
if sys.version_info.major >= 3:
import sip
sip.setapi('QVariant', 2)
from PyQt4.QtGui import QWidget, QHBoxLayout, QComboBox
class ComboBox(QWidget):
def __init__(self, parent=None, items=[]):
super(ComboBox, self).__init__(parent)
layout = QHBoxLayout()
self.cb = QComboBox()
self.items = items
self.cb.addItems(self.items)
self.cb.currentIndexChanged.connect(parent.comboSelectionChanged)
layout.addWidget(self.cb)
self.setLayout(layout)
def update_items(self, items):
self.items = items
self.cb.clear()
self.cb.addItems(self.items)
......@@ -6,6 +6,8 @@ except ImportError:
from PyQt4.QtGui import *
from PyQt4.QtCore import *
import time
import datetime
import json
import cv2
import numpy as np
......@@ -80,8 +82,9 @@ class AutoDialog(QDialog):
self.parent = parent
self.ocr = ocr
self.mImgList = mImgList
self.lender = lenbar
self.pb = QProgressBar()
self.pb.setRange(0, lenbar)
self.pb.setRange(0, self.lender)
self.pb.setValue(0)
layout = QVBoxLayout()
......@@ -108,10 +111,16 @@ class AutoDialog(QDialog):
self.thread_1.progressBarValue.connect(self.handleProgressBarSingal)
self.thread_1.listValue.connect(self.handleListWidgetSingal)
self.thread_1.endsignal.connect(self.handleEndsignalSignal)
self.time_start = time.time() # save start time
def handleProgressBarSingal(self, i):
self.pb.setValue(i)
# calculate time left of auto labeling
avg_time = (time.time() - self.time_start) / i # Use average time to prevent time fluctuations
time_left = str(datetime.timedelta(seconds=avg_time * (self.lender - i))).split(".")[0] # Remove microseconds
self.setWindowTitle("PPOCRLabel -- " + f"Time Left: {time_left}") # show
def handleListWidgetSingal(self, i):
self.listWidget.addItem(i)
titem = self.listWidget.item(self.listWidget.count() - 1)
......
......@@ -11,19 +11,13 @@
# CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
try:
from PyQt5.QtGui import *
from PyQt5.QtCore import *
from PyQt5.QtWidgets import *
except ImportError:
from PyQt4.QtGui import *
from PyQt4.QtCore import *
#from PyQt4.QtOpenGL import *
import copy
from PyQt5.QtCore import Qt, pyqtSignal, QPointF, QPoint
from PyQt5.QtGui import QPainter, QBrush, QColor, QPixmap
from PyQt5.QtWidgets import QWidget, QMenu, QApplication
from libs.shape import Shape
from libs.utils import distance
import copy
CURSOR_DEFAULT = Qt.ArrowCursor
CURSOR_POINT = Qt.PointingHandCursor
......@@ -31,8 +25,6 @@ CURSOR_DRAW = Qt.CrossCursor
CURSOR_MOVE = Qt.ClosedHandCursor
CURSOR_GRAB = Qt.OpenHandCursor
# class Canvas(QGLWidget):
class Canvas(QWidget):
zoomRequest = pyqtSignal(int)
......@@ -129,7 +121,6 @@ class Canvas(QWidget):
def selectedVertex(self):
return self.hVertex is not None
def mouseMoveEvent(self, ev):
"""Update line with last point and current coordinates."""
pos = self.transformPos(ev.pos())
......@@ -333,7 +324,6 @@ class Canvas(QWidget):
self.movingShape = False
def endMove(self, copy=False):
assert self.selectedShapes and self.selectedShapesCopy
assert len(self.selectedShapesCopy) == len(self.selectedShapes)
......@@ -410,7 +400,6 @@ class Canvas(QWidget):
self.selectionChanged.emit(shapes)
self.update()
def selectShapePoint(self, point, multiple_selection_mode):
"""Select the first shape created which contains this point."""
if self.selectedVertex(): # A vertex is marked for selection.
......@@ -494,7 +483,6 @@ class Canvas(QWidget):
else:
shape.moveVertexBy(index, shiftPos)
def boundedMoveShape(self, shapes, pos):
if type(shapes).__name__ != 'list': shapes = [shapes]
if self.outOfPixmap(pos):
......@@ -515,6 +503,7 @@ class Canvas(QWidget):
if dp:
for shape in shapes:
shape.moveBy(dp)
shape.close()
self.prevPoint = pos
return True
return False
......@@ -728,6 +717,31 @@ class Canvas(QWidget):
self.moveOnePixel('Up')
elif key == Qt.Key_Down and self.selectedShapes:
self.moveOnePixel('Down')
elif key == Qt.Key_X and self.selectedShapes:
for i in range(len(self.selectedShapes)):
self.selectedShape = self.selectedShapes[i]
if self.rotateOutOfBound(0.01):
continue
self.selectedShape.rotate(0.01)
self.shapeMoved.emit()
self.update()
elif key == Qt.Key_C and self.selectedShapes:
for i in range(len(self.selectedShapes)):
self.selectedShape = self.selectedShapes[i]
if self.rotateOutOfBound(-0.01):
continue
self.selectedShape.rotate(-0.01)
self.shapeMoved.emit()
self.update()
def rotateOutOfBound(self, angle):
for shape in range(len(self.selectedShapes)):
self.selectedShape = self.selectedShapes[shape]
for i, p in enumerate(self.selectedShape.points):
if self.outOfPixmap(self.selectedShape.rotatePoint(p, angle)):
return True
return False
def moveOnePixel(self, direction):
# print(self.selectedShape.points)
......
import sys, time
from PyQt5 import QtWidgets
from PyQt5.QtGui import *
from PyQt5.QtCore import *
from PyQt5.QtWidgets import *
# !/usr/bin/env python
# -*- coding: utf-8 -*-
from PyQt5.QtCore import QModelIndex
from PyQt5.QtWidgets import QListWidget
class EditInList(QListWidget):
def __init__(self):
super(EditInList,self).__init__()
# click to edit
self.clicked.connect(self.item_clicked)
super(EditInList, self).__init__()
self.edited_item = None
def item_clicked(self, modelindex: QModelIndex):
try:
if self.edited_item is not None:
self.closePersistentEditor(self.edited_item)
except:
self.edited_item = self.currentItem()
def item_clicked(self, modelindex: QModelIndex) -> None:
self.edited_item = self.currentItem()
self.closePersistentEditor(self.edited_item)
item = self.item(modelindex.row())
# time.sleep(0.2)
self.edited_item = item
self.openPersistentEditor(item)
# time.sleep(0.2)
self.editItem(item)
self.edited_item = self.item(modelindex.row())
self.openPersistentEditor(self.edited_item)
self.editItem(self.edited_item)
def mouseDoubleClickEvent(self, event):
# close edit
for i in range(self.count()):
self.closePersistentEditor(self.item(i))
pass
def leaveEvent(self, event):
# close edit
for i in range(self.count()):
self.closePersistentEditor(self.item(i))
\ No newline at end of file
self.closePersistentEditor(self.item(i))
......@@ -10,19 +10,14 @@
# SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
# CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#!/usr/bin/python
# !/usr/bin/python
# -*- coding: utf-8 -*-
import math
import sys
try:
from PyQt5.QtGui import *
from PyQt5.QtCore import *
except ImportError:
from PyQt4.QtGui import *
from PyQt4.QtCore import *
from PyQt5.QtCore import QPointF
from PyQt5.QtGui import QColor, QPen, QPainterPath, QFont
from libs.utils import distance
import sys
DEFAULT_LINE_COLOR = QColor(0, 255, 0, 128)
DEFAULT_FILL_COLOR = QColor(255, 0, 0, 128)
......@@ -59,6 +54,8 @@ class Shape(object):
self.difficult = difficult
self.paintLabel = paintLabel
self.locked = False
self.direction = 0
self.center = None
self._highlightIndex = None
self._highlightMode = self.NEAR_VERTEX
self._highlightSettings = {
......@@ -74,7 +71,24 @@ class Shape(object):
# is used for drawing the pending line a different color.
self.line_color = line_color
def rotate(self, theta):
for i, p in enumerate(self.points):
self.points[i] = self.rotatePoint(p, theta)
self.direction -= theta
self.direction = self.direction % (2 * math.pi)
def rotatePoint(self, p, theta):
order = p - self.center
cosTheta = math.cos(theta)
sinTheta = math.sin(theta)
pResx = cosTheta * order.x() + sinTheta * order.y()
pResy = - sinTheta * order.x() + cosTheta * order.y()
pRes = QPointF(self.center.x() + pResx, self.center.y() + pResy)
return pRes
def close(self):
self.center = QPointF((self.points[0].x() + self.points[2].x()) / 2,
(self.points[0].y() + self.points[2].y()) / 2)
self._closed = True
def reachMaxPoints(self):
......@@ -83,7 +97,9 @@ class Shape(object):
return False
def addPoint(self, point):
if not self.reachMaxPoints(): # 4个点时发出close信号
if self.reachMaxPoints():
self.close()
else:
self.points.append(point)
def popPoint(self):
......@@ -112,7 +128,7 @@ class Shape(object):
# Uncommenting the following line will draw 2 paths
# for the 1st vertex, and make it non-filled, which
# may be desirable.
#self.drawVertex(vrtx_path, 0)
# self.drawVertex(vrtx_path, 0)
for i, p in enumerate(self.points):
line_path.lineTo(p)
......@@ -136,9 +152,9 @@ class Shape(object):
font.setPointSize(8)
font.setBold(True)
painter.setFont(font)
if(self.label == None):
if self.label is None:
self.label = ""
if(min_y < MIN_Y_LABEL):
if min_y < MIN_Y_LABEL:
min_y += MIN_Y_LABEL
painter.drawText(min_x, min_y, self.label)
......@@ -198,6 +214,8 @@ class Shape(object):
def copy(self):
shape = Shape("%s" % self.label)
shape.points = [p for p in self.points]
shape.center = self.center
shape.direction = self.direction
shape.fill = self.fill
shape.selected = self.selected
shape._closed = self._closed
......
......@@ -33,17 +33,17 @@ PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools
- [more](./doc/doc_en/update_en.md)
## Features
- PP-OCR series of high-quality pre-trained models, comparable to commercial effects
- PP-OCR - A series of high-quality pre-trained models, comparable to commercial products
- Ultra lightweight PP-OCRv2 series models: detection (3.1M) + direction classifier (1.4M) + recognition 8.5M) = 13.0M
- Ultra lightweight PP-OCR mobile series models: detection (3.0M) + direction classifier (1.4M) + recognition (5.0M) = 9.4M
- General PP-OCR server series models: detection (47.1M) + direction classifier (1.4M) + recognition (94.9M) = 143.4M
- Support Chinese, English, and digit recognition, vertical text recognition, and long text recognition
- Support multi-language recognition: about 80 languages like Korean, Japanese, German, French, etc
- Support multi-lingual recognition: about 80 languages like Korean, Japanese, German, French, etc
- PP-Structure: a document structurize system
- support layout analysis and table recognition (support export to Excel)
- support key information extraction
- support DocVQA
- Rich toolkits related to the OCR areas
- Support layout analysis and table recognition (support export to Excel)
- Support key information extraction
- Support DocVQA
- Rich OCR toolkit
- Semi-automatic data annotation tool, i.e., PPOCRLabel: support fast and efficient data annotation
- Data synthesis tool, i.e., Style-Text: easy to synthesize a large number of images which are similar to the target scene image
- Support user-defined training, provides rich predictive inference deployment solutions
......@@ -62,7 +62,7 @@ The above pictures are the visualizations of the general ppocr_server model. For
<a name="Community"></a>
## Community
- Scan the QR code below with your Wechat, you can access to official technical exchange group. Look forward to your participation.
- Scan the QR code below with your Wechat, you can join the official technical discussion group. Looking forward to your participation.
<div align="center">
<img src="https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/dygraph/doc/joinus.PNG" width = "200" height = "200" />
......@@ -120,8 +120,8 @@ For a new language request, please refer to [Guideline for new language_requests
- [PP-Structure: Information Extraction](./ppstructure/README.md)
- [Layout Parser](./ppstructure/layout/README.md)
- [Table Recognition](./ppstructure/table/README.md)
- [DocVQA](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.4/ppstructure/vqa)
- [Key Information Extraction](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/ppstructure/docs/kie.md)
- [DocVQA](./ppstructure/vqa/README.md)
- [Key Information Extraction](./ppstructure/docs/kie.md)
- Academic Circles
- [Two-stage Algorithm](./doc/doc_en/algorithm_overview_en.md)
- [PGNet Algorithm](./doc/doc_en/pgnet_en.md)
......
......@@ -99,8 +99,8 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力
- [PP-Structure信息提取](./ppstructure/README_ch.md)
- [版面分析](./ppstructure/layout/README_ch.md)
- [表格识别](./ppstructure/table/README_ch.md)
- [DocVQA](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.4/ppstructure/vqa)
- [关键信息提取](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/ppstructure/docs/kie.md)
- [DocVQA](./ppstructure/vqa/README_ch.md)
- [关键信息提取](./ppstructure/docs/kie.md)
- OCR学术圈
- [两阶段模型介绍与下载](./doc/doc_ch/algorithm_overview.md)
- [端到端PGNet算法](./doc/doc_ch/pgnet.md)
......
......@@ -160,6 +160,7 @@ public class Predictor {
for (String content : contents) {
wordLabels.add(content);
}
wordLabels.add(" ");
Log.i(TAG, "Word label size: " + wordLabels.size());
} catch (Exception e) {
Log.e(TAG, e.getMessage());
......
# Server-side C++ Inference
This chapter introduces the C++ deployment method of the PaddleOCR model, and the corresponding python predictive deployment method refers to [document](../../doc/doc_ch/inference.md).
C++ is better than python in terms of performance calculation. Therefore, in most CPU and GPU deployment scenarios, C++ deployment is mostly used.
This section will introduce how to configure the C++ environment and complete it in the Linux\Windows (CPU\GPU) environment
PaddleOCR model deployment.
This chapter introduces the C++ deployment steps of the PaddleOCR model. The corresponding Python predictive deployment method refers to [document](../../doc/doc_ch/inference.md).
C++ is better than python in terms of performance. Therefore, in CPU and GPU deployment scenarios, C++ deployment is mostly used.
This section will introduce how to configure the C++ environment and deploy PaddleOCR in Linux (CPU\GPU) environment. For Windows deployment please refer to [Windows](./docs/windows_vs2019_build.md) compilation guidelines.
## 1. Prepare the Environment
......@@ -15,7 +14,7 @@ PaddleOCR model deployment.
### 1.1 Compile OpenCV
* First of all, you need to download the source code compiled package in the Linux environment from the opencv official website. Taking opencv3.4.7 as an example, the download command is as follows.
* First of all, you need to download the source code compiled package in the Linux environment from the OpenCV official website. Taking OpenCV 3.4.7 as an example, the download command is as follows.
```bash
cd deploy/cpp_infer
......@@ -23,9 +22,9 @@ wget https://paddleocr.bj.bcebos.com/libs/opencv/opencv-3.4.7.tar.gz
tar -xf opencv-3.4.7.tar.gz
```
Finally, you can see the folder of `opencv-3.4.7/` in the current directory.
Finally, you will see the folder of `opencv-3.4.7/` in the current directory.
* Compile opencv, the opencv source path (`root_path`) and installation path (`install_path`) should be set by yourself. Enter the opencv source code path and compile it in the following way.
* Compile OpenCV, the OpenCV source path (`root_path`) and installation path (`install_path`) should be set by yourself. Enter the OpenCV source code path and compile it in the following way.
```shell
......@@ -58,11 +57,11 @@ make -j
make install
```
Among them, `root_path` is the downloaded opencv source code path, and `install_path` is the installation path of opencv. After `make install` is completed, the opencv header file and library file will be generated in this folder for later OCR source code compilation.
In the above commands, `root_path` is the downloaded OpenCV source code path, and `install_path` is the installation path of OpenCV. After `make install` is completed, the OpenCV header file and library file will be generated in this folder for later OCR source code compilation.
The final file structure under the opencv installation path is as follows.
The final file structure under the OpenCV installation path is as follows.
```
opencv3/
......@@ -79,20 +78,20 @@ opencv3/
#### 1.2.1 Direct download and installation
[Paddle inference library official website](https://paddle-inference.readthedocs.io/en/latest/user_guides/download_lib.html). You can view and select the appropriate version of the inference library on the official website.
[Paddle inference library official website](https://paddle-inference.readthedocs.io/en/latest/user_guides/download_lib.html). You can review and select the appropriate version of the inference library on the official website.
* After downloading, use the following method to uncompress.
* After downloading, use the following command to extract files.
```
tar -xf paddle_inference.tgz
```
Finally you can see the following files in the folder of `paddle_inference/`.
Finally you will see the the folder of `paddle_inference/` in the current path.
#### 1.2.2 Compile from the source code
* If you want to get the latest Paddle inference library features, you can download the latest code from Paddle github repository and compile the inference library from the source code. It is recommended to download the inference library with paddle version greater than or equal to 2.0.1.
* You can refer to [Paddle inference library] (https://www.paddlepaddle.org.cn/documentation/docs/en/advanced_guide/inference_deployment/inference/build_and_install_lib_en.html) to get the Paddle source code from github, and then compile To generate the latest inference library. The method of using git to access the code is as follows.
#### 1.2.2 Compile the inference source code
* If you want to get the latest Paddle inference library features, you can download the latest code from Paddle GitHub repository and compile the inference library from the source code. It is recommended to download the inference library with paddle version greater than or equal to 2.0.1.
* You can refer to [Paddle inference library] (https://www.paddlepaddle.org.cn/documentation/docs/en/advanced_guide/inference_deployment/inference/build_and_install_lib_en.html) to get the Paddle source code from GitHub, and then compile To generate the latest inference library. The method of using git to access the code is as follows.
```shell
......@@ -100,7 +99,7 @@ git clone https://github.com/PaddlePaddle/Paddle.git
git checkout develop
```
* After entering the Paddle directory, the commands to compile the paddle inference library are as follows.
* Enter the Paddle directory and run the following commands to compile the paddle inference library.
```shell
rm -rf build
......@@ -133,14 +132,14 @@ build/paddle_inference_install_dir/
|-- version.txt
```
Among them, `paddle` is the Paddle library required for C++ prediction later, and `version.txt` contains the version information of the current inference library.
`paddle` is the Paddle library required for C++ prediction later, and `version.txt` contains the version information of the current inference library.
## 2. Compile and Run the Demo
### 2.1 Export the inference model
* You can refer to [Model inference](../../doc/doc_ch/inference.md)export the inference model. After the model is exported, assuming it is placed in the `inference` directory, the directory structure is as follows.
* You can refer to [Model inference](../../doc/doc_ch/inference.md) and export the inference model. After the model is exported, assuming it is placed in the `inference` directory, the directory structure is as follows.
```
inference/
......@@ -171,20 +170,28 @@ CUDA_LIB_DIR=your_cuda_lib_dir
CUDNN_LIB_DIR=your_cudnn_lib_dir
```
`OPENCV_DIR` is the opencv installation path; `LIB_DIR` is the download (`paddle_inference` folder)
`OPENCV_DIR` is the OpenCV installation path; `LIB_DIR` is the download (`paddle_inference` folder)
or the generated Paddle inference library path (`build/paddle_inference_install_dir` folder);
`CUDA_LIB_DIR` is the cuda library file path, in docker; it is `/usr/local/cuda/lib64`; `CUDNN_LIB_DIR` is the cudnn library file path, in docker it is `/usr/lib/x86_64-linux-gnu/`.
`CUDA_LIB_DIR` is the CUDA library file path, in docker; it is `/usr/local/cuda/lib64`; `CUDNN_LIB_DIR` is the cuDNN library file path, in docker it is `/usr/lib/x86_64-linux-gnu/`.
* After the compilation is completed, an executable file named `ppocr` will be generated in the `build` folder.
### Run the demo
Execute the built executable file:
Execute the built executable file:
```shell
./build/ppocr <mode> [--param1] [--param2] [...]
```
Here, `mode` is a required parameter,and the value range is ['det', 'rec', 'system'], representing using detection only, using recognition only and using the end-to-end system respectively. Specifically,
`mode` is a required parameter,and the valid values are
mode value | Model used
-----|------
det | Detection only
rec | Recognition only
system | End-to-end system
Specifically,
##### 1. run det demo:
```shell
......@@ -214,9 +221,9 @@ Here, `mode` is a required parameter,and the value range is ['det', 'rec', 'sy
--image_dir=../../doc/imgs/12.jpg
```
More parameters are as follows,
More parameters are as follows,
- common parameters
- Common parameters
|parameter|data type|default|meaning|
| --- | --- | --- | --- |
......@@ -226,7 +233,7 @@ More parameters are as follows,
|cpu_math_library_num_threads|int|10|Number of threads when using CPU inference. When machine cores is enough, the large the value, the faster the inference speed|
|use_mkldnn|bool|true|Whether to use mkdlnn library|
- detection related parameters
- Detection related parameters
|parameter|data type|default|meaning|
| --- | --- | --- | --- |
......@@ -238,7 +245,7 @@ More parameters are as follows,
|use_polygon_score|bool|false|Whether to use polygon box to calculate bbox score, false means to use rectangle box to calculate. Use rectangular box to calculate faster, and polygonal box more accurate for curved text area.|
|visualize|bool|true|Whether to visualize the results,when it is set as true, The prediction result will be save in the image file `./ocr_vis.png`.|
- classifier related parameters
- Classifier related parameters
|parameter|data type|default|meaning|
| --- | --- | --- | --- |
......@@ -246,7 +253,7 @@ More parameters are as follows,
|cls_model_dir|string|-|Address of direction classifier inference model|
|cls_thresh|float|0.9|Score threshold of the direction classifier|
- recogniton related parameters
- Recognition related parameters
|parameter|data type|default|meaning|
| --- | --- | --- | --- |
......@@ -265,4 +272,4 @@ The detection results will be shown on the screen, which is as follows.
### 2.3 Notes
* Paddle2.0.0 inference model library is recommended for this toturial.
* Paddle 2.0.0 inference model library is recommended for this tutorial.
English | [简体中文](README_cn.md)
## Introduction
Many users hope package the PaddleOCR service into a docker image, so that it can be quickly released and used in the docker or k8s environment.
Many users hope package the PaddleOCR service into a docker image, so that it can be quickly released and used in the docker or K8s environment.
This page provides some standardized code to achieve this goal. You can quickly publish the PaddleOCR project into a callable Restful API service through the following steps. (At present, the deployment based on the HubServing mode is implemented first, and author plans to increase the deployment of the PaddleServing mode in the futrue)
This page provides some standardized code to achieve this goal. You can quickly publish the PaddleOCR project into a callable Restful API service through the following steps. (At present, the deployment based on the HubServing mode is implemented first, and author plans to increase the deployment of the PaddleServing mode in the future)
## 1. Prerequisites
......@@ -14,7 +14,7 @@ c. NVIDIA Container Toolkit(GPU,Docker 19.03+ can skip this)
d. cuDNN 7.6+(GPU)
## 2. Build Image
a. Goto Dockerfile directory(ps:Need to distinguish between cpu and gpu version, the following takes cpu as an example, gpu version needs to replace the keyword)
a. Go to Dockerfile directory(PS: Need to distinguish between CPU and GPU version, the following takes CPU as an example, GPU version needs to replace the keyword)
```
cd deploy/docker/hubserving/cpu
```
......@@ -42,13 +42,13 @@ docker logs -f paddle_ocr
```
## 4. Test
a. Calculate the Base64 encoding of the picture to be recognized (if you just test, you can use a free online tool, like:https://freeonlinetools24.com/base64-image/
a. Calculate the Base64 encoding of the picture to be recognized (For test purpose, you can use a free online tool such as https://freeonlinetools24.com/base64-image/ )
b. Post a service request(sample request in sample_request.txt)
```
curl -H "Content-Type:application/json" -X POST --data "{\"images\": [\"Input image Base64 encode(need to delete the code 'data:image/jpg;base64,')\"]}" http://localhost:8868/predict/ocr_system
```
c. Get resposne(If the call is successful, the following result will be returned)
c. Get response(If the call is successful, the following result will be returned)
```
{"msg":"","results":[[{"confidence":0.8403433561325073,"text":"约定","text_region":[[345,377],[641,390],[634,540],[339,528]]},{"confidence":0.8131805658340454,"text":"最终相遇","text_region":[[356,532],[624,530],[624,596],[356,598]]}]],"status":"0"}
```
# Tutorial of PaddleOCR Mobile deployment
This tutorial will introduce how to use [Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite) to deploy paddleOCR ultra-lightweight Chinese and English detection models on mobile phones.
This tutorial will introduce how to use [Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite) to deploy PaddleOCR ultra-lightweight Chinese and English detection models on mobile phones.
paddle-lite is a lightweight inference engine for PaddlePaddle. It provides efficient inference capabilities for mobile phones and IoTs, and extensively integrates cross-platform hardware to provide lightweight deployment solutions for end-side deployment issues.
paddle-lite is a lightweight inference engine for PaddlePaddle. It provides efficient inference capabilities for mobile phones and IoT, and extensively integrates cross-platform hardware to provide lightweight deployment solutions for end-side deployment issues.
## 1. Preparation
......
......@@ -22,6 +22,7 @@ PaddleOCR提供2种服务部署方式:
- [环境准备](#环境准备)
- [模型转换](#模型转换)
- [Paddle Serving pipeline部署](#部署)
- [Windows用户](#Windows用户)
- [FAQ](#FAQ)
<a name="环境准备"></a>
......@@ -187,9 +188,10 @@ python3 -m paddle_serving_client.convert --dirname ./ch_PP-OCRv2_rec_infer/ \
2021-05-13 03:42:36,979 chl2(In: ['rec'], Out: ['@DAGExecutor']) size[0/0]
```
## WINDOWS用户
<a name="Windows用户"></a>
## Windows用户
Windows用户不能使用上述的启动方式,需要使用Web Service,详情参见[Windows平台使用Paddle Serving指导](https://github.com/PaddlePaddle/Serving/blob/develop/doc/WINDOWS_TUTORIAL_CN.md)
Windows用户不能使用上述的启动方式,需要使用Web Service,详情参见[Windows平台使用Paddle Serving指导](https://github.com/PaddlePaddle/Serving/blob/develop/doc/Windows_Tutorial_CN.md)
**WINDOWS只能使用0.5.0版本的CPU模式**
......
......@@ -28,14 +28,14 @@ python3 setup.py install
```
### 2. Download Pretrain Model
### 2. Download Pre-trained Model
Model prune needs to load pre-trained models.
PaddleOCR also provides a series of [models](../../../doc/doc_en/models_list_en.md). Developers can choose their own models or use their own models according to their needs.
### 3. Pruning sensitivity analysis
After the pre-training model is loaded, sensitivity analysis is performed on each network layer of the model to understand the redundancy of each network layer, and save a sensitivity file which named: sen.pickle. After that, user could load the sensitivity file via the [methods provided by PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/prune/sensitive.py#L221) and determining the pruning ratio of each network layer automatically. For specific details of sensitivity analysis, see:[Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md)
After the pre-trained model is loaded, sensitivity analysis is performed on each network layer of the model to understand the redundancy of each network layer, and save a sensitivity file which named: sen.pickle. After that, user could load the sensitivity file via the [methods provided by PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/prune/sensitive.py#L221) and determining the pruning ratio of each network layer automatically. For specific details of sensitivity analysis, see:[Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md)
The data format of sensitivity file:
sen.pickle(Dict){
'layer_weight_name_0': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss}
......@@ -47,7 +47,7 @@ PaddleOCR also provides a series of [models](../../../doc/doc_en/models_list_en.
'conv10_expand_weights': {0.1: 0.006509952684312718, 0.2: 0.01827734339798862, 0.3: 0.014528405644659832, 0.6: 0.06536008804270439, 0.8: 0.11798612250664964, 0.7: 0.12391408417493704, 0.4: 0.030615754498018757, 0.5: 0.047105205602406594}
'conv10_linear_weights': {0.1: 0.05113190831455035, 0.2: 0.07705573833558801, 0.3: 0.12096721757739311, 0.6: 0.5135061352930738, 0.8: 0.7908166677143281, 0.7: 0.7272187676899062, 0.4: 0.1819252083008504, 0.5: 0.3728054727792405}
}
The function would return a dict after loading the sensitivity file. The keys of the dict are name of parameters in each layer. And the value of key is the information about pruning sensitivity of correspoding layer. In example, pruning 10% filter of the layer corresponding to conv10_expand_weights would lead to 0.65% degradation of model performance. The details could be seen at: [Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/algo/algo.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86)
The function would return a dict after loading the sensitivity file. The keys of the dict are name of parameters in each layer. And the value of key is the information about pruning sensitivity of corresponding layer. In example, pruning 10% filter of the layer corresponding to conv10_expand_weights would lead to 0.65% degradation of model performance. The details could be seen at: [Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/algo/algo.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86)
Enter the PaddleOCR root directory,perform sensitivity analysis on the model with the following command:
......
## Introduction
Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model.
Generally, a more complex model would achieve better performance in the task, but it also leads to some redundancy in the model.
Quantization is a technique that reduces this redundancy by reducing the full precision data to a fixed number,
so as to reduce model calculation complexity and improve model inference performance.
......@@ -31,14 +31,14 @@ python setup.py install
```
### 2. Download Pretrain Model
PaddleOCR provides a series of trained [models](../../../doc/doc_en/models_list_en.md).
### 2. Download Pre-trained Model
PaddleOCR provides a series of pre-trained [models](../../../doc/doc_en/models_list_en.md).
If the model to be quantified is not in the list, you need to follow the [Regular Training](../../../doc/doc_en/quickstart_en.md) method to get the trained model.
### 3. Quant-Aware Training
Quantization training includes offline quantization training and online quantization training.
Online quantization training is more effective. It is necessary to load the pre-training model.
Online quantization training is more effective. It is necessary to load the pre-trained model.
After the quantization strategy is defined, the model can be quantified.
The code for quantization training is located in `slim/quantization/quant.py`. For example, to train a detection model, the training instructions are as follows:
......@@ -54,7 +54,7 @@ python deploy/slim/quantization/quant.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3
### 4. Export inference model
After getting the model after pruning and finetuning we, can export it as inference_model for predictive deployment:
Once we got the model after pruning and fine-tuning, we can export it as an inference model for the deployment of predictive tasks:
```bash
python deploy/slim/quantization/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_inference_dir=./output/quant_inference_model
......
......@@ -61,18 +61,18 @@ PaddleOCR基于动态图开源的文本识别算法列表:
|模型|骨干网络|Avg Accuracy|模型存储命名|下载链接|
|---|---|---|---|---|
|Rosetta|Resnet34_vd|80.9%|rec_r34_vd_none_none_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_none_ctc_v2.0_train.tar)|
|Rosetta|MobileNetV3|78.05%|rec_mv3_none_none_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_none_ctc_v2.0_train.tar)|
|CRNN|Resnet34_vd|82.76%|rec_r34_vd_none_bilstm_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_bilstm_ctc_v2.0_train.tar)|
|CRNN|MobileNetV3|79.97%|rec_mv3_none_bilstm_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar)|
|StarNet|Resnet34_vd|84.44%|rec_r34_vd_tps_bilstm_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_tps_bilstm_ctc_v2.0_train.tar)|
|StarNet|MobileNetV3|81.42%|rec_mv3_tps_bilstm_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_tps_bilstm_ctc_v2.0_train.tar)|
|RARE|MobileNetV3|82.5%|rec_mv3_tps_bilstm_att |[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_tps_bilstm_att_v2.0_train.tar)|
|RARE|Resnet34_vd|83.6%|rec_r34_vd_tps_bilstm_att |[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_tps_bilstm_att_v2.0_train.tar)|
|SRN|Resnet50_vd_fpn| 88.52% | rec_r50fpn_vd_none_srn | [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r50_vd_srn_train.tar) |
|NRTR|NRTR_MTB| 84.3% | rec_mtb_nrtr | [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mtb_nrtr_train.tar) |
|SAR|Resnet31| 87.2% | rec_r31_sar | [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_r31_sar_train.tar) |
|SEED|Aster_Resnet| 85.2% | rec_resnet_stn_bilstm_att | [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_resnet_stn_bilstm_att.tar) |
|Rosetta|Resnet34_vd|79.11%|rec_r34_vd_none_none_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_none_ctc_v2.0_train.tar)|
|Rosetta|MobileNetV3|75.80%|rec_mv3_none_none_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_none_ctc_v2.0_train.tar)|
|CRNN|Resnet34_vd|81.04%|rec_r34_vd_none_bilstm_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_bilstm_ctc_v2.0_train.tar)|
|CRNN|MobileNetV3|77.95%|rec_mv3_none_bilstm_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar)|
|StarNet|Resnet34_vd|82.85%|rec_r34_vd_tps_bilstm_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_tps_bilstm_ctc_v2.0_train.tar)|
|StarNet|MobileNetV3|79.28%|rec_mv3_tps_bilstm_ctc|[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_tps_bilstm_ctc_v2.0_train.tar)|
|RARE|Resnet34_vd|83.98%|rec_r34_vd_tps_bilstm_att |[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_tps_bilstm_att_v2.0_train.tar)|
|RARE|MobileNetV3|81.76%|rec_mv3_tps_bilstm_att |[训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_tps_bilstm_att_v2.0_train.tar)|
|SRN|Resnet50_vd_fpn| 86.31% | rec_r50fpn_vd_none_srn | [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r50_vd_srn_train.tar) |
|NRTR|NRTR_MTB| 84.21% | rec_mtb_nrtr | [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mtb_nrtr_train.tar) |
|SAR|Resnet31| 87.20% | rec_r31_sar | [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_r31_sar_train.tar) |
|SEED|Aster_Resnet| 85.35% | rec_resnet_stn_bilstm_att | [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_resnet_stn_bilstm_att.tar) |
<a name="2"></a>
......
......@@ -14,12 +14,12 @@ Demo测试的时候使用的是NDK 20b版本,20版本以上均可以支持编
1. Start a new Android Studio project
在项目模版中选择 Native C++ 选择PaddleOCR/depoly/android_demo 路径
在项目模版中选择 Native C++ 选择PaddleOCR/deploy/android_demo 路径
进入项目后会自动编译,第一次编译会花费较长的时间,建议添加代理加速下载。
**代理添加:**
选择 Android Studio -> Perferences -> Appearance & Behavior -> System Settings -> HTTP Proxy -> Manual proxy configuration
选择 Android Studio -> Preferences -> Appearance & Behavior -> System Settings -> HTTP Proxy -> Manual proxy configuration
![](../demo/proxy.png)
......
......@@ -16,7 +16,7 @@ PaddleOCR的Python代码遵循 [PEP8规范](https://www.python.org/dev/peps/pep-
- 空格
- 空格应该加在逗号、分号、冒号,而非他们的
- 空格应该加在逗号、分号、冒号,而非他们的
```python
# 正确:
......@@ -334,4 +334,4 @@ git push origin new_branch
2)如果评审意见比较多:
- 请给出总体的修改情况。
- 请采用`start a review`进行回复,而非直接回复的方式。原因是每个回复都会发送一封邮件,会造成邮件灾难。
\ No newline at end of file
- 请采用`start a review`进行回复,而非直接回复的方式。原因是每个回复都会发送一封邮件,会造成邮件灾难。
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment