Commit 89eb5e4b authored by wangsen's avatar wangsen
Browse files

init commit

parents
---
name: Issue template
about: Issue template for code error.
title: ''
labels: ''
assignees: ''
---
请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem
- 系统环境/System Environment:
- 版本号/Version:Paddle: PaddleOCR: 问题相关组件/Related components:
- 运行指令/Command Code:
- 完整报错/Complete Error Message:
# Byte-compiled / optimized / DLL files
__pycache__/
.ipynb_checkpoints/
*.py[cod]
*$py.class
# C extensions
*.so
inference/
inference_results/
output/
*.DS_Store
*.vs
*.user
*~
*.vscode
*.idea
*.log
.clang-format
.clang_format.hook
build/
dist/
paddleocr.egg-info/
/deploy/android_demo/app/OpenCV/
/deploy/android_demo/app/PaddleLite/
/deploy/android_demo/app/.cxx/
/deploy/android_demo/app/cache/
test_tipc/web/models/
test_tipc/web/node_modules/
- repo: https://github.com/PaddlePaddle/mirrors-yapf.git
sha: 0d79c0c469bab64f7229c9aca2b1186ef47f0e37
hooks:
- id: yapf
files: \.py$
- repo: https://github.com/pre-commit/pre-commit-hooks
sha: a11d9314b22d8f8c7556443875b731ef05965464
hooks:
- id: check-merge-conflict
- id: check-symlinks
- id: detect-private-key
files: (?!.*paddle)^.*$
- id: end-of-file-fixer
files: \.md$
- id: trailing-whitespace
files: \.md$
- repo: https://github.com/Lucas-C/pre-commit-hooks
sha: v1.0.1
hooks:
- id: forbid-crlf
files: \.md$
- id: remove-crlf
files: \.md$
- id: forbid-tabs
files: \.md$
- id: remove-tabs
files: \.md$
- repo: local
hooks:
- id: clang-format
name: clang-format
description: Format files with ClangFormat
entry: bash .clang_format.hook -i
language: system
files: \.(c|cc|cxx|cpp|cu|h|hpp|hxx|cuh|proto)$
[style]
based_on_style = pep8
column_limit = 80
Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
include LICENSE
include README.md
recursive-include ppocr/utils *.*
recursive-include ppocr/data *.py
recursive-include ppocr/postprocess *.py
recursive-include tools/infer *.py
recursive-include tools __init__.py
recursive-include ppocr/utils/e2e_utils *.py
recursive-include ppstructure *.py
\ No newline at end of file
# ex: set ts=8 noet:
all: qt5 test
test: testpy3
testpy2:
python -m unittest discover tests
testpy3:
python3 -m unittest discover tests
qt4: qt4py2
qt5: qt5py3
qt4py2:
pyrcc4 -py2 -o libs/resources.py resources.qrc
qt4py3:
pyrcc4 -py3 -o libs/resources.py resources.qrc
qt5py3:
pyrcc5 -o libs/resources.py resources.qrc
clean:
rm -rf ~/.labelImgSettings.pkl *.pyc dist labelImg.egg-info __pycache__ build
pip_upload:
python3 setup.py upload
long_description:
restview --long-description
.PHONY: all
# Copyright (c) <2015-Present> Tzutalin
# Copyright (C) 2013 MIT, Computer Science and Artificial Intelligence Laboratory. Bryan Russell, Antonio Torralba,
# William T. Freeman. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
# associated documentation files (the "Software"), to deal in the Software without restriction, including without
# limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the
# Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
# the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT
# NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
# SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
# CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
# !/usr/bin/env python
# -*- coding: utf-8 -*-
# pyrcc5 -o libs/resources.py resources.qrc
import argparse
import ast
import codecs
import json
import os.path
import platform
import subprocess
import sys
import xlrd
from functools import partial
from PyQt5.QtCore import QSize, Qt, QPoint, QByteArray, QTimer, QFileInfo, QPointF, QProcess
from PyQt5.QtGui import QImage, QCursor, QPixmap, QImageReader
from PyQt5.QtWidgets import QMainWindow, QListWidget, QVBoxLayout, QToolButton, QHBoxLayout, QDockWidget, QWidget, \
QSlider, QGraphicsOpacityEffect, QMessageBox, QListView, QScrollArea, QWidgetAction, QApplication, QLabel, QGridLayout, \
QFileDialog, QListWidgetItem, QComboBox, QDialog
__dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(__dir__)
sys.path.append(os.path.abspath(os.path.join(__dir__, '../..')))
sys.path.append(os.path.abspath(os.path.join(__dir__, '../PaddleOCR')))
sys.path.append("..")
from paddleocr import PaddleOCR, PPStructure
from libs.constants import *
from libs.utils import *
from libs.labelColor import label_colormap
from libs.settings import Settings
from libs.shape import Shape, DEFAULT_LINE_COLOR, DEFAULT_FILL_COLOR, DEFAULT_LOCK_COLOR
from libs.stringBundle import StringBundle
from libs.canvas import Canvas
from libs.zoomWidget import ZoomWidget
from libs.autoDialog import AutoDialog
from libs.labelDialog import LabelDialog
from libs.colorDialog import ColorDialog
from libs.ustr import ustr
from libs.hashableQListWidgetItem import HashableQListWidgetItem
from libs.editinlist import EditInList
from libs.unique_label_qlist_widget import UniqueLabelQListWidget
from libs.keyDialog import KeyDialog
__appname__ = 'PPOCRLabel'
LABEL_COLORMAP = label_colormap()
class MainWindow(QMainWindow):
FIT_WINDOW, FIT_WIDTH, MANUAL_ZOOM = list(range(3))
def __init__(self,
lang="ch",
gpu=False,
kie_mode=False,
default_filename=None,
default_predefined_class_file=None,
default_save_dir=None):
super(MainWindow, self).__init__()
self.setWindowTitle(__appname__)
self.setWindowState(Qt.WindowMaximized) # set window max
self.activateWindow() # PPOCRLabel goes to the front when activate
# Load setting in the main thread
self.settings = Settings()
self.settings.load()
settings = self.settings
self.lang = lang
# Load string bundle for i18n
if lang not in ['ch', 'en']:
lang = 'en'
self.stringBundle = StringBundle.getBundle(localeStr='zh-CN' if lang == 'ch' else 'en') # 'en'
getStr = lambda strId: self.stringBundle.getString(strId)
# KIE setting
self.kie_mode = kie_mode
self.key_previous_text = ""
self.existed_key_cls_set = set()
self.key_dialog_tip = getStr('keyDialogTip')
self.defaultSaveDir = default_save_dir
self.ocr = PaddleOCR(use_pdserving=False,
use_angle_cls=True,
det=True,
cls=True,
use_gpu=gpu,
lang=lang,
show_log=False)
self.table_ocr = PPStructure(use_pdserving=False,
use_gpu=gpu,
lang=lang,
layout=False,
show_log=False)
if os.path.exists('./data/paddle.png'):
result = self.ocr.ocr('./data/paddle.png', cls=True, det=True)
result = self.table_ocr('./data/paddle.png', return_ocr_result_in_table=True)
# For loading all image under a directory
self.mImgList = []
self.mImgList5 = []
self.dirname = None
self.labelHist = []
self.lastOpenDir = None
self.result_dic = []
self.result_dic_locked = []
self.changeFileFolder = False
self.haveAutoReced = False
self.labelFile = None
self.currIndex = 0
# Whether we need to save or not.
self.dirty = False
self._noSelectionSlot = False
self._beginner = True
self.screencastViewer = self.getAvailableScreencastViewer()
self.screencast = "https://github.com/PaddlePaddle/PaddleOCR"
# Load predefined classes to the list
self.loadPredefinedClasses(default_predefined_class_file)
# Main widgets and related state.
self.labelDialog = LabelDialog(parent=self, listItem=self.labelHist)
self.autoDialog = AutoDialog(parent=self)
self.itemsToShapes = {}
self.shapesToItems = {}
self.itemsToShapesbox = {}
self.shapesToItemsbox = {}
self.prevLabelText = getStr('tempLabel')
self.noLabelText = getStr('nullLabel')
self.model = 'paddle'
self.PPreader = None
self.autoSaveNum = 5
# ================== File List ==================
filelistLayout = QVBoxLayout()
filelistLayout.setContentsMargins(0, 0, 0, 0)
self.fileListWidget = QListWidget()
self.fileListWidget.itemClicked.connect(self.fileitemDoubleClicked)
self.fileListWidget.setIconSize(QSize(25, 25))
filelistLayout.addWidget(self.fileListWidget)
fileListContainer = QWidget()
fileListContainer.setLayout(filelistLayout)
self.fileListName = getStr('fileList')
self.fileDock = QDockWidget(self.fileListName, self)
self.fileDock.setObjectName(getStr('files'))
self.fileDock.setWidget(fileListContainer)
self.addDockWidget(Qt.LeftDockWidgetArea, self.fileDock)
# ================== Key List ==================
if self.kie_mode:
self.keyList = UniqueLabelQListWidget()
# set key list height
key_list_height = int(QApplication.desktop().height() // 4)
if key_list_height < 50:
key_list_height = 50
self.keyList.setMaximumHeight(key_list_height)
self.keyListDockName = getStr('keyListTitle')
self.keyListDock = QDockWidget(self.keyListDockName, self)
self.keyListDock.setWidget(self.keyList)
self.keyListDock.setFeatures(QDockWidget.NoDockWidgetFeatures)
filelistLayout.addWidget(self.keyListDock)
self.AutoRecognition = QToolButton()
self.AutoRecognition.setToolButtonStyle(Qt.ToolButtonTextBesideIcon)
self.AutoRecognition.setIcon(newIcon('Auto'))
autoRecLayout = QHBoxLayout()
autoRecLayout.setContentsMargins(0, 0, 0, 0)
autoRecLayout.addWidget(self.AutoRecognition)
autoRecContainer = QWidget()
autoRecContainer.setLayout(autoRecLayout)
filelistLayout.addWidget(autoRecContainer)
# ================== Right Area ==================
listLayout = QVBoxLayout()
listLayout.setContentsMargins(0, 0, 0, 0)
# Buttons
self.editButton = QToolButton()
self.reRecogButton = QToolButton()
self.reRecogButton.setIcon(newIcon('reRec', 30))
self.reRecogButton.setToolButtonStyle(Qt.ToolButtonTextBesideIcon)
self.tableRecButton = QToolButton()
self.tableRecButton.setToolButtonStyle(Qt.ToolButtonTextBesideIcon)
self.newButton = QToolButton()
self.newButton.setToolButtonStyle(Qt.ToolButtonTextBesideIcon)
self.createpolyButton = QToolButton()
self.createpolyButton.setToolButtonStyle(Qt.ToolButtonTextBesideIcon)
self.SaveButton = QToolButton()
self.SaveButton.setToolButtonStyle(Qt.ToolButtonTextBesideIcon)
self.DelButton = QToolButton()
self.DelButton.setToolButtonStyle(Qt.ToolButtonTextBesideIcon)
leftTopToolBox = QGridLayout()
leftTopToolBox.addWidget(self.newButton, 0, 0, 1, 1)
leftTopToolBox.addWidget(self.createpolyButton, 0, 1, 1, 1)
leftTopToolBox.addWidget(self.reRecogButton, 1, 0, 1, 1)
leftTopToolBox.addWidget(self.tableRecButton, 1, 1, 1, 1)
leftTopToolBoxContainer = QWidget()
leftTopToolBoxContainer.setLayout(leftTopToolBox)
listLayout.addWidget(leftTopToolBoxContainer)
# ================== Label List ==================
# Create and add a widget for showing current label items
self.labelList = EditInList()
labelListContainer = QWidget()
labelListContainer.setLayout(listLayout)
self.labelList.itemSelectionChanged.connect(self.labelSelectionChanged)
self.labelList.clicked.connect(self.labelList.item_clicked)
# Connect to itemChanged to detect checkbox changes.
self.labelList.itemChanged.connect(self.labelItemChanged)
self.labelListDockName = getStr('recognitionResult')
self.labelListDock = QDockWidget(self.labelListDockName, self)
self.labelListDock.setWidget(self.labelList)
self.labelListDock.setFeatures(QDockWidget.NoDockWidgetFeatures)
listLayout.addWidget(self.labelListDock)
# ================== Detection Box ==================
self.BoxList = QListWidget()
# self.BoxList.itemActivated.connect(self.boxSelectionChanged)
self.BoxList.itemSelectionChanged.connect(self.boxSelectionChanged)
self.BoxList.itemDoubleClicked.connect(self.editBox)
# Connect to itemChanged to detect checkbox changes.
self.BoxList.itemChanged.connect(self.boxItemChanged)
self.BoxListDockName = getStr('detectionBoxposition')
self.BoxListDock = QDockWidget(self.BoxListDockName, self)
self.BoxListDock.setWidget(self.BoxList)
self.BoxListDock.setFeatures(QDockWidget.NoDockWidgetFeatures)
listLayout.addWidget(self.BoxListDock)
# ================== Lower Right Area ==================
leftbtmtoolbox = QHBoxLayout()
leftbtmtoolbox.addWidget(self.SaveButton)
leftbtmtoolbox.addWidget(self.DelButton)
leftbtmtoolboxcontainer = QWidget()
leftbtmtoolboxcontainer.setLayout(leftbtmtoolbox)
listLayout.addWidget(leftbtmtoolboxcontainer)
self.dock = QDockWidget(getStr('boxLabelText'), self)
self.dock.setObjectName(getStr('labels'))
self.dock.setWidget(labelListContainer)
# ================== Zoom Bar ==================
self.imageSlider = QSlider(Qt.Horizontal)
self.imageSlider.valueChanged.connect(self.CanvasSizeChange)
self.imageSlider.setMinimum(-9)
self.imageSlider.setMaximum(510)
self.imageSlider.setSingleStep(1)
self.imageSlider.setTickPosition(QSlider.TicksBelow)
self.imageSlider.setTickInterval(1)
op = QGraphicsOpacityEffect()
op.setOpacity(0.2)
self.imageSlider.setGraphicsEffect(op)
self.imageSlider.setStyleSheet("background-color:transparent")
self.imageSliderDock = QDockWidget(getStr('ImageResize'), self)
self.imageSliderDock.setObjectName(getStr('IR'))
self.imageSliderDock.setWidget(self.imageSlider)
self.imageSliderDock.setFeatures(QDockWidget.DockWidgetFloatable)
self.imageSliderDock.setAttribute(Qt.WA_TranslucentBackground)
self.addDockWidget(Qt.RightDockWidgetArea, self.imageSliderDock)
self.zoomWidget = ZoomWidget()
self.colorDialog = ColorDialog(parent=self)
self.zoomWidgetValue = self.zoomWidget.value()
self.msgBox = QMessageBox()
# ================== Thumbnail ==================
hlayout = QHBoxLayout()
m = (0, 0, 0, 0)
hlayout.setSpacing(0)
hlayout.setContentsMargins(*m)
self.preButton = QToolButton()
self.preButton.setIcon(newIcon("prev", 40))
self.preButton.setIconSize(QSize(40, 100))
self.preButton.clicked.connect(self.openPrevImg)
self.preButton.setStyleSheet('border: none;')
self.preButton.setShortcut('a')
self.iconlist = QListWidget()
self.iconlist.setViewMode(QListView.IconMode)
self.iconlist.setFlow(QListView.TopToBottom)
self.iconlist.setSpacing(10)
self.iconlist.setIconSize(QSize(50, 50))
self.iconlist.setMovement(QListView.Static)
self.iconlist.setResizeMode(QListView.Adjust)
self.iconlist.itemClicked.connect(self.iconitemDoubleClicked)
self.iconlist.setStyleSheet("QListWidget{ background-color:transparent; border: none;}")
self.iconlist.setHorizontalScrollBarPolicy(Qt.ScrollBarAlwaysOff)
self.nextButton = QToolButton()
self.nextButton.setIcon(newIcon("next", 40))
self.nextButton.setIconSize(QSize(40, 100))
self.nextButton.setStyleSheet('border: none;')
self.nextButton.clicked.connect(self.openNextImg)
self.nextButton.setShortcut('d')
hlayout.addWidget(self.preButton)
hlayout.addWidget(self.iconlist)
hlayout.addWidget(self.nextButton)
iconListContainer = QWidget()
iconListContainer.setLayout(hlayout)
iconListContainer.setFixedHeight(100)
# ================== Canvas ==================
self.canvas = Canvas(parent=self)
self.canvas.zoomRequest.connect(self.zoomRequest)
self.canvas.setDrawingShapeToSquare(settings.get(SETTING_DRAW_SQUARE, False))
scroll = QScrollArea()
scroll.setWidget(self.canvas)
scroll.setWidgetResizable(True)
self.scrollBars = {
Qt.Vertical: scroll.verticalScrollBar(),
Qt.Horizontal: scroll.horizontalScrollBar()
}
self.scrollArea = scroll
self.canvas.scrollRequest.connect(self.scrollRequest)
self.canvas.newShape.connect(partial(self.newShape, False))
self.canvas.shapeMoved.connect(self.updateBoxlist) # self.setDirty
self.canvas.selectionChanged.connect(self.shapeSelectionChanged)
self.canvas.drawingPolygon.connect(self.toggleDrawingSensitive)
centerLayout = QVBoxLayout()
centerLayout.setContentsMargins(0, 0, 0, 0)
centerLayout.addWidget(scroll)
centerLayout.addWidget(iconListContainer, 0, Qt.AlignCenter)
centerContainer = QWidget()
centerContainer.setLayout(centerLayout)
self.setCentralWidget(centerContainer)
self.addDockWidget(Qt.RightDockWidgetArea, self.dock)
self.dock.setFeatures(QDockWidget.DockWidgetClosable | QDockWidget.DockWidgetFloatable)
self.fileDock.setFeatures(QDockWidget.NoDockWidgetFeatures)
# ================== Actions ==================
action = partial(newAction, self)
quit = action(getStr('quit'), self.close,
'Ctrl+Q', 'quit', getStr('quitApp'))
opendir = action(getStr('openDir'), self.openDirDialog,
'Ctrl+u', 'open', getStr('openDir'))
open_dataset_dir = action(getStr('openDatasetDir'), self.openDatasetDirDialog,
'Ctrl+p', 'open', getStr('openDatasetDir'), enabled=False)
save = action(getStr('save'), self.saveFile,
'Ctrl+V', 'verify', getStr('saveDetail'), enabled=False)
alcm = action(getStr('choosemodel'), self.autolcm,
'Ctrl+M', 'next', getStr('tipchoosemodel'))
deleteImg = action(getStr('deleteImg'), self.deleteImg, 'Ctrl+Shift+D', 'close', getStr('deleteImgDetail'),
enabled=True)
resetAll = action(getStr('resetAll'), self.resetAll, None, 'resetall', getStr('resetAllDetail'))
color1 = action(getStr('boxLineColor'), self.chooseColor,
'Ctrl+L', 'color_line', getStr('boxLineColorDetail'))
createMode = action(getStr('crtBox'), self.setCreateMode,
'w', 'new', getStr('crtBoxDetail'), enabled=False)
editMode = action('&Edit\nRectBox', self.setEditMode,
'Ctrl+J', 'edit', u'Move and edit Boxs', enabled=False)
create = action(getStr('crtBox'), self.createShape,
'w', 'objects', getStr('crtBoxDetail'), enabled=False)
delete = action(getStr('delBox'), self.deleteSelectedShape,
'backspace', 'delete', getStr('delBoxDetail'), enabled=False)
copy = action(getStr('dupBox'), self.copySelectedShape,
'Ctrl+C', 'copy', getStr('dupBoxDetail'),
enabled=False)
hideAll = action(getStr('hideBox'), partial(self.togglePolygons, False),
'Ctrl+H', 'hide', getStr('hideAllBoxDetail'),
enabled=False)
showAll = action(getStr('showBox'), partial(self.togglePolygons, True),
'Ctrl+A', 'hide', getStr('showAllBoxDetail'),
enabled=False)
help = action(getStr('tutorial'), self.showTutorialDialog, None, 'help', getStr('tutorialDetail'))
showInfo = action(getStr('info'), self.showInfoDialog, None, 'help', getStr('info'))
showSteps = action(getStr('steps'), self.showStepsDialog, None, 'help', getStr('steps'))
showKeys = action(getStr('keys'), self.showKeysDialog, None, 'help', getStr('keys'))
zoom = QWidgetAction(self)
zoom.setDefaultWidget(self.zoomWidget)
self.zoomWidget.setWhatsThis(
u"Zoom in or out of the image. Also accessible with"
" %s and %s from the canvas." % (fmtShortcut("Ctrl+[-+]"),
fmtShortcut("Ctrl+Wheel")))
self.zoomWidget.setEnabled(False)
zoomIn = action(getStr('zoomin'), partial(self.addZoom, 10),
'Ctrl++', 'zoom-in', getStr('zoominDetail'), enabled=False)
zoomOut = action(getStr('zoomout'), partial(self.addZoom, -10),
'Ctrl+-', 'zoom-out', getStr('zoomoutDetail'), enabled=False)
zoomOrg = action(getStr('originalsize'), partial(self.setZoom, 100),
'Ctrl+=', 'zoom', getStr('originalsizeDetail'), enabled=False)
fitWindow = action(getStr('fitWin'), self.setFitWindow,
'Ctrl+F', 'fit-window', getStr('fitWinDetail'),
checkable=True, enabled=False)
fitWidth = action(getStr('fitWidth'), self.setFitWidth,
'Ctrl+Shift+F', 'fit-width', getStr('fitWidthDetail'),
checkable=True, enabled=False)
# Group zoom controls into a list for easier toggling.
zoomActions = (self.zoomWidget, zoomIn, zoomOut,
zoomOrg, fitWindow, fitWidth)
self.zoomMode = self.MANUAL_ZOOM
self.scalers = {
self.FIT_WINDOW: self.scaleFitWindow,
self.FIT_WIDTH: self.scaleFitWidth,
# Set to one to scale to 100% when loading files.
self.MANUAL_ZOOM: lambda: 1,
}
# ================== New Actions ==================
edit = action(getStr('editLabel'), self.editLabel,
'Ctrl+E', 'edit', getStr('editLabelDetail'), enabled=False)
AutoRec = action(getStr('autoRecognition'), self.autoRecognition,
'', 'Auto', getStr('autoRecognition'), enabled=False)
reRec = action(getStr('reRecognition'), self.reRecognition,
'Ctrl+Shift+R', 'reRec', getStr('reRecognition'), enabled=False)
singleRere = action(getStr('singleRe'), self.singleRerecognition,
'Ctrl+R', 'reRec', getStr('singleRe'), enabled=False)
createpoly = action(getStr('creatPolygon'), self.createPolygon,
'q', 'new', getStr('creatPolygon'), enabled=False)
tableRec = action(getStr('TableRecognition'), self.TableRecognition,
'', 'Auto', getStr('TableRecognition'), enabled=False)
cellreRec = action(getStr('cellreRecognition'), self.cellreRecognition,
'', 'reRec', getStr('cellreRecognition'), enabled=False)
saveRec = action(getStr('saveRec'), self.saveRecResult,
'', 'save', getStr('saveRec'), enabled=False)
saveLabel = action(getStr('saveLabel'), self.saveLabelFile, #
'Ctrl+S', 'save', getStr('saveLabel'), enabled=False)
exportJSON = action(getStr('exportJSON'), self.exportJSON,
'', 'save', getStr('exportJSON'), enabled=False)
undoLastPoint = action(getStr("undoLastPoint"), self.canvas.undoLastPoint,
'Ctrl+Z', "undo", getStr("undoLastPoint"), enabled=False)
rotateLeft = action(getStr("rotateLeft"), partial(self.rotateImgAction, 1),
'Ctrl+Alt+L', "rotateLeft", getStr("rotateLeft"), enabled=False)
rotateRight = action(getStr("rotateRight"), partial(self.rotateImgAction, -1),
'Ctrl+Alt+R', "rotateRight", getStr("rotateRight"), enabled=False)
undo = action(getStr("undo"), self.undoShapeEdit,
'Ctrl+Z', "undo", getStr("undo"), enabled=False)
change_cls = action(getStr("keyChange"), self.change_box_key,
'Ctrl+X', "edit", getStr("keyChange"), enabled=False)
lock = action(getStr("lockBox"), self.lockSelectedShape,
None, "lock", getStr("lockBoxDetail"), enabled=False)
self.editButton.setDefaultAction(edit)
self.newButton.setDefaultAction(create)
self.createpolyButton.setDefaultAction(createpoly)
self.DelButton.setDefaultAction(deleteImg)
self.SaveButton.setDefaultAction(save)
self.AutoRecognition.setDefaultAction(AutoRec)
self.reRecogButton.setDefaultAction(reRec)
self.tableRecButton.setDefaultAction(tableRec)
# self.preButton.setDefaultAction(openPrevImg)
# self.nextButton.setDefaultAction(openNextImg)
# ================== Zoom layout ==================
zoomLayout = QHBoxLayout()
zoomLayout.addStretch()
self.zoominButton = QToolButton()
self.zoominButton.setToolButtonStyle(Qt.ToolButtonTextBesideIcon)
self.zoominButton.setDefaultAction(zoomIn)
self.zoomoutButton = QToolButton()
self.zoomoutButton.setToolButtonStyle(Qt.ToolButtonTextBesideIcon)
self.zoomoutButton.setDefaultAction(zoomOut)
self.zoomorgButton = QToolButton()
self.zoomorgButton.setToolButtonStyle(Qt.ToolButtonTextBesideIcon)
self.zoomorgButton.setDefaultAction(zoomOrg)
zoomLayout.addWidget(self.zoominButton)
zoomLayout.addWidget(self.zoomorgButton)
zoomLayout.addWidget(self.zoomoutButton)
zoomContainer = QWidget()
zoomContainer.setLayout(zoomLayout)
zoomContainer.setGeometry(0, 0, 30, 150)
shapeLineColor = action(getStr('shapeLineColor'), self.chshapeLineColor,
icon='color_line', tip=getStr('shapeLineColorDetail'),
enabled=False)
shapeFillColor = action(getStr('shapeFillColor'), self.chshapeFillColor,
icon='color', tip=getStr('shapeFillColorDetail'),
enabled=False)
# Label list context menu.
labelMenu = QMenu()
addActions(labelMenu, (edit, delete))
self.labelList.setContextMenuPolicy(Qt.CustomContextMenu)
self.labelList.customContextMenuRequested.connect(self.popLabelListMenu)
# Draw squares/rectangles
self.drawSquaresOption = QAction(getStr('drawSquares'), self)
self.drawSquaresOption.setCheckable(True)
self.drawSquaresOption.setChecked(settings.get(SETTING_DRAW_SQUARE, False))
self.drawSquaresOption.triggered.connect(self.toogleDrawSquare)
# Store actions for further handling.
self.actions = struct(save=save, resetAll=resetAll, deleteImg=deleteImg,
lineColor=color1, create=create, createpoly=createpoly, tableRec=tableRec, delete=delete, edit=edit, copy=copy,
saveRec=saveRec, singleRere=singleRere, AutoRec=AutoRec, reRec=reRec, cellreRec=cellreRec,
createMode=createMode, editMode=editMode,
shapeLineColor=shapeLineColor, shapeFillColor=shapeFillColor,
zoom=zoom, zoomIn=zoomIn, zoomOut=zoomOut, zoomOrg=zoomOrg,
fitWindow=fitWindow, fitWidth=fitWidth,
zoomActions=zoomActions, saveLabel=saveLabel, change_cls=change_cls,
undo=undo, undoLastPoint=undoLastPoint, open_dataset_dir=open_dataset_dir,
rotateLeft=rotateLeft, rotateRight=rotateRight, lock=lock, exportJSON=exportJSON,
fileMenuActions=(opendir, open_dataset_dir, saveLabel, exportJSON, resetAll, quit),
beginner=(), advanced=(),
editMenu=(createpoly, edit, copy, delete, singleRere, cellreRec, None, undo, undoLastPoint,
None, rotateLeft, rotateRight, None, color1, self.drawSquaresOption, lock,
None, change_cls),
beginnerContext=(
create, createpoly, edit, copy, delete, singleRere, cellreRec, rotateLeft, rotateRight, lock, change_cls),
advancedContext=(createMode, editMode, edit, copy,
delete, shapeLineColor, shapeFillColor),
onLoadActive=(create, createpoly, createMode, editMode),
onShapesPresent=(hideAll, showAll))
# menus
self.menus = struct(
file=self.menu('&' + getStr('mfile')),
edit=self.menu('&' + getStr('medit')),
view=self.menu('&' + getStr('mview')),
autolabel=self.menu('&PaddleOCR'),
help=self.menu('&' + getStr('mhelp')),
recentFiles=QMenu('Open &Recent'),
labelList=labelMenu)
self.lastLabel = None
# Add option to enable/disable labels being displayed at the top of bounding boxes
self.displayLabelOption = QAction(getStr('displayLabel'), self)
self.displayLabelOption.setShortcut("Ctrl+Shift+P")
self.displayLabelOption.setCheckable(True)
self.displayLabelOption.setChecked(settings.get(SETTING_PAINT_LABEL, False))
self.displayLabelOption.triggered.connect(self.togglePaintLabelsOption)
self.labelDialogOption = QAction(getStr('labelDialogOption'), self)
self.labelDialogOption.setShortcut("Ctrl+Shift+L")
self.labelDialogOption.setCheckable(True)
self.labelDialogOption.setChecked(settings.get(SETTING_PAINT_LABEL, False))
self.labelDialogOption.triggered.connect(self.speedChoose)
self.autoSaveOption = QAction(getStr('autoSaveMode'), self)
self.autoSaveOption.setCheckable(True)
self.autoSaveOption.setChecked(settings.get(SETTING_PAINT_LABEL, False))
self.autoSaveOption.triggered.connect(self.autoSaveFunc)
addActions(self.menus.file,
(opendir, open_dataset_dir, None, saveLabel, saveRec, exportJSON, self.autoSaveOption, None, resetAll, deleteImg,
quit))
addActions(self.menus.help, (showKeys, showSteps, showInfo))
addActions(self.menus.view, (
self.displayLabelOption, self.labelDialogOption,
None,
hideAll, showAll, None,
zoomIn, zoomOut, zoomOrg, None,
fitWindow, fitWidth))
addActions(self.menus.autolabel, (AutoRec, reRec, cellreRec, alcm, None, help))
self.menus.file.aboutToShow.connect(self.updateFileMenu)
# Custom context menu for the canvas widget:
addActions(self.canvas.menus[0], self.actions.beginnerContext)
self.statusBar().showMessage('%s started.' % __appname__)
self.statusBar().show()
# Application state.
self.image = QImage()
self.filePath = ustr(default_filename)
self.lastOpenDir = None
self.recentFiles = []
self.maxRecent = 7
self.lineColor = None
self.fillColor = None
self.zoom_level = 100
self.fit_window = False
# Add Chris
self.difficult = False
# Fix the compatible issue for qt4 and qt5. Convert the QStringList to python list
if settings.get(SETTING_RECENT_FILES):
if have_qstring():
recentFileQStringList = settings.get(SETTING_RECENT_FILES)
self.recentFiles = [ustr(i) for i in recentFileQStringList]
else:
self.recentFiles = recentFileQStringList = settings.get(SETTING_RECENT_FILES)
size = settings.get(SETTING_WIN_SIZE, QSize(1200, 800))
position = QPoint(0, 0)
saved_position = settings.get(SETTING_WIN_POSE, position)
# Fix the multiple monitors issue
for i in range(QApplication.desktop().screenCount()):
if QApplication.desktop().availableGeometry(i).contains(saved_position):
position = saved_position
break
self.resize(size)
self.move(position)
saveDir = ustr(settings.get(SETTING_SAVE_DIR, None))
self.lastOpenDir = ustr(settings.get(SETTING_LAST_OPEN_DIR, None))
self.restoreState(settings.get(SETTING_WIN_STATE, QByteArray()))
Shape.line_color = self.lineColor = QColor(settings.get(SETTING_LINE_COLOR, DEFAULT_LINE_COLOR))
Shape.fill_color = self.fillColor = QColor(settings.get(SETTING_FILL_COLOR, DEFAULT_FILL_COLOR))
self.canvas.setDrawingColor(self.lineColor)
# Add chris
Shape.difficult = self.difficult
# ADD:
# Populate the File menu dynamically.
self.updateFileMenu()
# Since loading the file may take some time, make sure it runs in the background.
if self.filePath and os.path.isdir(self.filePath):
self.queueEvent(partial(self.importDirImages, self.filePath or ""))
elif self.filePath:
self.queueEvent(partial(self.loadFile, self.filePath or ""))
self.keyDialog = None
# Callbacks:
self.zoomWidget.valueChanged.connect(self.paintCanvas)
self.populateModeActions()
# Display cursor coordinates at the right of status bar
self.labelCoordinates = QLabel('')
self.statusBar().addPermanentWidget(self.labelCoordinates)
# Open Dir if deafult file
if self.filePath and os.path.isdir(self.filePath):
self.openDirDialog(dirpath=self.filePath, silent=True)
def menu(self, title, actions=None):
menu = self.menuBar().addMenu(title)
if actions:
addActions(menu, actions)
return menu
def keyReleaseEvent(self, event):
if event.key() == Qt.Key_Control:
self.canvas.setDrawingShapeToSquare(False)
def keyPressEvent(self, event):
if event.key() == Qt.Key_Control:
# Draw rectangle if Ctrl is pressed
self.canvas.setDrawingShapeToSquare(True)
def noShapes(self):
return not self.itemsToShapes
def populateModeActions(self):
self.canvas.menus[0].clear()
addActions(self.canvas.menus[0], self.actions.beginnerContext)
self.menus.edit.clear()
actions = (self.actions.create,) # if self.beginner() else (self.actions.createMode, self.actions.editMode)
addActions(self.menus.edit, actions + self.actions.editMenu)
def setDirty(self):
self.dirty = True
self.actions.save.setEnabled(True)
def setClean(self):
self.dirty = False
self.actions.save.setEnabled(False)
self.actions.create.setEnabled(True)
self.actions.createpoly.setEnabled(True)
def toggleActions(self, value=True):
"""Enable/Disable widgets which depend on an opened image."""
for z in self.actions.zoomActions:
z.setEnabled(value)
for action in self.actions.onLoadActive:
action.setEnabled(value)
def queueEvent(self, function):
QTimer.singleShot(0, function)
def status(self, message, delay=5000):
self.statusBar().showMessage(message, delay)
def resetState(self):
self.itemsToShapes.clear()
self.shapesToItems.clear()
self.itemsToShapesbox.clear() # ADD
self.shapesToItemsbox.clear()
self.labelList.clear()
self.BoxList.clear()
self.filePath = None
self.imageData = None
self.labelFile = None
self.canvas.resetState()
self.labelCoordinates.clear()
# self.comboBox.cb.clear()
self.result_dic = []
def currentItem(self):
items = self.labelList.selectedItems()
if items:
return items[0]
return None
def currentBox(self):
items = self.BoxList.selectedItems()
if items:
return items[0]
return None
def addRecentFile(self, filePath):
if filePath in self.recentFiles:
self.recentFiles.remove(filePath)
elif len(self.recentFiles) >= self.maxRecent:
self.recentFiles.pop()
self.recentFiles.insert(0, filePath)
def beginner(self):
return self._beginner
def advanced(self):
return not self.beginner()
def getAvailableScreencastViewer(self):
osName = platform.system()
if osName == 'Windows':
return ['C:\\Program Files\\Internet Explorer\\iexplore.exe']
elif osName == 'Linux':
return ['xdg-open']
elif osName == 'Darwin':
return ['open']
## Callbacks ##
def showTutorialDialog(self):
subprocess.Popen(self.screencastViewer + [self.screencast])
def showInfoDialog(self):
from libs.__init__ import __version__
msg = u'Name:{0} \nApp Version:{1} \n{2} '.format(__appname__, __version__, sys.version_info)
QMessageBox.information(self, u'Information', msg)
def showStepsDialog(self):
msg = stepsInfo(self.lang)
QMessageBox.information(self, u'Information', msg)
def showKeysDialog(self):
msg = keysInfo(self.lang)
QMessageBox.information(self, u'Information', msg)
def createShape(self):
assert self.beginner()
self.canvas.setEditing(False)
self.actions.create.setEnabled(False)
self.actions.createpoly.setEnabled(False)
self.canvas.fourpoint = False
def createPolygon(self):
assert self.beginner()
self.canvas.setEditing(False)
self.canvas.fourpoint = True
self.actions.create.setEnabled(False)
self.actions.createpoly.setEnabled(False)
self.actions.undoLastPoint.setEnabled(True)
def rotateImg(self, filename, k, _value):
self.actions.rotateRight.setEnabled(_value)
pix = cv2.imread(filename)
pix = np.rot90(pix, k)
cv2.imwrite(filename, pix)
self.canvas.update()
self.loadFile(filename)
def rotateImgWarn(self):
if self.lang == 'ch':
self.msgBox.warning(self, "提示", "\n 该图片已经有标注框,旋转操作会打乱标注,建议清除标注框后旋转。")
else:
self.msgBox.warning(self, "Warn", "\n The picture already has a label box, "
"and rotation will disrupt the label. "
"It is recommended to clear the label box and rotate it.")
def rotateImgAction(self, k=1, _value=False):
filename = self.mImgList[self.currIndex]
if os.path.exists(filename):
if self.itemsToShapesbox:
self.rotateImgWarn()
else:
self.saveFile()
self.dirty = False
self.rotateImg(filename=filename, k=k, _value=True)
else:
self.rotateImgWarn()
self.actions.rotateRight.setEnabled(False)
self.actions.rotateLeft.setEnabled(False)
def toggleDrawingSensitive(self, drawing=True):
"""In the middle of drawing, toggling between modes should be disabled."""
self.actions.editMode.setEnabled(not drawing)
if not drawing and self.beginner():
# Cancel creation.
print('Cancel creation.')
self.canvas.setEditing(True)
self.canvas.restoreCursor()
self.actions.create.setEnabled(True)
self.actions.createpoly.setEnabled(True)
def toggleDrawMode(self, edit=True):
self.canvas.setEditing(edit)
self.actions.createMode.setEnabled(edit)
self.actions.editMode.setEnabled(not edit)
def setCreateMode(self):
assert self.advanced()
self.toggleDrawMode(False)
def setEditMode(self):
assert self.advanced()
self.toggleDrawMode(True)
self.labelSelectionChanged()
def updateFileMenu(self):
currFilePath = self.filePath
def exists(filename):
return os.path.exists(filename)
menu = self.menus.recentFiles
menu.clear()
files = [f for f in self.recentFiles if f !=
currFilePath and exists(f)]
for i, f in enumerate(files):
icon = newIcon('labels')
action = QAction(
icon, '&%d %s' % (i + 1, QFileInfo(f).fileName()), self)
action.triggered.connect(partial(self.loadRecent, f))
menu.addAction(action)
def popLabelListMenu(self, point):
self.menus.labelList.exec_(self.labelList.mapToGlobal(point))
def editLabel(self):
if not self.canvas.editing():
return
item = self.currentItem()
if not item:
return
text = self.labelDialog.popUp(item.text())
if text is not None:
item.setText(text)
# item.setBackground(generateColorByText(text))
self.setDirty()
self.updateComboBox()
# =================== detection box related functions ===================
def boxItemChanged(self, item):
shape = self.itemsToShapesbox[item]
box = ast.literal_eval(item.text())
# print('shape in labelItemChanged is',shape.points)
if box != [(int(p.x()), int(p.y())) for p in shape.points]:
# shape.points = box
shape.points = [QPointF(p[0], p[1]) for p in box]
# QPointF(x,y)
# shape.line_color = generateColorByText(shape.label)
self.setDirty()
else: # User probably changed item visibility
self.canvas.setShapeVisible(shape, True) # item.checkState() == Qt.Checked
def editBox(self): # ADD
if not self.canvas.editing():
return
item = self.currentBox()
if not item:
return
text = self.labelDialog.popUp(item.text())
imageSize = str(self.image.size())
width, height = self.image.width(), self.image.height()
if text:
try:
text_list = eval(text)
except:
msg_box = QMessageBox(QMessageBox.Warning, 'Warning', 'Please enter the correct format')
msg_box.exec_()
return
if len(text_list) < 4:
msg_box = QMessageBox(QMessageBox.Warning, 'Warning', 'Please enter the coordinates of 4 points')
msg_box.exec_()
return
for box in text_list:
if box[0] > width or box[0] < 0 or box[1] > height or box[1] < 0:
msg_box = QMessageBox(QMessageBox.Warning, 'Warning', 'Out of picture size')
msg_box.exec_()
return
item.setText(text)
# item.setBackground(generateColorByText(text))
self.setDirty()
self.updateComboBox()
def updateBoxlist(self):
self.canvas.selectedShapes_hShape = []
if self.canvas.hShape != None:
self.canvas.selectedShapes_hShape = self.canvas.selectedShapes + [self.canvas.hShape]
else:
self.canvas.selectedShapes_hShape = self.canvas.selectedShapes
for shape in self.canvas.selectedShapes_hShape:
item = self.shapesToItemsbox[shape] # listitem
text = [(int(p.x()), int(p.y())) for p in shape.points]
item.setText(str(text))
self.actions.undo.setEnabled(True)
self.setDirty()
def indexTo5Files(self, currIndex):
if currIndex < 2:
return self.mImgList[:5]
elif currIndex > len(self.mImgList) - 3:
return self.mImgList[-5:]
else:
return self.mImgList[currIndex - 2: currIndex + 3]
# Tzutalin 20160906 : Add file list and dock to move faster
def fileitemDoubleClicked(self, item=None):
self.currIndex = self.mImgList.index(ustr(os.path.join(os.path.abspath(self.dirname), item.text())))
filename = self.mImgList[self.currIndex]
if filename:
self.mImgList5 = self.indexTo5Files(self.currIndex)
# self.additems5(None)
self.loadFile(filename)
def iconitemDoubleClicked(self, item=None):
self.currIndex = self.mImgList.index(ustr(os.path.join(item.toolTip())))
filename = self.mImgList[self.currIndex]
if filename:
self.mImgList5 = self.indexTo5Files(self.currIndex)
# self.additems5(None)
self.loadFile(filename)
def CanvasSizeChange(self):
if len(self.mImgList) > 0 and self.imageSlider.hasFocus():
self.zoomWidget.setValue(self.imageSlider.value())
def shapeSelectionChanged(self, selected_shapes):
self._noSelectionSlot = True
for shape in self.canvas.selectedShapes:
shape.selected = False
self.labelList.clearSelection()
self.canvas.selectedShapes = selected_shapes
for shape in self.canvas.selectedShapes:
shape.selected = True
self.shapesToItems[shape].setSelected(True)
self.shapesToItemsbox[shape].setSelected(True)
self.labelList.scrollToItem(self.currentItem()) # QAbstractItemView.EnsureVisible
self.BoxList.scrollToItem(self.currentBox())
if self.kie_mode:
if len(self.canvas.selectedShapes) == 1 and self.keyList.count() > 0:
selected_key_item_row = self.keyList.findItemsByLabel(self.canvas.selectedShapes[0].key_cls,
get_row=True)
if isinstance(selected_key_item_row, list) and len(selected_key_item_row) == 0:
key_text = self.canvas.selectedShapes[0].key_cls
item = self.keyList.createItemFromLabel(key_text)
self.keyList.addItem(item)
rgb = self._get_rgb_by_label(key_text, self.kie_mode)
self.keyList.setItemLabel(item, key_text, rgb)
selected_key_item_row = self.keyList.findItemsByLabel(self.canvas.selectedShapes[0].key_cls,
get_row=True)
self.keyList.setCurrentRow(selected_key_item_row)
self._noSelectionSlot = False
n_selected = len(selected_shapes)
self.actions.singleRere.setEnabled(n_selected)
self.actions.cellreRec.setEnabled(n_selected)
self.actions.delete.setEnabled(n_selected)
self.actions.copy.setEnabled(n_selected)
self.actions.edit.setEnabled(n_selected == 1)
self.actions.lock.setEnabled(n_selected)
self.actions.change_cls.setEnabled(n_selected)
def addLabel(self, shape):
shape.paintLabel = self.displayLabelOption.isChecked()
item = HashableQListWidgetItem(shape.label)
item.setFlags(item.flags() | Qt.ItemIsUserCheckable)
item.setCheckState(Qt.Unchecked) if shape.difficult else item.setCheckState(Qt.Checked)
# Checked means difficult is False
# item.setBackground(generateColorByText(shape.label))
self.itemsToShapes[item] = shape
self.shapesToItems[shape] = item
self.labelList.addItem(item)
# print('item in add label is ',[(p.x(), p.y()) for p in shape.points], shape.label)
# ADD for box
item = HashableQListWidgetItem(str([(int(p.x()), int(p.y())) for p in shape.points]))
self.itemsToShapesbox[item] = shape
self.shapesToItemsbox[shape] = item
self.BoxList.addItem(item)
for action in self.actions.onShapesPresent:
action.setEnabled(True)
self.updateComboBox()
# update show counting
self.BoxListDock.setWindowTitle(self.BoxListDockName + f" ({self.BoxList.count()})")
self.labelListDock.setWindowTitle(self.labelListDockName + f" ({self.labelList.count()})")
def remLabels(self, shapes):
if shapes is None:
# print('rm empty label')
return
for shape in shapes:
item = self.shapesToItems[shape]
self.labelList.takeItem(self.labelList.row(item))
del self.shapesToItems[shape]
del self.itemsToShapes[item]
self.updateComboBox()
# ADD:
item = self.shapesToItemsbox[shape]
self.BoxList.takeItem(self.BoxList.row(item))
del self.shapesToItemsbox[shape]
del self.itemsToShapesbox[item]
self.updateComboBox()
def loadLabels(self, shapes):
s = []
for label, points, line_color, key_cls, difficult in shapes:
shape = Shape(label=label, line_color=line_color, key_cls=key_cls)
for x, y in points:
# Ensure the labels are within the bounds of the image. If not, fix them.
x, y, snapped = self.canvas.snapPointToCanvas(x, y)
if snapped:
self.setDirty()
shape.addPoint(QPointF(x, y))
shape.difficult = difficult
# shape.locked = False
shape.close()
s.append(shape)
self._update_shape_color(shape)
self.addLabel(shape)
self.updateComboBox()
self.canvas.loadShapes(s)
def singleLabel(self, shape):
if shape is None:
# print('rm empty label')
return
item = self.shapesToItems[shape]
item.setText(shape.label)
self.updateComboBox()
# ADD:
item = self.shapesToItemsbox[shape]
item.setText(str([(int(p.x()), int(p.y())) for p in shape.points]))
self.updateComboBox()
def updateComboBox(self):
# Get the unique labels and add them to the Combobox.
itemsTextList = [str(self.labelList.item(i).text()) for i in range(self.labelList.count())]
uniqueTextList = list(set(itemsTextList))
# Add a null row for showing all the labels
uniqueTextList.append("")
uniqueTextList.sort()
# self.comboBox.update_items(uniqueTextList)
def saveLabels(self, annotationFilePath, mode='Auto'):
# Mode is Auto means that labels will be loaded from self.result_dic totally, which is the output of ocr model
annotationFilePath = ustr(annotationFilePath)
def format_shape(s):
# print('s in saveLabels is ',s)
return dict(label=s.label, # str
line_color=s.line_color.getRgb(),
fill_color=s.fill_color.getRgb(),
points=[(int(p.x()), int(p.y())) for p in s.points], # QPonitF
difficult=s.difficult,
key_cls=s.key_cls) # bool
if mode == 'Auto':
shapes = []
else:
shapes = [format_shape(shape) for shape in self.canvas.shapes if shape.line_color != DEFAULT_LOCK_COLOR]
# Can add differrent annotation formats here
for box in self.result_dic:
trans_dic = {"label": box[1][0], "points": box[0], "difficult": False}
if self.kie_mode:
if len(box) == 3:
trans_dic.update({"key_cls": box[2]})
else:
trans_dic.update({"key_cls": "None"})
if trans_dic["label"] == "" and mode == 'Auto':
continue
shapes.append(trans_dic)
try:
trans_dic = []
for box in shapes:
trans_dict = {"transcription": box['label'], "points": box['points'], "difficult": box['difficult']}
if self.kie_mode:
trans_dict.update({"key_cls": box['key_cls']})
trans_dic.append(trans_dict)
self.PPlabel[annotationFilePath] = trans_dic
if mode == 'Auto':
self.Cachelabel[annotationFilePath] = trans_dic
# else:
# self.labelFile.save(annotationFilePath, shapes, self.filePath, self.imageData,
# self.lineColor.getRgb(), self.fillColor.getRgb())
# print('Image:{0} -> Annotation:{1}'.format(self.filePath, annotationFilePath))
return True
except:
self.errorMessage(u'Error saving label data', u'Error saving label data')
return False
def copySelectedShape(self):
for shape in self.canvas.copySelectedShape():
self.addLabel(shape)
# fix copy and delete
# self.shapeSelectionChanged(True)
def labelSelectionChanged(self):
if self._noSelectionSlot:
return
if self.canvas.editing():
selected_shapes = []
for item in self.labelList.selectedItems():
selected_shapes.append(self.itemsToShapes[item])
if selected_shapes:
self.canvas.selectShapes(selected_shapes)
else:
self.canvas.deSelectShape()
def boxSelectionChanged(self):
if self._noSelectionSlot:
# self.BoxList.scrollToItem(self.currentBox(), QAbstractItemView.PositionAtCenter)
return
if self.canvas.editing():
selected_shapes = []
for item in self.BoxList.selectedItems():
selected_shapes.append(self.itemsToShapesbox[item])
if selected_shapes:
self.canvas.selectShapes(selected_shapes)
else:
self.canvas.deSelectShape()
def labelItemChanged(self, item):
shape = self.itemsToShapes[item]
label = item.text()
if label != shape.label:
shape.label = item.text()
# shape.line_color = generateColorByText(shape.label)
self.setDirty()
elif not ((item.checkState() == Qt.Unchecked) ^ (not shape.difficult)):
shape.difficult = True if item.checkState() == Qt.Unchecked else False
self.setDirty()
else: # User probably changed item visibility
self.canvas.setShapeVisible(shape, True) # item.checkState() == Qt.Checked
# self.actions.save.setEnabled(True)
# Callback functions:
def newShape(self, value=True):
"""Pop-up and give focus to the label editor.
position MUST be in global coordinates.
"""
if len(self.labelHist) > 0:
self.labelDialog = LabelDialog(parent=self, listItem=self.labelHist)
if value:
text = self.labelDialog.popUp(text=self.prevLabelText)
self.lastLabel = text
else:
text = self.prevLabelText
if text is not None:
self.prevLabelText = self.stringBundle.getString('tempLabel')
shape = self.canvas.setLastLabel(text, None, None, None) # generate_color, generate_color
if self.kie_mode:
key_text, _ = self.keyDialog.popUp(self.key_previous_text)
if key_text is not None:
shape = self.canvas.setLastLabel(text, None, None, key_text) # generate_color, generate_color
self.key_previous_text = key_text
if not self.keyList.findItemsByLabel(key_text):
item = self.keyList.createItemFromLabel(key_text)
self.keyList.addItem(item)
rgb = self._get_rgb_by_label(key_text, self.kie_mode)
self.keyList.setItemLabel(item, key_text, rgb)
self._update_shape_color(shape)
self.keyDialog.addLabelHistory(key_text)
self.addLabel(shape)
if self.beginner(): # Switch to edit mode.
self.canvas.setEditing(True)
self.actions.create.setEnabled(True)
self.actions.createpoly.setEnabled(True)
self.actions.undoLastPoint.setEnabled(False)
self.actions.undo.setEnabled(True)
else:
self.actions.editMode.setEnabled(True)
self.setDirty()
else:
# self.canvas.undoLastLine()
self.canvas.resetAllLines()
def _update_shape_color(self, shape):
r, g, b = self._get_rgb_by_label(shape.key_cls, self.kie_mode)
shape.line_color = QColor(r, g, b)
shape.vertex_fill_color = QColor(r, g, b)
shape.hvertex_fill_color = QColor(255, 255, 255)
shape.fill_color = QColor(r, g, b, 128)
shape.select_line_color = QColor(255, 255, 255)
shape.select_fill_color = QColor(r, g, b, 155)
def _get_rgb_by_label(self, label, kie_mode):
shift_auto_shape_color = 2 # use for random color
if kie_mode and label != "None":
item = self.keyList.findItemsByLabel(label)[0]
label_id = self.keyList.indexFromItem(item).row() + 1
label_id += shift_auto_shape_color
return LABEL_COLORMAP[label_id % len(LABEL_COLORMAP)]
else:
return (0, 255, 0)
def scrollRequest(self, delta, orientation):
units = - delta / (8 * 15)
bar = self.scrollBars[orientation]
bar.setValue(bar.value() + bar.singleStep() * units)
def setZoom(self, value):
self.actions.fitWidth.setChecked(False)
self.actions.fitWindow.setChecked(False)
self.zoomMode = self.MANUAL_ZOOM
self.zoomWidget.setValue(value)
def addZoom(self, increment=10):
self.setZoom(self.zoomWidget.value() + increment)
self.imageSlider.setValue(self.zoomWidget.value() + increment) # set zoom slider value
def zoomRequest(self, delta):
# get the current scrollbar positions
# calculate the percentages ~ coordinates
h_bar = self.scrollBars[Qt.Horizontal]
v_bar = self.scrollBars[Qt.Vertical]
# get the current maximum, to know the difference after zooming
h_bar_max = h_bar.maximum()
v_bar_max = v_bar.maximum()
# get the cursor position and canvas size
# calculate the desired movement from 0 to 1
# where 0 = move left
# 1 = move right
# up and down analogous
cursor = QCursor()
pos = cursor.pos()
relative_pos = QWidget.mapFromGlobal(self, pos)
cursor_x = relative_pos.x()
cursor_y = relative_pos.y()
w = self.scrollArea.width()
h = self.scrollArea.height()
# the scaling from 0 to 1 has some padding
# you don't have to hit the very leftmost pixel for a maximum-left movement
margin = 0.1
move_x = (cursor_x - margin * w) / (w - 2 * margin * w)
move_y = (cursor_y - margin * h) / (h - 2 * margin * h)
# clamp the values from 0 to 1
move_x = min(max(move_x, 0), 1)
move_y = min(max(move_y, 0), 1)
# zoom in
units = delta / (8 * 15)
scale = 10
self.addZoom(scale * units)
# get the difference in scrollbar values
# this is how far we can move
d_h_bar_max = h_bar.maximum() - h_bar_max
d_v_bar_max = v_bar.maximum() - v_bar_max
# get the new scrollbar values
new_h_bar_value = h_bar.value() + move_x * d_h_bar_max
new_v_bar_value = v_bar.value() + move_y * d_v_bar_max
h_bar.setValue(new_h_bar_value)
v_bar.setValue(new_v_bar_value)
def setFitWindow(self, value=True):
if value:
self.actions.fitWidth.setChecked(False)
self.zoomMode = self.FIT_WINDOW if value else self.MANUAL_ZOOM
self.adjustScale()
def setFitWidth(self, value=True):
if value:
self.actions.fitWindow.setChecked(False)
self.zoomMode = self.FIT_WIDTH if value else self.MANUAL_ZOOM
self.adjustScale()
def togglePolygons(self, value):
for item, shape in self.itemsToShapes.items():
self.canvas.setShapeVisible(shape, value)
def loadFile(self, filePath=None):
"""Load the specified file, or the last opened file if None."""
if self.dirty:
self.mayContinue()
self.resetState()
self.canvas.setEnabled(False)
if filePath is None:
filePath = self.settings.get(SETTING_FILENAME)
# Make sure that filePath is a regular python string, rather than QString
filePath = ustr(filePath)
# Fix bug: An index error after select a directory when open a new file.
unicodeFilePath = ustr(filePath)
# unicodeFilePath = os.path.abspath(unicodeFilePath)
# Tzutalin 20160906 : Add file list and dock to move faster
# Highlight the file item
if unicodeFilePath and self.fileListWidget.count() > 0:
if unicodeFilePath in self.mImgList:
index = self.mImgList.index(unicodeFilePath)
fileWidgetItem = self.fileListWidget.item(index)
print('unicodeFilePath is', unicodeFilePath)
fileWidgetItem.setSelected(True)
self.iconlist.clear()
self.additems5(None)
for i in range(5):
item_tooltip = self.iconlist.item(i).toolTip()
# print(i,"---",item_tooltip)
if item_tooltip == ustr(filePath):
titem = self.iconlist.item(i)
titem.setSelected(True)
self.iconlist.scrollToItem(titem)
break
else:
self.fileListWidget.clear()
self.mImgList.clear()
self.iconlist.clear()
# if unicodeFilePath and self.iconList.count() > 0:
# if unicodeFilePath in self.mImgList:
if unicodeFilePath and os.path.exists(unicodeFilePath):
self.canvas.verified = False
cvimg = cv2.imdecode(np.fromfile(unicodeFilePath, dtype=np.uint8), 1)
height, width, depth = cvimg.shape
cvimg = cv2.cvtColor(cvimg, cv2.COLOR_BGR2RGB)
image = QImage(cvimg.data, width, height, width * depth, QImage.Format_RGB888)
if image.isNull():
self.errorMessage(u'Error opening file',
u"<p>Make sure <i>%s</i> is a valid image file." % unicodeFilePath)
self.status("Error reading %s" % unicodeFilePath)
return False
self.status("Loaded %s" % os.path.basename(unicodeFilePath))
self.image = image
self.filePath = unicodeFilePath
self.canvas.loadPixmap(QPixmap.fromImage(image))
if self.validFilestate(filePath) is True:
self.setClean()
else:
self.dirty = False
self.actions.save.setEnabled(True)
if len(self.canvas.lockedShapes) != 0:
self.actions.save.setEnabled(True)
self.setDirty()
self.canvas.setEnabled(True)
self.adjustScale(initial=True)
self.paintCanvas()
self.addRecentFile(self.filePath)
self.toggleActions(True)
self.showBoundingBoxFromPPlabel(filePath)
self.setWindowTitle(__appname__ + ' ' + filePath)
# Default : select last item if there is at least one item
if self.labelList.count():
self.labelList.setCurrentItem(self.labelList.item(self.labelList.count() - 1))
self.labelList.item(self.labelList.count() - 1).setSelected(True)
# show file list image count
select_indexes = self.fileListWidget.selectedIndexes()
if len(select_indexes) > 0:
self.fileDock.setWindowTitle(self.fileListName + f" ({select_indexes[0].row() + 1}"
f"/{self.fileListWidget.count()})")
# update show counting
self.BoxListDock.setWindowTitle(self.BoxListDockName + f" ({self.BoxList.count()})")
self.labelListDock.setWindowTitle(self.labelListDockName + f" ({self.labelList.count()})")
self.canvas.setFocus(True)
return True
return False
def showBoundingBoxFromPPlabel(self, filePath):
width, height = self.image.width(), self.image.height()
imgidx = self.getImglabelidx(filePath)
shapes = []
# box['ratio'] of the shapes saved in lockedShapes contains the ratio of the
# four corner coordinates of the shapes to the height and width of the image
for box in self.canvas.lockedShapes:
key_cls = 'None' if not self.kie_mode else box['key_cls']
if self.canvas.isInTheSameImage:
shapes.append((box['transcription'], [[s[0] * width, s[1] * height] for s in box['ratio']],
DEFAULT_LOCK_COLOR, key_cls, box['difficult']))
else:
shapes.append(('锁定框:待检测', [[s[0] * width, s[1] * height] for s in box['ratio']],
DEFAULT_LOCK_COLOR, key_cls, box['difficult']))
if imgidx in self.PPlabel.keys():
for box in self.PPlabel[imgidx]:
key_cls = 'None' if not self.kie_mode else box.get('key_cls', 'None')
shapes.append((box['transcription'], box['points'], None, key_cls, box.get('difficult', False)))
self.loadLabels(shapes)
self.canvas.verified = False
def validFilestate(self, filePath):
if filePath not in self.fileStatedict.keys():
return None
elif self.fileStatedict[filePath] == 1:
return True
else:
return False
def resizeEvent(self, event):
if self.canvas and not self.image.isNull() \
and self.zoomMode != self.MANUAL_ZOOM:
self.adjustScale()
super(MainWindow, self).resizeEvent(event)
def paintCanvas(self):
assert not self.image.isNull(), "cannot paint null image"
self.canvas.scale = 0.01 * self.zoomWidget.value()
self.canvas.adjustSize()
self.canvas.update()
def adjustScale(self, initial=False):
value = self.scalers[self.FIT_WINDOW if initial else self.zoomMode]()
self.zoomWidget.setValue(int(100 * value))
self.imageSlider.setValue(self.zoomWidget.value()) # set zoom slider value
def scaleFitWindow(self):
"""Figure out the size of the pixmap in order to fit the main widget."""
e = 2.0 # So that no scrollbars are generated.
w1 = self.centralWidget().width() - e
h1 = self.centralWidget().height() - e - 110
a1 = w1 / h1
# Calculate a new scale value based on the pixmap's aspect ratio.
w2 = self.canvas.pixmap.width() - 0.0
h2 = self.canvas.pixmap.height() - 0.0
a2 = w2 / h2
return w1 / w2 if a2 >= a1 else h1 / h2
def scaleFitWidth(self):
# The epsilon does not seem to work too well here.
w = self.centralWidget().width() - 2.0
return w / self.canvas.pixmap.width()
def closeEvent(self, event):
if not self.mayContinue():
event.ignore()
else:
settings = self.settings
# If it loads images from dir, don't load it at the beginning
if self.dirname is None:
settings[SETTING_FILENAME] = self.filePath if self.filePath else ''
else:
settings[SETTING_FILENAME] = ''
settings[SETTING_WIN_SIZE] = self.size()
settings[SETTING_WIN_POSE] = self.pos()
settings[SETTING_WIN_STATE] = self.saveState()
settings[SETTING_LINE_COLOR] = self.lineColor
settings[SETTING_FILL_COLOR] = self.fillColor
settings[SETTING_RECENT_FILES] = self.recentFiles
settings[SETTING_ADVANCE_MODE] = not self._beginner
if self.defaultSaveDir and os.path.exists(self.defaultSaveDir):
settings[SETTING_SAVE_DIR] = ustr(self.defaultSaveDir)
else:
settings[SETTING_SAVE_DIR] = ''
if self.lastOpenDir and os.path.exists(self.lastOpenDir):
settings[SETTING_LAST_OPEN_DIR] = self.lastOpenDir
else:
settings[SETTING_LAST_OPEN_DIR] = ''
settings[SETTING_PAINT_LABEL] = self.displayLabelOption.isChecked()
settings[SETTING_DRAW_SQUARE] = self.drawSquaresOption.isChecked()
settings.save()
try:
self.saveLabelFile()
except:
pass
def loadRecent(self, filename):
if self.mayContinue():
print(filename, "======")
self.loadFile(filename)
def scanAllImages(self, folderPath):
extensions = ['.%s' % fmt.data().decode("ascii").lower() for fmt in QImageReader.supportedImageFormats()]
images = []
for file in os.listdir(folderPath):
if file.lower().endswith(tuple(extensions)):
relativePath = os.path.join(folderPath, file)
path = ustr(os.path.abspath(relativePath))
images.append(path)
natural_sort(images, key=lambda x: x.lower())
return images
def openDirDialog(self, _value=False, dirpath=None, silent=False):
if not self.mayContinue():
return
defaultOpenDirPath = dirpath if dirpath else '.'
if self.lastOpenDir and os.path.exists(self.lastOpenDir):
defaultOpenDirPath = self.lastOpenDir
else:
defaultOpenDirPath = os.path.dirname(self.filePath) if self.filePath else '.'
if silent != True:
targetDirPath = ustr(QFileDialog.getExistingDirectory(self,
'%s - Open Directory' % __appname__,
defaultOpenDirPath,
QFileDialog.ShowDirsOnly | QFileDialog.DontResolveSymlinks))
else:
targetDirPath = ustr(defaultOpenDirPath)
self.lastOpenDir = targetDirPath
self.importDirImages(targetDirPath)
def openDatasetDirDialog(self):
if self.lastOpenDir and os.path.exists(self.lastOpenDir):
if platform.system() == 'Windows':
os.startfile(self.lastOpenDir)
else:
os.system('open ' + os.path.normpath(self.lastOpenDir))
defaultOpenDirPath = self.lastOpenDir
else:
if self.lang == 'ch':
self.msgBox.warning(self, "提示", "\n 原文件夹已不存在,请从新选择数据集路径!")
else:
self.msgBox.warning(self, "Warn",
"\n The original folder no longer exists, please choose the data set path again!")
self.actions.open_dataset_dir.setEnabled(False)
defaultOpenDirPath = os.path.dirname(self.filePath) if self.filePath else '.'
def init_key_list(self, label_dict):
if not self.kie_mode:
return
# load key_cls
for image, info in label_dict.items():
for box in info:
if "key_cls" not in box:
box.update({"key_cls": "None"})
self.existed_key_cls_set.add(box["key_cls"])
if len(self.existed_key_cls_set) > 0:
for key_text in self.existed_key_cls_set:
if not self.keyList.findItemsByLabel(key_text):
item = self.keyList.createItemFromLabel(key_text)
self.keyList.addItem(item)
rgb = self._get_rgb_by_label(key_text, self.kie_mode)
self.keyList.setItemLabel(item, key_text, rgb)
if self.keyDialog is None:
# key list dialog
self.keyDialog = KeyDialog(
text=self.key_dialog_tip,
parent=self,
labels=self.existed_key_cls_set,
sort_labels=True,
show_text_field=True,
completion="startswith",
fit_to_content={'column': True, 'row': False},
flags=None
)
def importDirImages(self, dirpath, isDelete=False):
if not self.mayContinue() or not dirpath:
return
if self.defaultSaveDir and self.defaultSaveDir != dirpath:
self.saveLabelFile()
if not isDelete:
self.loadFilestate(dirpath)
self.PPlabelpath = dirpath + '/Label.txt'
self.PPlabel = self.loadLabelFile(self.PPlabelpath)
self.Cachelabelpath = dirpath + '/Cache.cach'
self.Cachelabel = self.loadLabelFile(self.Cachelabelpath)
if self.Cachelabel:
self.PPlabel = dict(self.Cachelabel, **self.PPlabel)
self.init_key_list(self.PPlabel)
self.lastOpenDir = dirpath
self.dirname = dirpath
self.defaultSaveDir = dirpath
self.statusBar().showMessage('%s started. Annotation will be saved to %s' %
(__appname__, self.defaultSaveDir))
self.statusBar().show()
self.filePath = None
self.fileListWidget.clear()
self.mImgList = self.scanAllImages(dirpath)
self.mImgList5 = self.mImgList[:5]
self.openNextImg()
doneicon = newIcon('done')
closeicon = newIcon('close')
for imgPath in self.mImgList:
filename = os.path.basename(imgPath)
if self.validFilestate(imgPath) is True:
item = QListWidgetItem(doneicon, filename)
else:
item = QListWidgetItem(closeicon, filename)
self.fileListWidget.addItem(item)
print('DirPath in importDirImages is', dirpath)
self.iconlist.clear()
self.additems5(dirpath)
self.changeFileFolder = True
self.haveAutoReced = False
self.AutoRecognition.setEnabled(True)
self.reRecogButton.setEnabled(True)
self.tableRecButton.setEnabled(True)
self.actions.AutoRec.setEnabled(True)
self.actions.reRec.setEnabled(True)
self.actions.tableRec.setEnabled(True)
self.actions.open_dataset_dir.setEnabled(True)
self.actions.rotateLeft.setEnabled(True)
self.actions.rotateRight.setEnabled(True)
self.fileListWidget.setCurrentRow(0) # set list index to first
self.fileDock.setWindowTitle(self.fileListName + f" (1/{self.fileListWidget.count()})") # show image count
def openPrevImg(self, _value=False):
if len(self.mImgList) <= 0:
return
if self.filePath is None:
return
currIndex = self.mImgList.index(self.filePath)
self.mImgList5 = self.mImgList[:5]
if currIndex - 1 >= 0:
filename = self.mImgList[currIndex - 1]
self.mImgList5 = self.indexTo5Files(currIndex - 1)
if filename:
self.loadFile(filename)
def openNextImg(self, _value=False):
if not self.mayContinue():
return
if len(self.mImgList) <= 0:
return
filename = None
if self.filePath is None:
filename = self.mImgList[0]
self.mImgList5 = self.mImgList[:5]
else:
currIndex = self.mImgList.index(self.filePath)
if currIndex + 1 < len(self.mImgList):
filename = self.mImgList[currIndex + 1]
self.mImgList5 = self.indexTo5Files(currIndex + 1)
else:
self.mImgList5 = self.indexTo5Files(currIndex)
if filename:
print('file name in openNext is ', filename)
self.loadFile(filename)
def updateFileListIcon(self, filename):
pass
def saveFile(self, _value=False, mode='Manual'):
# Manual mode is used for users click "Save" manually,which will change the state of the image
if self.filePath:
imgidx = self.getImglabelidx(self.filePath)
self._saveFile(imgidx, mode=mode)
def saveLockedShapes(self):
self.canvas.lockedShapes = []
self.canvas.selectedShapes = []
for s in self.canvas.shapes:
if s.line_color == DEFAULT_LOCK_COLOR:
self.canvas.selectedShapes.append(s)
self.lockSelectedShape()
for s in self.canvas.shapes:
if s.line_color == DEFAULT_LOCK_COLOR:
self.canvas.selectedShapes.remove(s)
self.canvas.shapes.remove(s)
def _saveFile(self, annotationFilePath, mode='Manual'):
if len(self.canvas.lockedShapes) != 0:
self.saveLockedShapes()
if mode == 'Manual':
self.result_dic_locked = []
img = cv2.imread(self.filePath)
width, height = self.image.width(), self.image.height()
for shape in self.canvas.lockedShapes:
box = [[int(p[0] * width), int(p[1] * height)] for p in shape['ratio']]
# assert len(box) == 4
result = [(shape['transcription'], 1)]
result.insert(0, box)
self.result_dic_locked.append(result)
self.result_dic += self.result_dic_locked
self.result_dic_locked = []
if annotationFilePath and self.saveLabels(annotationFilePath, mode=mode):
self.setClean()
self.statusBar().showMessage('Saved to %s' % annotationFilePath)
self.statusBar().show()
currIndex = self.mImgList.index(self.filePath)
item = self.fileListWidget.item(currIndex)
item.setIcon(newIcon('done'))
self.fileStatedict[self.filePath] = 1
if len(self.fileStatedict) % self.autoSaveNum == 0:
self.saveFilestate()
self.savePPlabel(mode='Auto')
self.fileListWidget.insertItem(int(currIndex), item)
if not self.canvas.isInTheSameImage:
self.openNextImg()
self.actions.saveRec.setEnabled(True)
self.actions.saveLabel.setEnabled(True)
self.actions.exportJSON.setEnabled(True)
elif mode == 'Auto':
if annotationFilePath and self.saveLabels(annotationFilePath, mode=mode):
self.setClean()
self.statusBar().showMessage('Saved to %s' % annotationFilePath)
self.statusBar().show()
def closeFile(self, _value=False):
if not self.mayContinue():
return
self.resetState()
self.setClean()
self.toggleActions(False)
self.canvas.setEnabled(False)
self.actions.saveAs.setEnabled(False)
def deleteImg(self):
deletePath = self.filePath
if deletePath is not None:
deleteInfo = self.deleteImgDialog()
if deleteInfo == QMessageBox.Yes:
if platform.system() == 'Windows':
from win32com.shell import shell, shellcon
shell.SHFileOperation((0, shellcon.FO_DELETE, deletePath, None,
shellcon.FOF_SILENT | shellcon.FOF_ALLOWUNDO | shellcon.FOF_NOCONFIRMATION,
None, None))
# linux
elif platform.system() == 'Linux':
cmd = 'trash ' + deletePath
os.system(cmd)
# macOS
elif platform.system() == 'Darwin':
import subprocess
absPath = os.path.abspath(deletePath).replace('\\', '\\\\').replace('"', '\\"')
cmd = ['osascript', '-e',
'tell app "Finder" to move {the POSIX file "' + absPath + '"} to trash']
print(cmd)
subprocess.call(cmd, stdout=open(os.devnull, 'w'))
if self.filePath in self.fileStatedict.keys():
self.fileStatedict.pop(self.filePath)
imgidx = self.getImglabelidx(self.filePath)
if imgidx in self.PPlabel.keys():
self.PPlabel.pop(imgidx)
self.openNextImg()
self.importDirImages(self.lastOpenDir, isDelete=True)
def deleteImgDialog(self):
yes, cancel = QMessageBox.Yes, QMessageBox.Cancel
msg = u'The image will be deleted to the recycle bin'
return QMessageBox.warning(self, u'Attention', msg, yes | cancel)
def resetAll(self):
self.settings.reset()
self.close()
proc = QProcess()
proc.startDetached(os.path.abspath(__file__))
def mayContinue(self): #
if not self.dirty:
return True
else:
discardChanges = self.discardChangesDialog()
if discardChanges == QMessageBox.No:
return True
elif discardChanges == QMessageBox.Yes:
self.canvas.isInTheSameImage = True
self.saveFile()
self.canvas.isInTheSameImage = False
return True
else:
return False
def discardChangesDialog(self):
yes, no, cancel = QMessageBox.Yes, QMessageBox.No, QMessageBox.Cancel
if self.lang == 'ch':
msg = u'您有未保存的变更, 您想保存再继续吗?\n点击 "No" 丢弃所有未保存的变更.'
else:
msg = u'You have unsaved changes, would you like to save them and proceed?\nClick "No" to undo all changes.'
return QMessageBox.warning(self, u'Attention', msg, yes | no | cancel)
def errorMessage(self, title, message):
return QMessageBox.critical(self, title,
'<p><b>%s</b></p>%s' % (title, message))
def currentPath(self):
return os.path.dirname(self.filePath) if self.filePath else '.'
def chooseColor(self):
color = self.colorDialog.getColor(self.lineColor, u'Choose line color',
default=DEFAULT_LINE_COLOR)
if color:
self.lineColor = color
Shape.line_color = color
self.canvas.setDrawingColor(color)
self.canvas.update()
self.setDirty()
def deleteSelectedShape(self):
self.remLabels(self.canvas.deleteSelected())
self.actions.undo.setEnabled(True)
self.setDirty()
if self.noShapes():
for action in self.actions.onShapesPresent:
action.setEnabled(False)
self.BoxListDock.setWindowTitle(self.BoxListDockName + f" ({self.BoxList.count()})")
self.labelListDock.setWindowTitle(self.labelListDockName + f" ({self.labelList.count()})")
def chshapeLineColor(self):
color = self.colorDialog.getColor(self.lineColor, u'Choose line color',
default=DEFAULT_LINE_COLOR)
if color:
for shape in self.canvas.selectedShapes: shape.line_color = color
self.canvas.update()
self.setDirty()
def chshapeFillColor(self):
color = self.colorDialog.getColor(self.fillColor, u'Choose fill color',
default=DEFAULT_FILL_COLOR)
if color:
for shape in self.canvas.selectedShapes: shape.fill_color = color
self.canvas.update()
self.setDirty()
def copyShape(self):
self.canvas.endMove(copy=True)
self.addLabel(self.canvas.selectedShape)
self.setDirty()
def moveShape(self):
self.canvas.endMove(copy=False)
self.setDirty()
def loadPredefinedClasses(self, predefClassesFile):
if os.path.exists(predefClassesFile) is True:
with codecs.open(predefClassesFile, 'r', 'utf8') as f:
for line in f:
line = line.strip()
if self.labelHist is None:
self.labelHist = [line]
else:
self.labelHist.append(line)
def togglePaintLabelsOption(self):
for shape in self.canvas.shapes:
shape.paintLabel = self.displayLabelOption.isChecked()
def toogleDrawSquare(self):
self.canvas.setDrawingShapeToSquare(self.drawSquaresOption.isChecked())
def additems(self, dirpath):
for file in self.mImgList:
pix = QPixmap(file)
_, filename = os.path.split(file)
filename, _ = os.path.splitext(filename)
item = QListWidgetItem(QIcon(pix.scaled(100, 100, Qt.IgnoreAspectRatio, Qt.FastTransformation)),
filename[:10])
item.setToolTip(file)
self.iconlist.addItem(item)
def additems5(self, dirpath):
for file in self.mImgList5:
pix = QPixmap(file)
_, filename = os.path.split(file)
filename, _ = os.path.splitext(filename)
pfilename = filename[:10]
if len(pfilename) < 10:
lentoken = 12 - len(pfilename)
prelen = lentoken // 2
bfilename = prelen * " " + pfilename + (lentoken - prelen) * " "
# item = QListWidgetItem(QIcon(pix.scaled(100, 100, Qt.KeepAspectRatio, Qt.SmoothTransformation)),filename[:10])
item = QListWidgetItem(QIcon(pix.scaled(100, 100, Qt.IgnoreAspectRatio, Qt.FastTransformation)), pfilename)
# item.setForeground(QBrush(Qt.white))
item.setToolTip(file)
self.iconlist.addItem(item)
owidth = 0
for index in range(len(self.mImgList5)):
item = self.iconlist.item(index)
itemwidget = self.iconlist.visualItemRect(item)
owidth += itemwidget.width()
self.iconlist.setMinimumWidth(owidth + 50)
def gen_quad_from_poly(self, poly):
"""
Generate min area quad from poly.
"""
point_num = poly.shape[0]
min_area_quad = np.zeros((4, 2), dtype=np.float32)
rect = cv2.minAreaRect(poly.astype(
np.int32)) # (center (x,y), (width, height), angle of rotation)
box = np.array(cv2.boxPoints(rect))
first_point_idx = 0
min_dist = 1e4
for i in range(4):
dist = np.linalg.norm(box[(i + 0) % 4] - poly[0]) + \
np.linalg.norm(box[(i + 1) % 4] - poly[point_num // 2 - 1]) + \
np.linalg.norm(box[(i + 2) % 4] - poly[point_num // 2]) + \
np.linalg.norm(box[(i + 3) % 4] - poly[-1])
if dist < min_dist:
min_dist = dist
first_point_idx = i
for i in range(4):
min_area_quad[i] = box[(first_point_idx + i) % 4]
bbox_new = min_area_quad.tolist()
bbox = []
for box in bbox_new:
box = list(map(int, box))
bbox.append(box)
return bbox
def getImglabelidx(self, filePath):
if platform.system() == 'Windows':
spliter = '\\'
else:
spliter = '/'
filepathsplit = filePath.split(spliter)[-2:]
return filepathsplit[0] + '/' + filepathsplit[1]
def autoRecognition(self):
assert self.mImgList is not None
print('Using model from ', self.model)
uncheckedList = [i for i in self.mImgList if i not in self.fileStatedict.keys()]
self.autoDialog = AutoDialog(parent=self, ocr=self.ocr, mImgList=uncheckedList, lenbar=len(uncheckedList))
self.autoDialog.popUp()
self.currIndex = len(self.mImgList) - 1
self.loadFile(self.filePath) # ADD
self.haveAutoReced = True
self.AutoRecognition.setEnabled(False)
self.actions.AutoRec.setEnabled(False)
self.setDirty()
self.saveCacheLabel()
self.init_key_list(self.Cachelabel)
def reRecognition(self):
img = cv2.imread(self.filePath)
# org_box = [dic['points'] for dic in self.PPlabel[self.getImglabelidx(self.filePath)]]
if self.canvas.shapes:
self.result_dic = []
self.result_dic_locked = [] # result_dic_locked stores the ocr result of self.canvas.lockedShapes
rec_flag = 0
for shape in self.canvas.shapes:
box = [[int(p.x()), int(p.y())] for p in shape.points]
kie_cls = shape.key_cls
if len(box) > 4:
box = self.gen_quad_from_poly(np.array(box))
assert len(box) == 4
img_crop = get_rotate_crop_image(img, np.array(box, np.float32))
if img_crop is None:
msg = 'Can not recognise the detection box in ' + self.filePath + '. Please change manually'
QMessageBox.information(self, "Information", msg)
return
result = self.ocr.ocr(img_crop, cls=True, det=False)
if result[0][0] != '':
if shape.line_color == DEFAULT_LOCK_COLOR:
shape.label = result[0][0]
result.insert(0, box)
if self.kie_mode:
result.append(kie_cls)
self.result_dic_locked.append(result)
else:
result.insert(0, box)
if self.kie_mode:
result.append(kie_cls)
self.result_dic.append(result)
else:
print('Can not recognise the box')
if shape.line_color == DEFAULT_LOCK_COLOR:
shape.label = result[0][0]
if self.kie_mode:
self.result_dic_locked.append([box, (self.noLabelText, 0), kie_cls])
else:
self.result_dic_locked.append([box, (self.noLabelText, 0)])
else:
if self.kie_mode:
self.result_dic.append([box, (self.noLabelText, 0), kie_cls])
else:
self.result_dic.append([box, (self.noLabelText, 0)])
try:
if self.noLabelText == shape.label or result[1][0] == shape.label:
print('label no change')
else:
rec_flag += 1
except IndexError as e:
print('Can not recognise the box')
if (len(self.result_dic) > 0 and rec_flag > 0) or self.canvas.lockedShapes:
self.canvas.isInTheSameImage = True
self.saveFile(mode='Auto')
self.loadFile(self.filePath)
self.canvas.isInTheSameImage = False
self.setDirty()
elif len(self.result_dic) == len(self.canvas.shapes) and rec_flag == 0:
if self.lang == 'ch':
QMessageBox.information(self, "Information", "识别结果保持一致!")
else:
QMessageBox.information(self, "Information", "The recognition result remains unchanged!")
else:
print('Can not recgonise in ', self.filePath)
else:
QMessageBox.information(self, "Information", "Draw a box!")
def singleRerecognition(self):
img = cv2.imread(self.filePath)
for shape in self.canvas.selectedShapes:
box = [[int(p.x()), int(p.y())] for p in shape.points]
if len(box) > 4:
box = self.gen_quad_from_poly(np.array(box))
assert len(box) == 4
img_crop = get_rotate_crop_image(img, np.array(box, np.float32))
if img_crop is None:
msg = 'Can not recognise the detection box in ' + self.filePath + '. Please change manually'
QMessageBox.information(self, "Information", msg)
return
result = self.ocr.ocr(img_crop, cls=True, det=False)
if result[0][0] != '':
result.insert(0, box)
print('result in reRec is ', result)
if result[1][0] == shape.label:
print('label no change')
else:
shape.label = result[1][0]
else:
print('Can not recognise the box')
if self.noLabelText == shape.label:
print('label no change')
else:
shape.label = self.noLabelText
self.singleLabel(shape)
self.setDirty()
def TableRecognition(self):
'''
Table Recegnition
'''
from paddleocr.ppstructure.table.predict_table import to_excel
import time
start = time.time()
img = cv2.imread(self.filePath)
res = self.table_ocr(img, return_ocr_result_in_table=True)
TableRec_excel_dir = self.lastOpenDir + '/tableRec_excel_output/'
os.makedirs(TableRec_excel_dir, exist_ok=True)
filename, _ = os.path.splitext(os.path.basename(self.filePath))
excel_path = TableRec_excel_dir + '{}.xlsx'.format(filename)
if res is None:
msg = 'Can not recognise the table in ' + self.filePath + '. Please change manually'
QMessageBox.information(self, "Information", msg)
to_excel('', excel_path) # create an empty excel
return
# save res
# ONLY SUPPORT ONE TABLE in one image
hasTable = False
for region in res:
if region['type'] == 'Table':
if region['res']['boxes'] is None:
msg = 'Can not recognise the detection box in ' + self.filePath + '. Please change manually'
QMessageBox.information(self, "Information", msg)
to_excel('', excel_path) # create an empty excel
return
hasTable = True
# save table ocr result on PPOCRLabel
# clear all old annotaions before saving result
self.itemsToShapes.clear()
self.shapesToItems.clear()
self.itemsToShapesbox.clear() # ADD
self.shapesToItemsbox.clear()
self.labelList.clear()
self.BoxList.clear()
self.result_dic = []
self.result_dic_locked = []
shapes = []
result_len = len(region['res']['boxes'])
for i in range(result_len):
bbox = np.array(region['res']['boxes'][i])
rec_text = region['res']['rec_res'][i][0]
# polys to rectangles
x1, y1 = np.min(bbox[:, 0]), np.min(bbox[:, 1])
x2, y2 = np.max(bbox[:, 0]), np.max(bbox[:, 1])
rext_bbox = [[x1, y1], [x2, y1], [x2, y2], [x1, y2]]
# save bbox to shape
shape = Shape(label=rec_text, line_color=DEFAULT_LINE_COLOR, key_cls=None)
for point in rext_bbox:
x, y = point
# Ensure the labels are within the bounds of the image.
# If not, fix them.
x, y, snapped = self.canvas.snapPointToCanvas(x, y)
shape.addPoint(QPointF(x, y))
shape.difficult = False
# shape.locked = False
shape.close()
self.addLabel(shape)
shapes.append(shape)
self.setDirty()
self.canvas.loadShapes(shapes)
# save HTML result to excel
try:
to_excel(region['res']['html'], excel_path)
except:
print('Can not save excel file, maybe Permission denied (.xlsx is being occupied)')
break
if not hasTable:
msg = 'Can not recognise the table in ' + self.filePath + '. Please change manually'
QMessageBox.information(self, "Information", msg)
to_excel('', excel_path) # create an empty excel
return
# automatically open excel annotation file
if platform.system() == 'Windows':
try:
import win32com.client
except:
print("CANNOT OPEN .xlsx. It could be one of the following reasons: " \
"Only support Windows | No python win32com")
try:
xl = win32com.client.Dispatch("Excel.Application")
xl.Visible = True
xl.Workbooks.Open(excel_path)
# excelEx = "You need to show the excel executable at this point"
# subprocess.Popen([excelEx, excel_path])
# os.startfile(excel_path)
except:
print("CANNOT OPEN .xlsx. It could be the following reasons: " \
".xlsx is not existed")
else:
os.system('open ' + os.path.normpath(excel_path))
print('time cost: ', time.time() - start)
def cellreRecognition(self):
'''
re-recognise text in a cell
'''
img = cv2.imread(self.filePath)
for shape in self.canvas.selectedShapes:
box = [[int(p.x()), int(p.y())] for p in shape.points]
if len(box) > 4:
box = self.gen_quad_from_poly(np.array(box))
assert len(box) == 4
# pad around bbox for better text recognition accuracy
_box = boxPad(box, img.shape, 6)
img_crop = get_rotate_crop_image(img, np.array(_box, np.float32))
if img_crop is None:
msg = 'Can not recognise the detection box in ' + self.filePath + '. Please change manually'
QMessageBox.information(self, "Information", msg)
return
# merge the text result in the cell
texts = ''
probs = 0. # the probability of the cell is avgerage prob of every text box in the cell
bboxes = self.ocr.ocr(img_crop, det=True, rec=False, cls=False)
if len(bboxes) > 0:
bboxes.reverse() # top row text at first
for _bbox in bboxes:
patch = get_rotate_crop_image(img_crop, np.array(_bbox, np.float32))
rec_res = self.ocr.ocr(patch, det=False, rec=True, cls=False)
text = rec_res[0][0]
if text != '':
texts += text + ('' if text[0].isalpha() else ' ') # add space between english word
probs += rec_res[0][1]
probs = probs / len(bboxes)
result = [(texts.strip(), probs)]
if result[0][0] != '':
result.insert(0, box)
print('result in reRec is ', result)
if result[1][0] == shape.label:
print('label no change')
else:
shape.label = result[1][0]
else:
print('Can not recognise the box')
if self.noLabelText == shape.label:
print('label no change')
else:
shape.label = self.noLabelText
self.singleLabel(shape)
self.setDirty()
def exportJSON(self):
'''
export PPLabel and CSV to JSON (PubTabNet)
'''
import pandas as pd
from libs.dataPartitionDialog import DataPartitionDialog
# data partition user input
partitionDialog = DataPartitionDialog(parent=self)
partitionDialog.exec()
if partitionDialog.getStatus() == False:
return
# automatically save annotations
self.saveFilestate()
self.savePPlabel(mode='auto')
# load box annotations
labeldict = {}
if not os.path.exists(self.PPlabelpath):
msg = 'ERROR, Can not find Label.txt'
QMessageBox.information(self, "Information", msg)
return
else:
with open(self.PPlabelpath, 'r', encoding='utf-8') as f:
data = f.readlines()
for each in data:
file, label = each.split('\t')
if label:
label = label.replace('false', 'False')
label = label.replace('true', 'True')
labeldict[file] = eval(label)
else:
labeldict[file] = []
train_split, val_split, test_split = partitionDialog.getDataPartition()
# check validate
if train_split + val_split + test_split > 100:
msg = "The sum of training, validation and testing data should be less than 100%"
QMessageBox.information(self, "Information", msg)
return
print(train_split, val_split, test_split)
train_split, val_split, test_split = float(train_split) / 100., float(val_split) / 100., float(test_split) / 100.
train_id = int(len(labeldict) * train_split)
val_id = int(len(labeldict) * (train_split + val_split))
print('Data partition: train:', train_id,
'validation:', val_id - train_id,
'test:', len(labeldict) - val_id)
TableRec_excel_dir = os.path.join(self.lastOpenDir, 'tableRec_excel_output')
json_results = []
imgid = 0
for image_path in labeldict.keys():
# load csv annotations
filename, _ = os.path.splitext(os.path.basename(image_path))
csv_path = os.path.join(TableRec_excel_dir, filename + '.xlsx')
if not os.path.exists(csv_path):
continue
excel = xlrd.open_workbook(csv_path)
sheet0 = excel.sheet_by_index(0) # only sheet 0
merged_cells = sheet0.merged_cells # (0,1,1,3) start row, end row, start col, end col
html_list = [['td'] * sheet0.ncols for i in range(sheet0.nrows)]
for merged in merged_cells:
html_list = expand_list(merged, html_list)
token_list = convert_token(html_list)
# load box annotations
cells = []
for anno in labeldict[image_path]:
tokens = list(anno['transcription'])
obb = anno['points']
hbb = OBB2HBB(np.array(obb)).tolist()
cells.append({'tokens': tokens, 'bbox': hbb})
# data split
if imgid < train_id:
split = 'train'
elif imgid < val_id:
split = 'val'
else:
split = 'test'
# save dict
html = {'structure': {'tokens': token_list}, 'cell': cells}
json_results.append({'filename': os.path.basename(image_path), 'split': split, 'imgid': imgid, 'html': html})
imgid += 1
# save json
with open("{}/annotation.json".format(self.lastOpenDir), "w", encoding='utf-8') as fid:
fid.write(json.dumps(json_results, ensure_ascii=False))
msg = 'JSON sucessfully saved in {}/annotation.json'.format(self.lastOpenDir)
QMessageBox.information(self, "Information", msg)
def autolcm(self):
vbox = QVBoxLayout()
hbox = QHBoxLayout()
self.panel = QLabel()
self.panel.setText(self.stringBundle.getString('choseModelLg'))
self.panel.setAlignment(Qt.AlignLeft)
self.comboBox = QComboBox()
self.comboBox.setObjectName("comboBox")
self.comboBox.addItems(['Chinese & English', 'English', 'French', 'German', 'Korean', 'Japanese'])
vbox.addWidget(self.panel)
vbox.addWidget(self.comboBox)
self.dialog = QDialog()
self.dialog.resize(300, 100)
self.okBtn = QPushButton(self.stringBundle.getString('ok'))
self.cancelBtn = QPushButton(self.stringBundle.getString('cancel'))
self.okBtn.clicked.connect(self.modelChoose)
self.cancelBtn.clicked.connect(self.cancel)
self.dialog.setWindowTitle(self.stringBundle.getString('choseModelLg'))
hbox.addWidget(self.okBtn)
hbox.addWidget(self.cancelBtn)
vbox.addWidget(self.panel)
vbox.addLayout(hbox)
self.dialog.setLayout(vbox)
self.dialog.setWindowModality(Qt.ApplicationModal)
self.dialog.exec_()
if self.filePath:
self.AutoRecognition.setEnabled(True)
self.actions.AutoRec.setEnabled(True)
def modelChoose(self):
print(self.comboBox.currentText())
lg_idx = {'Chinese & English': 'ch', 'English': 'en', 'French': 'french', 'German': 'german',
'Korean': 'korean', 'Japanese': 'japan'}
del self.ocr
self.ocr = PaddleOCR(use_pdserving=False, use_angle_cls=True, det=True, cls=True, use_gpu=False,
lang=lg_idx[self.comboBox.currentText()])
del self.table_ocr
self.table_ocr = PPStructure(use_pdserving=False,
use_gpu=False,
lang=lg_idx[self.comboBox.currentText()],
layout=False,
show_log=False)
self.dialog.close()
def cancel(self):
self.dialog.close()
def loadFilestate(self, saveDir):
self.fileStatepath = saveDir + '/fileState.txt'
self.fileStatedict = {}
if not os.path.exists(self.fileStatepath):
f = open(self.fileStatepath, 'w', encoding='utf-8')
else:
with open(self.fileStatepath, 'r', encoding='utf-8') as f:
states = f.readlines()
for each in states:
file, state = each.split('\t')
self.fileStatedict[file] = 1
self.actions.saveLabel.setEnabled(True)
self.actions.saveRec.setEnabled(True)
self.actions.exportJSON.setEnabled(True)
def saveFilestate(self):
with open(self.fileStatepath, 'w', encoding='utf-8') as f:
for key in self.fileStatedict:
f.write(key + '\t')
f.write(str(self.fileStatedict[key]) + '\n')
def loadLabelFile(self, labelpath):
labeldict = {}
if not os.path.exists(labelpath):
f = open(labelpath, 'w', encoding='utf-8')
else:
with open(labelpath, 'r', encoding='utf-8') as f:
data = f.readlines()
for each in data:
file, label = each.split('\t')
if label:
label = label.replace('false', 'False')
label = label.replace('true', 'True')
labeldict[file] = eval(label)
else:
labeldict[file] = []
return labeldict
def savePPlabel(self, mode='Manual'):
savedfile = [self.getImglabelidx(i) for i in self.fileStatedict.keys()]
with open(self.PPlabelpath, 'w', encoding='utf-8') as f:
for key in self.PPlabel:
if key in savedfile and self.PPlabel[key] != []:
f.write(key + '\t')
f.write(json.dumps(self.PPlabel[key], ensure_ascii=False) + '\n')
if mode == 'Manual':
if self.lang == 'ch':
msg = '已将检查过的图片标签保存在 ' + self.PPlabelpath + " 文件中"
else:
msg = 'Images that have been checked are saved in ' + self.PPlabelpath
QMessageBox.information(self, "Information", msg)
def saveCacheLabel(self):
with open(self.Cachelabelpath, 'w', encoding='utf-8') as f:
for key in self.Cachelabel:
f.write(key + '\t')
f.write(json.dumps(self.Cachelabel[key], ensure_ascii=False) + '\n')
def saveLabelFile(self):
self.saveFilestate()
self.savePPlabel()
def saveRecResult(self):
if {} in [self.PPlabelpath, self.PPlabel, self.fileStatedict]:
QMessageBox.information(self, "Information", "Check the image first")
return
rec_gt_dir = os.path.dirname(self.PPlabelpath) + '/rec_gt.txt'
crop_img_dir = os.path.dirname(self.PPlabelpath) + '/crop_img/'
ques_img = []
if not os.path.exists(crop_img_dir):
os.mkdir(crop_img_dir)
with open(rec_gt_dir, 'w', encoding='utf-8') as f:
for key in self.fileStatedict:
idx = self.getImglabelidx(key)
try:
img = cv2.imread(key)
for i, label in enumerate(self.PPlabel[idx]):
if label['difficult']:
continue
img_crop = get_rotate_crop_image(img, np.array(label['points'], np.float32))
img_name = os.path.splitext(os.path.basename(idx))[0] + '_crop_' + str(i) + '.jpg'
cv2.imwrite(crop_img_dir + img_name, img_crop)
f.write('crop_img/' + img_name + '\t')
f.write(label['transcription'] + '\n')
except Exception as e:
ques_img.append(key)
print("Can not read image ", e)
if ques_img:
QMessageBox.information(self,
"Information",
"The following images can not be saved, please check the image path and labels.\n"
+ "".join(str(i) + '\n' for i in ques_img))
QMessageBox.information(self, "Information", "Cropped images have been saved in " + str(crop_img_dir))
def speedChoose(self):
if self.labelDialogOption.isChecked():
self.canvas.newShape.disconnect()
self.canvas.newShape.connect(partial(self.newShape, True))
else:
self.canvas.newShape.disconnect()
self.canvas.newShape.connect(partial(self.newShape, False))
def autoSaveFunc(self):
if self.autoSaveOption.isChecked():
self.autoSaveNum = 1 # Real auto_Save
try:
self.saveLabelFile()
except:
pass
print('The program will automatically save once after confirming an image')
else:
self.autoSaveNum = 5 # Used for backup
print('The program will automatically save once after confirming 5 images (default)')
def change_box_key(self):
if not self.kie_mode:
return
key_text, _ = self.keyDialog.popUp(self.key_previous_text)
if key_text is None:
return
self.key_previous_text = key_text
for shape in self.canvas.selectedShapes:
shape.key_cls = key_text
if not self.keyList.findItemsByLabel(key_text):
item = self.keyList.createItemFromLabel(key_text)
self.keyList.addItem(item)
rgb = self._get_rgb_by_label(key_text, self.kie_mode)
self.keyList.setItemLabel(item, key_text, rgb)
self._update_shape_color(shape)
self.keyDialog.addLabelHistory(key_text)
def undoShapeEdit(self):
self.canvas.restoreShape()
self.labelList.clear()
self.BoxList.clear()
self.loadShapes(self.canvas.shapes)
self.actions.undo.setEnabled(self.canvas.isShapeRestorable)
def loadShapes(self, shapes, replace=True):
self._noSelectionSlot = True
for shape in shapes:
self.addLabel(shape)
self.labelList.clearSelection()
self._noSelectionSlot = False
self.canvas.loadShapes(shapes, replace=replace)
print("loadShapes") # 1
def lockSelectedShape(self):
"""lock the selected shapes.
Add self.selectedShapes to lock self.canvas.lockedShapes,
which holds the ratio of the four coordinates of the locked shapes
to the width and height of the image
"""
width, height = self.image.width(), self.image.height()
def format_shape(s):
return dict(label=s.label, # str
line_color=s.line_color.getRgb(),
fill_color=s.fill_color.getRgb(),
ratio=[[int(p.x()) / width, int(p.y()) / height] for p in s.points], # QPonitF
difficult=s.difficult, # bool
key_cls=s.key_cls, # bool
)
# lock
if len(self.canvas.lockedShapes) == 0:
for s in self.canvas.selectedShapes:
s.line_color = DEFAULT_LOCK_COLOR
s.locked = True
shapes = [format_shape(shape) for shape in self.canvas.selectedShapes]
trans_dic = []
for box in shapes:
trans_dict = {"transcription": box['label'], "ratio": box['ratio'], "difficult": box['difficult']}
if self.kie_mode:
trans_dict.update({"key_cls": box["key_cls"]})
trans_dic.append(trans_dict)
self.canvas.lockedShapes = trans_dic
self.actions.save.setEnabled(True)
# unlock
else:
for s in self.canvas.shapes:
s.line_color = DEFAULT_LINE_COLOR
self.canvas.lockedShapes = []
self.result_dic_locked = []
self.setDirty()
self.actions.save.setEnabled(True)
def inverted(color):
return QColor(*[255 - v for v in color.getRgb()])
def read(filename, default=None):
try:
with open(filename, 'rb') as f:
return f.read()
except:
return default
def str2bool(v):
return v.lower() in ("true", "t", "1")
def get_main_app(argv=[]):
"""
Standard boilerplate Qt application code.
Do everything but app.exec_() -- so that we can test the application in one thread
"""
app = QApplication(argv)
app.setApplicationName(__appname__)
app.setWindowIcon(newIcon("app"))
# Tzutalin 201705+: Accept extra arguments to change predefined class file
arg_parser = argparse.ArgumentParser()
arg_parser.add_argument("--lang", type=str, default='en', nargs="?")
arg_parser.add_argument("--gpu", type=str2bool, default=True, nargs="?")
arg_parser.add_argument("--kie", type=str2bool, default=False, nargs="?")
arg_parser.add_argument("--predefined_classes_file",
default=os.path.join(os.path.dirname(__file__), "data", "predefined_classes.txt"),
nargs="?")
args = arg_parser.parse_args(argv[1:])
win = MainWindow(lang=args.lang,
gpu=args.gpu,
kie_mode=args.kie,
default_predefined_class_file=args.predefined_classes_file)
win.show()
return app, win
def main():
"""construct main app and run it"""
app, _win = get_main_app(sys.argv)
return app.exec_()
if __name__ == '__main__':
resource_file = './libs/resources.py'
if not os.path.exists(resource_file):
output = os.system('pyrcc5 -o libs/resources.py resources.qrc')
assert output == 0, "operate the cmd have some problems ,please check whether there is a in the lib " \
"directory resources.py "
sys.exit(main())
English | [简体中文](README_ch.md)
# PPOCRLabelv2
PPOCRLabelv2 is a semi-automatic graphic annotation tool suitable for OCR field, with built-in PP-OCR model to automatically detect and re-recognize data. It is written in Python3 and PyQT5, supporting rectangular box, table, irregular text and key information annotation modes. Annotations can be directly used for the training of PP-OCR detection and recognition models.
| regular text annotation | table annotation |
| :-------------------------------------------------: | :--------------------------------------------: |
| <img src="./data/gif/steps_en.gif" width="80%"/> | <img src="./data/gif/table.gif" width="100%"/> |
| **irregular text annotation** | **key information annotation** |
| <img src="./data/gif/multi-point.gif" width="80%"/> | <img src="./data/gif/kie.gif" width="100%"/> |
### Recent Update
- 2022.05: Add table annotations, follow `2.2 Table Annotations` for more information (by [whjdark](https://github.com/peterh0323); [Evezerest](https://github.com/Evezerest))
- 2022.02:(by [PeterH0323](https://github.com/peterh0323)
- Add KIE Mode by using `--kie`, for [detection + identification + keyword extraction] labeling.
- Improve user experience: support using `C` or `X` to rotate box, prompt for the number of files and labels, optimize interaction.
- 2021.11.17:
- Support install and start PPOCRLabel through the whl package (by [d2623587501](https://github.com/d2623587501))
- Dataset segmentation: Divide the annotation file into training, verification and testing parts (refer to section 3.5 below, by [MrCuiHao](https://github.com/MrCuiHao))
- 2021.8.11:
- New functions: Open the dataset folder, image rotation (Note: Please delete the label box before rotating the image) (by [Wei-JL](https://github.com/Wei-JL))
- Added shortcut key description (Help-Shortcut Key), repaired the direction shortcut key movement function under batch processing (by [d2623587501](https://github.com/d2623587501))
- 2021.2.5: New batch processing and undo functions (by [Evezerest](https://github.com/Evezerest)):
- **Batch processing function**: Press and hold the Ctrl key to select the box, you can move, copy, and delete in batches.
- **Undo function**: In the process of drawing a four-point label box or after editing the box, press Ctrl+Z to undo the previous operation.
- Fix image rotation and size problems, optimize the process of editing the mark frame (by [ninetailskim](https://github.com/ninetailskim)[edencfc](https://github.com/edencfc)).
- 2021.1.11: Optimize the labeling experience (by [edencfc](https://github.com/edencfc)),
- Users can choose whether to pop up the label input dialog after drawing the detection box in "View - Pop-up Label Input Dialog".
- The recognition result scrolls synchronously when users click related detection box.
- Click to modify the recognition result.(If you can't change the result, please switch to the system default input method, or switch back to the original input method again)
- 2020.12.18: Support re-recognition of a single label box (by [ninetailskim](https://github.com/ninetailskim) ), perfect shortcut keys.
## 1. Installation and Run
### 1.1 Install PaddlePaddle
```bash
pip3 install --upgrade pip
# If you have cuda9 or cuda10 installed on your machine, please run the following command to install
python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
# If you only have cpu on your machine, please run the following command to install
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
```
For more software version requirements, please refer to the instructions in [Installation Document](https://www.paddlepaddle.org.cn/install/quick) for operation.
### 1.2 Install and Run PPOCRLabel
PPOCRLabel can be started in two ways: whl package and Python script. The whl package form is more convenient to start, and the python script to start is convenient for secondary development.
#### Windows
```bash
pip install PPOCRLabel # install
# Select label mode and run
PPOCRLabel # [Normal mode] for [detection + recognition] labeling
PPOCRLabel --kie True # [KIE mode] for [detection + recognition + keyword extraction] labeling
```
> If you getting this error `OSError: [WinError 126] The specified module could not be found` when you install shapely on windows. Please try to download Shapely whl file using http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely.
>
> Reference: [Solve shapely installation on windows](https://stackoverflow.com/questions/44398265/install-shapely-oserror-winerror-126-the-specified-module-could-not-be-found)
>
#### Ubuntu Linux
```bash
pip3 install PPOCRLabel
pip3 install trash-cli
# Select label mode and run
PPOCRLabel # [Normal mode] for [detection + recognition] labeling
PPOCRLabel --kie True # [KIE mode] for [detection + recognition + keyword extraction] labeling
```
#### MacOS
```bash
pip3 install PPOCRLabel
pip3 install opencv-contrib-python-headless==4.2.0.32
# Select label mode and run
PPOCRLabel # [Normal mode] for [detection + recognition] labeling
PPOCRLabel --kie True # [KIE mode] for [detection + recognition + keyword extraction] labeling
```
#### 1.2.2 Run PPOCRLabel by Python Script
If you modify the PPOCRLabel file (for example, specifying a new built-in model), it will be more convenient to see the results by running the Python script. If you still want to start with the whl package, you need to uninstall the whl package in the current environment and then recompile it according to the next section.
```bash
cd ./PPOCRLabel # Switch to the PPOCRLabel directory
# Select label mode and run
python PPOCRLabel.py # [Normal mode] for [detection + recognition] labeling
python PPOCRLabel.py --kie True # [KIE mode] for [detection + recognition + keyword extraction] labeling
```
#### 1.2.3 Build and Install the Whl Package Locally
Compile and install a new whl package, where 1.0.2 is the version number, you can specify the new version in 'setup.py'.
```bash
cd PaddleOCR/PPOCRLabel
python3 setup.py bdist_wheel
pip3 install dist/PPOCRLabel-1.0.2-py2.py3-none-any.whl
```
## 2. Usage
### 2.1 Steps
1. Build and launch using the instructions above.
2. Click 'Open Dir' in Menu/File to select the folder of the picture.<sup>[1]</sup>
3. Click 'Auto recognition', use PP-OCR model to automatically annotate images which marked with 'X' <sup>[2]</sup>before the file name.
4. Create Box:
4.1 Click 'Create RectBox' or press 'W' in English keyboard mode to draw a new rectangle detection box. Click and release left mouse to select a region to annotate the text area.
4.2 Press 'Q' to enter four-point labeling mode which enables you to create any four-point shape by clicking four points with the left mouse button in succession and DOUBLE CLICK the left mouse as the signal of labeling completion.
5. After the marking frame is drawn, the user clicks "OK", and the detection frame will be pre-assigned a "TEMPORARY" label.
6. Click 're-Recognition', model will rewrite ALL recognition results in ALL detection box<sup>[3]</sup>.
7. Single click the result in 'recognition result' list to manually change inaccurate recognition results.
8. **Click "Check", the image status will switch to "√",then the program automatically jump to the next.**
9. Click "Delete Image", and the image will be deleted to the recycle bin.
10. Labeling result: the user can export the label result manually through the menu "File - Export Label", while the program will also export automatically if "File - Auto export Label Mode" is selected. The manually checked label will be stored in *Label.txt* under the opened picture folder. Click "File"-"Export Recognition Results" in the menu bar, the recognition training data of such pictures will be saved in the *crop_img* folder, and the recognition label will be saved in *rec_gt.txt*<sup>[4]</sup>.
### 2.2 Table Annotation
The table annotation is aimed at extracting the structure of the table in a picture and converting it to Excel format,
so the annotation needs to be done simultaneously with external software to edit Excel.
In PPOCRLabel, complete the text information labeling (text and position), complete the table structure information
labeling in the Excel file, the recommended steps are:
1. Table annotation: After opening the table picture, click on the `Table Recognition` button in the upper right corner of PPOCRLabel, which will call the table recognition model in PP-Structure to automatically label
the table and pop up Excel at the same time.
2. Change the recognition result: **label each cell** (i.e. the text in a cell is marked as a box). Right click on the box and click on `Cell Re-recognition`.
You can use the model to automatically recognise the text within a cell.
3. Mark the table structure: for each cell contains the text, **mark as any identifier (such as `1`) in Excel**, to ensure that the merged cell structure is same as the original picture.
4. Export JSON format annotation: close all Excel files corresponding to table images, click `File`-`Export table JSON annotation` to obtain JSON annotation results.
### 2.3 Note
[1] PPOCRLabel uses the opened folder as the project. After opening the image folder, the picture will not be displayed in the dialog. Instead, the pictures under the folder will be directly imported into the program after clicking "Open Dir".
[2] The image status indicates whether the user has saved the image manually. If it has not been saved manually it is "X", otherwise it is "√", PPOCRLabel will not relabel pictures with a status of "√".
[3] After clicking "Re-recognize", the model will overwrite ALL recognition results in the picture. Therefore, if the recognition result has been manually changed before, it may change after re-recognition.
[4] The files produced by PPOCRLabel can be found under the opened picture folder including the following, please do not manually change the contents, otherwise it will cause the program to be abnormal.
| File name | Description |
| :-----------: | :----------------------------------------------------------: |
| Label.txt | The detection label file can be directly used for PP-OCR detection model training. After the user saves 5 label results, the file will be automatically exported. It will also be written when the user closes the application or changes the file folder. |
| fileState.txt | The picture status file save the image in the current folder that has been manually confirmed by the user. |
| Cache.cach | Cache files to save the results of model recognition. |
| rec_gt.txt | The recognition label file, which can be directly used for PP-OCR identification model training, is generated after the user clicks on the menu bar "File"-"Export recognition result". |
| crop_img | The recognition data, generated at the same time with *rec_gt.txt* |
## 3. Explanation
### 3.1 Shortcut keys
| Shortcut keys | Description |
|--------------------------|--------------------------------------------------|
| Ctrl + Shift + R | Re-recognize all the labels of the current image |
| W | Create a rect box |
| Q | Create a multi-points box |
| X | Rotate the box anti-clockwise |
| C | Rotate the box clockwise |
| Ctrl + E | Edit label of the selected box |
| Ctrl + X | Change key class of the box when enable `--kie` |
| Ctrl + R | Re-recognize the selected box |
| Ctrl + C | Copy and paste the selected box |
| Ctrl + Left Mouse Button | Multi select the label box |
| Backspace | Delete the selected box |
| Ctrl + V | Check image |
| Ctrl + Shift + d | Delete image |
| D | Next image |
| A | Previous image |
| Ctrl++ | Zoom in |
| Ctrl-- | Zoom out |
| ↑→↓← | Move selected box |
### 3.2 Built-in Model
- Default model: PPOCRLabel uses the Chinese and English ultra-lightweight OCR model in PaddleOCR by default, supports Chinese, English and number recognition, and multiple language detection.
- Model language switching: Changing the built-in model language is supportable by clicking "PaddleOCR"-"Choose OCR Model" in the menu bar. Currently supported languages​include French, German, Korean, and Japanese.
For specific model download links, please refer to [PaddleOCR Model List](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/models_list_en.md#multilingual-recognition-modelupdating)
- **Custom Model**: If users want to replace the built-in model with their own inference model, they can follow the [Custom Model Code Usage](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/doc/doc_en/whl_en.md#31-use-by-code) by modifying PPOCRLabel.py for [Instantiation of PaddleOCR class](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.5/PPOCRLabel/PPOCRLabel.py#L97) :
add parameter `det_model_dir` in `self.ocr = PaddleOCR(use_pdserving=False, use_angle_cls=True, det=True, cls=True, use_gpu=gpu, lang=lang) `
### 3.3 Export Label Result
PPOCRLabel supports three ways to export Label.txt
- Automatically export: After selecting "File - Auto Export Label Mode", the program will automatically write the annotations into Label.txt every time the user confirms an image. If this option is not turned on, it will be automatically exported after detecting that the user has manually checked 5 images.
> The automatically export mode is turned off by default
- Manual export: Click "File-Export Marking Results" to manually export the label.
- Close application export
### 3.4 Export Partial Recognition Results
For some data that are difficult to recognize, the recognition results will not be exported by **unchecking** the corresponding tags in the recognition results checkbox. The unchecked recognition result is saved as `True` in the `difficult` variable in the label file `label.txt`.
> *Note: The status of the checkboxes in the recognition results still needs to be saved manually by clicking Save Button.*
### 3.5 Dataset division
- Enter the following command in the terminal to execute the dataset division script:
```
cd ./PPOCRLabel # Change the directory to the PPOCRLabel folder
python gen_ocr_train_val_test.py --trainValTestRatio 6:2:2 --datasetRootPath ../train_data
```
Parameter Description:
- `trainValTestRatio` is the division ratio of the number of images in the training set, validation set, and test set, set according to your actual situation, the default is `6:2:2`
- `datasetRootPath` is the storage path of the complete dataset labeled by PPOCRLabel. The default path is `PaddleOCR/train_data` .
```
|-train_data
|-crop_img
|- word_001_crop_0.png
|- word_002_crop_0.jpg
|- word_003_crop_0.jpg
| ...
| Label.txt
| rec_gt.txt
|- word_001.png
|- word_002.jpg
|- word_003.jpg
| ...
```
### 3.6 Error message
- If paddleocr is installed with whl, it has a higher priority than calling PaddleOCR class with paddleocr.py, which may cause an exception if whl package is not updated.
- For Linux users, if you get an error starting with **objc[XXXXX]** when opening the software, it proves that your opencv version is too high. It is recommended to install version 4.2:
```
pip install opencv-python==4.2.0.32
```
- If you get an error starting with **Missing string id **,you need to recompile resources:
```
pyrcc5 -o libs/resources.py resources.qrc
```
- If you get an error ``` module 'cv2' has no attribute 'INTER_NEAREST'```, you need to delete all opencv related packages first, and then reinstall the 4.2.0.32 version of headless opencv
```
pip install opencv-contrib-python-headless==4.2.0.32
```
### 4. Related
1.[Tzutalin. LabelImg. Git code (2015)](https://github.com/tzutalin/labelImg)
[English](README.md) | 简体中文
# PPOCRLabelv2
PPOCRLabel是一款适用于OCR领域的半自动化图形标注工具,内置PP-OCR模型对数据自动标注和重新识别。使用Python3和PyQT5编写,支持矩形框标注、表格标注、不规则文本标注、关键信息标注模式,导出格式可直接用于PaddleOCR检测和识别模型的训练。
| 常规标注 | 表格标注 |
| :---------------------------------------------------: | :----------------------------------------------: |
| <img src="./data/gif/steps_en.gif" width="80%"/> | <img src="./data/gif/table.gif" width="100%"/> |
| **不规则文本标注** | **关键信息标注** |
| <img src="./data/gif/multi-point.gif" width="80%"/> | <img src="./data/gif/kie.gif" width="100%"/> |
#### 近期更新
- 2022.05:新增表格标注,使用方法见下方`2.2 表格标注`(by [whjdark](https://github.com/peterh0323); [Evezerest](https://github.com/Evezerest))
- 2022.02:新增关键信息标注、优化标注体验(by [PeterH0323](https://github.com/peterh0323)
- 新增:使用 `--kie` 进入 KIE 功能,用于打【检测+识别+关键字提取】的标签
- 提升用户体验:新增文件与标记数目提示、优化交互、修复gpu使用等问题。
- 新增功能:使用 `C``X` 对标记框进行旋转。
- 2021.11.17:
- 新增支持通过whl包安装和启动PPOCRLabel(by [d2623587501](https://github.com/d2623587501)
- 标注数据集切分:对标注数据进行训练、验证与测试集划分(参考下方3.5节,by [MrCuiHao](https://github.com/MrCuiHao)
- 2021.8.11:
- 新增功能:打开数据所在文件夹、右键图像旋转90度(注意:旋转前的图片上不能存在标记框,by [Wei-JL](https://github.com/Wei-JL)
- 新增快捷键说明(帮助-快捷键)、修复批处理下的方向快捷键移动功能(by [d2623587501](https://github.com/d2623587501)
- 2021.2.5:新增批处理与撤销功能(by [Evezerest](https://github.com/Evezerest))
- **批处理功能**:按住Ctrl键选择标记框后可批量移动、复制、删除、重新识别。
- **撤销功能**:在绘制四点标注框过程中或对框进行编辑操作后,按下Ctrl+Z可撤销上一部操作。
- 修复图像旋转和尺寸问题、优化编辑标记框过程(by [ninetailskim](https://github.com/ninetailskim)[edencfc](https://github.com/edencfc)
- 2021.1.11:优化标注体验(by [edencfc](https://github.com/edencfc)):
- 用户可在“视图 - 弹出标记输入框”选择在画完检测框后标记输入框是否弹出。
- 识别结果与检测框同步滚动。
- 识别结果更改为单击修改。(如果无法修改,请切换为系统自带输入法,或再次切回原输入法)
- 2020.12.18: 支持对单个标记框进行重新识别(by [ninetailskim](https://github.com/ninetailskim)),完善快捷键。
如果您对完善工具有不一样的想法,欢迎通过[社区常规赛](https://github.com/PaddlePaddle/PaddleOCR/issues/4982)报名相关更改,获得积分兑换奖励。
## 1. 安装与运行
### 1.1 安装PaddlePaddle
```bash
pip3 install --upgrade pip
# 如果您的机器安装的是CUDA9或CUDA10,请运行以下命令安装
python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
# 如果您的机器是CPU,请运行以下命令安装
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
```
更多的版本需求,请参照[安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
### 1.2 安装与运行PPOCRLabel
PPOCRLabel可通过whl包与Python脚本两种方式启动,whl包形式启动更加方便,python脚本启动便于二次开发
#### 1.2.1 通过whl包安装与运行
##### Windows
```bash
pip install PPOCRLabel # 安装
# 选择标签模式来启动
PPOCRLabel --lang ch # 启动【普通模式】,用于打【检测+识别】场景的标签
PPOCRLabel --lang ch --kie True # 启动 【KIE 模式】,用于打【检测+识别+关键字提取】场景的标签
```
> 注意:通过whl包安装PPOCRLabel会自动下载 `paddleocr` whl包,其中shapely依赖可能会出现 `[winRrror 126] 找不到指定模块的问题。` 的错误,建议从[这里](https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely)下载并安装
##### Ubuntu Linux
```bash
pip3 install PPOCRLabel
pip3 install trash-cli
# 选择标签模式来启动
PPOCRLabel --lang ch # 启动【普通模式】,用于打【检测+识别】场景的标签
PPOCRLabel --lang ch --kie True # 启动 【KIE 模式】,用于打【检测+识别+关键字提取】场景的标签
```
##### MacOS
```bash
pip3 install PPOCRLabel
pip3 install opencv-contrib-python-headless==4.2.0.32 # 如果下载过慢请添加"-i https://mirror.baidu.com/pypi/simple"
# 选择标签模式来启动
PPOCRLabel --lang ch # 启动【普通模式】,用于打【检测+识别】场景的标签
PPOCRLabel --lang ch --kie True # 启动 【KIE 模式】,用于打【检测+识别+关键字提取】场景的标签
```
> 如果上述安装出现问题,可以参考3.6节 错误提示
#### 1.2.2 通过Python脚本运行PPOCRLabel
如果您对PPOCRLabel文件有所更改(例如指定新的内置模型),通过Python脚本运行会更加方便的看到更改的结果。如果仍然需要通过whl包启动,则需要先卸载当前环境中的whl包,然后参考下节重新编译whl包。
```bash
cd ./PPOCRLabel # 切换到PPOCRLabel目录
python PPOCRLabel.py --lang ch
```
#### 1.2.3 本地构建whl包并安装
编译与安装新的whl包,其中1.0.2为版本号,可在 `setup.py` 中指定新版本。
```bash
cd PaddleOCR/PPOCRLabel
python3 setup.py bdist_wheel
pip3 install dist/PPOCRLabel-1.0.2-py2.py3-none-any.whl -i https://mirror.baidu.com/pypi/simple
```
## 2. 使用
### 2.1 操作步骤
> 如果您只需要标注文字信息和位置,推荐按照以下步骤展开:
1. 安装与运行:使用上述命令安装与运行程序。
2. 打开文件夹:在菜单栏点击 “文件” - "打开目录" 选择待标记图片的文件夹<sup>[1]</sup>.
3. 自动标注:点击 ”自动标注“,使用PPOCR超轻量模型对图片文件名前图片状态<sup>[2]</sup>为 “X” 的图片进行自动标注。
4. 手动标注:点击 “矩形标注”(推荐直接在英文模式下点击键盘中的 “W”),用户可对当前图片中模型未检出的部分进行手动绘制标记框。点击键盘Q,则使用四点标注模式(或点击“编辑” - “四点标注”),用户依次点击4个点后,双击左键表示标注完成。
5. 标记框绘制完成后,用户点击 “确认”,检测框会先被预分配一个 “待识别” 标签。
6. 重新识别:将图片中的所有检测画绘制/调整完成后,点击 “重新识别”,PPOCR模型会对当前图片中的**所有检测框**重新识别<sup>[3]</sup>
7. 内容更改:单击识别结果,对不准确的识别结果进行手动更改。
8. **确认标记:点击 “确认”,图片状态切换为 “√”,跳转至下一张。**
9. 删除:点击 “删除图像”,图片将会被删除至回收站。
10. 导出结果:用户可以通过菜单中“文件-导出标记结果”手动导出,同时也可以点击“文件 - 自动导出标记结果”开启自动导出。手动确认过的标记将会被存放在所打开图片文件夹下的*Label.txt*中。在菜单栏点击 “文件” - "导出识别结果"后,会将此类图片的识别训练数据保存在*crop_img*文件夹下,识别标签保存在*rec_gt.txt*<sup>[4]</sup>
### 2.2 表格标注
表格标注针对表格的结构化提取,将图片中的表格转换为Excel格式,因此标注时需要配合外部软件打开Excel同时完成。
在PPOCRLabel软件中完成表格中的文字信息标注(文字与位置)、在Excel文件中完成表格结构信息标注,推荐的步骤为:
1. 表格识别:打开表格图片后,点击软件右上角 `表格识别` 按钮,软件调用PP-Structure中的表格识别模型,自动为表格打标签,同时弹出Excel
2. 更改识别结果:**以表格中的单元格为单位增加标注框**(即一个单元格内的文字都标记为一个框)。标注框上鼠标右键后点击 `单元格重识别`
可利用模型自动识别单元格内的文字。
3. 标注表格结构:将表格图像中有文字的单元格,**在Excel中标记为任意标识符(如`1`)**,保证Excel中的单元格合并情况与原图相同即可。
4. 导出JSON格式:关闭所有表格图像对应的Excel,点击 `文件`-`导出表格JSON标注` 获得JSON标注结果。
### 2.3 注意
[1] PPOCRLabel以**文件夹**为基本标记单位,打开待标记的图片文件夹后,不会在窗口栏中显示图片,而是在点击 "选择文件夹" 之后直接将文件夹下的图片导入到程序中。
[2] 图片状态表示本张图片用户是否手动保存过,未手动保存过即为 “X”,手动保存过为 “√”。点击 “自动标注”按钮后,PPOCRLabel不会对状态为 “√” 的图片重新标注。
[3] 点击“重新识别”后,模型会对图片中的识别结果进行覆盖。因此如果在此之前手动更改过识别结果,有可能在重新识别后产生变动。
[4] PPOCRLabel产生的文件放置于标记图片文件夹下,包括一下几种,请勿手动更改其中内容,否则会引起程序出现异常。
| 文件名 | 说明 |
| :-----------: | :----------------------------------------------------------: |
| Label.txt | 检测标签,可直接用于PPOCR检测模型训练。用户每确认5张检测结果后,程序会进行自动写入。当用户关闭应用程序或切换文件路径后同样会进行写入。 |
| fileState.txt | 图片状态标记文件,保存当前文件夹下已经被用户手动确认过的图片名称。 |
| Cache.cach | 缓存文件,保存模型自动识别的结果。 |
| rec_gt.txt | 识别标签。可直接用于PPOCR识别模型训练。需用户手动点击菜单栏“文件” - "导出识别结果"后产生。 |
| crop_img | 识别数据。按照检测框切割后的图片。与rec_gt.txt同时产生。 |
## 3. 说明
### 3.1 快捷键
| 快捷键 | 说明 |
|------------------|---------------------------------|
| Ctrl + shift + R | 对当前图片的所有标记重新识别 |
| W | 新建矩形框 |
| Q | 新建多点框 |
| X | 框逆时针旋转 |
| C | 框顺时针旋转 |
| Ctrl + E | 编辑所选框标签 |
| Ctrl + X | `--kie` 模式下,修改 Box 的关键字种类 |
| Ctrl + R | 重新识别所选标记 |
| Ctrl + C | 【复制并粘贴】选中的标记框 |
| Ctrl + 鼠标左键 | 多选标记框 |
| Backspace | 删除所选框 |
| Ctrl + V | 确认本张图片标记 |
| Ctrl + Shift + d | 删除本张图片 |
| D | 下一张图片 |
| A | 上一张图片 |
| Ctrl++ | 缩小 |
| Ctrl-- | 放大 |
| ↑→↓← | 移动标记框 |
### 3.2 内置模型
- 默认模型:PPOCRLabel默认使用PaddleOCR中的中英文超轻量OCR模型,支持中英文与数字识别,多种语言检测。
- 模型语言切换:用户可通过菜单栏中 "PaddleOCR" - "选择模型" 切换内置模型语言,目前支持的语言包括法文、德文、韩文、日文。具体模型下载链接可参考[PaddleOCR模型列表](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/models_list.md).
- **自定义模型**:如果用户想将内置模型更换为自己的推理模型,可根据[自定义模型代码使用](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/whl.md#%E8%87%AA%E5%AE%9A%E4%B9%89%E6%A8%A1%E5%9E%8B),通过修改PPOCRLabel.py中针对[PaddleOCR类的实例化](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/PPOCRLabel/PPOCRLabel.py#L116) 实现,例如指定检测模型:`self.ocr = PaddleOCR(det=True, cls=True, use_gpu=gpu, lang=lang) `,在 `det_model_dir` 中传入 自己的模型即可。
### 3.3 导出标记结果
PPOCRLabel支持三种导出方式:
- 自动导出:点击“文件 - 自动导出标记结果”后,用户每确认过一张图片,程序自动将标记结果写入Label.txt中。若未开启此选项,则检测到用户手动确认过5张图片后进行自动导出。
> 默认情况下自动导出功能为关闭状态
- 手动导出:点击“文件 - 导出标记结果”手动导出标记。
- 关闭应用程序导出
### 3.4 导出部分识别结果
针对部分难以识别的数据,通过在识别结果的复选框中**取消勾选**相应的标记,其识别结果不会被导出。被取消勾选的识别结果在标记文件 `label.txt` 中的 `difficult` 变量保存为 `True`
> *注意:识别结果中的复选框状态仍需用户手动点击确认后才能保留*
### 3.5 数据集划分
在终端中输入以下命令执行数据集划分脚本:
```
cd ./PPOCRLabel # 将目录切换到PPOCRLabel文件夹下
python gen_ocr_train_val_test.py --trainValTestRatio 6:2:2 --datasetRootPath ../train_data
```
参数说明:
- `trainValTestRatio` 是训练集、验证集、测试集的图像数量划分比例,根据实际情况设定,默认是`6:2:2`
- `datasetRootPath` 是PPOCRLabel标注的完整数据集存放路径。默认路径是 `PaddleOCR/train_data` 分割数据集前应有如下结构:
```
|-train_data
|-crop_img
|- word_001_crop_0.png
|- word_002_crop_0.jpg
|- word_003_crop_0.jpg
| ...
| Label.txt
| rec_gt.txt
|- word_001.png
|- word_002.jpg
|- word_003.jpg
| ...
```
### 3.6 错误提示
- 如果同时使用whl包安装了paddleocr,其优先级大于通过paddleocr.py调用PaddleOCR类,whl包未更新时会导致程序异常。
- PPOCRLabel**不支持对中文文件名**的图片进行自动标注。
- 针对Linux用户:如果您在打开软件过程中出现**objc[XXXXX]**开头的错误,证明您的opencv版本太高,建议安装4.2版本:
```
pip install opencv-python==4.2.0.32
```
- 如果出现 ```Missing string id``` 开头的错误,需要重新编译资源:
```
pyrcc5 -o libs/resources.py resources.qrc
```
- 如果出现``` module 'cv2' has no attribute 'INTER_NEAREST'```错误,需要首先删除所有opencv相关包,然后重新安装4.2.0.32版本的headless opencv
```
pip install opencv-contrib-python-headless==4.2.0.32
```
### 4. 参考资料
1.[Tzutalin. LabelImg. Git code (2015)](https://github.com/tzutalin/labelImg)
# coding:utf8
import os
import shutil
import random
import argparse
# 删除划分的训练集、验证集、测试集文件夹,重新创建一个空的文件夹
def isCreateOrDeleteFolder(path, flag):
flagPath = os.path.join(path, flag)
if os.path.exists(flagPath):
shutil.rmtree(flagPath)
os.makedirs(flagPath)
flagAbsPath = os.path.abspath(flagPath)
return flagAbsPath
def splitTrainVal(root, absTrainRootPath, absValRootPath, absTestRootPath, trainTxt, valTxt, testTxt, flag):
# 按照指定的比例划分训练集、验证集、测试集
dataAbsPath = os.path.abspath(root)
if flag == "det":
labelFilePath = os.path.join(dataAbsPath, args.detLabelFileName)
elif flag == "rec":
labelFilePath = os.path.join(dataAbsPath, args.recLabelFileName)
labelFileRead = open(labelFilePath, "r", encoding="UTF-8")
labelFileContent = labelFileRead.readlines()
random.shuffle(labelFileContent)
labelRecordLen = len(labelFileContent)
for index, labelRecordInfo in enumerate(labelFileContent):
imageRelativePath = labelRecordInfo.split('\t')[0]
imageLabel = labelRecordInfo.split('\t')[1]
imageName = os.path.basename(imageRelativePath)
if flag == "det":
imagePath = os.path.join(dataAbsPath, imageName)
elif flag == "rec":
imagePath = os.path.join(dataAbsPath, "{}\\{}".format(args.recImageDirName, imageName))
# 按预设的比例划分训练集、验证集、测试集
trainValTestRatio = args.trainValTestRatio.split(":")
trainRatio = eval(trainValTestRatio[0]) / 10
valRatio = trainRatio + eval(trainValTestRatio[1]) / 10
curRatio = index / labelRecordLen
if curRatio < trainRatio:
imageCopyPath = os.path.join(absTrainRootPath, imageName)
shutil.copy(imagePath, imageCopyPath)
trainTxt.write("{}\t{}".format(imageCopyPath, imageLabel))
elif curRatio >= trainRatio and curRatio < valRatio:
imageCopyPath = os.path.join(absValRootPath, imageName)
shutil.copy(imagePath, imageCopyPath)
valTxt.write("{}\t{}".format(imageCopyPath, imageLabel))
else:
imageCopyPath = os.path.join(absTestRootPath, imageName)
shutil.copy(imagePath, imageCopyPath)
testTxt.write("{}\t{}".format(imageCopyPath, imageLabel))
# 删掉存在的文件
def removeFile(path):
if os.path.exists(path):
os.remove(path)
def genDetRecTrainVal(args):
detAbsTrainRootPath = isCreateOrDeleteFolder(args.detRootPath, "train")
detAbsValRootPath = isCreateOrDeleteFolder(args.detRootPath, "val")
detAbsTestRootPath = isCreateOrDeleteFolder(args.detRootPath, "test")
recAbsTrainRootPath = isCreateOrDeleteFolder(args.recRootPath, "train")
recAbsValRootPath = isCreateOrDeleteFolder(args.recRootPath, "val")
recAbsTestRootPath = isCreateOrDeleteFolder(args.recRootPath, "test")
removeFile(os.path.join(args.detRootPath, "train.txt"))
removeFile(os.path.join(args.detRootPath, "val.txt"))
removeFile(os.path.join(args.detRootPath, "test.txt"))
removeFile(os.path.join(args.recRootPath, "train.txt"))
removeFile(os.path.join(args.recRootPath, "val.txt"))
removeFile(os.path.join(args.recRootPath, "test.txt"))
detTrainTxt = open(os.path.join(args.detRootPath, "train.txt"), "a", encoding="UTF-8")
detValTxt = open(os.path.join(args.detRootPath, "val.txt"), "a", encoding="UTF-8")
detTestTxt = open(os.path.join(args.detRootPath, "test.txt"), "a", encoding="UTF-8")
recTrainTxt = open(os.path.join(args.recRootPath, "train.txt"), "a", encoding="UTF-8")
recValTxt = open(os.path.join(args.recRootPath, "val.txt"), "a", encoding="UTF-8")
recTestTxt = open(os.path.join(args.recRootPath, "test.txt"), "a", encoding="UTF-8")
splitTrainVal(args.datasetRootPath, detAbsTrainRootPath, detAbsValRootPath, detAbsTestRootPath, detTrainTxt, detValTxt,
detTestTxt, "det")
for root, dirs, files in os.walk(args.datasetRootPath):
for dir in dirs:
if dir == 'crop_img':
splitTrainVal(root, recAbsTrainRootPath, recAbsValRootPath, recAbsTestRootPath, recTrainTxt, recValTxt,
recTestTxt, "rec")
else:
continue
break
if __name__ == "__main__":
# 功能描述:分别划分检测和识别的训练集、验证集、测试集
# 说明:可以根据自己的路径和需求调整参数,图像数据往往多人合作分批标注,每一批图像数据放在一个文件夹内用PPOCRLabel进行标注,
# 如此会有多个标注好的图像文件夹汇总并划分训练集、验证集、测试集的需求
parser = argparse.ArgumentParser()
parser.add_argument(
"--trainValTestRatio",
type=str,
default="6:2:2",
help="ratio of trainset:valset:testset")
parser.add_argument(
"--datasetRootPath",
type=str,
default="../train_data/",
help="path to the dataset marked by ppocrlabel, E.g, dataset folder named 1,2,3..."
)
parser.add_argument(
"--detRootPath",
type=str,
default="../train_data/det",
help="the path where the divided detection dataset is placed")
parser.add_argument(
"--recRootPath",
type=str,
default="../train_data/rec",
help="the path where the divided recognition dataset is placed"
)
parser.add_argument(
"--detLabelFileName",
type=str,
default="Label.txt",
help="the name of the detection annotation file")
parser.add_argument(
"--recLabelFileName",
type=str,
default="rec_gt.txt",
help="the name of the recognition annotation file"
)
parser.add_argument(
"--recImageDirName",
type=str,
default="crop_img",
help="the name of the folder where the cropped recognition dataset is located"
)
args = parser.parse_args()
genDetRecTrainVal(args)
__version_info__ = ('1', '0', '0')
__version__ = '.'.join(__version_info__)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment