"...git@developer.sourcefind.cn:modelzoo/qwen_lmdeploy.git" did not exist on "7283781e605bf29fb075bfaf12748e11d99e16bb"
Unverified Commit fee2c17b authored by MissPenguin's avatar MissPenguin Committed by GitHub
Browse files

Merge branch 'develop' into develop

parents da75ef8b bad9f6cd
# Android Demo 快速测试
### 1. 安装最新版本的Android Studio
可以从 https://developer.android.com/studio 下载。本Demo使用是4.0版本Android Studio编写。
### 2. 创建新项目
Demo测试的时候使用的是NDK 20b版本,20版本以上均可以支持编译成功。
如果您是初学者,可以用以下方式安装和测试NDK编译环境。
点击 File -> New ->New Project, 新建 "Native C++" project
1. Start a new Android Studio project
在项目模版中选择 Native C++ 选择PaddleOCR/depoly/android_demo 路径
进入项目后会自动编译,第一次编译会花费较长的时间,建议添加代理加速下载。
**代理添加:**
选择 Android Studio -> Perferences -> Appearance & Behavior -> System Settings -> HTTP Proxy -> Manual proxy configuration
![](../demo/proxy.png)
2. 开始编译
点击编译按钮,连接手机,跟着Android Studio的引导完成操作。
在 Android Studio 里看到下图,表示编译完成:
![](../demo/build.png)
**提示:** 此时如果出现下列找不到OpenCV的报错信息,请重新点击编译,编译完成后退出项目,再次进入。
![](../demo/error.png)
### 3. 发送到手机端
完成编译,点击运行,在手机端查看效果。
### 4. 如何自定义demo图片
1. 图片存放路径:android_demo/app/src/main/assets/images
将自定义图片放置在该路径下
2. 配置文件: android_demo/app/src/main/res/values/strings.xml
修改 IMAGE_PATH_DEFAULT 为自定义图片名即可
# 获得更多支持
前往[端计算模型生成平台EasyEdge](https://ai.baidu.com/easyedge/app/open_source_demo?referrerUrl=paddlelite),获得更多开发支持:
- Demo APP:可使用手机扫码安装,方便手机端快速体验文字识别
- SDK:模型被封装为适配不同芯片硬件和操作系统SDK,包括完善的接口,方便进行二次开发
......@@ -60,6 +60,8 @@
| beta1 | 设置一阶矩估计的指数衰减率 | 0.9 | \ |
| beta2 | 设置二阶矩估计的指数衰减率 | 0.999 | \ |
| decay | 是否使用decay | \ | \ |
| function(decay) | 设置decay方式 | cosine_decay | 目前只支持cosin_decay |
| step_each_epoch | 每个epoch包含多少次迭代 | 20 | 计算方式:total_image_num / (batch_size_per_card * card_size) |
| total_epoch | 总共迭代多少个epoch | 1000 | 与Global.epoch_num 一致 |
| function(decay) | 设置decay方式 | - | 目前支持cosine_decay与piecewise_decay |
| step_each_epoch | 每个epoch包含多少次迭代, cosine_decay时有效 | 20 | 计算方式:total_image_num / (batch_size_per_card * card_size) |
| total_epoch | 总共迭代多少个epoch, cosine_decay时有效 | 1000 | 与Global.epoch_num 一致 |
| boundaries | 学习率下降时的迭代次数间隔, piecewise_decay时有效 | - | 参数为列表形式 |
| decay_rate | 学习率衰减系数, piecewise_decay时有效 | - | \ |
......@@ -26,7 +26,7 @@ wget -P ./train_data/ https://paddleocr.bj.bcebos.com/dataset/test_icdar2015_la
提供的标注文件格式为,其中中间是"\t"分隔:
```
" 图像文件名 json.dumps编码的图像标注信息"
ch4_test_images/img_61.jpg [{"transcription": "MASA", "points": [[310, 104], [416, 141], [418, 216], [312, 179]], ...}]
ch4_test_images/img_61.jpg [{"transcription": "MASA", "points": [[310, 104], [416, 141], [418, 216], [312, 179]]}, {...}]
```
json.dumps编码前的图像标注信息是包含多个字典的list,字典中的 `points` 表示文本框的四个点的坐标(x, y),从左上角的点开始顺时针排列。
`transcription` 表示当前文本框的文字,在文本检测任务中并不需要这个信息。
......
......@@ -3,15 +3,15 @@
经测试PaddleOCR可在glibc 2.23上运行,您也可以测试其他glibc版本或安装glic 2.23
PaddleOCR 工作环境
- PaddlePaddle 1.7+
- python3
- python3.7
- glibc 2.23
- cuDNN 7.6+ (GPU)
建议使用我们提供的docker运行PaddleOCR,有关docker使用请参考[链接](https://docs.docker.com/get-started/)
建议使用我们提供的docker运行PaddleOCR,有关docker、nvidia-docker使用请参考[链接](https://docs.docker.com/get-started/)
*如您希望使用 mac 或 windows直接运行预测代码,可以从第2步开始执行。*
1. (建议)准备docker环境。第一次使用这个镜像,会自动下载该镜像,请耐心等待。
**1. (建议)准备docker环境。第一次使用这个镜像,会自动下载该镜像,请耐心等待。**
```
# 切换到工作目录下
cd /home/Projects
......@@ -21,10 +21,10 @@ cd /home/Projects
如果您希望在CPU环境下使用docker,使用docker而不是nvidia-docker创建docker
sudo docker run --name ppocr -v $PWD:/paddle --network=host -it hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda9.0-cudnn7-dev /bin/bash
如果您的机器安装的是CUDA9,请运行以下命令创建容器
如果使用CUDA9,请运行以下命令创建容器
sudo nvidia-docker run --name ppocr -v $PWD:/paddle --network=host -it hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda9.0-cudnn7-dev /bin/bash
如果您的机器安装的是CUDA10,请运行以下命令创建容器
如果使用CUDA10,请运行以下命令创建容器
sudo nvidia-docker run --name ppocr -v $PWD:/paddle --network=host -it hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda10.0-cudnn7-dev /bin/bash
您也可以访问[DockerHub](https://hub.docker.com/r/paddlepaddle/paddle/tags/)获取与您机器适配的镜像。
......@@ -47,7 +47,7 @@ docker images
hub.baidubce.com/paddlepaddle/paddle latest-gpu-cuda9.0-cudnn7-dev f56310dcc829
```
2. 安装PaddlePaddle Fluid v1.7
**2. 安装PaddlePaddle Fluid v1.7**
```
pip3 install --upgrade pip
......@@ -64,7 +64,7 @@ python3 -m pip install paddlepaddle==1.7.2 -i https://pypi.tuna.tsinghua.edu.cn/
更多的版本需求,请参照[安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
```
3. 克隆PaddleOCR repo代码
**3. 克隆PaddleOCR repo代码**
```
【推荐】git clone https://github.com/PaddlePaddle/PaddleOCR
......@@ -75,8 +75,11 @@ git clone https://gitee.com/paddlepaddle/PaddleOCR
注:码云托管代码可能无法实时同步本github项目更新,存在3~5天延时,请优先使用推荐方式。
```
4. 安装第三方库
**4. 安装第三方库**
```
cd PaddleOCR
pip3 install -r requirments.txt
```
注意,windows环境下,建议从[这里](https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely)下载shapely安装包完成安装,
直接通过pip安装的shapely库可能出现`[winRrror 126] 找不到指定模块的问题`
......@@ -21,12 +21,11 @@ ln -sf <path/to/dataset> <path/to/paddle_ocr>/train_data/dataset
* 使用自己数据集:
若您希望使用自己的数据进行训练,请参考下文组织您的数据。
- 训练集
首先请将训练图片放入同一个文件夹(train_images),并用一个txt文件(rec_gt_train.txt)记录图片路径和标签。
* 注意: 默认请将图片路径和图片标签用 \t 分割,如用其他方式分割将造成训练报错
**注意:** 默认请将图片路径和图片标签用 \t 分割,如用其他方式分割将造成训练报错
```
" 图像文件名 图像标注信息 "
......@@ -41,12 +40,9 @@ PaddleOCR 提供了一份用于训练 icdar2015 数据集的标签文件,通
wget -P ./train_data/ic15_data https://paddleocr.bj.bcebos.com/dataset/rec_gt_train.txt
# 测试集标签
wget -P ./train_data/ic15_data https://paddleocr.bj.bcebos.com/dataset/rec_gt_test.txt
```
最终训练集应有如下文件结构:
```
|-train_data
|-ic15_data
......@@ -150,7 +146,7 @@ PaddleOCR支持训练和评估交替进行, 可以在 `configs/rec/rec_icdar15_t
如果验证集很大,测试将会比较耗时,建议减少评估次数,或训练完再进行评估。
* 提示: 可通过 -c 参数选择 `configs/rec/` 路径下的多种模型配置进行训练,PaddleOCR支持的识别算法有:
**提示:** 可通过 -c 参数选择 `configs/rec/` 路径下的多种模型配置进行训练,PaddleOCR支持的识别算法有:
| 配置文件 | 算法名称 | backbone | trans | seq | pred |
......
......@@ -28,21 +28,38 @@ deploy/hubserving/ocr_system/
# 安装paddlehub
pip3 install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
# 设置环境变量
# 在Linux下设置环境变量
export PYTHONPATH=.
```
# 在Windows下设置环境变量
SET PYTHONPATH=.
```
### 2. 安装服务模块
PaddleOCR提供3种服务模块,根据需要安装所需模块。如:
PaddleOCR提供3种服务模块,根据需要安装所需模块。
安装检测服务模块:
```hub install deploy/hubserving/ocr_det/```
* 在Linux环境下,安装示例如下:
```shell
# 安装检测服务模块:
hub install deploy/hubserving/ocr_det/
或,安装识别服务模块:
```hub install deploy/hubserving/ocr_rec/```
# 或,安装识别服务模块:
hub install deploy/hubserving/ocr_rec/
或,安装检测+识别串联服务模块:
```hub install deploy/hubserving/ocr_system/```
# 或,安装检测+识别串联服务模块:
hub install deploy/hubserving/ocr_system/
```
* 在Windows环境下(文件夹的分隔符为`\`),安装示例如下:
```shell
# 安装检测服务模块:
hub install deploy\hubserving\ocr_det\
# 或,安装识别服务模块:
hub install deploy\hubserving\ocr_rec\
# 或,安装检测+识别串联服务模块:
hub install deploy\hubserving\ocr_system\
```
### 3. 启动服务
#### 方式1. 命令行命令启动(仅支持CPU)
......@@ -69,9 +86,9 @@ $ hub serving start --modules [Module1==Version1, Module2==Version2, ...] \
#### 方式2. 配置文件启动(支持CPU、GPU)
**启动命令:**
```hub serving start --config/-c config.json```
```hub serving start -c config.json```
其中,`config.json`格式如下:
其中,`config.json`格式如下:
```python
{
"modules_info": {
......@@ -124,7 +141,7 @@ hub serving start -c deploy/hubserving/ocr_system/config.json
## 返回结果格式说明
返回结果为列表(list),列表中的每一项为词典(dict),词典一共可能包含3种字段,信息如下:
|字段名称|数据类型|意义|
|字段名称|数据类型|意义|
|-|-|-|
|text|str|文本内容|
|confidence|float| 文本识别置信度|
......@@ -134,9 +151,9 @@ hub serving start -c deploy/hubserving/ocr_system/config.json
|字段名/模块名|ocr_det|ocr_rec|ocr_system|
|-|-|-|-|
|text||✔|✔|
|confidence||✔|✔|
|text_region|✔||✔|
|text||✔|✔|
|confidence||✔|✔|
|text_region|✔||✔|
**说明:** 如果需要增加、删除、修改返回字段,可在相应模块的`module.py`文件中进行修改,完整流程参考下一节自定义修改服务模块。
......@@ -157,4 +174,3 @@ hub serving start -c deploy/hubserving/ocr_system/config.json
- 5、重新启动服务
```hub serving start -m ocr_system```
# 更新
- 2020.7.23 发布7月21日B站直播课回放和PPT,PaddleOCR开源大礼包全面解读,[获取地址](https://aistudio.baidu.com/aistudio/course/introduce/1519)
- 2020.7.15 添加基于EasyEdge和Paddle-Lite的移动端DEMO,支持iOS和Android系统
- 2020.7.15 完善预测部署,添加基于C++预测引擎推理、服务化部署和端侧部署方案,以及超轻量级中文OCR模型预测耗时Benchmark
- 2020.7.15 整理OCR相关数据集、常用数据标注以及合成工具
......
# Android Demo quick start
### 1. Install the latest version of Android Studio
It can be downloaded from https://developer.android.com/studio . This Demo is written by Android Studio version 4.0.
### 2. Create a new project
The NDK version 20b is used in the demo test, and the compilation can be successfully supported for version 20 and above.
If you are a beginner, you can install and test the NDK compilation environment in the following ways.
File -> New ->New Project to create "Native C++" project
1. Start a new Android Studio project
Select Native C++ in the project template, select Paddle OCR/deploy/android_demo path
After entering the project, it will be automatically compiled. The first compilation
will take a long time. It is recommended to add an agent to speed up the download.
**Agent add:**
Android Studio -> Perferences -> Appearance & Behavior -> System Settings -> HTTP Proxy -> Manual proxy configuration
![](../demo/proxy.png)
2. Start compilation
Click the compile button, connect the phone, and follow the instructions of Android Studio to complete the operation.
When you see the following picture in Android Studio, the compilation is complete:
![](../demo/build.png)
**Tip:** At this time, if the following error message that OpenCV cannot be found appears, please re-click compile,
exit the project after compiling, and enter again.
![](../demo/error.png)
### 3. Send to mobile
Complete the compilation, click Run, and check the effect on the mobile phone.
### 4. How to customize the demo picture
1. Image storage path: android_demo/app/src/main/assets/images
Place the custom picture under this path
2. Configuration file: android_demo/app/src/main/res/values/strings.xml
Modify IMAGE_PATH_DEFAULT to a custom picture name
# Get more support
Go to [EasyEdge](https://ai.baidu.com/easyedge/app/open_source_demo?referrerUrl=paddlelite) to get more development support:
- Demo APP: You can use your mobile phone to scan the code to install, which is convenient for the mobile terminal to quickly experience text recognition
- SDK: The model is packaged to adapt to different chip hardware and operating system SDKs, including a complete interface to facilitate secondary development
......@@ -60,6 +60,8 @@ Take `rec_icdar15_train.yml` as an example:
| beta1 | Set the exponential decay rate for the 1st moment estimates | 0.9 | \ |
| beta2 | Set the exponential decay rate for the 2nd moment estimates | 0.999 | \ |
| decay | Whether to use decay | \ | \ |
| function(decay) | Set the decay function | cosine_decay | Only support cosine_decay |
| step_each_epoch | The number of steps in an epoch. | 20 | Calculation :total_image_num / (batch_size_per_card * card_size) |
| total_epoch | The number of epochs | 1000 | Consistent with Global.epoch_num |
| function(decay) | Set the decay function | cosine_decay | Support cosine_decay and piecewise_decay |
| step_each_epoch | The number of steps in an epoch. Used in cosine_decay | 20 | Calculation :total_image_num / (batch_size_per_card * card_size) |
| total_epoch | The number of epochs. Used in cosine_decay | 1000 | Consistent with Global.epoch_num |
| boundaries | The step intervals to reduce learning rate. Used in piecewise_decay | - | The format is list |
| decay_rate | Learning rate decay rate. Used in piecewise_decay | - | \ |
......@@ -25,7 +25,7 @@ After decompressing the data set and downloading the annotation file, PaddleOCR/
The provided annotation file format is as follow, seperated by "\t":
```
" Image file name Image annotation information encoded by json.dumps"
ch4_test_images/img_61.jpg [{"transcription": "MASA", "points": [[310, 104], [416, 141], [418, 216], [312, 179]], ...}]
ch4_test_images/img_61.jpg [{"transcription": "MASA", "points": [[310, 104], [416, 141], [418, 216], [312, 179]]}, {...}]
```
The image annotation after json.dumps() encoding is a list containing multiple dictionaries. The `points` in the dictionary represent the coordinates (x, y) of the four points of the text box, arranged clockwise from the point at the upper left corner.
......
......@@ -4,28 +4,28 @@ After testing, paddleocr can run on glibc 2.23. You can also test other glibc ve
PaddleOCR working environment:
- PaddlePaddle1.7
- python3
- python3.7
- glibc 2.23
It is recommended to use the docker provided by us to run PaddleOCR, please refer to the use of docker [link](https://docs.docker.com/get-started/).
*If you want to directly run the prediction code on mac or windows, you can start from step 2.*
1. (Recommended) Prepare a docker environment. The first time you use this image, it will be downloaded automatically. Please be patient.
**1. (Recommended) Prepare a docker environment. The first time you use this image, it will be downloaded automatically. Please be patient.**
```
# Switch to the working directory
cd /home/Projects
# You need to create a docker container for the first run, and do not need to run the current command when you run it again
# Create a docker container named ppocr and map the current directory to the /paddle directory of the container
#If you want to use docker in a CPU environment, use docker instead of nvidia-docker to create docker
#If using CPU, use docker instead of nvidia-docker to create docker
sudo docker run --name ppocr -v $PWD:/paddle --network=host -it hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda9.0-cudnn7-dev /bin/bash
```
If you have cuda9 installed on your machine, please run the following command to create a container:
If using CUDA9, please run the following command to create a container:
```
sudo nvidia-docker run --name ppocr -v $PWD:/paddle --network=host -it hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda9.0-cudnn7-dev /bin/bash
```
If you have cuda10 installed on your machine, please run the following command to create a container:
If using CUDA10, please run the following command to create a container:
```
sudo nvidia-docker run --name ppocr -v $PWD:/paddle --network=host -it hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda10.0-cudnn7-dev /bin/bash
```
......@@ -49,7 +49,7 @@ docker images
hub.baidubce.com/paddlepaddle/paddle latest-gpu-cuda9.0-cudnn7-dev f56310dcc829
```
2. Install PaddlePaddle Fluid v1.7 (the higher version is not supported yet, the adaptation work is in progress)
**2. Install PaddlePaddle Fluid v1.7 (the higher version is not supported yet, the adaptation work is in progress)**
```
pip3 install --upgrade pip
......@@ -65,7 +65,7 @@ python3 -m pip install paddlepaddle==1.7.2 -i https://pypi.tuna.tsinghua.edu.cn/
For more software version requirements, please refer to the instructions in [Installation Document](https://www.paddlepaddle.org.cn/install/quick) for operation.
3. Clone PaddleOCR repo
**3. Clone PaddleOCR repo**
```
# Recommend
git clone https://github.com/PaddlePaddle/PaddleOCR
......@@ -77,8 +77,14 @@ git clone https://gitee.com/paddlepaddle/PaddleOCR
# Note: The cloud-hosting code may not be able to synchronize the update with this GitHub project in real time. There might be a delay of 3-5 days. Please give priority to the recommended method.
```
4. Install third-party libraries
**4. Install third-party libraries**
```
cd PaddleOCR
pip3 install -r requirments.txt
```
If you getting this error `OSError: [WinError 126] The specified module could not be found` when you install shapely on windows.
Please try to download Shapely whl file using [http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely](http://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely).
Reference: [Solve shapely installation on windows](https://stackoverflow.com/questions/44398265/install-shapely-oserror-winerror-126-the-specified-module-could-not-be-found)
......@@ -29,25 +29,38 @@ The following steps take the 2-stage series service as an example. If only the d
# Install paddlehub
pip3 install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
# Set environment variables
# Set environment variables on Linux
export PYTHONPATH=.
```
# Set environment variables on Windows
SET PYTHONPATH=.
```
### 2. Install Service Module
PaddleOCR provides 3 kinds of service modules, install the required modules according to your needs. Such as:
PaddleOCR provides 3 kinds of service modules, install the required modules according to your needs.
Install the detection service module:
* On Linux platform, the examples are as follows.
```shell
# Install the detection service module:
hub install deploy/hubserving/ocr_det/
```
Or, install the recognition service module:
```shell
# Or, install the recognition service module:
hub install deploy/hubserving/ocr_rec/
```
Or, install the 2-stage series service module:
```shell
# Or, install the 2-stage series service module:
hub install deploy/hubserving/ocr_system/
```
```
* On Windows platform, the examples are as follows.
```shell
# Install the detection service module:
hub install deploy\hubserving\ocr_det\
# Or, install the recognition service module:
hub install deploy\hubserving\ocr_rec\
# Or, install the 2-stage series service module:
hub install deploy\hubserving\ocr_system\
```
### 3. Start service
#### Way 1. Start with command line parameters (CPU only)
......@@ -119,7 +132,7 @@ python tools/test_hubserving.py server_url image_path
```
Two parameters need to be passed to the script:
- **server_url**:service address,format of which is
- **server_url**:service address,format of which is
`http://[ip_address]:[port]/predict/[module_name]`
For example, if the detection, recognition and 2-stage serial services are started with provided configuration files, the respective `server_url` would be:
`http://127.0.0.1:8866/predict/ocr_det`
......@@ -135,7 +148,7 @@ python tools/test_hubserving.py http://127.0.0.1:8868/predict/ocr_system ./doc/i
## Returned result format
The returned result is a list. Each item in the list is a dict. The dict may contain three fields. The information is as follows:
|field name|data type|description|
|field name|data type|description|
|-|-|-|
|text|str|text content|
|confidence|float|text recognition confidence|
......@@ -145,9 +158,9 @@ The fields returned by different modules are different. For example, the results
|field name/module name|ocr_det|ocr_rec|ocr_system|
|-|-|-|-|
|text||✔|✔|
|confidence||✔|✔|
|text_region|✔||✔|
|text||✔|✔|
|confidence||✔|✔|
|text_region|✔||✔|
**Note:** If you need to add, delete or modify the returned fields, you can modify the file `module.py` of the corresponding module. For the complete process, refer to the user-defined modification service module in the next section.
......
# RECENT UPDATES
- 2020.7.23, Release the playback and PPT of live class on BiliBili station, PaddleOCR Introduction, [address](https://aistudio.baidu.com/aistudio/course/introduce/1519)
- 2020.7.15, Add mobile App demo , support both iOS and Android ( based on easyedge and Paddle Lite)
- 2020.7.15, Improve the deployment ability, add the C + + inference , serving deployment. In addtion, the benchmarks of the ultra-lightweight Chinese OCR model are provided.
- 2020.7.15, Add several related datasets, data annotation and synthesis tools.
......
......@@ -41,7 +41,7 @@ class TrainReader(object):
"absence process_function in Reader"
self.process = create_module(params['process_function'])(params)
def __call__(self, process_id):
def __call__(self, process_id):
def sample_iter_reader():
with open(self.label_file_path, "rb") as fin:
label_infor_list = fin.readlines()
......
......@@ -17,7 +17,7 @@ import cv2
import numpy as np
import json
import sys
from ppocr.utils.utility import initial_logger
from ppocr.utils.utility import initial_logger, check_and_read_gif
logger = initial_logger()
from .data_augment import AugmentData
......@@ -100,7 +100,9 @@ class DBProcessTrain(object):
def __call__(self, label_infor):
img_path, gt_label = self.convert_label_infor(label_infor)
imgvalue = cv2.imread(img_path)
imgvalue, flag = check_and_read_gif(img_path)
if not flag:
imgvalue = cv2.imread(img_path)
if imgvalue is None:
logger.info("{} does not exist!".format(img_path))
return None
......
......@@ -17,6 +17,7 @@ import cv2
import numpy as np
import json
import sys
import os
class EASTProcessTrain(object):
def __init__(self, params):
......@@ -52,7 +53,7 @@ class EASTProcessTrain(object):
label_infor = label_infor.decode()
label_infor = label_infor.encode('utf-8').decode('utf-8-sig')
substr = label_infor.strip("\n").split("\t")
img_path = self.img_set_dir + substr[0]
img_path = os.path.join(self.img_set_dir, substr[0])
label = json.loads(substr[1])
nBox = len(label)
wordBBs, txts, txt_tags = [], [], []
......
......@@ -185,6 +185,7 @@ class SimpleReader(object):
if params['mode'] != 'test':
self.img_set_dir = params['img_set_dir']
self.label_file_path = params['label_file_path']
self.use_gpu = params['use_gpu']
self.char_ops = params['char_ops']
self.image_shape = params['image_shape']
self.loss_type = params['loss_type']
......@@ -213,6 +214,15 @@ class SimpleReader(object):
if self.mode != 'train':
process_id = 0
def get_device_num():
if self.use_gpu:
gpus = os.environ.get("CUDA_VISIBLE_DEVICES", 1)
gpu_num = len(gpus.split(','))
return gpu_num
else:
cpu_num = os.environ.get("CPU_NUM", 1)
return int(cpu_num)
def sample_iter_reader():
if self.mode != 'train' and self.infer_img is not None:
image_file_list = get_image_file_list(self.infer_img)
......@@ -233,10 +243,16 @@ class SimpleReader(object):
img_num = len(label_infor_list)
img_id_list = list(range(img_num))
random.shuffle(img_id_list)
if sys.platform == "win32":
if sys.platform == "win32" and self.num_workers != 1:
print("multiprocess is not fully compatible with Windows."
"num_workers will be 1.")
self.num_workers = 1
if self.batch_size * get_device_num(
) * self.num_workers > img_num:
raise Exception(
"The number of the whole data ({}) is smaller than the batch_size * devices_num * num_workers ({})".
format(img_num, self.batch_size * get_device_num() *
self.num_workers))
for img_id in range(process_id, img_num, self.num_workers):
label_infor = label_infor_list[img_id_list[img_id]]
substr = label_infor.decode('utf-8').strip("\n").split("\t")
......
......@@ -360,7 +360,7 @@ def process_image(img,
text = char_ops.encode(label)
if len(text) == 0 or len(text) > max_text_length:
logger.info(
"Warning in ppocr/data/rec/img_tools.py:line362: Wrong data type."
"Warning in ppocr/data/rec/img_tools.py: Wrong data type."
"Excepted string with length between 1 and {}, but "
"got '{}'. Label is '{}'".format(max_text_length,
len(text), label))
......
......@@ -31,16 +31,28 @@ __all__ = [
class MobileNetV3():
def __init__(self, params):
self.scale = params['scale']
model_name = params['model_name']
self.scale = params.get("scale", 0.5)
model_name = params.get("model_name", "small")
large_stride = params.get("large_stride", [1, 2, 2, 2])
small_stride = params.get("small_stride", [2, 2, 2, 2])
assert isinstance(large_stride, list), "large_stride type must " \
"be list but got {}".format(type(large_stride))
assert isinstance(small_stride, list), "small_stride type must " \
"be list but got {}".format(type(small_stride))
assert len(large_stride) == 4, "large_stride length must be " \
"4 but got {}".format(len(large_stride))
assert len(small_stride) == 4, "small_stride length must be " \
"4 but got {}".format(len(small_stride))
self.inplanes = 16
if model_name == "large":
self.cfg = [
# k, exp, c, se, nl, s,
[3, 16, 16, False, 'relu', 1],
[3, 64, 24, False, 'relu', (2, 1)],
[3, 16, 16, False, 'relu', large_stride[0]],
[3, 64, 24, False, 'relu', (large_stride[1], 1)],
[3, 72, 24, False, 'relu', 1],
[5, 72, 40, True, 'relu', (2, 1)],
[5, 72, 40, True, 'relu', (large_stride[2], 1)],
[5, 120, 40, True, 'relu', 1],
[5, 120, 40, True, 'relu', 1],
[3, 240, 80, False, 'hard_swish', 1],
......@@ -49,7 +61,7 @@ class MobileNetV3():
[3, 184, 80, False, 'hard_swish', 1],
[3, 480, 112, True, 'hard_swish', 1],
[3, 672, 112, True, 'hard_swish', 1],
[5, 672, 160, True, 'hard_swish', (2, 1)],
[5, 672, 160, True, 'hard_swish', (large_stride[3], 1)],
[5, 960, 160, True, 'hard_swish', 1],
[5, 960, 160, True, 'hard_swish', 1],
]
......@@ -58,15 +70,15 @@ class MobileNetV3():
elif model_name == "small":
self.cfg = [
# k, exp, c, se, nl, s,
[3, 16, 16, True, 'relu', (2, 1)],
[3, 72, 24, False, 'relu', (2, 1)],
[3, 16, 16, True, 'relu', (small_stride[0], 1)],
[3, 72, 24, False, 'relu', (small_stride[1], 1)],
[3, 88, 24, False, 'relu', 1],
[5, 96, 40, True, 'hard_swish', (2, 1)],
[5, 96, 40, True, 'hard_swish', (small_stride[2], 1)],
[5, 240, 40, True, 'hard_swish', 1],
[5, 240, 40, True, 'hard_swish', 1],
[5, 120, 48, True, 'hard_swish', 1],
[5, 144, 48, True, 'hard_swish', 1],
[5, 288, 96, True, 'hard_swish', (2, 1)],
[5, 288, 96, True, 'hard_swish', (small_stride[3], 1)],
[5, 576, 96, True, 'hard_swish', 1],
[5, 576, 96, True, 'hard_swish', 1],
]
......@@ -78,7 +90,7 @@ class MobileNetV3():
supported_scale = [0.35, 0.5, 0.75, 1.0, 1.25]
assert self.scale in supported_scale, \
"supported scale are {} but input scale is {}".format(supported_scale, scale)
"supported scales are {} but input scale is {}".format(supported_scale, self.scale)
def __call__(self, input):
scale = self.scale
......
......@@ -32,6 +32,7 @@ class CTCPredict(object):
self.char_num = params['char_num']
self.encoder = SequenceEncoder(params)
self.encoder_type = params['encoder_type']
self.fc_decay = params.get("fc_decay", 0.0004)
def __call__(self, inputs, labels=None, mode=None):
encoder_features = self.encoder(inputs)
......@@ -39,7 +40,7 @@ class CTCPredict(object):
encoder_features = fluid.layers.concat(encoder_features, axis=1)
name = "ctc_fc"
para_attr, bias_attr = get_para_bias_attr(
l2_decay=0.0004, k=encoder_features.shape[1], name=name)
l2_decay=self.fc_decay, k=encoder_features.shape[1], name=name)
predict = fluid.layers.fc(input=encoder_features,
size=self.char_num + 1,
param_attr=para_attr,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment