"...git@developer.sourcefind.cn:chenpangpang/open-webui.git" did not exist on "07283ca0cf4a0e6b7559382ec1c90a153f233cd1"
Commit 2735e9e3 authored by LDOUBLEV's avatar LDOUBLEV
Browse files

Merge branch 'dygraph' of https://github.com/PaddlePaddle/PaddleOCR into dyg_db

parents 493a7171 52671b7d
...@@ -4,16 +4,18 @@ ...@@ -4,16 +4,18 @@
PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力使用者训练出更好的模型,并应用落地。 PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力使用者训练出更好的模型,并应用落地。
**近期更新** **近期更新**
- 2020.12.07 [FAQ](./doc/doc_ch/FAQ.md)新增5个高频问题,总数124个,并且计划以后每周一都会更新,欢迎大家持续关注。
- 2020.11.25 更新半自动标注工具[PPOCRLabel](./PPOCRLabel/README.md),辅助开发者高效完成标注任务,输出格式与PP-OCR训练任务完美衔接。
- 2020.9.22 更新PP-OCR技术文章,https://arxiv.org/abs/2009.09941 - 2020.9.22 更新PP-OCR技术文章,https://arxiv.org/abs/2009.09941
- 2020.9.19 更新超轻量压缩ppocr_mobile_slim系列模型,整体模型3.5M(详见[PP-OCR Pipline](#PP-OCR)),适合在移动端部署使用。[模型下载](#模型下载) - 2020.9.19 更新超轻量压缩ppocr_mobile_slim系列模型,整体模型3.5M(详见[PP-OCR Pipeline](#PP-OCR)),适合在移动端部署使用。[模型下载](#模型下载)
- 2020.9.17 更新超轻量ppocr_mobile系列和通用ppocr_server系列中英文ocr模型,媲美商业效果。[模型下载](#模型下载) - 2020.9.17 更新超轻量ppocr_mobile系列和通用ppocr_server系列中英文ocr模型,媲美商业效果。[模型下载](#模型下载)
- 2020.9.17 更新[英文识别模型](./doc/doc_ch/models_list.md#英文识别模型)[多语言识别模型](doc/doc_ch/models_list.md#多语言识别模型),已支持`德语、法语、日语、韩语`,更多语种识别模型将持续更新。 - 2020.9.17 更新[英文识别模型](./doc/doc_ch/models_list.md#英文识别模型)[多语言识别模型](doc/doc_ch/models_list.md#多语言识别模型),已支持`德语、法语、日语、韩语`,更多语种识别模型将持续更新。
- 2020.8.26 更新OCR相关的84个常见问题及解答,具体参考[FAQ](./doc/doc_ch/FAQ.md)
- 2020.8.24 支持通过whl包安装使用PaddleOCR,具体参考[Paddleocr Package使用说明](./doc/doc_ch/whl.md) - 2020.8.24 支持通过whl包安装使用PaddleOCR,具体参考[Paddleocr Package使用说明](./doc/doc_ch/whl.md)
- 2020.8.21 更新8月18日B站直播课回放和PPT,课节2,易学易用的OCR工具大礼包,[获取地址](https://aistudio.baidu.com/aistudio/education/group/info/1519) - 2020.8.21 更新8月18日B站直播课回放和PPT,课节2,易学易用的OCR工具大礼包,[获取地址](https://aistudio.baidu.com/aistudio/education/group/info/1519)
- [More](./doc/doc_ch/update.md) - [More](./doc/doc_ch/update.md)
## 特性 ## 特性
- PPOCR系列高质量预训练模型,准确的识别效果 - PPOCR系列高质量预训练模型,准确的识别效果
...@@ -48,15 +50,14 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力 ...@@ -48,15 +50,14 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力
- 代码体验:从[快速安装](./doc/doc_ch/installation.md) 开始 - 代码体验:从[快速安装](./doc/doc_ch/installation.md) 开始
<a name="模型下载"></a> <a name="模型下载"></a>
## PP-OCR 1.1系列模型列表(9月17日更新) ## PP-OCR 2.0系列模型列表(更新
| 模型简介 | 模型名称 |推荐场景 | 检测模型 | 方向分类器 | 识别模型 | | 模型简介 | 模型名称 |推荐场景 | 检测模型 | 方向分类器 | 识别模型 |
| ------------ | --------------- | ----------------|---- | ---------- | -------- | | ------------ | --------------- | ----------------|---- | ---------- | -------- |
| 中英文超轻量OCR模型(8.1M) | ch_ppocr_mobile_v1.1_xx |移动端&服务器端|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar)|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_pre.tar) | | 中英文超轻量OCR模型(8.1M) | ch_ppocr_mobile_v2.0_xx |移动端&服务器端|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar) |
| 中英文通用OCR模型(155.1M) |ch_ppocr_server_v1.1_xx|服务器端 |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar) | | 中英文通用OCR模型(143M) |ch_ppocr_server_v2.0_xx|服务器端 |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar) |
| 中英文超轻量压缩OCR模型(3.5M) | ch_ppocr_mobile_slim_v1.1_xx| 移动端 |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_det_prune_opt.nb) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_cls_quant_opt.nb)| [推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_rec_quant_opt.nb)|
更多模型下载(包括多语言),可以参考[PP-OCR v1.1 系列模型下载](./doc/doc_ch/models_list.md) 更多模型下载(包括多语言),可以参考[PP-OCR v2.0 系列模型下载](./doc/doc_ch/models_list.md)
## 文档教程 ## 文档教程
- [快速安装](./doc/doc_ch/installation.md) - [快速安装](./doc/doc_ch/installation.md)
...@@ -141,6 +142,7 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框 ...@@ -141,6 +142,7 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框
## 贡献代码 ## 贡献代码
我们非常欢迎你为PaddleOCR贡献代码,也十分感谢你的反馈。 我们非常欢迎你为PaddleOCR贡献代码,也十分感谢你的反馈。
- 非常感谢 [Khanh Tran](https://github.com/xxxpsyduck)[Karl Horky](https://github.com/karlhorky) 贡献修改英文文档 - 非常感谢 [Khanh Tran](https://github.com/xxxpsyduck)[Karl Horky](https://github.com/karlhorky) 贡献修改英文文档
- 非常感谢 [zhangxin](https://github.com/ZhangXinNan)([Blog](https://blog.csdn.net/sdlypyzq)) 贡献新的可视化方式、添加.gitgnore、处理手动设置PYTHONPATH环境变量的问题 - 非常感谢 [zhangxin](https://github.com/ZhangXinNan)([Blog](https://blog.csdn.net/sdlypyzq)) 贡献新的可视化方式、添加.gitgnore、处理手动设置PYTHONPATH环境变量的问题
- 非常感谢 [lyl120117](https://github.com/lyl120117) 贡献打印网络结构的代码 - 非常感谢 [lyl120117](https://github.com/lyl120117) 贡献打印网络结构的代码
...@@ -148,3 +150,6 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框 ...@@ -148,3 +150,6 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框
- 非常感谢 [authorfu](https://github.com/authorfu) 贡献Android和[xiadeye](https://github.com/xiadeye) 贡献IOS的demo代码 - 非常感谢 [authorfu](https://github.com/authorfu) 贡献Android和[xiadeye](https://github.com/xiadeye) 贡献IOS的demo代码
- 非常感谢 [BeyondYourself](https://github.com/BeyondYourself) 给PaddleOCR提了很多非常棒的建议,并简化了PaddleOCR的部分代码风格。 - 非常感谢 [BeyondYourself](https://github.com/BeyondYourself) 给PaddleOCR提了很多非常棒的建议,并简化了PaddleOCR的部分代码风格。
- 非常感谢 [tangmq](https://gitee.com/tangmq) 给PaddleOCR增加Docker化部署服务,支持快速发布可调用的Restful API服务。 - 非常感谢 [tangmq](https://gitee.com/tangmq) 给PaddleOCR增加Docker化部署服务,支持快速发布可调用的Restful API服务。
- 非常感谢 [lijinhan](https://github.com/lijinhan) 给PaddleOCR增加java SpringBoot 调用OCR Hubserving接口完成对OCR服务化部署的使用。
- 非常感谢 [Mejans](https://github.com/Mejans) 给PaddleOCR增加新语言奥克西坦语Occitan的字典和语料。
- 非常感谢 [Evezerest](https://github.com/Evezerest)[ninetailskim](https://github.com/ninetailskim)[edencfc](https://github.com/edencfc)[BeyondYourself](https://github.com/BeyondYourself)[1084667371](https://github.com/1084667371) 贡献了PPOCRLabel的完整代码。
This diff is collapsed.
...@@ -8,7 +8,6 @@ Global: ...@@ -8,7 +8,6 @@ Global:
# evaluation is run every 5000 iterations after the 4000th iteration # evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: [0, 1000] eval_batch_step: [0, 1000]
# if pretrained_model is saved in static mode, load_static_weights must set to True # if pretrained_model is saved in static mode, load_static_weights must set to True
load_static_weights: True
cal_metric_during_train: True cal_metric_during_train: True
pretrained_model: pretrained_model:
checkpoints: checkpoints:
......
Global:
use_gpu: true
epoch_num: 1200
log_smooth_window: 20
print_batch_step: 2
save_model_dir: ./output/det_r50_vd/
save_epoch_step: 1200
# evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: 8
# if pretrained_model is saved in static mode, load_static_weights must set to True
load_static_weights: True
cal_metric_during_train: False
pretrained_model: ./pretrain_models/ResNet50_vd_ssld_pretrained/
checkpoints:
save_inference_dir:
use_visualdl: True
infer_img: doc/imgs_en/img_10.jpg
save_res_path: ./output/det_db/predicts_db.txt
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
learning_rate:
lr: 0.001
regularizer:
name: 'L2'
factor: 0
Architecture:
type: det
algorithm: DB
Transform:
Backbone:
name: ResNet
layers: 50
Neck:
name: FPN
out_channels: 256
Head:
name: DBHead
k: 50
Loss:
name: DBLoss
balance_loss: true
main_loss_type: DiceLoss
alpha: 5
beta: 10
ohem_ratio: 3
PostProcess:
name: DBPostProcess
thresh: 0.3
box_thresh: 0.6
max_candidates: 1000
unclip_ratio: 1.5
Metric:
name: DetMetric
main_indicator: hmean
TRAIN:
dataset:
name: SimpleDataSet
data_dir: ./detection/
file_list:
- ./detection/train_icdar2015_label.txt # dataset1
ratio_list: [1.0]
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- DetLabelEncode: # Class handling label
- IaaAugment:
augmenter_args:
- { 'type': Fliplr, 'args': { 'p': 0.5 } }
- { 'type': Affine, 'args': { 'rotate': [ -10,10 ] } }
- { 'type': Resize,'args': { 'size': [ 0.5,3 ] } }
- EastRandomCropData:
size: [ 640,640 ]
max_tries: 50
keep_ratio: true
- MakeBorderMap:
shrink_ratio: 0.4
thresh_min: 0.3
thresh_max: 0.7
- MakeShrinkMap:
shrink_ratio: 0.4
min_text_size: 8
- NormalizeImage:
scale: 1./255.
mean: [ 0.485, 0.456, 0.406 ]
std: [ 0.229, 0.224, 0.225 ]
order: 'hwc'
- ToCHWImage:
- keepKeys:
keep_keys: ['image','threshold_map','threshold_mask','shrink_map','shrink_mask'] # dataloader will return list in this order
loader:
shuffle: True
drop_last: False
batch_size: 16
num_workers: 8
EVAL:
dataset:
name: SimpleDataSet
data_dir: ./detection/
file_list:
- ./detection/test_icdar2015_label.txt
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- DetLabelEncode: # Class handling label
- DetResizeForTest:
image_shape: [736,1280]
- NormalizeImage:
scale: 1./255.
mean: [ 0.485, 0.456, 0.406 ]
std: [ 0.229, 0.224, 0.225 ]
order: 'hwc'
- ToCHWImage:
- keepKeys:
keep_keys: ['image','shape','polys','ignore_tags']
loader:
shuffle: False
drop_last: False
batch_size: 1 # must be 1
num_workers: 8
\ No newline at end of file
...@@ -11,7 +11,7 @@ Global: ...@@ -11,7 +11,7 @@ Global:
load_static_weights: True load_static_weights: True
cal_metric_during_train: False cal_metric_during_train: False
pretrained_model: ./pretrain_models/MobileNetV3_large_x0_5_pretrained pretrained_model: ./pretrain_models/MobileNetV3_large_x0_5_pretrained
checkpoints: #./output/det_db_0.001_DiceLoss_256_pp_config_2.0b_4gpu/best_accuracy checkpoints:
save_inference_dir: save_inference_dir:
use_visualdl: False use_visualdl: False
infer_img: doc/imgs_en/img_10.jpg infer_img: doc/imgs_en/img_10.jpg
......
...@@ -11,7 +11,7 @@ Global: ...@@ -11,7 +11,7 @@ Global:
load_static_weights: True load_static_weights: True
cal_metric_during_train: False cal_metric_during_train: False
pretrained_model: ./pretrain_models/ResNet18_vd_pretrained pretrained_model: ./pretrain_models/ResNet18_vd_pretrained
checkpoints: #./output/det_db_0.001_DiceLoss_256_pp_config_2.0b_4gpu/best_accuracy checkpoints:
save_inference_dir: save_inference_dir:
use_visualdl: False use_visualdl: False
infer_img: doc/imgs_en/img_10.jpg infer_img: doc/imgs_en/img_10.jpg
......
...@@ -11,7 +11,7 @@ Global: ...@@ -11,7 +11,7 @@ Global:
load_static_weights: True load_static_weights: True
cal_metric_during_train: False cal_metric_during_train: False
pretrained_model: ./pretrain_models/MobileNetV3_large_x0_5_pretrained pretrained_model: ./pretrain_models/MobileNetV3_large_x0_5_pretrained
checkpoints: #./output/det_db_0.001_DiceLoss_256_pp_config_2.0b_4gpu/best_accuracy checkpoints:
save_inference_dir: save_inference_dir:
use_visualdl: False use_visualdl: False
infer_img: doc/imgs_en/img_10.jpg infer_img: doc/imgs_en/img_10.jpg
......
...@@ -3,7 +3,7 @@ Global: ...@@ -3,7 +3,7 @@ Global:
epoch_num: 1200 epoch_num: 1200
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
save_model_dir: ./output/det_rc/det_r50_vd/ save_model_dir: ./output/det_r50_vd/
save_epoch_step: 1200 save_epoch_step: 1200
# evaluation is run every 5000 iterations after the 4000th iteration # evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: [5000,4000] eval_batch_step: [5000,4000]
......
Global:
use_gpu: false
epoch_num: 500
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec/mv3_none_bilstm_ctc/
save_epoch_step: 500
# evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: 127
# if pretrained_model is saved in static mode, load_static_weights must set to True
load_static_weights: True
cal_metric_during_train: True
pretrained_model:
checkpoints:
save_inference_dir:
use_visualdl: False
infer_img: doc/imgs_words/ch/word_1.jpg
# for data or label process
max_text_length: 80
character_dict_path: ppocr/utils/ppocr_keys_v1.txt
character_type: 'ch'
use_space_char: False
infer_mode: False
use_tps: False
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
learning_rate:
lr: 0.001
regularizer:
name: 'L2'
factor: 0.00001
Architecture:
type: rec
algorithm: CRNN
Transform:
Backbone:
name: MobileNetV3
scale: 0.5
model_name: small
small_stride: [ 1, 2, 2, 2 ]
Neck:
name: SequenceEncoder
encoder_type: fc
hidden_size: 96
Head:
name: CTC
fc_decay: 0.00001
Loss:
name: CTCLoss
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
TRAIN:
dataset:
name: SimpleDataSet
data_dir: ./rec
file_list:
- ./rec/train.txt # dataset1
ratio_list: [ 0.4,0.6 ]
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- CTCLabelEncode: # Class handling label
- RecAug:
- RecResizeImg:
image_shape: [ 3,32,320 ]
- keepKeys:
keep_keys: [ 'image','label','length' ] # dataloader will return list in this order
loader:
batch_size: 256
shuffle: True
drop_last: True
num_workers: 8
EVAL:
dataset:
name: SimpleDataSet
data_dir: ./rec
file_list:
- ./rec/val.txt
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- CTCLabelEncode: # Class handling label
- RecResizeImg:
image_shape: [ 3,32,320 ]
- keepKeys:
keep_keys: [ 'image','label','length' ] # dataloader will return list in this order
loader:
shuffle: False
drop_last: False
batch_size: 256
num_workers: 8
Global:
use_gpu: false
epoch_num: 500
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec/res34_none_bilstm_ctc/
save_epoch_step: 500
# evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: 127
# if pretrained_model is saved in static mode, load_static_weights must set to True
load_static_weights: True
cal_metric_during_train: True
pretrained_model:
checkpoints:
save_inference_dir:
use_visualdl: False
infer_img: doc/imgs_words/ch/word_1.jpg
# for data or label process
max_text_length: 80
character_dict_path: ppocr/utils/ppocr_keys_v1.txt
character_type: 'ch'
use_space_char: False
infer_mode: False
use_tps: False
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
learning_rate:
lr: 0.001
regularizer:
name: 'L2'
factor: 0.00001
Architecture:
type: rec
algorithm: CRNN
Transform:
Backbone:
name: ResNet
layers: 34
Neck:
name: SequenceEncoder
encoder_type: fc
hidden_size: 96
Head:
name: CTC
fc_decay: 0.00001
Loss:
name: CTCLoss
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
TRAIN:
dataset:
name: SimpleDataSet
data_dir: ./rec
file_list:
- ./rec/train.txt # dataset1
ratio_list: [ 0.4,0.6 ]
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- CTCLabelEncode: # Class handling label
- RecAug:
- RecResizeImg:
image_shape: [ 3,32,320 ]
- keepKeys:
keep_keys: [ 'image','label','length' ] # dataloader will return list in this order
loader:
batch_size: 256
shuffle: True
drop_last: True
num_workers: 8
EVAL:
dataset:
name: SimpleDataSet
data_dir: ./rec
file_list:
- ./rec/val.txt
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- CTCLabelEncode: # Class handling label
- RecResizeImg:
image_shape: [ 3,32,320 ]
- keepKeys:
keep_keys: [ 'image','label','length' ] # dataloader will return list in this order
loader:
shuffle: False
drop_last: False
batch_size: 256
num_workers: 8
...@@ -3,7 +3,7 @@ Global: ...@@ -3,7 +3,7 @@ Global:
epoch_num: 500 epoch_num: 500
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
save_model_dir: ./output/rec_chinese_common_v1.1 save_model_dir: ./output/rec_chinese_common_v2.0
save_epoch_step: 3 save_epoch_step: 3
# evaluation is run every 5000 iterations after the 4000th iteration # evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: [0, 2000] eval_batch_step: [0, 2000]
......
...@@ -3,7 +3,7 @@ Global: ...@@ -3,7 +3,7 @@ Global:
epoch_num: 500 epoch_num: 500
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
save_model_dir: ./output/rec_chinese_lite_v1.1 save_model_dir: ./output/rec_chinese_lite_v2.0
save_epoch_step: 3 save_epoch_step: 3
# evaluation is run every 5000 iterations after the 4000th iteration # evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: [0, 2000] eval_batch_step: [0, 2000]
...@@ -19,7 +19,7 @@ Global: ...@@ -19,7 +19,7 @@ Global:
character_type: ch character_type: ch
max_text_length: 25 max_text_length: 25
infer_mode: False infer_mode: False
use_space_char: False use_space_char: True
Optimizer: Optimizer:
......
Global: Global:
use_gpu: true use_gpu: True
epoch_num: 500 epoch_num: 500
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
...@@ -15,7 +15,7 @@ Global: ...@@ -15,7 +15,7 @@ Global:
use_visualdl: False use_visualdl: False
infer_img: infer_img:
# for data or label process # for data or label process
character_dict_path: ppocr/utils/dict/ic15_dict.txt character_dict_path: ppocr/utils/dict/en_dict.txt
character_type: ch character_type: ch
max_text_length: 25 max_text_length: 25
infer_mode: False infer_mode: False
......
Global: Global:
use_gpu: true use_gpu: True
epoch_num: 500 epoch_num: 500
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
...@@ -9,9 +9,9 @@ Global: ...@@ -9,9 +9,9 @@ Global:
eval_batch_step: [0, 2000] eval_batch_step: [0, 2000]
# if pretrained_model is saved in static mode, load_static_weights must set to True # if pretrained_model is saved in static mode, load_static_weights must set to True
cal_metric_during_train: True cal_metric_during_train: True
pretrained_model: pretrained_model:
checkpoints: checkpoints:
save_inference_dir: save_inference_dir:
use_visualdl: False use_visualdl: False
infer_img: infer_img:
# for data or label process # for data or label process
...@@ -19,7 +19,7 @@ Global: ...@@ -19,7 +19,7 @@ Global:
character_type: french character_type: french
max_text_length: 25 max_text_length: 25
infer_mode: False infer_mode: False
use_space_char: True use_space_char: False
Optimizer: Optimizer:
......
Global: Global:
use_gpu: true use_gpu: True
epoch_num: 500 epoch_num: 500
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
...@@ -19,7 +19,7 @@ Global: ...@@ -19,7 +19,7 @@ Global:
character_type: german character_type: german
max_text_length: 25 max_text_length: 25
infer_mode: False infer_mode: False
use_space_char: True use_space_char: False
Optimizer: Optimizer:
......
Global: Global:
use_gpu: true use_gpu: True
epoch_num: 500 epoch_num: 500
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
...@@ -19,7 +19,7 @@ Global: ...@@ -19,7 +19,7 @@ Global:
character_type: japan character_type: japan
max_text_length: 25 max_text_length: 25
infer_mode: False infer_mode: False
use_space_char: True use_space_char: False
Optimizer: Optimizer:
......
Global: Global:
use_gpu: true use_gpu: True
epoch_num: 500 epoch_num: 500
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
...@@ -19,7 +19,7 @@ Global: ...@@ -19,7 +19,7 @@ Global:
character_type: korean character_type: korean
max_text_length: 25 max_text_length: 25
infer_mode: False infer_mode: False
use_space_char: True use_space_char: False
Optimizer: Optimizer:
......
Global: Global:
use_gpu: false use_gpu: true
epoch_num: 500 epoch_num: 72
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
save_model_dir: ./output/rec/res34_none_none_ctc/ save_model_dir: ./output/rec/ic15/
save_epoch_step: 500 save_epoch_step: 3
# evaluation is run every 5000 iterations after the 4000th iteration # evaluation is run every 2000 iterations
eval_batch_step: 127 eval_batch_step: [0, 2000]
# if pretrained_model is saved in static mode, load_static_weights must set to True # if pretrained_model is saved in static mode, load_static_weights must set to True
load_static_weights: True
cal_metric_during_train: True cal_metric_during_train: True
pretrained_model: pretrained_model:
checkpoints: checkpoints:
save_inference_dir: save_inference_dir:
use_visualdl: False use_visualdl: False
infer_img: doc/imgs_words/ch/word_1.jpg infer_img: doc/imgs_words_en/word_10.png
# for data or label process # for data or label process
max_text_length: 80 character_dict_path: ppocr/utils/ic15_dict.txt
character_dict_path: ppocr/utils/ppocr_keys_v1.txt character_type: ch
character_type: 'ch' max_text_length: 25
use_space_char: False
infer_mode: False infer_mode: False
use_tps: False use_space_char: False
Optimizer: Optimizer:
name: Adam name: Adam
beta1: 0.9 beta1: 0.9
beta2: 0.999 beta2: 0.999
learning_rate: lr:
lr: 0.001 learning_rate: 0.0005
regularizer: regularizer:
name: 'L2' name: 'L2'
factor: 0.00001 factor: 0
Architecture: Architecture:
type: rec model_type: rec
algorithm: CRNN algorithm: CRNN
Transform: Transform:
Backbone: Backbone:
...@@ -43,10 +40,11 @@ Architecture: ...@@ -43,10 +40,11 @@ Architecture:
layers: 34 layers: 34
Neck: Neck:
name: SequenceEncoder name: SequenceEncoder
encoder_type: reshape encoder_type: rnn
hidden_size: 256
Head: Head:
name: CTC name: CTCHead
fc_decay: 0.00001 fc_decay: 0
Loss: Loss:
name: CTCLoss name: CTCLoss
...@@ -58,46 +56,42 @@ Metric: ...@@ -58,46 +56,42 @@ Metric:
name: RecMetric name: RecMetric
main_indicator: acc main_indicator: acc
TRAIN: Train:
dataset: dataset:
name: SimpleDataSet name: SimpleDataSet
data_dir: ./rec data_dir: ./train_data/
file_list: label_file_list: ["./train_data/train_list.txt"]
- ./rec/train.txt # dataset1
ratio_list: [ 0.4,0.6 ]
transforms: transforms:
- DecodeImage: # load image - DecodeImage: # load image
img_mode: BGR img_mode: BGR
channel_first: False channel_first: False
- CTCLabelEncode: # Class handling label - CTCLabelEncode: # Class handling label
- RecAug:
- RecResizeImg: - RecResizeImg:
image_shape: [ 3,32,320 ] image_shape: [3, 32, 100]
- keepKeys: - KeepKeys:
keep_keys: [ 'image','label','length' ] # dataloader will return list in this order keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
loader: loader:
batch_size: 256
shuffle: True shuffle: True
batch_size_per_card: 256
drop_last: True drop_last: True
num_workers: 8 num_workers: 8
EVAL: Eval:
dataset: dataset:
name: SimpleDataSet name: SimpleDataSet
data_dir: ./rec data_dir: ./train_data/
file_list: label_file_list: ["./train_data/train_list.txt"]
- ./rec/val.txt
transforms: transforms:
- DecodeImage: # load image - DecodeImage: # load image
img_mode: BGR img_mode: BGR
channel_first: False channel_first: False
- CTCLabelEncode: # Class handling label - CTCLabelEncode: # Class handling label
- RecResizeImg: - RecResizeImg:
image_shape: [ 3,32,320 ] image_shape: [3, 32, 100]
- keepKeys: - KeepKeys:
keep_keys: [ 'image','label','length' ] # dataloader will return list in this order keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
loader: loader:
shuffle: False shuffle: False
drop_last: False drop_last: False
batch_size: 256 batch_size_per_card: 256
num_workers: 8 num_workers: 4
...@@ -81,7 +81,8 @@ cv::Mat Classifier::Run(cv::Mat &img) { ...@@ -81,7 +81,8 @@ cv::Mat Classifier::Run(cv::Mat &img) {
void Classifier::LoadModel(const std::string &model_dir) { void Classifier::LoadModel(const std::string &model_dir) {
AnalysisConfig config; AnalysisConfig config;
config.SetModel(model_dir + "/model", model_dir + "/params"); config.SetModel(model_dir + "/inference.pdmodel",
model_dir + "/inference.pdiparams");
if (this->use_gpu_) { if (this->use_gpu_) {
config.EnableUseGpu(this->gpu_mem_, this->gpu_id_); config.EnableUseGpu(this->gpu_mem_, this->gpu_id_);
......
...@@ -18,7 +18,8 @@ namespace PaddleOCR { ...@@ -18,7 +18,8 @@ namespace PaddleOCR {
void DBDetector::LoadModel(const std::string &model_dir) { void DBDetector::LoadModel(const std::string &model_dir) {
AnalysisConfig config; AnalysisConfig config;
config.SetModel(model_dir + "/model", model_dir + "/params"); config.SetModel(model_dir + "/inference.pdmodel",
model_dir + "/inference.pdiparams");
if (this->use_gpu_) { if (this->use_gpu_) {
config.EnableUseGpu(this->gpu_mem_, this->gpu_id_); config.EnableUseGpu(this->gpu_mem_, this->gpu_id_);
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment