api_ch.md 2.23 KB
Newer Older
WenmuZhou's avatar
WenmuZhou committed
1
2
# PaddleStructure

WenmuZhou's avatar
WenmuZhou committed
3
4
5
6
7
8
安装layoutparser
```sh
wget  https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
pip3 install layoutparser-0.0.0-py3-none-any.whl
```

WenmuZhou's avatar
WenmuZhou committed
9
## 1. pipeline介绍
WenmuZhou's avatar
WenmuZhou committed
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

PaddleStructure 是一个用于复杂板式文字OCR的工具包,流程如下
![pipeline](../doc/table/pipeline.png)

在PaddleStructure中,图片会先经由layoutparser进行版面分析,在版面分析中,会对图片里的区域进行分类,根据根据类别进行对于的ocr流程。

目前layoutparser会输出五个类别:
1. Text
2. Title
3. Figure
4. List
5. Table
   
1-4类走传统的OCR流程,5走表格的OCR流程。

WenmuZhou's avatar
WenmuZhou committed
25
## 2. LayoutParser
WenmuZhou's avatar
WenmuZhou committed
26

WenmuZhou's avatar
WenmuZhou committed
27
[文档](layout/README.md)
WenmuZhou's avatar
WenmuZhou committed
28

WenmuZhou's avatar
WenmuZhou committed
29
## 3. Table OCR
WenmuZhou's avatar
WenmuZhou committed
30
31
32

[文档](table/README_ch.md)

WenmuZhou's avatar
WenmuZhou committed
33
## 4. PaddleStructure whl包介绍
WenmuZhou's avatar
WenmuZhou committed
34

WenmuZhou's avatar
WenmuZhou committed
35
### 4.1 使用
WenmuZhou's avatar
WenmuZhou committed
36

WenmuZhou's avatar
WenmuZhou committed
37
4.1.1 代码使用
WenmuZhou's avatar
WenmuZhou committed
38
```python
WenmuZhou's avatar
WenmuZhou committed
39
import os
WenmuZhou's avatar
WenmuZhou committed
40
import cv2
WenmuZhou's avatar
WenmuZhou committed
41
from paddlestructure import PaddleStructure,draw_result,save_res
WenmuZhou's avatar
WenmuZhou committed
42

WenmuZhou's avatar
WenmuZhou committed
43
table_engine = PaddleStructure(show_log=True)
WenmuZhou's avatar
WenmuZhou committed
44

WenmuZhou's avatar
WenmuZhou committed
45
save_folder = './output/table'
WenmuZhou's avatar
WenmuZhou committed
46
47
48
img_path = '../doc/table/1.png'
img = cv2.imread(img_path)
result = table_engine(img)
WenmuZhou's avatar
WenmuZhou committed
49
50
save_res(result, save_folder,os.path.basename(img_path).split('.')[0])

WenmuZhou's avatar
WenmuZhou committed
51
52
53
54
55
56
57
58
59
60
61
62
for line in result:
    print(line)

from PIL import Image

font_path = 'path/tp/PaddleOCR/doc/fonts/simfang.ttf'
image = Image.open(img_path).convert('RGB')
im_show = draw_result(image, result,font_path=font_path)
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```

WenmuZhou's avatar
WenmuZhou committed
63
4.1.2 命令行使用
WenmuZhou's avatar
WenmuZhou committed
64
65
66
67
```bash
paddlestructure --image_dir=../doc/table/1.png
```

WenmuZhou's avatar
WenmuZhou committed
68
69
70
71
72
73
74
75
76
77
78
### 参数说明
大部分参数和paddleocr whl包保持一致,见 [whl包文档](../doc/doc_ch/whl.md)

| 字段                    | 说明                                            | 默认值           |
|------------------------|------------------------------------------------------|------------------|
| output                 | excel和识别结果保存的地址                    | ./output/table            |
| structure_max_len      |  structure模型预测时,图像的长边resize尺度             |  488            |
| structure_model_dir      |  structure inference 模型地址             |  None            |
| structure_char_type      |  structure 模型所用字典地址             |  ../ppocr/utils/dict/table_structure_dict.tx            |