api_ch.md 2.14 KB
Newer Older
WenmuZhou's avatar
WenmuZhou committed
1
2
# PaddleStructure

WenmuZhou's avatar
WenmuZhou committed
3
4
5
6
7
8
安装layoutparser
```sh
wget  https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
pip3 install layoutparser-0.0.0-py3-none-any.whl
```

WenmuZhou's avatar
WenmuZhou committed
9
## 1. pipeline介绍
WenmuZhou's avatar
WenmuZhou committed
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

PaddleStructure 是一个用于复杂板式文字OCR的工具包,流程如下
![pipeline](../doc/table/pipeline.png)

在PaddleStructure中,图片会先经由layoutparser进行版面分析,在版面分析中,会对图片里的区域进行分类,根据根据类别进行对于的ocr流程。

目前layoutparser会输出五个类别:
1. Text
2. Title
3. Figure
4. List
5. Table
   
1-4类走传统的OCR流程,5走表格的OCR流程。

WenmuZhou's avatar
WenmuZhou committed
25
## 2. LayoutParser
WenmuZhou's avatar
WenmuZhou committed
26

WenmuZhou's avatar
WenmuZhou committed
27
[文档](layout/README.md)
WenmuZhou's avatar
WenmuZhou committed
28

WenmuZhou's avatar
WenmuZhou committed
29
## 3. Table OCR
WenmuZhou's avatar
WenmuZhou committed
30
31
32

[文档](table/README_ch.md)

WenmuZhou's avatar
WenmuZhou committed
33
## 4. PaddleStructure whl包介绍
WenmuZhou's avatar
WenmuZhou committed
34

WenmuZhou's avatar
WenmuZhou committed
35
### 4.1 使用
WenmuZhou's avatar
WenmuZhou committed
36

WenmuZhou's avatar
WenmuZhou committed
37
4.1.1 代码使用
WenmuZhou's avatar
WenmuZhou committed
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
```python
import cv2
from paddlestructure import PaddleStructure,draw_result

table_engine = PaddleStructure(
    output='./output/table',
    show_log=True)

img_path = '../doc/table/1.png'
img = cv2.imread(img_path)
result = table_engine(img)
for line in result:
    print(line)

from PIL import Image

font_path = 'path/tp/PaddleOCR/doc/fonts/simfang.ttf'
image = Image.open(img_path).convert('RGB')
im_show = draw_result(image, result,font_path=font_path)
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```

WenmuZhou's avatar
WenmuZhou committed
61
4.1.2 命令行使用
WenmuZhou's avatar
WenmuZhou committed
62
63
64
65
```bash
paddlestructure --image_dir=../doc/table/1.png
```

WenmuZhou's avatar
WenmuZhou committed
66
67
68
69
70
71
72
73
74
75
76
### 参数说明
大部分参数和paddleocr whl包保持一致,见 [whl包文档](../doc/doc_ch/whl.md)

| 字段                    | 说明                                            | 默认值           |
|------------------------|------------------------------------------------------|------------------|
| output                 | excel和识别结果保存的地址                    | ./output/table            |
| structure_max_len      |  structure模型预测时,图像的长边resize尺度             |  488            |
| structure_model_dir      |  structure inference 模型地址             |  None            |
| structure_char_type      |  structure 模型所用字典地址             |  ../ppocr/utils/dict/table_structure_dict.tx            |