api.md 3.01 KB
Newer Older
WenmuZhou's avatar
WenmuZhou committed
1
2
# PaddleStructure

WenmuZhou's avatar
WenmuZhou committed
3
4
5
6
7
8
install layoutparser
```sh
wget  https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
pip3 install layoutparser-0.0.0-py3-none-any.whl
```

WenmuZhou's avatar
WenmuZhou committed
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
## 1. Introduction to pipeline

PaddleStructure is a toolkit for complex layout text OCR, the process is as follows

![pipeline](../doc/table/pipeline.png)

In PaddleStructure, the image will be analyzed by layoutparser first. In the layout analysis, the area in the image will be classified, and the OCR process will be carried out according to the category.

Currently layoutparser will output five categories:
1. Text
2. Title
3. Figure
4. List
5. Table
   
Types 1-4 follow the traditional OCR process, and 5 follow the Table OCR process.

## 2. LayoutParser


## 3. Table OCR

[doc](table/README.md)

WenmuZhou's avatar
opt doc  
WenmuZhou committed
33
## 4. Predictive by inference engine
WenmuZhou's avatar
WenmuZhou committed
34

WenmuZhou's avatar
opt doc  
WenmuZhou committed
35
36
37
38
39
40
41
42
43
Use the following commands to complete the inference
```python
python3 table/predict_system.py --det_model_dir=path/to/det_model_dir --rec_model_dir=path/to/rec_model_dir --table_model_dir=path/to/table_model_dir --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=EN --det_limit_side_len=736 --det_limit_type=min --output ../output/table
```
After running, each image will have a directory with the same name under the directory specified in the output field. Each table in the picture will be stored as an excel, and the excel file name will be the coordinates of the table in the image.

## 5. PaddleStructure whl package introduction

### 5.1 Use
WenmuZhou's avatar
WenmuZhou committed
44

WenmuZhou's avatar
opt doc  
WenmuZhou committed
45
5.1.1 Use by code
WenmuZhou's avatar
WenmuZhou committed
46
```python
WenmuZhou's avatar
WenmuZhou committed
47
import os
WenmuZhou's avatar
WenmuZhou committed
48
import cv2
WenmuZhou's avatar
WenmuZhou committed
49
from paddlestructure import PaddleStructure,draw_result,save_res
WenmuZhou's avatar
WenmuZhou committed
50

WenmuZhou's avatar
WenmuZhou committed
51
table_engine = PaddleStructure(show_log=True)
WenmuZhou's avatar
WenmuZhou committed
52

WenmuZhou's avatar
WenmuZhou committed
53
save_folder = './output/table'
WenmuZhou's avatar
WenmuZhou committed
54
55
56
img_path = '../doc/table/1.png'
img = cv2.imread(img_path)
result = table_engine(img)
WenmuZhou's avatar
WenmuZhou committed
57
58
save_res(result, save_folder,os.path.basename(img_path).split('.')[0])

WenmuZhou's avatar
WenmuZhou committed
59
60
61
62
63
64
65
66
67
68
69
70
for line in result:
    print(line)

from PIL import Image

font_path = 'path/tp/PaddleOCR/doc/fonts/simfang.ttf'
image = Image.open(img_path).convert('RGB')
im_show = draw_result(image, result,font_path=font_path)
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```

WenmuZhou's avatar
opt doc  
WenmuZhou committed
71
5.1.2 Use by command line
WenmuZhou's avatar
WenmuZhou committed
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
```bash
paddlestructure --image_dir=../doc/table/1.png
```

### 参数说明
大部分参数和paddleocr whl包保持一致,见 [whl包文档](../doc/doc_ch/whl.md)

| 字段                    | 说明                                            | 默认值           |
|------------------------|------------------------------------------------------|------------------|
| output                 | excel和识别结果保存的地址                    | ./output/table            |
| structure_max_len      |  structure模型预测时,图像的长边resize尺度             |  488            |
| structure_model_dir      |  structure inference 模型地址             |  None            |
| structure_char_type      |  structure 模型所用字典地址             |  ../ppocr/utils/dict/table_structure_dict.tx            |