README.md 3.07 KB
Newer Older
WenmuZhou's avatar
WenmuZhou committed
1
2
# PaddleStructure

WenmuZhou's avatar
WenmuZhou committed
3
4
install layoutparser
```sh
WenmuZhou's avatar
WenmuZhou committed
5
pip3 install https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
WenmuZhou's avatar
WenmuZhou committed
6
7
```

WenmuZhou's avatar
WenmuZhou committed
8
9
10
11
## 1. Introduction to pipeline

PaddleStructure is a toolkit for complex layout text OCR, the process is as follows

WenmuZhou's avatar
WenmuZhou committed
12
![pipeline](../doc/table/pipeline.jpg)
WenmuZhou's avatar
WenmuZhou committed
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

In PaddleStructure, the image will be analyzed by layoutparser first. In the layout analysis, the area in the image will be classified, and the OCR process will be carried out according to the category.

Currently layoutparser will output five categories:
1. Text
2. Title
3. Figure
4. List
5. Table
   
Types 1-4 follow the traditional OCR process, and 5 follow the Table OCR process.

## 2. LayoutParser


## 3. Table OCR

[doc](table/README.md)

WenmuZhou's avatar
opt doc  
WenmuZhou committed
32
## 4. Predictive by inference engine
WenmuZhou's avatar
WenmuZhou committed
33

WenmuZhou's avatar
opt doc  
WenmuZhou committed
34
35
Use the following commands to complete the inference
```python
WenmuZhou's avatar
WenmuZhou committed
36
python3 predict_system.py --det_model_dir=path/to/det_model_dir --rec_model_dir=path/to/rec_model_dir --table_model_dir=path/to/table_model_dir --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=EN --det_limit_side_len=736 --det_limit_type=min --output ../output/table
WenmuZhou's avatar
opt doc  
WenmuZhou committed
37
38
39
40
41
42
```
After running, each image will have a directory with the same name under the directory specified in the output field. Each table in the picture will be stored as an excel, and the excel file name will be the coordinates of the table in the image.

## 5. PaddleStructure whl package introduction

### 5.1 Use
WenmuZhou's avatar
WenmuZhou committed
43

WenmuZhou's avatar
opt doc  
WenmuZhou committed
44
5.1.1 Use by code
WenmuZhou's avatar
WenmuZhou committed
45
```python
WenmuZhou's avatar
WenmuZhou committed
46
import os
WenmuZhou's avatar
WenmuZhou committed
47
import cv2
WenmuZhou's avatar
WenmuZhou committed
48
from paddlestructure import PaddleStructure,draw_result,save_res
WenmuZhou's avatar
WenmuZhou committed
49

WenmuZhou's avatar
WenmuZhou committed
50
table_engine = PaddleStructure(show_log=True)
WenmuZhou's avatar
WenmuZhou committed
51

WenmuZhou's avatar
WenmuZhou committed
52
save_folder = './output/table'
WenmuZhou's avatar
WenmuZhou committed
53
54
55
img_path = '../doc/table/1.png'
img = cv2.imread(img_path)
result = table_engine(img)
WenmuZhou's avatar
WenmuZhou committed
56
57
save_res(result, save_folder,os.path.basename(img_path).split('.')[0])

WenmuZhou's avatar
WenmuZhou committed
58
59
60
61
62
for line in result:
    print(line)

from PIL import Image

WenmuZhou's avatar
WenmuZhou committed
63
font_path = 'path/to/PaddleOCR/doc/fonts/simfang.ttf'
WenmuZhou's avatar
WenmuZhou committed
64
65
66
67
68
69
image = Image.open(img_path).convert('RGB')
im_show = draw_result(image, result,font_path=font_path)
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```

WenmuZhou's avatar
opt doc  
WenmuZhou committed
70
5.1.2 Use by command line
WenmuZhou's avatar
WenmuZhou committed
71
72
73
74
```bash
paddlestructure --image_dir=../doc/table/1.png
```

WenmuZhou's avatar
opt doc  
WenmuZhou committed
75
76
### Parameter Description
Most of the parameters are consistent with the paddleocr whl package, see [whl package documentation](../doc/doc_ch/whl.md)
WenmuZhou's avatar
WenmuZhou committed
77

WenmuZhou's avatar
opt doc  
WenmuZhou committed
78
| Parameter                    | Description                                            | Default           |
WenmuZhou's avatar
WenmuZhou committed
79
|------------------------|------------------------------------------------------|------------------|
WenmuZhou's avatar
opt doc  
WenmuZhou committed
80
81
82
83
| output                 | The path where excel and recognition results are saved                    | ./output/table            |
| structure_max_len      |  When the table structure model predicts, the long side of the image is resized             |  488            |
| structure_model_dir      |  Table structure inference model path             |  None            |
| structure_char_type      | Dictionary path used by table structure model             |  ../ppocr/utils/dict/table_structure_dict.tx            |
WenmuZhou's avatar
WenmuZhou committed
84
85