serving_en.md 7.82 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Service deployment

PaddleOCR provides 2 service deployment methods::
- Based on **HubServing**:Has been integrated into PaddleOCR ([code](https://github.com/PaddlePaddle/PaddleOCR/tree/develop/deploy/hubserving)). Please follow this tutorial.
- Based on **PaddleServing**:See PaddleServing official website for details ([demo](https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/ocr)). Follow-up will also be integrated into PaddleOCR.  

The service deployment directory includes three service packages: detection, recognition, and two-stage series connection. Select the corresponding service package to install and start service according to your needs. The directory is as follows:  
```
deploy/hubserving/
  └─  ocr_det     detection module service package
  └─  ocr_rec     recognition module service package
  └─  ocr_system  two-stage series connection service package
```

Each service pack contains 3 files. Take the 2-stage series connection service package as an example, the directory is as follows:  
```
deploy/hubserving/ocr_system/
  └─  __init__.py    Empty file, required
  └─  config.json    Configuration file, optional, passed in as a parameter when using configuration to start the service
  └─  module.py      Main module file, required, contains the complete logic of the service
  └─  params.py      Parameter file, required, including parameters such as model path, pre- and post-processing parameters
```

## Quick start service
The following steps take the 2-stage series service as an example. If only the detection service or recognition service is needed, replace the corresponding file path.

### 1. Prepare the environment
```shell
# Install paddlehub  
pip3 install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple

littletomatodonkey's avatar
littletomatodonkey committed
32
# Set environment variables on Linux
33
export PYTHONPATH=.
littletomatodonkey's avatar
littletomatodonkey committed
34
35
36
# Set environment variables on Windows
SET PYTHONPATH=.
```
37
38

### 2. Install Service Module
littletomatodonkey's avatar
littletomatodonkey committed
39
40
41
PaddleOCR provides 3 kinds of service modules, install the required modules according to your needs.

* On Linux platform, the examples are as follows.
42
```shell
MissPenguin's avatar
MissPenguin committed
43
# Install the detection service module:
44
hub install deploy/hubserving/ocr_det/
MissPenguin's avatar
MissPenguin committed
45
46

# Or, install the recognition service module:
47
hub install deploy/hubserving/ocr_rec/
MissPenguin's avatar
MissPenguin committed
48

MissPenguin's avatar
MissPenguin committed
49
# Or, install the 2-stage series service module:
50
hub install deploy/hubserving/ocr_system/
littletomatodonkey's avatar
littletomatodonkey committed
51
52
```

MissPenguin's avatar
MissPenguin committed
53
* On Windows platform, the examples are as follows.
littletomatodonkey's avatar
littletomatodonkey committed
54
```shell
MissPenguin's avatar
MissPenguin committed
55
# Install the detection service module:
littletomatodonkey's avatar
littletomatodonkey committed
56
hub install deploy\hubserving\ocr_det\
MissPenguin's avatar
MissPenguin committed
57
58

# Or, install the recognition service module:
littletomatodonkey's avatar
littletomatodonkey committed
59
hub install deploy\hubserving\ocr_rec\
MissPenguin's avatar
MissPenguin committed
60
61

# Or, install the 2-stage series service module:
littletomatodonkey's avatar
littletomatodonkey committed
62
hub install deploy\hubserving\ocr_system\
MissPenguin's avatar
MissPenguin committed
63
```
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134

### 3. Start service
#### Way 1. Start with command line parameters (CPU only)

**start command:**  
```shell
$ hub serving start --modules [Module1==Version1, Module2==Version2, ...] \
                    --port XXXX \
                    --use_multiprocess \
                    --workers \
```  
**parameters:**  

|parameters|usage|  
|-|-|  
|--modules/-m|PaddleHub Serving pre-installed model, listed in the form of multiple Module==Version key-value pairs<br>*`When Version is not specified, the latest version is selected by default`*|
|--port/-p|Service port, default is 8866|  
|--use_multiprocess|Enable concurrent mode, the default is single-process mode, this mode is recommended for multi-core CPU machines<br>*`Windows operating system only supports single-process mode`*|
|--workers|The number of concurrent tasks specified in concurrent mode, the default is `2*cpu_count-1`, where `cpu_count` is the number of CPU cores|  

For example, start the 2-stage series service:  
```shell
hub serving start -m ocr_system
```  

This completes the deployment of a service API, using the default port number 8866.  

#### Way 2. Start with configuration file(CPU、GPU)
**start command:**  
```shell
hub serving start --config/-c config.json
```  
Wherein, the format of `config.json` is as follows:
```python
{
    "modules_info": {
        "ocr_system": {
            "init_args": {
                "version": "1.0.0",
                "use_gpu": true
            },
            "predict_args": {
            }
        }
    },
    "port": 8868,
    "use_multiprocess": false,
    "workers": 2
}
```
- The configurable parameters in `init_args` are consistent with the `_initialize` function interface in `module.py`. Among them, **when `use_gpu` is `true`, it means that the GPU is used to start the service**.
- The configurable parameters in `predict_args` are consistent with the `predict` function interface in `module.py`.

**Note:**  
- When using the configuration file to start the service, other parameters will be ignored.
- If you use GPU prediction (that is, `use_gpu` is set to `true`), you need to set the environment variable CUDA_VISIBLE_DEVICES before starting the service, such as: ```export CUDA_VISIBLE_DEVICES=0```, otherwise you do not need to set it.
- **`use_gpu` and `use_multiprocess` cannot be `true` at the same time.**  

For example, use GPU card No. 3 to start the 2-stage series service:
```shell
export CUDA_VISIBLE_DEVICES=3
hub serving start -c deploy/hubserving/ocr_system/config.json
```  

## Send prediction requests
After the service starts, you can use the following command to send a prediction request to obtain the prediction result:  
```shell
python tools/test_hubserving.py server_url image_path
```  

Two parameters need to be passed to the script:
littletomatodonkey's avatar
littletomatodonkey committed
135
- **server_url**:service address,format of which is
136
`http://[ip_address]:[port]/predict/[module_name]`  
MissPenguin's avatar
MissPenguin committed
137
For example, if the detection, recognition and 2-stage serial services are started with provided configuration files, the respective `server_url` would be:  
138
139
140
141
142
143
144
145
146
147
148
149
150
`http://127.0.0.1:8866/predict/ocr_det`  
`http://127.0.0.1:8867/predict/ocr_rec`  
`http://127.0.0.1:8868/predict/ocr_system`  
- **image_path**:Test image path, can be a single image path or an image directory path

**Eg.**
```shell
python tools/test_hubserving.py http://127.0.0.1:8868/predict/ocr_system ./doc/imgs/
```

## Returned result format
The returned result is a list. Each item in the list is a dict. The dict may contain three fields. The information is as follows:

littletomatodonkey's avatar
littletomatodonkey committed
151
|field name|data type|description|
152
153
154
155
156
157
158
159
160
|-|-|-|
|text|str|text content|
|confidence|float|text recognition confidence|
|text_region|list|text location coordinates|

The fields returned by different modules are different. For example, the results returned by the text recognition service module do not contain `text_region`. The details are as follows:

|field name/module name|ocr_det|ocr_rec|ocr_system|
|-|-|-|-|  
littletomatodonkey's avatar
littletomatodonkey committed
161
162
163
|text||✔|✔|
|confidence||✔|✔|
|text_region|✔||✔|
164
165
166
167
168
169

**Note:** If you need to add, delete or modify the returned fields, you can modify the file `module.py` of the corresponding module. For the complete process, refer to the user-defined modification service module in the next section.

## User defined service module modification
If you need to modify the service logic, the following steps are generally required (take the modification of `ocr_system` for example):

MissPenguin's avatar
MissPenguin committed
170
- 1. Stop service
171
172
```shell
hub serving stop --port/-p XXXX
MissPenguin's avatar
MissPenguin committed
173
```
174
175
- 2. Modify the code in the corresponding files, like `module.py` and `params.py`, according to the actual needs.  
For example, if you need to replace the model used by the deployed service, you need to modify model path parameters `det_model_dir` and `rec_model_dir` in `params.py`. Of course, other related parameters may need to be modified at the same time. Please modify and debug according to the actual situation. It is suggested to run `module.py` directly for debugging after modification before starting the service test.  
MissPenguin's avatar
MissPenguin committed
176
- 3. Uninstall old service module
177
178
179
```shell
hub uninstall ocr_system
```
MissPenguin's avatar
MissPenguin committed
180
- 4. Install modified service module
181
182
```shell
hub install deploy/hubserving/ocr_system/
MissPenguin's avatar
MissPenguin committed
183
```
MissPenguin's avatar
MissPenguin committed
184
- 5. Restart service
185
186
```shell
hub serving start -m ocr_system
MissPenguin's avatar
MissPenguin committed
187
```