PdLabel_PdClas.md 5.28 KB
Newer Older
Sugon_ldc's avatar
Sugon_ldc committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
# 花朵分类:从 PaddleLabel 到 PaddleClas

PaddleLabel 标注数据+PaddleClas 训练预测=快速完成一次花朵分类的任务

---

## 1. 数据准备

- 首先使用`PaddleLabel`对自制的花朵数据集进行标注,其次使用`Split Dataset`功能分割数据集,最后导出数据集
-`PaddleLabel`导出后的内容全部放到自己的建立的文件夹下,例如`flower_clas_dataset`,其目录结构如下:

```
├── flower_clas_dataset
│   ├── image
│   │   ├── flower1.jpg
│   │   ├── flower2.jpg
│   │   ├── ...
│   ├── labels.txt
│   ├── test_list.txt
│   ├── train_list.txt
│   ├── val_list.txt
```

## 2. 训练

### 2.1 安装必备的库

**2.1.1 安装 paddlepaddle**

```
# 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
# pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
# 您的机器是CPU,请运行以下命令安装
pip install paddlepaddle
```

**2.1.2 安装 paddleclas 以及依赖项**

```
git clone https://gitee.com/paddlepaddle/PaddleClas.git -b release/2.2
cd PaddleClas
pip install -r requirements.txt
python setup.py install
```

### 2.2 准备自制的花朵分类数据集

```
cd ./PaddleClas/dataset/
mkdir flower_clas_dataset
cd ../../
cp -r ./flower_clas_dataset/* ./PaddleClas/dataset/flower_clas_dataset
```

### 2.3 修改配置文件

> PaddleClas/ppcls/configs/quick_start/new_user/ShuffleNetV2_x0_25.yaml

```
# global configs
Global:
  checkpoints: null
  pretrained_model: null
  output_dir: ./output/
  device: cpu
  save_interval: 20
  eval_during_train: True
  eval_interval: 10
  epochs: 100
  print_batch_step: 10
  use_visualdl: True
  # used for static mode and model export
  image_shape: [3, 224, 224]
  save_inference_dir: ./inference

# model architecture
Arch:
  name: ShuffleNetV2_x0_25
  class_num: 3

# loss function config for traing/eval process
Loss:
  Train:
    - CELoss:
        weight: 1.0
  Eval:
    - CELoss:
        weight: 1.0


Optimizer:
  name: Momentum
  momentum: 0.9
  lr:
    name: Cosine
    learning_rate: 0.0125
    warmup_epoch: 5
  regularizer:
    name: 'L2'
    coeff: 0.00001


# data loader for train and eval
DataLoader:
  Train:
    dataset:
      name: ImageNetDataset
      image_root: ./dataset/
      cls_label_path: ./dataset/train_list.txt
      transform_ops:
        - DecodeImage:
            to_rgb: True
            channel_first: False
        - RandCropImage:
            size: 224
        - RandFlipImage:
            flip_code: 1
        - NormalizeImage:
            scale: 1.0/255.0
            mean: [0.485, 0.456, 0.406]
            std: [0.229, 0.224, 0.225]
            order: ''

    sampler:
      name: DistributedBatchSampler
      batch_size: 16
      drop_last: False
      shuffle: True
    loader:
      num_workers: 0
      use_shared_memory: True

  Eval:
    dataset:
      name: ImageNetDataset
      image_root: ./dataset/
      cls_label_path: ./dataset/val_list.txt
      transform_ops:
        - DecodeImage:
            to_rgb: True
            channel_first: False
        - ResizeImage:
            resize_short: 256
        - CropImage:
            size: 224
        - NormalizeImage:
            scale: 1.0/255.0
            mean: [0.485, 0.456, 0.406]
            std: [0.229, 0.224, 0.225]
            order: ''
    sampler:
      name: DistributedBatchSampler
      batch_size: 32
      drop_last: False
      shuffle: False
    loader:
      num_workers: 0
      use_shared_memory: True

Infer:
  infer_imgs: dataset/predict_demo.jpg
  batch_size: 10
  transforms:
    - DecodeImage:
        to_rgb: True
        channel_first: False
    - ResizeImage:
        resize_short: 256
    - CropImage:
        size: 224
    - NormalizeImage:
        scale: 1.0/255.0
        mean: [0.485, 0.456, 0.406]
        std: [0.229, 0.224, 0.225]
        order: ''
    - ToCHWImage:
  PostProcess:
    name: Topk
    topk: 3

Metric:
  Train:
    - TopkAcc:
        topk: [1, 3]
  Eval:
    - TopkAcc:
        topk: [1, 3]
```

### 2.4 添加类别映射文件

> PaddleClas/ppcls/configs/quick_start/new_user/label.txt

```
sunflower
rose
dandelion
```

### 2.5 开始训练

```
export CUDA_VISIBLE_DEVICES=0
# 开始训练
python PaddleClas/tools/train.py -c ./PaddleClas/ppcls/configs/quick_start/new_user/ShuffleNetV2_x0_25.yaml
```

## 3. 模型评估

### 3.1 评估

```
python PaddleClas/tools/eval.py -c ./PaddleClas/ppcls/configs/quick_start/new_user/ShuffleNetV2_x0_25.yaml
```

### 3.2 预测

```
python3 PaddleClas/tools/infer.py \
    -c ./PaddleClas/ppcls/configs/quick_start/new_user/ShuffleNetV2_x0_25.yaml \
    -o Infer.infer_imgs=dataset/predict_demo.jpg \
    -o Global.pretrained_model=output/ShuffleNetV2_x0_25/latest
```

预测的样例图片是:

<img src="https://ai-studio-static-online.cdn.bcebos.com/e7d6cabc46434205891cfc0c125b8dcec511e622469c49a5b8ec48051f7dd997" width="50%" height="50%">

预测的结果是:

> {'class_ids': [0, 1, 2], 'scores': [0.89812, 0.09476, 0.00712], 'file_name': 'dataset/predict_demo.jpg', 'label_names': []}
> 也就是说 0 的概率最大,为 0.89812,0 对应的结果是向日葵,也就是说结果是向日葵,预测无误。

## AI Studio 第三方教程推荐

[快速体验演示案例](https://aistudio.baidu.com/aistudio/projectdetail/4337003)