Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
ModelZoo
CRNN_Paddle
Commits
ed43fc11
Commit
ed43fc11
authored
May 16, 2025
by
wanglch
Browse files
Initial commit
parents
Pipeline
#2703
canceled with stages
Changes
378
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
2565 additions
and
0 deletions
+2565
-0
configs/rec/PP-FormuaNet/PP-FormulaNet-S.yaml
configs/rec/PP-FormuaNet/PP-FormulaNet-S.yaml
+117
-0
configs/rec/PP-FormuaNet/PP-FormulaNet_plus-L.yaml
configs/rec/PP-FormuaNet/PP-FormulaNet_plus-L.yaml
+122
-0
configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml
configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml
+119
-0
configs/rec/PP-FormuaNet/PP-FormulaNet_plus-S.yaml
configs/rec/PP-FormuaNet/PP-FormulaNet_plus-S.yaml
+120
-0
configs/rec/PP-OCRv3/PP-OCRv3_mobile_rec.yml
configs/rec/PP-OCRv3/PP-OCRv3_mobile_rec.yml
+134
-0
configs/rec/PP-OCRv3/PP-OCRv3_mobile_rec_distillation.yml
configs/rec/PP-OCRv3/PP-OCRv3_mobile_rec_distillation.yml
+209
-0
configs/rec/PP-OCRv3/en_PP-OCRv3_mobile_rec.yml
configs/rec/PP-OCRv3/en_PP-OCRv3_mobile_rec.yml
+134
-0
configs/rec/PP-OCRv3/multi_language/.gitkeep
configs/rec/PP-OCRv3/multi_language/.gitkeep
+0
-0
configs/rec/PP-OCRv3/multi_language/arabic_PP-OCRv3_mobile_rec.yml
...ec/PP-OCRv3/multi_language/arabic_PP-OCRv3_mobile_rec.yml
+133
-0
configs/rec/PP-OCRv3/multi_language/chinese_cht_PP-OCRv3_mobile_rec.yaml
...OCRv3/multi_language/chinese_cht_PP-OCRv3_mobile_rec.yaml
+133
-0
configs/rec/PP-OCRv3/multi_language/cyrillic_PP-OCRv3_mobile_rec.yml
.../PP-OCRv3/multi_language/cyrillic_PP-OCRv3_mobile_rec.yml
+133
-0
configs/rec/PP-OCRv3/multi_language/devanagari_PP-OCRv3_mobile_rec.yml
...P-OCRv3/multi_language/devanagari_PP-OCRv3_mobile_rec.yml
+133
-0
configs/rec/PP-OCRv3/multi_language/japan_PP-OCRv3_mobile_rec.yml
...rec/PP-OCRv3/multi_language/japan_PP-OCRv3_mobile_rec.yml
+133
-0
configs/rec/PP-OCRv3/multi_language/ka_PP-OCRv3_mobile_rec.yml
...gs/rec/PP-OCRv3/multi_language/ka_PP-OCRv3_mobile_rec.yml
+133
-0
configs/rec/PP-OCRv3/multi_language/korean_PP-OCRv3_mobile_rec.yml
...ec/PP-OCRv3/multi_language/korean_PP-OCRv3_mobile_rec.yml
+133
-0
configs/rec/PP-OCRv3/multi_language/latin_PP-OCRv3_mobile_rec.yml
...rec/PP-OCRv3/multi_language/latin_PP-OCRv3_mobile_rec.yml
+133
-0
configs/rec/PP-OCRv3/multi_language/ta_PP-OCRv3_mobile_rec.yml
...gs/rec/PP-OCRv3/multi_language/ta_PP-OCRv3_mobile_rec.yml
+133
-0
configs/rec/PP-OCRv3/multi_language/te_PP-OCRv3_mobile_rec.yml
...gs/rec/PP-OCRv3/multi_language/te_PP-OCRv3_mobile_rec.yml
+133
-0
configs/rec/PP-OCRv4/PP-OCRv4_mobile_rec.yml
configs/rec/PP-OCRv4/PP-OCRv4_mobile_rec.yml
+140
-0
configs/rec/PP-OCRv4/PP-OCRv4_mobile_rec_ampO2_ultra.yml
configs/rec/PP-OCRv4/PP-OCRv4_mobile_rec_ampO2_ultra.yml
+140
-0
No files found.
Too many changes to show.
To preserve performance only
378 of 378+
files are displayed.
Plain diff
Email patch
configs/rec/PP-FormuaNet/PP-FormulaNet-S.yaml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
PP-FormulaNet-S
# To use static model for inference.
use_gpu
:
True
epoch_num
:
20
log_smooth_window
:
10
print_batch_step
:
10
save_model_dir
:
./output/rec/pp_formulanet_s/
save_epoch_step
:
2
# evaluation is run every 179 iterations (1 epoch)(batch_size = 56) # max_seq_len: 1024
eval_batch_step
:
[
0
,
179
]
cal_metric_during_train
:
True
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
False
infer_img
:
doc/datasets/pme_demo/0000013.png
infer_mode
:
False
use_space_char
:
False
rec_char_dict_path
:
&rec_char_dict_path
ppocr/utils/dict/unimernet_tokenizer
max_new_tokens
:
&max_new_tokens
1024
input_size
:
&input_size
[
384
,
384
]
save_res_path
:
./output/rec/predicts_pp_formulanet_s.txt
allow_resize_largeImg
:
False
start_ema
:
True
d2s_train_image_shape
:
[
1
,
384
,
384
]
Optimizer
:
name
:
AdamW
beta1
:
0.9
beta2
:
0.999
weight_decay
:
0.05
lr
:
name
:
LinearWarmupCosine
learning_rate
:
0.0001
Architecture
:
model_type
:
rec
algorithm
:
PP-FormulaNet-S
in_channels
:
3
Transform
:
Backbone
:
name
:
PPHGNetV2_B4_Formula
class_num
:
1024
Head
:
name
:
PPFormulaNet_Head
max_new_tokens
:
*max_new_tokens
decoder_start_token_id
:
0
decoder_ffn_dim
:
1536
decoder_hidden_size
:
384
decoder_layers
:
2
temperature
:
0.2
do_sample
:
False
top_p
:
0.95
encoder_hidden_size
:
2048
is_export
:
False
length_aware
:
True
use_parallel
:
True
parallel_step
:
3
Loss
:
name
:
PPFormulaNet_S_Loss
parallel_step
:
3
PostProcess
:
name
:
UniMERNetDecode
rec_char_dict_path
:
*rec_char_dict_path
Metric
:
name
:
LaTeXOCRMetric
main_indicator
:
exp_rate
cal_bleu_score
:
True
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./ocr_rec_latexocr_dataset_example
label_file_list
:
[
"
./ocr_rec_latexocr_dataset_example/train.txt"
]
transforms
:
-
UniMERNetImgDecode
:
input_size
:
*input_size
-
UniMERNetTrainTransform
:
-
LatexImageFormat
:
-
UniMERNetLabelEncode
:
rec_char_dict_path
:
*rec_char_dict_path
max_seq_len
:
*max_new_tokens
-
KeepKeys
:
keep_keys
:
[
'
image'
,
'
label'
,
'
attention_mask'
]
loader
:
shuffle
:
False
drop_last
:
False
batch_size_per_card
:
14
num_workers
:
0
collate_fn
:
UniMERNetCollator
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./ocr_rec_latexocr_dataset_example
label_file_list
:
[
"
./ocr_rec_latexocr_dataset_example/val.txt"
]
transforms
:
-
UniMERNetImgDecode
:
input_size
:
*input_size
-
UniMERNetTestTransform
:
-
LatexImageFormat
:
-
UniMERNetLabelEncode
:
max_seq_len
:
*max_new_tokens
rec_char_dict_path
:
*rec_char_dict_path
-
KeepKeys
:
keep_keys
:
[
'
image'
,
'
label'
,
'
attention_mask'
,
'
filename'
]
loader
:
shuffle
:
False
drop_last
:
False
batch_size_per_card
:
30
num_workers
:
0
collate_fn
:
UniMERNetCollator
configs/rec/PP-FormuaNet/PP-FormulaNet_plus-L.yaml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
PP-FormulaNet_plus-L
# To use static model for inference.
use_gpu
:
True
epoch_num
:
10
log_smooth_window
:
10
print_batch_step
:
10
save_model_dir
:
./output/rec/pp_formulanet_plus_l/
save_epoch_step
:
2
# evaluation is run every 417 iterations (1 epoch)(batch_size = 24) # max_seq_len: 1024
eval_batch_step
:
[
0
,
417
]
cal_metric_during_train
:
True
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
False
infer_img
:
doc/datasets/pme_demo/0000013.png
infer_mode
:
False
use_space_char
:
False
rec_char_dict_path
:
&rec_char_dict_path
ppocr/utils/dict/unimernet_tokenizer
max_new_tokens
:
&max_new_tokens
2560
input_size
:
&input_size
[
768
,
768
]
save_res_path
:
./output/rec/predicts_pp_formulanet_plus_l.txt
allow_resize_largeImg
:
False
start_ema
:
True
d2s_train_image_shape
:
[
1
,
768
,
768
]
Optimizer
:
name
:
AdamW
beta1
:
0.9
beta2
:
0.999
weight_decay
:
0.05
lr
:
name
:
LinearWarmupCosine
learning_rate
:
0.0001
Architecture
:
model_type
:
rec
algorithm
:
PP-FormulaNet_plus-L
in_channels
:
3
Transform
:
Backbone
:
name
:
Vary_VIT_B_Formula
image_size
:
768
encoder_embed_dim
:
768
encoder_depth
:
12
encoder_num_heads
:
12
encoder_global_attn_indexes
:
[
2
,
5
,
8
,
11
]
Head
:
name
:
PPFormulaNet_Head
max_new_tokens
:
*max_new_tokens
decoder_start_token_id
:
0
decoder_ffn_dim
:
2048
decoder_hidden_size
:
512
decoder_layers
:
8
temperature
:
0.2
do_sample
:
False
top_p
:
0.95
encoder_hidden_size
:
1024
is_export
:
False
length_aware
:
False
use_parallel
:
False
parallel_step
:
0
Loss
:
name
:
PPFormulaNet_L_Loss
PostProcess
:
name
:
UniMERNetDecode
rec_char_dict_path
:
*rec_char_dict_path
Metric
:
name
:
LaTeXOCRMetric
main_indicator
:
exp_rate
cal_bleu_score
:
True
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./ocr_rec_latexocr_dataset_example
label_file_list
:
[
"
./ocr_rec_latexocr_dataset_example/train.txt"
]
transforms
:
-
UniMERNetImgDecode
:
input_size
:
*input_size
random_padding
:
True
random_resize
:
True
random_crop
:
True
-
UniMERNetTrainTransform
:
-
LatexImageFormat
:
-
UniMERNetLabelEncode
:
rec_char_dict_path
:
*rec_char_dict_path
max_seq_len
:
*max_new_tokens
-
KeepKeys
:
keep_keys
:
[
'
image'
,
'
label'
,
'
attention_mask'
]
loader
:
shuffle
:
False
drop_last
:
False
batch_size_per_card
:
3
num_workers
:
0
collate_fn
:
UniMERNetCollator
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./ocr_rec_latexocr_dataset_example
label_file_list
:
[
"
./ocr_rec_latexocr_dataset_example/val.txt"
]
transforms
:
-
UniMERNetImgDecode
:
input_size
:
*input_size
-
UniMERNetTestTransform
:
-
LatexImageFormat
:
-
UniMERNetLabelEncode
:
max_seq_len
:
*max_new_tokens
rec_char_dict_path
:
*rec_char_dict_path
-
KeepKeys
:
keep_keys
:
[
'
image'
,
'
label'
,
'
attention_mask'
,
'
filename'
]
loader
:
shuffle
:
False
drop_last
:
False
batch_size_per_card
:
10
num_workers
:
0
collate_fn
:
UniMERNetCollator
configs/rec/PP-FormuaNet/PP-FormulaNet_plus-M.yaml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
PP-FormulaNet_plus-M
# To use static model for inference.
use_gpu
:
True
epoch_num
:
20
log_smooth_window
:
10
print_batch_step
:
10
save_model_dir
:
./output/rec/pp_formulanet_plus_m/
save_epoch_step
:
2
# evaluation is run every 179 iterations (1 epoch)(batch_size = 56) # max_seq_len: 1024
eval_batch_step
:
[
0
,
179
]
cal_metric_during_train
:
True
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
False
infer_img
:
doc/datasets/pme_demo/0000013.png
infer_mode
:
False
use_space_char
:
False
rec_char_dict_path
:
&rec_char_dict_path
ppocr/utils/dict/unimernet_tokenizer
max_new_tokens
:
&max_new_tokens
2560
input_size
:
&input_size
[
384
,
384
]
save_res_path
:
./output/rec/predicts_pp_formulanet_plus_m.txt
allow_resize_largeImg
:
False
start_ema
:
True
d2s_train_image_shape
:
[
1
,
384
,
384
]
Optimizer
:
name
:
AdamW
beta1
:
0.9
beta2
:
0.999
weight_decay
:
0.05
lr
:
name
:
LinearWarmupCosine
learning_rate
:
0.0001
Architecture
:
model_type
:
rec
algorithm
:
PP-FormulaNet_plus-M
in_channels
:
3
Transform
:
Backbone
:
name
:
PPHGNetV2_B6_Formula
class_num
:
1024
Head
:
name
:
PPFormulaNet_Head
max_new_tokens
:
*max_new_tokens
decoder_start_token_id
:
0
decoder_ffn_dim
:
2048
decoder_hidden_size
:
512
decoder_layers
:
6
temperature
:
0.2
do_sample
:
False
top_p
:
0.95
encoder_hidden_size
:
2048
is_export
:
False
length_aware
:
False
use_parallel
:
False
parallel_step
:
0
Loss
:
name
:
PPFormulaNet_L_Loss
PostProcess
:
name
:
UniMERNetDecode
rec_char_dict_path
:
*rec_char_dict_path
Metric
:
name
:
LaTeXOCRMetric
main_indicator
:
exp_rate
cal_bleu_score
:
True
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./ocr_rec_latexocr_dataset_example
label_file_list
:
[
"
./ocr_rec_latexocr_dataset_example/train.txt"
]
transforms
:
-
UniMERNetImgDecode
:
input_size
:
*input_size
random_padding
:
True
random_resize
:
True
random_crop
:
True
-
UniMERNetTrainTransform
:
-
LatexImageFormat
:
-
UniMERNetLabelEncode
:
rec_char_dict_path
:
*rec_char_dict_path
max_seq_len
:
*max_new_tokens
-
KeepKeys
:
keep_keys
:
[
'
image'
,
'
label'
,
'
attention_mask'
]
loader
:
shuffle
:
False
drop_last
:
False
batch_size_per_card
:
14
num_workers
:
0
collate_fn
:
UniMERNetCollator
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./ocr_rec_latexocr_dataset_example
label_file_list
:
[
"
./ocr_rec_latexocr_dataset_example/val.txt"
]
transforms
:
-
UniMERNetImgDecode
:
input_size
:
*input_size
-
UniMERNetTestTransform
:
-
LatexImageFormat
:
-
UniMERNetLabelEncode
:
max_seq_len
:
*max_new_tokens
rec_char_dict_path
:
*rec_char_dict_path
-
KeepKeys
:
keep_keys
:
[
'
image'
,
'
label'
,
'
attention_mask'
,
'
filename'
]
loader
:
shuffle
:
False
drop_last
:
False
batch_size_per_card
:
30
num_workers
:
0
collate_fn
:
UniMERNetCollator
configs/rec/PP-FormuaNet/PP-FormulaNet_plus-S.yaml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
PP-FormulaNet_plus-S
# To use static model for inference.
use_gpu
:
True
epoch_num
:
20
log_smooth_window
:
10
print_batch_step
:
10
save_model_dir
:
./output/rec/pp_formulanet_plus_s/
save_epoch_step
:
2
# evaluation is run every 179 iterations (1 epoch)(batch_size = 56) # max_seq_len: 1024
eval_batch_step
:
[
0
,
179
]
cal_metric_during_train
:
True
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
False
infer_img
:
doc/datasets/pme_demo/0000013.png
infer_mode
:
False
use_space_char
:
False
rec_char_dict_path
:
&rec_char_dict_path
ppocr/utils/dict/unimernet_tokenizer
max_new_tokens
:
&max_new_tokens
1024
input_size
:
&input_size
[
384
,
384
]
save_res_path
:
./output/rec/predicts_pp_formulanet_plus_s.txt
allow_resize_largeImg
:
False
start_ema
:
True
d2s_train_image_shape
:
[
1
,
384
,
384
]
Optimizer
:
name
:
AdamW
beta1
:
0.9
beta2
:
0.999
weight_decay
:
0.05
lr
:
name
:
LinearWarmupCosine
learning_rate
:
0.0001
Architecture
:
model_type
:
rec
algorithm
:
PP-FormulaNet_plus-S
in_channels
:
3
Transform
:
Backbone
:
name
:
PPHGNetV2_B4_Formula
class_num
:
1024
Head
:
name
:
PPFormulaNet_Head
max_new_tokens
:
*max_new_tokens
decoder_start_token_id
:
0
decoder_ffn_dim
:
1536
decoder_hidden_size
:
384
decoder_layers
:
2
temperature
:
0.2
do_sample
:
False
top_p
:
0.95
encoder_hidden_size
:
2048
is_export
:
False
length_aware
:
True
use_parallel
:
True,
parallel_step
:
3
Loss
:
name
:
PPFormulaNet_S_Loss
parallel_step
:
3
PostProcess
:
name
:
UniMERNetDecode
rec_char_dict_path
:
*rec_char_dict_path
Metric
:
name
:
LaTeXOCRMetric
main_indicator
:
exp_rate
cal_bleu_score
:
True
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./ocr_rec_latexocr_dataset_example
label_file_list
:
[
"
./ocr_rec_latexocr_dataset_example/train.txt"
]
transforms
:
-
UniMERNetImgDecode
:
input_size
:
*input_size
random_padding
:
True
random_resize
:
True
random_crop
:
True
-
UniMERNetTrainTransform
:
-
LatexImageFormat
:
-
UniMERNetLabelEncode
:
rec_char_dict_path
:
*rec_char_dict_path
max_seq_len
:
*max_new_tokens
-
KeepKeys
:
keep_keys
:
[
'
image'
,
'
label'
,
'
attention_mask'
]
loader
:
shuffle
:
False
drop_last
:
False
batch_size_per_card
:
14
num_workers
:
0
collate_fn
:
UniMERNetCollator
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./ocr_rec_latexocr_dataset_example
label_file_list
:
[
"
./ocr_rec_latexocr_dataset_example/val.txt"
]
transforms
:
-
UniMERNetImgDecode
:
input_size
:
*input_size
-
UniMERNetTestTransform
:
-
LatexImageFormat
:
-
UniMERNetLabelEncode
:
max_seq_len
:
*max_new_tokens
rec_char_dict_path
:
*rec_char_dict_path
-
KeepKeys
:
keep_keys
:
[
'
image'
,
'
label'
,
'
attention_mask'
,
'
filename'
]
loader
:
shuffle
:
False
drop_last
:
False
batch_size_per_card
:
30
num_workers
:
0
collate_fn
:
UniMERNetCollator
configs/rec/PP-OCRv3/PP-OCRv3_mobile_rec.yml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
PP-OCRv3_mobile_rec
# To use static model for inference.
debug
:
false
use_gpu
:
true
epoch_num
:
500
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/rec_ppocr_v3
save_epoch_step
:
3
eval_batch_step
:
[
0
,
2000
]
cal_metric_during_train
:
true
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
false
infer_img
:
doc/imgs_words/ch/word_1.jpg
character_dict_path
:
ppocr/utils/ppocr_keys_v1.txt
max_text_length
:
&max_text_length
25
infer_mode
:
false
use_space_char
:
true
distributed
:
true
save_res_path
:
./output/rec/predicts_ppocrv3.txt
d2s_train_image_shape
:
[
3
,
48
,
320
]
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Cosine
learning_rate
:
0.001
warmup_epoch
:
5
regularizer
:
name
:
L2
factor
:
3.0e-05
Architecture
:
model_type
:
rec
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
MobileNetV1Enhance
scale
:
0.5
last_conv_stride
:
[
1
,
2
]
last_pool_type
:
avg
last_pool_kernel_size
:
[
2
,
2
]
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
64
depth
:
2
hidden_dims
:
120
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
SARHead
:
enc_dim
:
512
max_text_length
:
*max_text_length
Loss
:
name
:
MultiLoss
loss_config_list
:
-
CTCLoss
:
-
SARLoss
:
PostProcess
:
name
:
CTCLabelDecode
Metric
:
name
:
RecMetric
main_indicator
:
acc
ignore_space
:
False
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data/
ext_op_transform_idx
:
1
label_file_list
:
-
./train_data/train_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
RecConAug
:
prob
:
0.5
ext_data_num
:
2
image_shape
:
[
48
,
320
,
3
]
max_text_length
:
*max_text_length
-
RecAug
:
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
true
batch_size_per_card
:
128
drop_last
:
true
num_workers
:
4
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data
label_file_list
:
-
./train_data/val_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
false
drop_last
:
false
batch_size_per_card
:
128
num_workers
:
4
configs/rec/PP-OCRv3/PP-OCRv3_mobile_rec_distillation.yml
0 → 100644
View file @
ed43fc11
Global
:
debug
:
false
use_gpu
:
true
epoch_num
:
800
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/rec_ppocr_v3_distillation
save_epoch_step
:
3
eval_batch_step
:
[
0
,
2000
]
cal_metric_during_train
:
true
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
false
infer_img
:
doc/imgs_words/ch/word_1.jpg
character_dict_path
:
ppocr/utils/ppocr_keys_v1.txt
max_text_length
:
&max_text_length
25
infer_mode
:
false
use_space_char
:
true
distributed
:
true
save_res_path
:
./output/rec/predicts_ppocrv3_distillation.txt
d2s_train_image_shape
:
[
3
,
48
,
-1
]
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Piecewise
decay_epochs
:
[
700
]
values
:
[
0.0005
,
0.00005
]
warmup_epoch
:
5
regularizer
:
name
:
L2
factor
:
3.0e-05
Architecture
:
model_type
:
&model_type
"
rec"
name
:
DistillationModel
algorithm
:
Distillation
Models
:
Teacher
:
pretrained
:
freeze_params
:
false
return_all_feats
:
true
model_type
:
*model_type
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
MobileNetV1Enhance
scale
:
0.5
last_conv_stride
:
[
1
,
2
]
last_pool_type
:
avg
last_pool_kernel_size
:
[
2
,
2
]
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
64
depth
:
2
hidden_dims
:
120
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
SARHead
:
enc_dim
:
512
max_text_length
:
*max_text_length
Student
:
pretrained
:
freeze_params
:
false
return_all_feats
:
true
model_type
:
*model_type
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
MobileNetV1Enhance
scale
:
0.5
last_conv_stride
:
[
1
,
2
]
last_pool_type
:
avg
last_pool_kernel_size
:
[
2
,
2
]
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
64
depth
:
2
hidden_dims
:
120
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
SARHead
:
enc_dim
:
512
max_text_length
:
*max_text_length
Loss
:
name
:
CombinedLoss
loss_config_list
:
-
DistillationDMLLoss
:
weight
:
1.0
act
:
"
softmax"
use_log
:
true
model_name_pairs
:
-
[
"
Student"
,
"
Teacher"
]
key
:
head_out
multi_head
:
True
dis_head
:
ctc
name
:
dml_ctc
-
DistillationDMLLoss
:
weight
:
0.5
act
:
"
softmax"
use_log
:
true
model_name_pairs
:
-
[
"
Student"
,
"
Teacher"
]
key
:
head_out
multi_head
:
True
dis_head
:
sar
name
:
dml_sar
-
DistillationDistanceLoss
:
weight
:
1.0
mode
:
"
l2"
model_name_pairs
:
-
[
"
Student"
,
"
Teacher"
]
key
:
backbone_out
-
DistillationCTCLoss
:
weight
:
1.0
model_name_list
:
[
"
Student"
,
"
Teacher"
]
key
:
head_out
multi_head
:
True
-
DistillationSARLoss
:
weight
:
1.0
model_name_list
:
[
"
Student"
,
"
Teacher"
]
key
:
head_out
multi_head
:
True
PostProcess
:
name
:
DistillationCTCLabelDecode
model_name
:
[
"
Student"
,
"
Teacher"
]
key
:
head_out
multi_head
:
True
Metric
:
name
:
DistillationMetric
base_metric_name
:
RecMetric
main_indicator
:
acc
key
:
"
Student"
ignore_space
:
False
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data/
ext_op_transform_idx
:
1
label_file_list
:
-
./train_data/train_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
RecConAug
:
prob
:
0.5
ext_data_num
:
2
image_shape
:
[
48
,
320
,
3
]
max_text_length
:
*max_text_length
-
RecAug
:
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
true
batch_size_per_card
:
128
drop_last
:
true
num_workers
:
4
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data
label_file_list
:
-
./train_data/val_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
false
drop_last
:
false
batch_size_per_card
:
128
num_workers
:
4
configs/rec/PP-OCRv3/en_PP-OCRv3_mobile_rec.yml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
en_PP-OCRv3_mobile_rec
# To use static model for inference.
debug
:
false
use_gpu
:
true
epoch_num
:
500
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/v3_en_mobile
save_epoch_step
:
3
eval_batch_step
:
[
0
,
2000
]
cal_metric_during_train
:
true
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
false
infer_img
:
doc/imgs_words/ch/word_1.jpg
character_dict_path
:
ppocr/utils/en_dict.txt
max_text_length
:
&max_text_length
25
infer_mode
:
false
use_space_char
:
true
distributed
:
true
save_res_path
:
./output/rec/predicts_ppocrv3_en.txt
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Cosine
learning_rate
:
0.001
warmup_epoch
:
5
regularizer
:
name
:
L2
factor
:
3.0e-05
Architecture
:
model_type
:
rec
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
MobileNetV1Enhance
scale
:
0.5
last_conv_stride
:
[
1
,
2
]
last_pool_type
:
avg
last_pool_kernel_size
:
[
2
,
2
]
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
64
depth
:
2
hidden_dims
:
120
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
SARHead
:
enc_dim
:
512
max_text_length
:
*max_text_length
Loss
:
name
:
MultiLoss
loss_config_list
:
-
CTCLoss
:
-
SARLoss
:
PostProcess
:
name
:
CTCLabelDecode
Metric
:
name
:
RecMetric
main_indicator
:
acc
ignore_space
:
False
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data/
ext_op_transform_idx
:
1
label_file_list
:
-
./train_data/train_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
RecConAug
:
prob
:
0.5
ext_data_num
:
2
image_shape
:
[
48
,
320
,
3
]
max_text_length
:
*max_text_length
-
RecAug
:
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
true
batch_size_per_card
:
128
drop_last
:
true
num_workers
:
4
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data
label_file_list
:
-
./train_data/val_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
false
drop_last
:
false
batch_size_per_card
:
128
num_workers
:
4
configs/rec/PP-OCRv3/multi_language/.gitkeep
0 → 100644
View file @
ed43fc11
configs/rec/PP-OCRv3/multi_language/arabic_PP-OCRv3_mobile_rec.yml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
arabic_PP-OCRv3_mobile_rec
# To use static model for inference.
debug
:
false
use_gpu
:
true
epoch_num
:
500
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/v3_arabic_mobile
save_epoch_step
:
3
eval_batch_step
:
[
0
,
2000
]
cal_metric_during_train
:
true
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
false
infer_img
:
./doc/imgs_words/arabic/ar_2.jpg
character_dict_path
:
ppocr/utils/dict/arabic_dict.txt
max_text_length
:
&max_text_length
25
infer_mode
:
false
use_space_char
:
true
distributed
:
true
save_res_path
:
./output/rec/predicts_ppocrv3_arabic.txt
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Cosine
learning_rate
:
0.001
warmup_epoch
:
5
regularizer
:
name
:
L2
factor
:
3.0e-05
Architecture
:
model_type
:
rec
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
MobileNetV1Enhance
scale
:
0.5
last_conv_stride
:
[
1
,
2
]
last_pool_type
:
avg
last_pool_kernel_size
:
[
2
,
2
]
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
64
depth
:
2
hidden_dims
:
120
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
SARHead
:
enc_dim
:
512
max_text_length
:
*max_text_length
Loss
:
name
:
MultiLoss
loss_config_list
:
-
CTCLoss
:
-
SARLoss
:
PostProcess
:
name
:
CTCLabelDecode
Metric
:
name
:
RecMetric
main_indicator
:
acc
ignore_space
:
False
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data/
ext_op_transform_idx
:
1
label_file_list
:
-
./train_data/train_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
RecConAug
:
prob
:
0.5
ext_data_num
:
2
image_shape
:
[
48
,
320
,
3
]
-
RecAug
:
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
true
batch_size_per_card
:
128
drop_last
:
true
num_workers
:
4
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data
label_file_list
:
-
./train_data/val_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
false
drop_last
:
false
batch_size_per_card
:
128
num_workers
:
4
configs/rec/PP-OCRv3/multi_language/chinese_cht_PP-OCRv3_mobile_rec.yaml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
chinese_cht_PP-OCRv3_mobile_rec
# To use static model for inference.
debug
:
false
use_gpu
:
true
epoch_num
:
500
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/v3_chinese_cht_mobile
save_epoch_step
:
3
eval_batch_step
:
[
0
,
2000
]
cal_metric_during_train
:
true
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
false
infer_img
:
doc/imgs_words/ch/word_1.jpg
character_dict_path
:
ppocr/utils/dict/chinese_cht_dict.txt
max_text_length
:
&max_text_length
25
infer_mode
:
false
use_space_char
:
true
distributed
:
true
save_res_path
:
./output/rec/predicts_ppocrv3_chinese_cht.txt
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Cosine
learning_rate
:
0.001
warmup_epoch
:
5
regularizer
:
name
:
L2
factor
:
3.0e-05
Architecture
:
model_type
:
rec
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
MobileNetV1Enhance
scale
:
0.5
last_conv_stride
:
[
1
,
2
]
last_pool_type
:
avg
last_pool_kernel_size
:
[
2
,
2
]
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
64
depth
:
2
hidden_dims
:
120
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
SARHead
:
enc_dim
:
512
max_text_length
:
*max_text_length
Loss
:
name
:
MultiLoss
loss_config_list
:
-
CTCLoss
:
-
SARLoss
:
PostProcess
:
name
:
CTCLabelDecode
Metric
:
name
:
RecMetric
main_indicator
:
acc
ignore_space
:
False
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data/
ext_op_transform_idx
:
1
label_file_list
:
-
./train_data/train_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
RecConAug
:
prob
:
0.5
ext_data_num
:
2
image_shape
:
[
48
,
320
,
3
]
-
RecAug
:
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
true
batch_size_per_card
:
128
drop_last
:
true
num_workers
:
4
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data
label_file_list
:
-
./train_data/val_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
false
drop_last
:
false
batch_size_per_card
:
128
num_workers
:
4
configs/rec/PP-OCRv3/multi_language/cyrillic_PP-OCRv3_mobile_rec.yml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
cyrillic_PP-OCRv3_mobile_rec
# To use static model for inference.
debug
:
false
use_gpu
:
true
epoch_num
:
500
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/v3_cyrillic_mobile
save_epoch_step
:
3
eval_batch_step
:
[
0
,
2000
]
cal_metric_during_train
:
true
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
false
infer_img
:
doc/imgs_words/ch/word_1.jpg
character_dict_path
:
ppocr/utils/dict/cyrillic_dict.txt
max_text_length
:
&max_text_length
25
infer_mode
:
false
use_space_char
:
true
distributed
:
true
save_res_path
:
./output/rec/predicts_ppocrv3_cyrillic.txt
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Cosine
learning_rate
:
0.001
warmup_epoch
:
5
regularizer
:
name
:
L2
factor
:
3.0e-05
Architecture
:
model_type
:
rec
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
MobileNetV1Enhance
scale
:
0.5
last_conv_stride
:
[
1
,
2
]
last_pool_type
:
avg
last_pool_kernel_size
:
[
2
,
2
]
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
64
depth
:
2
hidden_dims
:
120
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
SARHead
:
enc_dim
:
512
max_text_length
:
*max_text_length
Loss
:
name
:
MultiLoss
loss_config_list
:
-
CTCLoss
:
-
SARLoss
:
PostProcess
:
name
:
CTCLabelDecode
Metric
:
name
:
RecMetric
main_indicator
:
acc
ignore_space
:
False
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data/
ext_op_transform_idx
:
1
label_file_list
:
-
./train_data/train_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
RecConAug
:
prob
:
0.5
ext_data_num
:
2
image_shape
:
[
48
,
320
,
3
]
-
RecAug
:
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
true
batch_size_per_card
:
128
drop_last
:
true
num_workers
:
4
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data
label_file_list
:
-
./train_data/val_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
false
drop_last
:
false
batch_size_per_card
:
128
num_workers
:
4
configs/rec/PP-OCRv3/multi_language/devanagari_PP-OCRv3_mobile_rec.yml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
devanagari_PP-OCRv3_mobile_rec
# To use static model for inference.
debug
:
false
use_gpu
:
true
epoch_num
:
500
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/v3_devanagari_mobile
save_epoch_step
:
3
eval_batch_step
:
[
0
,
2000
]
cal_metric_during_train
:
true
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
false
infer_img
:
doc/imgs_words/ch/word_1.jpg
character_dict_path
:
ppocr/utils/dict/devanagari_dict.txt
max_text_length
:
&max_text_length
25
infer_mode
:
false
use_space_char
:
true
distributed
:
true
save_res_path
:
./output/rec/predicts_ppocrv3_devanagari.txt
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Cosine
learning_rate
:
0.001
warmup_epoch
:
5
regularizer
:
name
:
L2
factor
:
3.0e-05
Architecture
:
model_type
:
rec
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
MobileNetV1Enhance
scale
:
0.5
last_conv_stride
:
[
1
,
2
]
last_pool_type
:
avg
last_pool_kernel_size
:
[
2
,
2
]
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
64
depth
:
2
hidden_dims
:
120
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
SARHead
:
enc_dim
:
512
max_text_length
:
*max_text_length
Loss
:
name
:
MultiLoss
loss_config_list
:
-
CTCLoss
:
-
SARLoss
:
PostProcess
:
name
:
CTCLabelDecode
Metric
:
name
:
RecMetric
main_indicator
:
acc
ignore_space
:
False
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data/
ext_op_transform_idx
:
1
label_file_list
:
-
./train_data/train_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
RecConAug
:
prob
:
0.5
ext_data_num
:
2
image_shape
:
[
48
,
320
,
3
]
-
RecAug
:
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
true
batch_size_per_card
:
128
drop_last
:
true
num_workers
:
4
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data
label_file_list
:
-
./train_data/val_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
false
drop_last
:
false
batch_size_per_card
:
128
num_workers
:
4
configs/rec/PP-OCRv3/multi_language/japan_PP-OCRv3_mobile_rec.yml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
japan_PP-OCRv3_mobile_rec
# To use static model for inference.
debug
:
false
use_gpu
:
true
epoch_num
:
500
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/v3_japan_mobile
save_epoch_step
:
3
eval_batch_step
:
[
0
,
2000
]
cal_metric_during_train
:
true
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
false
infer_img
:
doc/imgs_words/ch/word_1.jpg
character_dict_path
:
ppocr/utils/dict/japan_dict.txt
max_text_length
:
&max_text_length
25
infer_mode
:
false
use_space_char
:
true
distributed
:
true
save_res_path
:
./output/rec/predicts_ppocrv3_japan.txt
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Cosine
learning_rate
:
0.001
warmup_epoch
:
5
regularizer
:
name
:
L2
factor
:
3.0e-05
Architecture
:
model_type
:
rec
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
MobileNetV1Enhance
scale
:
0.5
last_conv_stride
:
[
1
,
2
]
last_pool_type
:
avg
last_pool_kernel_size
:
[
2
,
2
]
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
64
depth
:
2
hidden_dims
:
120
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
SARHead
:
enc_dim
:
512
max_text_length
:
*max_text_length
Loss
:
name
:
MultiLoss
loss_config_list
:
-
CTCLoss
:
-
SARLoss
:
PostProcess
:
name
:
CTCLabelDecode
Metric
:
name
:
RecMetric
main_indicator
:
acc
ignore_space
:
False
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data/
ext_op_transform_idx
:
1
label_file_list
:
-
./train_data/train_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
RecConAug
:
prob
:
0.5
ext_data_num
:
2
image_shape
:
[
48
,
320
,
3
]
-
RecAug
:
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
true
batch_size_per_card
:
128
drop_last
:
true
num_workers
:
4
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data
label_file_list
:
-
./train_data/val_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
false
drop_last
:
false
batch_size_per_card
:
128
num_workers
:
4
configs/rec/PP-OCRv3/multi_language/ka_PP-OCRv3_mobile_rec.yml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
ka_PP-OCRv3_mobile_rec
# To use static model for inference.
debug
:
false
use_gpu
:
true
epoch_num
:
500
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/v3_ka_mobile
save_epoch_step
:
3
eval_batch_step
:
[
0
,
2000
]
cal_metric_during_train
:
true
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
false
infer_img
:
doc/imgs_words/ch/word_1.jpg
character_dict_path
:
ppocr/utils/dict/ka_dict.txt
max_text_length
:
&max_text_length
25
infer_mode
:
false
use_space_char
:
true
distributed
:
true
save_res_path
:
./output/rec/predicts_ppocrv3_ka.txt
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Cosine
learning_rate
:
0.001
warmup_epoch
:
5
regularizer
:
name
:
L2
factor
:
3.0e-05
Architecture
:
model_type
:
rec
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
MobileNetV1Enhance
scale
:
0.5
last_conv_stride
:
[
1
,
2
]
last_pool_type
:
avg
last_pool_kernel_size
:
[
2
,
2
]
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
64
depth
:
2
hidden_dims
:
120
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
SARHead
:
enc_dim
:
512
max_text_length
:
*max_text_length
Loss
:
name
:
MultiLoss
loss_config_list
:
-
CTCLoss
:
-
SARLoss
:
PostProcess
:
name
:
CTCLabelDecode
Metric
:
name
:
RecMetric
main_indicator
:
acc
ignore_space
:
False
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data/
ext_op_transform_idx
:
1
label_file_list
:
-
./train_data/train_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
RecConAug
:
prob
:
0.5
ext_data_num
:
2
image_shape
:
[
48
,
320
,
3
]
-
RecAug
:
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
true
batch_size_per_card
:
128
drop_last
:
true
num_workers
:
4
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data
label_file_list
:
-
./train_data/val_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
false
drop_last
:
false
batch_size_per_card
:
128
num_workers
:
4
configs/rec/PP-OCRv3/multi_language/korean_PP-OCRv3_mobile_rec.yml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
korean_PP-OCRv3_mobile_rec
# To use static model for inference.
debug
:
false
use_gpu
:
true
epoch_num
:
500
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/v3_korean_mobile
save_epoch_step
:
3
eval_batch_step
:
[
0
,
2000
]
cal_metric_during_train
:
true
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
false
infer_img
:
doc/imgs_words/ch/word_1.jpg
character_dict_path
:
ppocr/utils/dict/korean_dict.txt
max_text_length
:
&max_text_length
25
infer_mode
:
false
use_space_char
:
true
distributed
:
true
save_res_path
:
./output/rec/predicts_ppocrv3_korean.txt
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Cosine
learning_rate
:
0.001
warmup_epoch
:
5
regularizer
:
name
:
L2
factor
:
3.0e-05
Architecture
:
model_type
:
rec
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
MobileNetV1Enhance
scale
:
0.5
last_conv_stride
:
[
1
,
2
]
last_pool_type
:
avg
last_pool_kernel_size
:
[
2
,
2
]
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
64
depth
:
2
hidden_dims
:
120
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
SARHead
:
enc_dim
:
512
max_text_length
:
*max_text_length
Loss
:
name
:
MultiLoss
loss_config_list
:
-
CTCLoss
:
-
SARLoss
:
PostProcess
:
name
:
CTCLabelDecode
Metric
:
name
:
RecMetric
main_indicator
:
acc
ignore_space
:
False
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data/
ext_op_transform_idx
:
1
label_file_list
:
-
./train_data/train_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
RecConAug
:
prob
:
0.5
ext_data_num
:
2
image_shape
:
[
48
,
320
,
3
]
-
RecAug
:
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
true
batch_size_per_card
:
128
drop_last
:
true
num_workers
:
4
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data
label_file_list
:
-
./train_data/val_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
false
drop_last
:
false
batch_size_per_card
:
128
num_workers
:
4
configs/rec/PP-OCRv3/multi_language/latin_PP-OCRv3_mobile_rec.yml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
latin_PP-OCRv3_mobile_rec
# To use static model for inference.
debug
:
false
use_gpu
:
true
epoch_num
:
500
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/v3_latin_mobile
save_epoch_step
:
3
eval_batch_step
:
[
0
,
2000
]
cal_metric_during_train
:
true
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
false
infer_img
:
doc/imgs_words/ch/word_1.jpg
character_dict_path
:
ppocr/utils/dict/latin_dict.txt
max_text_length
:
&max_text_length
25
infer_mode
:
false
use_space_char
:
true
distributed
:
true
save_res_path
:
./output/rec/predicts_ppocrv3_latin.txt
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Cosine
learning_rate
:
0.001
warmup_epoch
:
5
regularizer
:
name
:
L2
factor
:
3.0e-05
Architecture
:
model_type
:
rec
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
MobileNetV1Enhance
scale
:
0.5
last_conv_stride
:
[
1
,
2
]
last_pool_type
:
avg
last_pool_kernel_size
:
[
2
,
2
]
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
64
depth
:
2
hidden_dims
:
120
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
SARHead
:
enc_dim
:
512
max_text_length
:
*max_text_length
Loss
:
name
:
MultiLoss
loss_config_list
:
-
CTCLoss
:
-
SARLoss
:
PostProcess
:
name
:
CTCLabelDecode
Metric
:
name
:
RecMetric
main_indicator
:
acc
ignore_space
:
False
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data/
ext_op_transform_idx
:
1
label_file_list
:
-
./train_data/train_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
RecConAug
:
prob
:
0.5
ext_data_num
:
2
image_shape
:
[
48
,
320
,
3
]
-
RecAug
:
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
true
batch_size_per_card
:
128
drop_last
:
true
num_workers
:
4
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data
label_file_list
:
-
./train_data/val_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
false
drop_last
:
false
batch_size_per_card
:
128
num_workers
:
4
configs/rec/PP-OCRv3/multi_language/ta_PP-OCRv3_mobile_rec.yml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
ta_PP-OCRv3_mobile_rec
# To use static model for inference.
debug
:
false
use_gpu
:
true
epoch_num
:
500
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/v3_ta_mobile
save_epoch_step
:
3
eval_batch_step
:
[
0
,
2000
]
cal_metric_during_train
:
true
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
false
infer_img
:
doc/imgs_words/ch/word_1.jpg
character_dict_path
:
ppocr/utils/dict/ta_dict.txt
max_text_length
:
&max_text_length
25
infer_mode
:
false
use_space_char
:
true
distributed
:
true
save_res_path
:
./output/rec/predicts_ppocrv3_ta.txt
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Cosine
learning_rate
:
0.001
warmup_epoch
:
5
regularizer
:
name
:
L2
factor
:
3.0e-05
Architecture
:
model_type
:
rec
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
MobileNetV1Enhance
scale
:
0.5
last_conv_stride
:
[
1
,
2
]
last_pool_type
:
avg
last_pool_kernel_size
:
[
2
,
2
]
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
64
depth
:
2
hidden_dims
:
120
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
SARHead
:
enc_dim
:
512
max_text_length
:
*max_text_length
Loss
:
name
:
MultiLoss
loss_config_list
:
-
CTCLoss
:
-
SARLoss
:
PostProcess
:
name
:
CTCLabelDecode
Metric
:
name
:
RecMetric
main_indicator
:
acc
ignore_space
:
False
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data/
ext_op_transform_idx
:
1
label_file_list
:
-
./train_data/train_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
RecConAug
:
prob
:
0.5
ext_data_num
:
2
image_shape
:
[
48
,
320
,
3
]
-
RecAug
:
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
true
batch_size_per_card
:
128
drop_last
:
true
num_workers
:
4
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data
label_file_list
:
-
./train_data/val_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
false
drop_last
:
false
batch_size_per_card
:
128
num_workers
:
4
configs/rec/PP-OCRv3/multi_language/te_PP-OCRv3_mobile_rec.yml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
te_PP-OCRv3_mobile_rec
# To use static model for inference.
debug
:
false
use_gpu
:
true
epoch_num
:
500
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/v3_te_mobile
save_epoch_step
:
3
eval_batch_step
:
[
0
,
2000
]
cal_metric_during_train
:
true
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
false
infer_img
:
doc/imgs_words/ch/word_1.jpg
character_dict_path
:
ppocr/utils/dict/te_dict.txt
max_text_length
:
&max_text_length
25
infer_mode
:
false
use_space_char
:
true
distributed
:
true
save_res_path
:
./output/rec/predicts_ppocrv3_te.txt
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Cosine
learning_rate
:
0.001
warmup_epoch
:
5
regularizer
:
name
:
L2
factor
:
3.0e-05
Architecture
:
model_type
:
rec
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
MobileNetV1Enhance
scale
:
0.5
last_conv_stride
:
[
1
,
2
]
last_pool_type
:
avg
last_pool_kernel_size
:
[
2
,
2
]
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
64
depth
:
2
hidden_dims
:
120
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
SARHead
:
enc_dim
:
512
max_text_length
:
*max_text_length
Loss
:
name
:
MultiLoss
loss_config_list
:
-
CTCLoss
:
-
SARLoss
:
PostProcess
:
name
:
CTCLabelDecode
Metric
:
name
:
RecMetric
main_indicator
:
acc
ignore_space
:
False
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data/
ext_op_transform_idx
:
1
label_file_list
:
-
./train_data/train_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
RecConAug
:
prob
:
0.5
ext_data_num
:
2
image_shape
:
[
48
,
320
,
3
]
-
RecAug
:
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
true
batch_size_per_card
:
128
drop_last
:
true
num_workers
:
4
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data
label_file_list
:
-
./train_data/val_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
MultiLabelEncode
:
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_sar
-
length
-
valid_ratio
loader
:
shuffle
:
false
drop_last
:
false
batch_size_per_card
:
128
num_workers
:
4
configs/rec/PP-OCRv4/PP-OCRv4_mobile_rec.yml
0 → 100644
View file @
ed43fc11
Global
:
model_name
:
PP-OCRv4_mobile_rec
# To use static model for inference.
debug
:
false
use_gpu
:
true
epoch_num
:
200
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/rec_ppocr_v4
save_epoch_step
:
10
eval_batch_step
:
[
0
,
2000
]
cal_metric_during_train
:
true
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
false
infer_img
:
doc/imgs_words/ch/word_1.jpg
character_dict_path
:
ppocr/utils/ppocr_keys_v1.txt
max_text_length
:
&max_text_length
25
infer_mode
:
false
use_space_char
:
true
distributed
:
true
save_res_path
:
./output/rec/predicts_ppocrv3.txt
d2s_train_image_shape
:
[
3
,
48
,
320
]
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Cosine
learning_rate
:
0.001
warmup_epoch
:
5
regularizer
:
name
:
L2
factor
:
3.0e-05
Architecture
:
model_type
:
rec
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
PPLCNetV3
scale
:
0.95
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
120
depth
:
2
hidden_dims
:
120
kernel_size
:
[
1
,
3
]
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
NRTRHead
:
nrtr_dim
:
384
max_text_length
:
*max_text_length
Loss
:
name
:
MultiLoss
loss_config_list
:
-
CTCLoss
:
-
NRTRLoss
:
PostProcess
:
name
:
CTCLabelDecode
Metric
:
name
:
RecMetric
main_indicator
:
acc
Train
:
dataset
:
name
:
MultiScaleDataSet
ds_width
:
false
data_dir
:
./train_data/
ext_op_transform_idx
:
1
label_file_list
:
-
./train_data/train_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
RecConAug
:
prob
:
0.5
ext_data_num
:
2
image_shape
:
[
48
,
320
,
3
]
max_text_length
:
*max_text_length
-
RecAug
:
-
MultiLabelEncode
:
gtc_encode
:
NRTRLabelEncode
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_gtc
-
length
-
valid_ratio
sampler
:
name
:
MultiScaleSampler
scales
:
[[
320
,
32
],
[
320
,
48
],
[
320
,
64
]]
first_bs
:
&bs
192
fix_bs
:
false
divided_factor
:
[
8
,
16
]
# w, h
is_training
:
True
loader
:
shuffle
:
true
batch_size_per_card
:
*bs
drop_last
:
true
num_workers
:
8
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data
label_file_list
:
-
./train_data/val_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
MultiLabelEncode
:
gtc_encode
:
NRTRLabelEncode
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_gtc
-
length
-
valid_ratio
loader
:
shuffle
:
false
drop_last
:
false
batch_size_per_card
:
128
num_workers
:
4
configs/rec/PP-OCRv4/PP-OCRv4_mobile_rec_ampO2_ultra.yml
0 → 100644
View file @
ed43fc11
Global
:
debug
:
false
use_gpu
:
true
epoch_num
:
200
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/rec_ppocr_v4
save_epoch_step
:
10
eval_batch_step
:
[
0
,
2000
]
cal_metric_during_train
:
true
pretrained_model
:
checkpoints
:
save_inference_dir
:
use_visualdl
:
false
infer_img
:
doc/imgs_words/ch/word_1.jpg
character_dict_path
:
ppocr/utils/ppocr_keys_v1.txt
max_text_length
:
&max_text_length
25
infer_mode
:
false
use_space_char
:
true
distributed
:
true
save_res_path
:
./output/rec/predicts_ppocrv3.txt
use_amp
:
True
amp_level
:
O2
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Cosine
learning_rate
:
0.001
warmup_epoch
:
5
regularizer
:
name
:
L2
factor
:
3.0e-05
Architecture
:
model_type
:
rec
algorithm
:
SVTR_LCNet
Transform
:
Backbone
:
name
:
PPLCNetV3
scale
:
0.95
Head
:
name
:
MultiHead
head_list
:
-
CTCHead
:
Neck
:
name
:
svtr
dims
:
120
depth
:
2
hidden_dims
:
120
kernel_size
:
[
1
,
3
]
use_guide
:
True
Head
:
fc_decay
:
0.00001
-
NRTRHead
:
nrtr_dim
:
384
max_text_length
:
*max_text_length
Loss
:
name
:
MultiLoss
loss_config_list
:
-
CTCLoss
:
-
NRTRLoss
:
PostProcess
:
name
:
CTCLabelDecode
Metric
:
name
:
RecMetric
main_indicator
:
acc
Train
:
dataset
:
name
:
MultiScaleDataSet
ds_width
:
false
data_dir
:
./train_data/
ext_op_transform_idx
:
1
label_file_list
:
-
./train_data/train_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
RecConAug
:
prob
:
0.5
ext_data_num
:
2
image_shape
:
[
48
,
320
,
3
]
max_text_length
:
*max_text_length
-
RecAug
:
-
MultiLabelEncode
:
gtc_encode
:
NRTRLabelEncode
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_gtc
-
length
-
valid_ratio
sampler
:
name
:
MultiScaleSampler
scales
:
[[
320
,
32
],
[
320
,
48
],
[
320
,
64
]]
first_bs
:
&bs
384
fix_bs
:
false
divided_factor
:
[
8
,
16
]
# w, h
is_training
:
True
loader
:
shuffle
:
true
batch_size_per_card
:
*bs
drop_last
:
true
num_workers
:
16
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data
label_file_list
:
-
./train_data/val_list.txt
transforms
:
-
DecodeImage
:
img_mode
:
BGR
channel_first
:
false
-
MultiLabelEncode
:
gtc_encode
:
NRTRLabelEncode
-
RecResizeImg
:
image_shape
:
[
3
,
48
,
320
]
-
KeepKeys
:
keep_keys
:
-
image
-
label_ctc
-
label_gtc
-
length
-
valid_ratio
loader
:
shuffle
:
false
drop_last
:
false
batch_size_per_card
:
128
num_workers
:
16
Prev
1
…
4
5
6
7
8
9
10
11
12
…
19
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment