Unverified Commit 14f4bbb9 authored by Xiaomeng Zhao's avatar Xiaomeng Zhao Committed by GitHub
Browse files

Merge pull request #1089 from myhloli/dev

feat(pdf_parse): add OCR score to span data
parents 9675a574 7d4dfca2
...@@ -215,6 +215,7 @@ def txt_spans_extract_v2(pdf_page, spans, all_bboxes, all_discarded_blocks, lang ...@@ -215,6 +215,7 @@ def txt_spans_extract_v2(pdf_page, spans, all_bboxes, all_discarded_blocks, lang
ocr_text, ocr_score = ocr_res[0][0] ocr_text, ocr_score = ocr_res[0][0]
if ocr_score > 0.5 and len(ocr_text) > 0: if ocr_score > 0.5 and len(ocr_text) > 0:
span['content'] = ocr_text span['content'] = ocr_text
span['score'] = ocr_score
else: else:
spans.remove(span) spans.remove(span)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment