feat(pdf_parse): improve OCR processing and contrast filtering
- Rename empty_spans to need_ocr_spans for better clarity - Add calculate_contrast function to measure image contrast - Filter out low-contrast spans to improve OCR accuracy - Update OCR processing workflow to use new filtering method
Showing
Please register or sign in to comment