fix(pdf): improve ligature handling and text extraction
- Move ligature replacement function to pdf_parse_union_core_v2.py - Optimize ligature replacement using a more efficient approach - Modify text extraction flags to preserve ligatures in PDF content - Remove unnecessary function from ocr_mkcontent.py
Showing
Please register or sign in to comment