• myhloli's avatar
    refactor(pdf_check): improve character detection using PyMuPDF · ac888156
    myhloli authored
    - Replace pdfminer with PyMuPDF for character detection
    - Implement new method detect_invalid_chars_by_pymupdf
    - Update check_invalid_chars in pdf_meta_scan.py to use new method
    - Add __replace_0xfffd function in pdf_parse_union_core_v2.py to handle special characters
    - Remove unused imports and update requirements.txt
    ac888156
pdf_check.py 3.1 KB