1. 13 Dec, 2024 1 commit
    • myhloli's avatar
      fix(pdf): improve ligature handling and text extraction · c638fc5d
      myhloli authored
      - Move ligature replacement function to pdf_parse_union_core_v2.py
      - Optimize ligature replacement using a more efficient approach
      - Modify text extraction flags to preserve ligatures in PDF content
      - Remove unnecessary function from ocr_mkcontent.py
      c638fc5d
  2. 12 Dec, 2024 1 commit
  3. 11 Dec, 2024 32 commits
  4. 10 Dec, 2024 6 commits