1. 24 Dec, 2024 1 commit
    • myhloli's avatar
      feat(llm): add LLM-aided formula and text correction · c660fdc8
      myhloli authored
      - Add LLM-aided formula and text correction functionality
      - Update config reader to include LLM-aided settings
      - Create new LLM-aided processing module
      - Update main processing script to incorporate LLM-aided corrections
      - Modify download scripts to check for new config version
      c660fdc8
  2. 13 Dec, 2024 1 commit
    • myhloli's avatar
      fix(pdf): improve ligature handling and text extraction · c638fc5d
      myhloli authored
      - Move ligature replacement function to pdf_parse_union_core_v2.py
      - Optimize ligature replacement using a more efficient approach
      - Modify text extraction flags to preserve ligatures in PDF content
      - Remove unnecessary function from ocr_mkcontent.py
      c638fc5d
  3. 12 Dec, 2024 1 commit
  4. 11 Dec, 2024 14 commits
  5. 10 Dec, 2024 7 commits
  6. 09 Dec, 2024 3 commits
  7. 07 Dec, 2024 2 commits
  8. 06 Dec, 2024 10 commits
  9. 05 Dec, 2024 1 commit
    • myhloli's avatar
      perf(model): add threading lock for OCR model initialization · 04478095
      myhloli authored
      - Introduce a lock to synchronize access to OCR model initialization- This change improves thread safety when multiple threads access the OCR model concurrently
      - The lock ensures that the OCR model is initialized only once, even in multi-threaded scenarios
      04478095