1. 21 Nov, 2024 2 commits
    • myhloli's avatar
      refactor(txt_parse): improve text extraction accuracy with new algorithm · 309be741
      myhloli authored
      - Implement new text extraction method (txt_spans_extract_v2) to enhance accuracy
      - Add character filling in spans for better text reconstruction
      - Introduce empty span handling using OCR for missed text
      - Optimize span filtering and overlap removal
      309be741
    • myhloli's avatar
      feat(ocr): improve text detection and OCR accuracy · b2e37a2d
      myhloli authored
      - Update OCR utils to handle different box formats and improve angle calculation
      - Modify PDF extraction kit to support OCR option and optimize processing flow
      - Enhance PPOCR model to sort and filter detection boxes, improving text splitting accuracy
      b2e37a2d
  2. 19 Nov, 2024 1 commit
  3. 18 Nov, 2024 1 commit
    • myhloli's avatar
      feat(ocr): improve handling of angled text boxes · 4fd966eb
      myhloli authored
      - Add calculate_is_angle function to detect angled text boxes
      - Update update_det_boxes and merge_det_boxes functions to handle angled text boxes
      - Modify angle detection logic in various parts of the code
      4fd966eb
  4. 15 Nov, 2024 1 commit