1. 01 Apr, 2025 1 commit
  2. 07 Mar, 2025 1 commit
    • myhloli's avatar
      refactor(magic_pdf): replace PIL with NumPy for image processing · 1b34f7e4
      myhloli authored
      - Remove PIL usage across multiple files
      - Convert image processing functions to use NumPy arrays
      - Update crop_img function to work with NumPy arrays
      - Modify image loading and resizing to use NumPy and OpenCV
      - Clean up unused imports and comments related to PIL
      1b34f7e4
  3. 04 Mar, 2025 1 commit
  4. 03 Mar, 2025 2 commits
  5. 27 Feb, 2025 1 commit
  6. 09 Feb, 2025 1 commit
  7. 23 Jan, 2025 1 commit
  8. 22 Jan, 2025 1 commit
  9. 15 Jan, 2025 1 commit
  10. 14 Jan, 2025 1 commit
    • myhloli's avatar
      feat(layout): improve title block handling and layout detection · c20e9a1e
      myhloli authored
      - Merge title blocks that are close to each other horizontally
      - Adjust line insertion logic for title blocks- Increase image size and decrease confidence threshold for layout detection
      - Update DocLayoutYOLO model weights
      - Refactor drawing of bounding boxes for different block types
      c20e9a1e
  11. 10 Jan, 2025 3 commits
  12. 09 Jan, 2025 1 commit
  13. 05 Jan, 2025 1 commit
    • myhloli's avatar
      feat(tools): add character bounding box drawing functionality · f911a102
      myhloli authored
      - Add `draw_char_bbox` function to `draw_bbox.py` for drawing character bounding boxes
      - Integrate `draw_char_bbox` into `common.py` for use in PDF processing pipeline
      - Include option to draw character bounding boxes in debug mode
      f911a102
  14. 30 Dec, 2024 1 commit
    • myhloli's avatar
      fix(npu): correct module name for NPU operations · 2684e775
      myhloli authored
      - Update `clean_memory.py` to use `torch_npu.npu` instead of `torch.npu`
      - Update `model_utils.py` to use `torch_npu.npu` instead of `torch.npu`
      - Simplify NPU availability check and bfloat16 support in `pdf_parse_union_core_v2.py`
      2684e775
  15. 26 Dec, 2024 2 commits
    • myhloli's avatar
      refactor(device): optimize memory cleaning and device selection · 50f48417
      myhloli authored
      - Update clean_memory function to support both CUDA and NPU devices
      - Implement get_device function to centralize device selection logic
      - Modify model initialization and memory cleaning to use the selected device
      - Update RapidTableModel to support both RapidOCR and PaddleOCR engines
      50f48417
    • myhloli's avatar
      feat(model): add npu support and optimize table model · 7990e7df
      myhloli authored
      - Add NPU support for memory cleaning and model initialization
      - Optimize table model initialization and prediction process
      - Update memory utils to support NPU
      - Add language parameter for table model
      7990e7df
  16. 24 Dec, 2024 1 commit
    • myhloli's avatar
      feat(llm): add LLM-aided formula and text correction · c660fdc8
      myhloli authored
      - Add LLM-aided formula and text correction functionality
      - Update config reader to include LLM-aided settings
      - Create new LLM-aided processing module
      - Update main processing script to incorporate LLM-aided corrections
      - Modify download scripts to check for new config version
      c660fdc8
  17. 11 Dec, 2024 2 commits
  18. 10 Dec, 2024 1 commit
  19. 03 Dec, 2024 2 commits
  20. 02 Dec, 2024 1 commit
  21. 29 Nov, 2024 2 commits
  22. 28 Nov, 2024 1 commit
    • myhloli's avatar
      refactor(pdf_check): improve character detection using PyMuPDF · ac888156
      myhloli authored
      - Replace pdfminer with PyMuPDF for character detection
      - Implement new method detect_invalid_chars_by_pymupdf
      - Update check_invalid_chars in pdf_meta_scan.py to use new method
      - Add __replace_0xfffd function in pdf_parse_union_core_v2.py to handle special characters
      - Remove unused imports and update requirements.txt
      ac888156
  23. 27 Nov, 2024 2 commits
  24. 26 Nov, 2024 3 commits
  25. 25 Nov, 2024 1 commit
  26. 22 Nov, 2024 1 commit
  27. 21 Nov, 2024 1 commit
  28. 19 Nov, 2024 1 commit
  29. 18 Nov, 2024 1 commit
  30. 15 Nov, 2024 1 commit