1. 16 Apr, 2025 1 commit
  2. 14 Apr, 2025 1 commit
  3. 12 Apr, 2025 1 commit
  4. 08 Apr, 2025 1 commit
  5. 03 Apr, 2025 1 commit
  6. 01 Apr, 2025 1 commit
  7. 07 Mar, 2025 1 commit
    • myhloli's avatar
      refactor(magic_pdf): replace PIL with NumPy for image processing · 1b34f7e4
      myhloli authored
      - Remove PIL usage across multiple files
      - Convert image processing functions to use NumPy arrays
      - Update crop_img function to work with NumPy arrays
      - Modify image loading and resizing to use NumPy and OpenCV
      - Clean up unused imports and comments related to PIL
      1b34f7e4
  8. 04 Mar, 2025 1 commit
  9. 03 Mar, 2025 2 commits
  10. 27 Feb, 2025 1 commit
  11. 09 Feb, 2025 1 commit
  12. 23 Jan, 2025 1 commit
  13. 22 Jan, 2025 1 commit
  14. 15 Jan, 2025 1 commit
  15. 14 Jan, 2025 1 commit
    • myhloli's avatar
      feat(layout): improve title block handling and layout detection · c20e9a1e
      myhloli authored
      - Merge title blocks that are close to each other horizontally
      - Adjust line insertion logic for title blocks- Increase image size and decrease confidence threshold for layout detection
      - Update DocLayoutYOLO model weights
      - Refactor drawing of bounding boxes for different block types
      c20e9a1e
  16. 10 Jan, 2025 3 commits
  17. 09 Jan, 2025 1 commit
  18. 05 Jan, 2025 1 commit
    • myhloli's avatar
      feat(tools): add character bounding box drawing functionality · f911a102
      myhloli authored
      - Add `draw_char_bbox` function to `draw_bbox.py` for drawing character bounding boxes
      - Integrate `draw_char_bbox` into `common.py` for use in PDF processing pipeline
      - Include option to draw character bounding boxes in debug mode
      f911a102
  19. 30 Dec, 2024 1 commit
    • myhloli's avatar
      fix(npu): correct module name for NPU operations · 2684e775
      myhloli authored
      - Update `clean_memory.py` to use `torch_npu.npu` instead of `torch.npu`
      - Update `model_utils.py` to use `torch_npu.npu` instead of `torch.npu`
      - Simplify NPU availability check and bfloat16 support in `pdf_parse_union_core_v2.py`
      2684e775
  20. 26 Dec, 2024 2 commits
    • myhloli's avatar
      refactor(device): optimize memory cleaning and device selection · 50f48417
      myhloli authored
      - Update clean_memory function to support both CUDA and NPU devices
      - Implement get_device function to centralize device selection logic
      - Modify model initialization and memory cleaning to use the selected device
      - Update RapidTableModel to support both RapidOCR and PaddleOCR engines
      50f48417
    • myhloli's avatar
      feat(model): add npu support and optimize table model · 7990e7df
      myhloli authored
      - Add NPU support for memory cleaning and model initialization
      - Optimize table model initialization and prediction process
      - Update memory utils to support NPU
      - Add language parameter for table model
      7990e7df
  21. 24 Dec, 2024 1 commit
    • myhloli's avatar
      feat(llm): add LLM-aided formula and text correction · c660fdc8
      myhloli authored
      - Add LLM-aided formula and text correction functionality
      - Update config reader to include LLM-aided settings
      - Create new LLM-aided processing module
      - Update main processing script to incorporate LLM-aided corrections
      - Modify download scripts to check for new config version
      c660fdc8
  22. 11 Dec, 2024 2 commits
  23. 10 Dec, 2024 1 commit
  24. 03 Dec, 2024 2 commits
  25. 02 Dec, 2024 1 commit
  26. 29 Nov, 2024 2 commits
  27. 28 Nov, 2024 1 commit
    • myhloli's avatar
      refactor(pdf_check): improve character detection using PyMuPDF · ac888156
      myhloli authored
      - Replace pdfminer with PyMuPDF for character detection
      - Implement new method detect_invalid_chars_by_pymupdf
      - Update check_invalid_chars in pdf_meta_scan.py to use new method
      - Add __replace_0xfffd function in pdf_parse_union_core_v2.py to handle special characters
      - Remove unused imports and update requirements.txt
      ac888156
  28. 27 Nov, 2024 2 commits
  29. 26 Nov, 2024 3 commits
  30. 25 Nov, 2024 1 commit