1. 07 Mar, 2025 1 commit
    • myhloli's avatar
      refactor(magic_pdf): replace PIL with NumPy for image processing · 1b34f7e4
      myhloli authored
      - Remove PIL usage across multiple files
      - Convert image processing functions to use NumPy arrays
      - Update crop_img function to work with NumPy arrays
      - Modify image loading and resizing to use NumPy and OpenCV
      - Clean up unused imports and comments related to PIL
      1b34f7e4
  2. 11 Feb, 2025 1 commit
    • myhloli's avatar
      fix(model): move environment variable settings to global scope · f5112e21
      myhloli authored
      - Move environment variable settings for NPU, MPS, and other configurations to the global scope in doc_analyze_by_custom_model.py
      - Remove redundant environment variable settings in pdf_extract_kit.py
      - This change ensures consistent configuration across the application and avoids potential conflicts or duplicate settings
      f5112e21
  3. 16 Jan, 2025 1 commit
    • myhloli's avatar
      feat(table): upgrade RapidTable to1.0.3 and add sub-model support · 79c8a5c8
      myhloli authored
      - Update RapidTable dependency to version 1.0.3
      - Add support for sub-models in RapidTable
      - Update magic-pdf configuration to include table sub-model
      - Modify table model initialization to support sub-models
      - Update table prediction logic to handle new output format
      79c8a5c8
  4. 15 Jan, 2025 1 commit
  5. 14 Jan, 2025 1 commit
    • myhloli's avatar
      feat(layout): improve title block handling and layout detection · c20e9a1e
      myhloli authored
      - Merge title blocks that are close to each other horizontally
      - Adjust line insertion logic for title blocks- Increase image size and decrease confidence threshold for layout detection
      - Update DocLayoutYOLO model weights
      - Refactor drawing of bounding boxes for different block types
      c20e9a1e
  6. 10 Jan, 2025 1 commit
  7. 26 Dec, 2024 2 commits
    • myhloli's avatar
      refactor(device): optimize memory cleaning and device selection · 50f48417
      myhloli authored
      - Update clean_memory function to support both CUDA and NPU devices
      - Implement get_device function to centralize device selection logic
      - Modify model initialization and memory cleaning to use the selected device
      - Update RapidTableModel to support both RapidOCR and PaddleOCR engines
      50f48417
    • myhloli's avatar
      feat(model): add npu support and optimize table model · 7990e7df
      myhloli authored
      - Add NPU support for memory cleaning and model initialization
      - Optimize table model initialization and prediction process
      - Update memory utils to support NPU
      - Add language parameter for table model
      7990e7df
  8. 16 Dec, 2024 1 commit
  9. 12 Dec, 2024 1 commit
  10. 11 Dec, 2024 1 commit
  11. 06 Dec, 2024 7 commits
  12. 22 Nov, 2024 1 commit
  13. 21 Nov, 2024 1 commit
    • myhloli's avatar
      feat(ocr): improve text detection and OCR accuracy · b2e37a2d
      myhloli authored
      - Update OCR utils to handle different box formats and improve angle calculation
      - Modify PDF extraction kit to support OCR option and optimize processing flow
      - Enhance PPOCR model to sort and filter detection boxes, improving text splitting accuracy
      b2e37a2d
  14. 19 Nov, 2024 1 commit
  15. 15 Nov, 2024 1 commit
  16. 08 Nov, 2024 2 commits
    • myhloli's avatar
      feat(table): add RapidOCR support for RapidTable model · fe2c2c0d
      myhloli authored
      - Integrate RapidOCR with RapidTable model for table recognition
      - Improve memory management for devices with <= 8GB VRAM
      - Update table recognition process to use RapidOCR for RapidTable
      - Add rapidocr-paddle dependency in setup.py
      fe2c2c0d
    • myhloli's avatar
      feat(table): integrate RapidTable model for table recognition · 240fe99e
      myhloli authored
      - Add RapidTable model support for table recognition
      - Update table model configuration and initialization
      - Modify table recognition process to use RapidTable when specified
      - Add RapidTable dependency to setup.py
      240fe99e
  17. 06 Nov, 2024 1 commit
  18. 04 Nov, 2024 2 commits
    • myhloli's avatar
      feat(table): upgrade StructEqTable model and integrate into PDF Extract Kit · 11f23843
      myhloli authored
      - Update StructTableModel to use the latest struct-eqtable library
      - Add support for HTML table extraction in PDF Extract Kit
      - Improve error handling and model initialization
      - Update dependencies in setup.py for struct-eqtable
      11f23843
    • ciaran's avatar
      Update pdf_extract_kit.py · fb6cb8b0
      ciaran authored
      Modify line 397 to ensure compatibility with CPU execution, addressing the issue where specifying 'cpu' in config.json still results in a ValueError for expecting a cuda device but getting 'cpu' during demo execution.
      fb6cb8b0
  19. 28 Oct, 2024 3 commits
  20. 25 Oct, 2024 1 commit
  21. 24 Oct, 2024 1 commit
  22. 23 Oct, 2024 1 commit
    • myhloli's avatar
      feat(model): add support for DocLayout-YOLO model · 1279f2cd
      myhloli authored
      - Add new layout model option: DocLayout-YOLO
      - Implement model initialization and prediction for DocLayout-YOLO
      - Update configuration options to include new model- Modify existing code to support both LayoutLMv3 and DocLayout-YOLO models
      - Update Gradio app to support more Custom Switch
      1279f2cd
  23. 17 Oct, 2024 2 commits
  24. 14 Oct, 2024 1 commit
    • myhloli's avatar
      feat(list&index block): detect and merge list and index blocks · 1f1dd353
      myhloli authored
      - Add detection for list and index blocks in OCR processing- Implement merging of list and index blocks across pages
      - Update block types to include list and index categories
      - Adjust text merging logic to handle new block types
      - Modify layout drawing to distinguish list and index blocks
      1f1dd353
  25. 08 Oct, 2024 2 commits
  26. 06 Oct, 2024 1 commit
    • myhloli's avatar
      refactor(model): improve timing information and performance · be1b1ae7
      myhloli authored
      - Enhance timing output precision to two decimal places for better readability- Calculate and log document analysis speed in pages per second
      - Optimize logging for YOLO and table recognition processes
      - Remove unnecessary comments and improve code efficiency
      be1b1ae7
  27. 29 Sep, 2024 1 commit