Commits · 230191c74e67e9483640ff4d81b2c2f4dbb13e15 · wangsen / MinerU

16 Jan, 2025 3 commits

refactor(model): update batch analyze logic for rapid table model · 452a9c0b

myhloli authored Jan 16, 2025

- Modify the batch analyze process to handle the rapid table model's output
- Add logic_points variable to capture additional output from rapid table prediction

452a9c0b

feat(table): upgrade RapidTable to1.0.3 and add sub-model support · 79c8a5c8

myhloli authored Jan 16, 2025

- Update RapidTable dependency to version 1.0.3
- Add support for sub-models in RapidTable
- Update magic-pdf configuration to include table sub-model
- Modify table model initialization to support sub-models
- Update table prediction logic to handle new output format

79c8a5c8

fix(magic_pdf): correct end page index and improve error handling · f209ddea

myhloli authored Jan 16, 2025

- Adjust end_page_id calculation to prevent IndexError when accessing pages
- Enhance error handling in LLM post-processing by specifically catching JSONDecodeError

f209ddea

15 Jan, 2025 1 commit

feat(model): improve batch analysis logic and support npu · f3502226

myhloli authored Jan 15, 2025

- Add support for NPU (Neural Processing Unit) when available
- Implement batch analysis for GPU and NPU devices
- Optimize memory usage and improve performance
- Update logging and error handling

f3502226

14 Jan, 2025 2 commits

refactor(BatchAnalyze): comment out image rotation logic in doclayout_yolo · 902dcd2c
myhloli authored Jan 14, 2025

902dcd2c

feat(layout): improve title block handling and layout detection · c20e9a1e

myhloli authored Jan 14, 2025

- Merge title blocks that are close to each other horizontally
- Adjust line insertion logic for title blocks- Increase image size and decrease confidence threshold for layout detection
- Update DocLayoutYOLO model weights
- Refactor drawing of bounding boxes for different block types

c20e9a1e

10 Jan, 2025 1 commit

fix(device): enable MPS support and fix related issues · 203b8f90

myhloli authored Jan 10, 2025

- Add MPS support for Apple Silicon devices
- Implement empty_cache() for MPS devices
- Set PYTORCH_ENABLE_MPS_FALLBACK environment variable
- Adjust MFR model device allocation for MPS

203b8f90

09 Jan, 2025 3 commits

refactor(magic_pdf): update OCR engine selection in RapidTableModel · bd1b7677

myhloli authored Jan 09, 2025

- Remove conditional logic for OCR engine selection
- Always use RapidOCR as the OCR engine
- Simplify the __init__ method by removing unused code

bd1b7677

refactor(model): remove unused YOLO v11 language detection model · a80ff051

myhloli authored Jan 09, 2025

- Remove YOLO v11 language detection model from model_configs.yaml
- Update language detection utils to use a fixed model path instead of dynamic configuration
- Remove unused model weight parameter for YOLO v11 language detection

a80ff051

refactor(langdetect): simplify language detection model and improve logging · 3271cf75

myhloli authored Jan 09, 2025

- Remove LangDetectMode and related conditional logic
- Use a single model weight for language detection
- Add logging for language detection results
- Update model initialization and prediction methods

3271cf75

08 Jan, 2025 2 commits

feat(model): add language detection model and update related modules · 735f3a70

myhloli authored Jan 08, 2025

- Add language detection model initialization and integration
- Update model list to include language detection
- Refactor language detection utils for better model management

735f3a70

feat(language-detection): improve language detection accuracy for specific languages · 356cb1f2

myhloli authored Jan 08, 2025

- Add separate models for Chinese/Japanese and English/French/German detection
- Implement mode-based detection to use appropriate models for different languages
- Update language detection process to use higher DPI for better accuracy
- Modify model initialization and prediction logic to support new language-specific models

356cb1f2

06 Jan, 2025 2 commits
- fix(table): handle empty OCR result in rapidtable · 12caa784
  myhloli authored Jan 06, 2025
```
- Add check for empty OCR result when using PaddleOCR model
- Assign None to ocr_result if no text is detected, preventing further errors
```
  12caa784
- refactor: remove unused method in MagicModel class · d13f3c6d
  icecraft authored Jan 06, 2025
  
  d13f3c6d
05 Jan, 2025 1 commit

fix(magic-pdf): update OCR model selection logic · 16a0a350

myhloli authored Jan 05, 2025

- Add missing 'else' statement in OCR model selection logic
- Ensure consistent formatting of 'if' statements for better readability
- Remove unnecessary empty line in the 'app.py' file

16a0a350

03 Jan, 2025 2 commits

refactor(ocr): comment out unnecessary log statement · 04febf52
myhloli authored Jan 03, 2025
```
- Remove logger.info() call for additional_ocr_params to reduce log verbosity
```
04febf52

feat(model): add onnxruntime support for paddleocr on cpu · 512adb67

myhloli authored Jan 03, 2025

- Implement ONNXModelSingleton to manage ONNX models
- Modify ModifiedPaddleOCR to use ONNX models on ARM CPUs without CUDA
- Update RapidTableModel to use RapidOCR with ONNXRuntime on CPU
- Add rapidocr_onnxruntime dependency in setup.py

512adb67

30 Dec, 2024 2 commits

refactor(magic_pdf): comment out npu-related code · 88b909e2

myhloli authored Dec 30, 2024

- Remove use_npu variable initialization
- Comment out device assignment and npu check
- Comment out use_npu parameter in ModifiedPaddleOCR constructor

88b909e2

fix(npu): correct module name for NPU operations · 2684e775

myhloli authored Dec 30, 2024

- Update `clean_memory.py` to use `torch_npu.npu` instead of `torch.npu`
- Update `model_utils.py` to use `torch_npu.npu` instead of `torch.npu`
- Simplify NPU availability check and bfloat16 support in `pdf_parse_union_core_v2.py`

2684e775

26 Dec, 2024 2 commits

refactor(device): optimize memory cleaning and device selection · 50f48417

myhloli authored Dec 26, 2024

- Update clean_memory function to support both CUDA and NPU devices
- Implement get_device function to centralize device selection logic
- Modify model initialization and memory cleaning to use the selected device
- Update RapidTableModel to support both RapidOCR and PaddleOCR engines

50f48417

feat(model): add npu support and optimize table model · 7990e7df

myhloli authored Dec 26, 2024

- Add NPU support for memory cleaning and model initialization
- Optimize table model initialization and prediction process
- Update memory utils to support NPU
- Add language parameter for table model

7990e7df

18 Dec, 2024 2 commits

refactor(magic_pdf): move model config variables · 489f70e9

myhloli authored Dec 18, 2024

- Move __use_inside_model__ and __model_mode__ from operators/__init__.py to model/__init__.py
- These variables are more appropriately located in the model module since they relate to model configuration

489f70e9

refactor: refactor code · b2887ca0
icecraft authored Dec 18, 2024

b2887ca0

17 Dec, 2024 1 commit

feat(language-detection): add YOLOv11 language detection model · 20438bd2

myhloli authored Dec 17, 2024

- Add YOLOv11 language detection model for PDF documents
- Implement language detection in PymuDocDataset
- Update app.py to include 'auto' language option
- Create language detection utilities and constants

20438bd2

16 Dec, 2024 1 commit

refactor(magic_pdf): remove YOLO_VERBOSE setting and update YOLOv8 prediction verbosity · 9e4ebea9

myhloli authored Dec 16, 2024

- Remove YOLO_VERBOSE environment variable from multiple files
- Set verbose=False in YOLOv8 prediction method to suppress logger output

9e4ebea9

13 Dec, 2024 2 commits
- feat: add logging for detection time in BatchAnalyze when OCR is not applied · be010394
  Suven authored Dec 13, 2024
  
  be010394
- feat: enhance batch processing in BatchAnalyze with layout and OCR timing logs · 49bfdf07
  Suven authored Dec 13, 2024
  
  49bfdf07
12 Dec, 2024 3 commits
- fix: batch methods in DocLayoutYOLO and YOLOv8 models · 4fd1e41e
  Suven authored Dec 12, 2024
  
  4fd1e41e
- feat: add batch prediction methods for YOLOv8 and Unimernet models · 7ce9edc6
  Suven authored Dec 12, 2024
  
  7ce9edc6
- perf(layout): optimize layout detection for PDF extraction · 6a75d7dc
  myhloli authored Dec 12, 2024
```
- Add initial setup for layout detection
- Implement conditional cropping for tall images
- Skip cropping for wide images to improve performance
- Reuse Image object across layout detection steps
```
  6a75d7dc
11 Dec, 2024 2 commits
- feat(layout): improve layout detection for DocLayout_YOLO model · f5d812b3
  myhloli authored Dec 11, 2024
```
- Implement image cropping and pasting technique to enhance layout detection
- Adjust detected polygons to original image coordinates
- Add comments for better code readability
```
  f5d812b3
- feat: remove pipe_auto_mode · 302a6950
  xu rui authored Dec 11, 2024
  
  302a6950
10 Dec, 2024 4 commits

refactor(model): update import paths for PaddleOCR modules · 061c03a0

myhloli authored Dec 11, 2024

- Change import paths from paddleocr.ppocr to ppocr for utility functions
- Update import paths for logging and utility modules in ppocr_273_mod.py- Modify import paths for tablemaster_paddle.py to use ppstructure instead of paddleocr.ppstructure

061c03a0

refactor(tablemaster): update import paths for TableSystem and init_args · 01cd633d

myhloli authored Dec 11, 2024

- Change import path for TableSystem from 'ppstructure.table.predict_table' to 'paddleocr.ppstructure.table.predict_table'
- Change import path for init_args from 'ppstructure.utility' to 'paddleocr.ppstructure.utility'

01cd633d

refactor(magic_pdf): update paddleocr module import paths · 56fad23d

myhloli authored Dec 11, 2024

- Modify import paths for paddleocr utilities in ocr_utils.py and ppocr_273_mod.py
- Change from `ppocr.utils.utility` to `paddleocr.ppocr.utils.utility`
- Update related import statements in two files to reflect the new path

56fad23d

fix(magic_pdf): disable PaddlePaddle signal handler · dd7f6781

myhloli authored Dec 10, 2024

- Import paddle module and disable its signal handler to prevent interference with other components
- This change addresses potential conflicts between PaddlePaddle and other libraries or system signals

dd7f6781

09 Dec, 2024 2 commits

refactor(magic_pdf): optimize environment setup and dependencies · a296ea41

myhloli authored Dec 09, 2024

- Add environment variables to disable albumentations and yolo updates
- Import torchtext and disable deprecation warnings
- Update unimernet to 0.2.2
- Specify ultralytics version as >=8.3.48
- Remove upper version limit for torch

a296ea41

fix: add parse_pdf_type and version · 57f9f9dc
icecraft authored Dec 09, 2024

57f9f9dc

07 Dec, 2024 1 commit
- fix: 1. ocr txt mode error 2. lose pdf_parse_type field · 87af738a
  sawmice authored Dec 07, 2024
  
  87af738a
06 Dec, 2024 1 commit

refactor(magic-pdf): optimize model initialization and concurrency control · 012a46e0

myhloli authored Dec 06, 2024

- Remove concurrency limit logic from app.py
- Update model initialization process in various modules
- Remove unused VRAM check for concurrency limit
- Refactor OCR model initialization in pdf_extract_kit.py
- Update txt_spans_extract_v2 function to use lang parameter instead of ocr_model

012a46e0