Commits · 79c8a5c8cb8653989505ef945828b08a8320641a · wangsen / MinerU

16 Jan, 2025 1 commit

feat(table): upgrade RapidTable to1.0.3 and add sub-model support · 79c8a5c8

myhloli authored Jan 16, 2025

- Update RapidTable dependency to version 1.0.3
- Add support for sub-models in RapidTable
- Update magic-pdf configuration to include table sub-model
- Modify table model initialization to support sub-models
- Update table prediction logic to handle new output format

79c8a5c8

14 Jan, 2025 1 commit

feat(layout): improve title block handling and layout detection · c20e9a1e

myhloli authored Jan 14, 2025

- Merge title blocks that are close to each other horizontally
- Adjust line insertion logic for title blocks- Increase image size and decrease confidence threshold for layout detection
- Update DocLayoutYOLO model weights
- Refactor drawing of bounding boxes for different block types

c20e9a1e

09 Jan, 2025 3 commits

refactor(magic_pdf): update OCR engine selection in RapidTableModel · bd1b7677

myhloli authored Jan 09, 2025

- Remove conditional logic for OCR engine selection
- Always use RapidOCR as the OCR engine
- Simplify the __init__ method by removing unused code

bd1b7677

refactor(model): remove unused YOLO v11 language detection model · a80ff051

myhloli authored Jan 09, 2025

- Remove YOLO v11 language detection model from model_configs.yaml
- Update language detection utils to use a fixed model path instead of dynamic configuration
- Remove unused model weight parameter for YOLO v11 language detection

a80ff051

refactor(langdetect): simplify language detection model and improve logging · 3271cf75

myhloli authored Jan 09, 2025

- Remove LangDetectMode and related conditional logic
- Use a single model weight for language detection
- Add logging for language detection results
- Update model initialization and prediction methods

3271cf75

08 Jan, 2025 2 commits

feat(model): add language detection model and update related modules · 735f3a70

myhloli authored Jan 08, 2025

- Add language detection model initialization and integration
- Update model list to include language detection
- Refactor language detection utils for better model management

735f3a70

feat(language-detection): improve language detection accuracy for specific languages · 356cb1f2

myhloli authored Jan 08, 2025

- Add separate models for Chinese/Japanese and English/French/German detection
- Implement mode-based detection to use appropriate models for different languages
- Update language detection process to use higher DPI for better accuracy
- Modify model initialization and prediction logic to support new language-specific models

356cb1f2

06 Jan, 2025 1 commit

fix(table): handle empty OCR result in rapidtable · 12caa784

myhloli authored Jan 06, 2025

- Add check for empty OCR result when using PaddleOCR model
- Assign None to ocr_result if no text is detected, preventing further errors

12caa784

05 Jan, 2025 1 commit

fix(magic-pdf): update OCR model selection logic · 16a0a350

myhloli authored Jan 05, 2025

- Add missing 'else' statement in OCR model selection logic
- Ensure consistent formatting of 'if' statements for better readability
- Remove unnecessary empty line in the 'app.py' file

16a0a350

03 Jan, 2025 2 commits

refactor(ocr): comment out unnecessary log statement · 04febf52
myhloli authored Jan 03, 2025
```
- Remove logger.info() call for additional_ocr_params to reduce log verbosity
```
04febf52

feat(model): add onnxruntime support for paddleocr on cpu · 512adb67

myhloli authored Jan 03, 2025

- Implement ONNXModelSingleton to manage ONNX models
- Modify ModifiedPaddleOCR to use ONNX models on ARM CPUs without CUDA
- Update RapidTableModel to use RapidOCR with ONNXRuntime on CPU
- Add rapidocr_onnxruntime dependency in setup.py

512adb67

30 Dec, 2024 2 commits

refactor(magic_pdf): comment out npu-related code · 88b909e2

myhloli authored Dec 30, 2024

- Remove use_npu variable initialization
- Comment out device assignment and npu check
- Comment out use_npu parameter in ModifiedPaddleOCR constructor

88b909e2

fix(npu): correct module name for NPU operations · 2684e775

myhloli authored Dec 30, 2024

- Update `clean_memory.py` to use `torch_npu.npu` instead of `torch.npu`
- Update `model_utils.py` to use `torch_npu.npu` instead of `torch.npu`
- Simplify NPU availability check and bfloat16 support in `pdf_parse_union_core_v2.py`

2684e775

26 Dec, 2024 2 commits

refactor(device): optimize memory cleaning and device selection · 50f48417

myhloli authored Dec 26, 2024

- Update clean_memory function to support both CUDA and NPU devices
- Implement get_device function to centralize device selection logic
- Modify model initialization and memory cleaning to use the selected device
- Update RapidTableModel to support both RapidOCR and PaddleOCR engines

50f48417

feat(model): add npu support and optimize table model · 7990e7df

myhloli authored Dec 26, 2024

- Add NPU support for memory cleaning and model initialization
- Optimize table model initialization and prediction process
- Update memory utils to support NPU
- Add language parameter for table model

7990e7df

17 Dec, 2024 1 commit

feat(language-detection): add YOLOv11 language detection model · 20438bd2

myhloli authored Dec 17, 2024

- Add YOLOv11 language detection model for PDF documents
- Implement language detection in PymuDocDataset
- Update app.py to include 'auto' language option
- Create language detection utilities and constants

20438bd2

16 Dec, 2024 1 commit

refactor(magic_pdf): remove YOLO_VERBOSE setting and update YOLOv8 prediction verbosity · 9e4ebea9

myhloli authored Dec 16, 2024

- Remove YOLO_VERBOSE environment variable from multiple files
- Set verbose=False in YOLOv8 prediction method to suppress logger output

9e4ebea9

12 Dec, 2024 2 commits
- fix: batch methods in DocLayoutYOLO and YOLOv8 models · 4fd1e41e
  Suven authored Dec 12, 2024
  
  4fd1e41e
- feat: add batch prediction methods for YOLOv8 and Unimernet models · 7ce9edc6
  Suven authored Dec 12, 2024
  
  7ce9edc6
10 Dec, 2024 3 commits

refactor(model): update import paths for PaddleOCR modules · 061c03a0

myhloli authored Dec 11, 2024

- Change import paths from paddleocr.ppocr to ppocr for utility functions
- Update import paths for logging and utility modules in ppocr_273_mod.py- Modify import paths for tablemaster_paddle.py to use ppstructure instead of paddleocr.ppstructure

061c03a0

refactor(tablemaster): update import paths for TableSystem and init_args · 01cd633d

myhloli authored Dec 11, 2024

- Change import path for TableSystem from 'ppstructure.table.predict_table' to 'paddleocr.ppstructure.table.predict_table'
- Change import path for init_args from 'ppstructure.utility' to 'paddleocr.ppstructure.utility'

01cd633d

refactor(magic_pdf): update paddleocr module import paths · 56fad23d

myhloli authored Dec 11, 2024

- Modify import paths for paddleocr utilities in ocr_utils.py and ppocr_273_mod.py
- Change from `ppocr.utils.utility` to `paddleocr.ppocr.utils.utility`
- Update related import statements in two files to reflect the new path

56fad23d

06 Dec, 2024 5 commits

refactor(magic-pdf): optimize model initialization and concurrency control · 012a46e0

myhloli authored Dec 06, 2024

- Remove concurrency limit logic from app.py
- Update model initialization process in various modules
- Remove unused VRAM check for concurrency limit
- Refactor OCR model initialization in pdf_extract_kit.py
- Update txt_spans_extract_v2 function to use lang parameter instead of ocr_model

012a46e0

refactor(model): implement thread-safe OCR model initialization · f2a92d57

myhloli authored Dec 06, 2024

- Add threading support for OCR model initialization
- Modify AtomModelSingleton to handle thread-specific instances
- Update PDFExtractKit and PDFParseUnionCoreV2 to use new thread-safe OCR initialization

f2a92d57

fix(model): simplify model initialization logic · a9723c61
myhloli authored Dec 06, 2024

a9723c61

refactor(magic_pdf): optimize model initialization and threading · 878f3de0

赵小蒙 authored Dec 06, 2024

- Remove unnecessary threading.Lock in AtomModelSingleton
- Add threading.Lock to CustomPEKModel for OCR processing
- Simplify model initialization logic in AtomModelSingleton

878f3de0

perf(model): optimize model initialization · ce592f8b

myhloli authored Dec 06, 2024

- Add condition to return existing model if already initialized
- Improve efficiency by avoiding redundant model creation

ce592f8b

05 Dec, 2024 1 commit

perf(model): add threading lock for OCR model initialization · 04478095

myhloli authored Dec 05, 2024

- Introduce a lock to synchronize access to OCR model initialization- This change improves thread safety when multiple threads access the OCR model concurrently
- The lock ensures that the OCR model is initialized only once, even in multi-threaded scenarios

04478095

03 Dec, 2024 2 commits

fix(vram): improve VRAM checking logic · 104273cc

myhloli authored Dec 03, 2024

- Update VRAM checking logic in app.py and model_utils.py
- Add None and type checks for VRAM values
- Adjust concurrency limit calculation in app.py
- Modify clean_vram function to handle cases with no VRAM information

104273cc

feat(gradio_app): implement dynamic concurrency limit based on VRAM · b1fe9d4f

myhloli authored Dec 03, 2024

- Add get_concurrency_limit function to calculate concurrency limit based on VRAM
- Update clean_vram function and rename to get_vram for better clarity
- Apply concurrency limit to the to_markdown function in the Gradio app

b1fe9d4f

29 Nov, 2024 1 commit
- refactor(ocr): Fix the error of paddleocr failing to initialize in a multi-threaded environment · 7f2f2c0f
  myhloli authored Nov 29, 2024
  
  7f2f2c0f
27 Nov, 2024 1 commit

refactor(ocr): remove unused functions and optimize OCR processing loop · 5f4410b4

myhloli authored Nov 27, 2024

- Remove unused function `calculate_angle_degrees`- Refactor `calculate_is_angle` to use directly in OCR processing
- Eliminate unnecessary loop index `idx` in OCR processing loops

5f4410b4

26 Nov, 2024 1 commit

feat(ocr): filter out low confidence ocr results · eb45a0e8

myhloli authored Nov 26, 2024

- Add confidence score threshold to filter out low confidence OCR results
- Improve OCR accuracy by ignoring less certain detections

eb45a0e8

22 Nov, 2024 1 commit

fix(table): add null check for OCR result in rapid table prediction · 18aa1a20

myhloli authored Nov 22, 2024

- Add a null check for OCR result in the predict method
- Return None values if OCR result is None to prevent further processing

18aa1a20

21 Nov, 2024 2 commits

refactor(txt_parse): improve text extraction accuracy with new algorithm · 309be741

myhloli authored Nov 21, 2024

- Implement new text extraction method (txt_spans_extract_v2) to enhance accuracy
- Add character filling in spans for better text reconstruction
- Introduce empty span handling using OCR for missed text
- Optimize span filtering and overlap removal

309be741

feat(ocr): improve text detection and OCR accuracy · b2e37a2d

myhloli authored Nov 21, 2024

- Update OCR utils to handle different box formats and improve angle calculation
- Modify PDF extraction kit to support OCR option and optimize processing flow
- Enhance PPOCR model to sort and filter detection boxes, improving text splitting accuracy

b2e37a2d

19 Nov, 2024 1 commit
- refactor: move some constants or enums defs to config folder · b492c19c
  icecraft authored Nov 19, 2024
  
  b492c19c
18 Nov, 2024 1 commit

feat(ocr): improve handling of angled text boxes · 4fd966eb

myhloli authored Nov 18, 2024

- Add calculate_is_angle function to detect angled text boxes
- Update update_det_boxes and merge_det_boxes functions to handle angled text boxes
- Modify angle detection logic in various parts of the code

4fd966eb

15 Nov, 2024 1 commit
- refactor(model): rename and restructure model modules · 08f46125
  myhloli authored Nov 15, 2024
  
  08f46125