Commits · 88b909e20e0e27da406e694bf0010ea229e54406 · wangsen / MinerU

30 Dec, 2024 2 commits

refactor(magic_pdf): comment out npu-related code · 88b909e2

myhloli authored Dec 30, 2024

- Remove use_npu variable initialization
- Comment out device assignment and npu check
- Comment out use_npu parameter in ModifiedPaddleOCR constructor

88b909e2

fix(npu): correct module name for NPU operations · 2684e775

myhloli authored Dec 30, 2024

- Update `clean_memory.py` to use `torch_npu.npu` instead of `torch.npu`
- Update `model_utils.py` to use `torch_npu.npu` instead of `torch.npu`
- Simplify NPU availability check and bfloat16 support in `pdf_parse_union_core_v2.py`

2684e775

26 Dec, 2024 2 commits

refactor(device): optimize memory cleaning and device selection · 50f48417

myhloli authored Dec 26, 2024

- Update clean_memory function to support both CUDA and NPU devices
- Implement get_device function to centralize device selection logic
- Modify model initialization and memory cleaning to use the selected device
- Update RapidTableModel to support both RapidOCR and PaddleOCR engines

50f48417

feat(model): add npu support and optimize table model · 7990e7df

myhloli authored Dec 26, 2024

- Add NPU support for memory cleaning and model initialization
- Optimize table model initialization and prediction process
- Update memory utils to support NPU
- Add language parameter for table model

7990e7df

18 Dec, 2024 2 commits

refactor(magic_pdf): move model config variables · 489f70e9

myhloli authored Dec 18, 2024

- Move __use_inside_model__ and __model_mode__ from operators/__init__.py to model/__init__.py
- These variables are more appropriately located in the model module since they relate to model configuration

489f70e9

refactor: refactor code · b2887ca0
icecraft authored Dec 18, 2024

b2887ca0

17 Dec, 2024 1 commit

feat(language-detection): add YOLOv11 language detection model · 20438bd2

myhloli authored Dec 17, 2024

- Add YOLOv11 language detection model for PDF documents
- Implement language detection in PymuDocDataset
- Update app.py to include 'auto' language option
- Create language detection utilities and constants

20438bd2

16 Dec, 2024 1 commit

refactor(magic_pdf): remove YOLO_VERBOSE setting and update YOLOv8 prediction verbosity · 9e4ebea9

myhloli authored Dec 16, 2024

- Remove YOLO_VERBOSE environment variable from multiple files
- Set verbose=False in YOLOv8 prediction method to suppress logger output

9e4ebea9

13 Dec, 2024 2 commits
- feat: add logging for detection time in BatchAnalyze when OCR is not applied · be010394
  Suven authored Dec 13, 2024
  
  be010394
- feat: enhance batch processing in BatchAnalyze with layout and OCR timing logs · 49bfdf07
  Suven authored Dec 13, 2024
  
  49bfdf07
12 Dec, 2024 3 commits
- fix: batch methods in DocLayoutYOLO and YOLOv8 models · 4fd1e41e
  Suven authored Dec 12, 2024
  
  4fd1e41e
- feat: add batch prediction methods for YOLOv8 and Unimernet models · 7ce9edc6
  Suven authored Dec 12, 2024
  
  7ce9edc6
- perf(layout): optimize layout detection for PDF extraction · 6a75d7dc
  myhloli authored Dec 12, 2024
```
- Add initial setup for layout detection
- Implement conditional cropping for tall images
- Skip cropping for wide images to improve performance
- Reuse Image object across layout detection steps
```
  6a75d7dc
11 Dec, 2024 2 commits
- feat(layout): improve layout detection for DocLayout_YOLO model · f5d812b3
  myhloli authored Dec 11, 2024
```
- Implement image cropping and pasting technique to enhance layout detection
- Adjust detected polygons to original image coordinates
- Add comments for better code readability
```
  f5d812b3
- feat: remove pipe_auto_mode · 302a6950
  xu rui authored Dec 11, 2024
  
  302a6950
10 Dec, 2024 4 commits

refactor(model): update import paths for PaddleOCR modules · 061c03a0

myhloli authored Dec 11, 2024

- Change import paths from paddleocr.ppocr to ppocr for utility functions
- Update import paths for logging and utility modules in ppocr_273_mod.py- Modify import paths for tablemaster_paddle.py to use ppstructure instead of paddleocr.ppstructure

061c03a0

refactor(tablemaster): update import paths for TableSystem and init_args · 01cd633d

myhloli authored Dec 11, 2024

- Change import path for TableSystem from 'ppstructure.table.predict_table' to 'paddleocr.ppstructure.table.predict_table'
- Change import path for init_args from 'ppstructure.utility' to 'paddleocr.ppstructure.utility'

01cd633d

refactor(magic_pdf): update paddleocr module import paths · 56fad23d

myhloli authored Dec 11, 2024

- Modify import paths for paddleocr utilities in ocr_utils.py and ppocr_273_mod.py
- Change from `ppocr.utils.utility` to `paddleocr.ppocr.utils.utility`
- Update related import statements in two files to reflect the new path

56fad23d

fix(magic_pdf): disable PaddlePaddle signal handler · dd7f6781

myhloli authored Dec 10, 2024

- Import paddle module and disable its signal handler to prevent interference with other components
- This change addresses potential conflicts between PaddlePaddle and other libraries or system signals

dd7f6781

09 Dec, 2024 2 commits

refactor(magic_pdf): optimize environment setup and dependencies · a296ea41

myhloli authored Dec 09, 2024

- Add environment variables to disable albumentations and yolo updates
- Import torchtext and disable deprecation warnings
- Update unimernet to 0.2.2
- Specify ultralytics version as >=8.3.48
- Remove upper version limit for torch

a296ea41

fix: add parse_pdf_type and version · 57f9f9dc
icecraft authored Dec 09, 2024

57f9f9dc

07 Dec, 2024 1 commit
- fix: 1. ocr txt mode error 2. lose pdf_parse_type field · 87af738a
  sawmice authored Dec 07, 2024
  
  87af738a
06 Dec, 2024 9 commits

refactor(magic-pdf): optimize model initialization and concurrency control · 012a46e0

myhloli authored Dec 06, 2024

- Remove concurrency limit logic from app.py
- Update model initialization process in various modules
- Remove unused VRAM check for concurrency limit
- Refactor OCR model initialization in pdf_extract_kit.py
- Update txt_spans_extract_v2 function to use lang parameter instead of ocr_model

012a46e0

refactor(ocr): replace AtomModelSingleton with ocr_model_init for OCR model instantiation · 47a83d28

myhloli authored Dec 06, 2024

- Remove usage of AtomModelSingleton for OCR model creation
- Add ocr_model_init function to initialize OCR model
- Update OCR model initialization in pdf_extract_kit.py and pdf_parse_union_core_v2.py
- Modify txt_spans_extract_v2 function to accept ocr_model as a parameter
- Update parse_page_core function to use ocr_model instead of lang for OCR processing

47a83d28

refactor(model): implement thread-safe OCR model initialization · f2a92d57

myhloli authored Dec 06, 2024

- Add threading support for OCR model initialization
- Modify AtomModelSingleton to handle thread-specific instances
- Update PDFExtractKit and PDFParseUnionCoreV2 to use new thread-safe OCR initialization

f2a92d57

refactor(magic_pdf): remove unused threading lock and model initialization code · a1744b77

myhloli authored Dec 06, 2024

- Remove threading.Lock import and usage
- Delete unused model initialization comments and code- Simplify OCR model initialization in both pdf_extract_kit.py and pdf_parse_union_core_v2.py

a1744b77

refactor(model): replace AtomModelSingleton with ocr_model_init for OCR model initialization · 488660dd

myhloli authored Dec 06, 2024

- Remove usage of AtomModelSingleton for OCR model initialization
- Add import of ocr_model_init from model_init module
- Update OCR model initialization process to use ocr_model_init function
- Remove lock for OCR processing as it's no longer needed

488660dd

refactor(model): replace ModelSingleton with direct model initialization and improve threading · 6f636b6e

myhloli authored Dec 06, 2024

- Remove usage of ModelSingleton class
- Initialize model directly using custom_model_init function
- Add self._lock attribute to PDFExtractKit class for thread safety- Replace local lock with self._lock for OCR processing

6f636b6e

fix(model): simplify model initialization logic · a9723c61
myhloli authored Dec 06, 2024

a9723c61

refactor(magic_pdf): optimize model initialization and threading · 878f3de0

赵小蒙 authored Dec 06, 2024

- Remove unnecessary threading.Lock in AtomModelSingleton
- Add threading.Lock to CustomPEKModel for OCR processing
- Simplify model initialization logic in AtomModelSingleton

878f3de0

perf(model): optimize model initialization · ce592f8b

myhloli authored Dec 06, 2024

- Add condition to return existing model if already initialized
- Improve efficiency by avoiding redundant model creation

ce592f8b

05 Dec, 2024 1 commit

perf(model): add threading lock for OCR model initialization · 04478095

myhloli authored Dec 05, 2024

- Introduce a lock to synchronize access to OCR model initialization- This change improves thread safety when multiple threads access the OCR model concurrently
- The lock ensures that the OCR model is initialized only once, even in multi-threaded scenarios

04478095

03 Dec, 2024 5 commits

fix(vram): improve VRAM checking logic · 104273cc

myhloli authored Dec 03, 2024

- Update VRAM checking logic in app.py and model_utils.py
- Add None and type checks for VRAM values
- Adjust concurrency limit calculation in app.py
- Modify clean_vram function to handle cases with no VRAM information

104273cc

refactor: add docs · d44e7a28
xu rui authored Nov 29, 2024

d44e7a28
feat: add function definitions · 4a82d6a0
icecraft authored Nov 28, 2024

4a82d6a0
refactor: isolate inference and pipeline · a3a720ea
icecraft authored Nov 27, 2024

a3a720ea

feat(gradio_app): implement dynamic concurrency limit based on VRAM · b1fe9d4f

myhloli authored Dec 03, 2024

- Add get_concurrency_limit function to calculate concurrency limit based on VRAM
- Update clean_vram function and rename to get_vram for better clarity
- Apply concurrency limit to the to_markdown function in the Gradio app

b1fe9d4f

29 Nov, 2024 1 commit
- refactor(ocr): Fix the error of paddleocr failing to initialize in a multi-threaded environment · 7f2f2c0f
  myhloli authored Nov 29, 2024
  
  7f2f2c0f
28 Nov, 2024 1 commit
- fix(lite_model): Adapt iite Mode to the Hybrid OCR Mode in Version 0.10 · 9b4d77dc
  myhloli authored Nov 28, 2024
  
  9b4d77dc
27 Nov, 2024 1 commit

refactor(ocr): remove unused functions and optimize OCR processing loop · 5f4410b4

myhloli authored Nov 27, 2024

- Remove unused function `calculate_angle_degrees`- Refactor `calculate_is_angle` to use directly in OCR processing
- Eliminate unnecessary loop index `idx` in OCR processing loops

5f4410b4