- 08 Apr, 2025 1 commit
-
-
Xiaomeng Zhao authored
-
- 07 Apr, 2025 3 commits
-
-
Xiaomeng Zhao authored
docs: update torchvision version in CUDA installation guide
-
Xiaomeng Zhao authored
docs: update torchvision version in CUDA installation guide
-
myhloli authored
- Update torchvision version from0.21.1 to0.21.0 in Windows CUDA acceleration guides - Update both English and Chinese versions of the documentation
-
- 06 Apr, 2025 4 commits
-
-
Xiaomeng Zhao authored
build: remove accelerate dependency
-
myhloli authored
- Remove accelerate package from requirements.txt - This change ensures only necessary external dependencies are introduced
-
Xiaomeng Zhao authored
build(deps): add accelerate package and update requirements https://github.com/opendatalab/MinerU/issues/2112
-
myhloli authored
- Add accelerate package to support model training acceleration - Update requirements.txt to include new dependency
-
- 03 Apr, 2025 29 commits
-
-
Xiaomeng Zhao authored
master -> dev
-
myhloli authored
-
Xiaomeng Zhao authored
Release 1.3.0
-
Xiaomeng Zhao authored
docs(readme): update release notes for version 1.3.0
-
Xiaomeng Zhao authored
docs(readme): update release notes for version 1.3.0
-
myhloli authored
- Remove duplicate entries for paddleocr2torch and thread safety - Add new entry for real-time progress bar implementation - Update mfr model to unimernet(2503) - Extend torch version compatibility - Enhance cuda support for various GPU models - Improve parsing speed on MPS devices
-
myhloli authored
- Update release notes in both English and Chinese README files - Highlight major optimizations and improvements in version 1.3.0 - Clarify compatibility changes for torch, CUDA, and Python versions - Emphasize performance improvements and parsing speed enhancements - Mention specific bug fixes and parsing effect optimizations
-
Xiaomeng Zhao authored
Release 1.3.0
-
Xiaomeng Zhao authored
fix: support non-pdf file in batch mode
-
Xiaomeng Zhao authored
fix: convert image with pymupdf
-
icecraft authored
-
Xiaomeng Zhao authored
fix: support non-pdf file in batch mode
-
icecraft authored
-
Xiaomeng Zhao authored
feat(web_api): update configuration and remove unused code
-
Xiaomeng Zhao authored
feat(web_api): update configuration and remove unused code
-
myhloli authored
- Comment out PaddlePaddle GPU installation in Dockerfile - Add OCR model download URL in download_models.py - Update config version in magic-pdf.json - Remove outdated information and simplify README.md - Remove volume creation for PaddleOCR models in Dockerfile
-
Xiaomeng Zhao authored
docs(user_guide): update installation guide and CUDA support
-
Xiaomeng Zhao authored
docs(user_guide): update installation guide and CUDA support
-
myhloli authored
- Update CUDA version requirements to 12.4 and higher - Add support for CUDA 12.6 and CANN environments- Update Python version requirements to 3.10-3.12 - Remove paddlepaddle-gpu installation and related instructions - Update magic-pdf installation command to use Aliyun mirror - Add storage requirements and update memory requirements - Update GPU hardware support list to include all GPUs with Tensor Cores - Add support for Apple Silicon
-
Xiaomeng Zhao authored
docs(readme): update changelog and compatibility information
-
Xiaomeng Zhao authored
docs(readme): update changelog and compatibility information
-
myhloli authored
- Update changelog for version 1.3.0 release - Clarify CUDA and GPU compatibility improvements - Add information about batch processing speed improvements - Update model download process and memory usage optimizations - Include link to batch processing demo script
-
Xiaomeng Zhao authored
feat(model): add tqdm progress bar to model prediction loops
-
Xiaomeng Zhao authored
feat(model): add tqdm progress bar to model prediction loops
-
myhloli authored
- Update table recognition logic to process each table individually - Refactor layout detection to use tqdm for progress tracking - Optimize OCR recognition by using a single tqdm wrapper - Improve MFR prediction with a more accurate progress bar - Simplify MFD prediction by removing unnecessary total calculation
-
myhloli authored
- Comment out OCR timing measurement code to improve readability and performance - Remove unnecessary logging of OCR processing time
-
myhloli authored
- Remove unused imports and comments - Increase MIN_BATCH_INFERENCE_SIZE from 100 to 200 - Comment out VRAM cleaning and logging in batch_analyze.py - Simplify code in doc_analyze_by_custom_model.py- Add tqdm progress bar in pdf_parse_union_core_v2.py - Enable tqdm in OCR processing
-
myhloli authored
- Remove outdated comments in table-config examples - Add tqdm to requirements in all Docker environments
-
myhloli authored
- Add tqdm progress bar to batch prediction loops in multiple model modules - Improve logging and error handling in batch analysis script - Update table model initialization to use default sub-model if none specified - Add tqdm dependency to requirements.txt
-
- 02 Apr, 2025 3 commits
-
-
Xiaomeng Zhao authored
feat(model): update Chinese OCR detection model to PP-OCRv3
-
Xiaomeng Zhao authored
feat(model): update Chinese OCR detection model to PP-OCRv3
-
myhloli authored
- Replace ch_PP-OCRv4_det_infer.pth with ch_PP-OCRv3_det_infer.pth in models_config.yml - Add new ch_PP-OCRv3_det_infer model configuration in arch_config.yaml
-