- 23 Jan, 2025 6 commits
-
-
Xiaomeng Zhao authored
docs(README): update online demo links and enhance documentation readability
-
myhloli authored
- Update online demo links in both English and Chinese README files
-
Xiaomeng Zhao authored
docs(readme): update changelog for v1.1.0 release
-
myhloli authored
docs(readme): update changelog for v1.1.0 release- Update model capabilities: upgrade to latest doclayout_yolo(2501) and unimernet(2501) models - Improve performance: optimize resource usage and processing pipeline for faster parsing on high-end devices- Enhance parsing effects: add new heading classification feature to online demo - Refactor changelog structure for better readability and organization
-
Xiaomeng Zhao authored
feat(table-config): add sub_model configuration for rapid_table
-
myhloli authored
- Add sub_model configuration option for rapid_table model - Provide two sub_model options: slanet_plus and unitable
-
- 22 Jan, 2025 8 commits
-
-
Xiaomeng Zhao authored
docs(url): update Miners links in header
-
myhloli authored
- Change Miners homepage link from 'https://mineru.org.cn/home?source=online' to 'https://mineru.net/home?source=online' - Change Miners client download link from 'https://mineru.org.cn/client?source=online' to 'https://mineru.net/client?source=online'
-
Xiaomeng Zhao authored
docs(readme):update readme for 1.1.0
-
myhloli authored
-
myhloli authored
- Add timing measurement for formula, text, and title optimization using LLM - Log the execution time for each LLM aided process
-
Xiaomeng Zhao authored
refactor(pdf_parse): uncomment char bbox validation logic
-
myhloli authored
- Add a check to return 0 when either bbox1_area or bbox2_area is zero - This prevents division by zero errors when calculating IoU
-
myhloli authored
- Restore commented code for filtering out characters with invalid bounding boxes - This change may affect the filtering of unnecessary characters in PDF parsing
-
- 21 Jan, 2025 11 commits
-
-
Xiaomeng Zhao authored
perf(magic_pdf): optimize batch processing for GPU
-
myhloli authored
- Update conditions for batch ratio assignment: -8 <= gpu_memory < 10: batch_ratio = 2 - 10 <= gpu_memory <= 12: batch_ratio =4 - This fix ensures proper batch ratio selection for GPU memory sizes
-
myhloli authored
- Improve batch ratio calculation based on GPU memory - Enhance performance for devices with 8GB or more VRAM
-
Xiaomeng Zhao authored
perf(magic_pdf): adjust batch ratio calculation for GPU memory
-
myhloli authored
- Reduce batch_ratio by 1 for better performance and stability - This change ensures more consistent memory usage when processing documents
-
Xiaomeng Zhao authored
perf(magic_pdf): optimize batch ratio calculation for GPU
-
myhloli authored
refactor(magic_pdf): adjust VRAM allocation and MFR batch size- Update VRAM allocation logic to use 'VIRTUAL_VRAM_SIZE' environment variable - Reduce MFR (Math Formula Recognition) batch size from 64 to 32
-
myhloli authored
- Update GPU memory check and batch ratio calculation logic - Add support for virtual VRAM size environment variable - Improve logging for GPU memory and batch ratio
-
myhloli authored
- Reduce YOLO_LAYOUT_BASE_BATCH_SIZE from 4 to 1 - Simplify batch ratio calculation for formula detection - Remove unused conditional logic in batch ratio determination
-
Xiaomeng Zhao authored
fix(models): update unimernet_small model path
-
myhloli authored
- Update model path from 'unimernet_small' to 'unimernet_small_2501' in multiple scripts and configuration files - This change affects download_models.py, download_models_hf.py, and model_configs.yaml
-
- 20 Jan, 2025 6 commits
-
-
Xiaomeng Zhao authored
fix(ocr): improve ONNX model initialization and error handling
-
myhloli authored
- Add key length validation for ONNX model initialization - Move import statements to the top of the file - Wrap model initialization in a try-except block for better error handling - Refactor code to improve readability and maintainability
-
Xiaomeng Zhao authored
feat(pdf_parse): remove tilted lines for better text extraction
-
myhloli authored
- Add remove_tilted_line function to filter out lines with angles between 2 and 88 degrees - Integrate the new function into the text extraction process - Improve the accuracy of text block processing by removing non-horizontal/vertical lines
-
Xiaomeng Zhao authored
LGTM
-
陆逊 authored
-
- 17 Jan, 2025 8 commits
-
-
Xiaomeng Zhao authored
feat(llm_aided): add reasonability check and fine-tuning guidelines
-
myhloli authored
- Added instructions for checking the reasonability of heading levels - Included guidelines for making fine adjustments based on context and logic - Emphasized the importance of aligning the final result with the document's actual structure
-
Xiaomeng Zhao authored
fix(magic_pdf): limit batch ratio for GPU memory
-
myhloli authored
- Commented out the original batch ratio calculation - Set a fixed batch ratio of 2 for GPUs with less than 8 GB memory - Increased batch ratio to 4 for GPUs with 8 GB or more memory
-
Xiaomeng Zhao authored
refactor(model): update config version check to 1.1.1
-
myhloli authored
- Update the version check in download_models.py and download_models_hf.py - Change the threshold from '1.1.0' to '1.1.1' for model configuration updates
-
Xiaomeng Zhao authored
refactor(table): add device configuration for Unitable model
-
myhloli authored
- Import get_device function from magic_pdf.libs.config_reader- Update RapidTableModel initialization to include device parameter for Unitable model
-
- 16 Jan, 2025 1 commit
-
-
Xiaomeng Zhao authored
docs(README): update WeChat group link
-