- 19 Mar, 2025 1 commit
-
-
Xiaomeng Zhao authored
add support for more document types
-
- 14 Mar, 2025 2 commits
-
-
JesseChen1031 authored
-
JesseChen1031 authored
-
- 13 Mar, 2025 6 commits
- 12 Mar, 2025 1 commit
-
-
github-actions[bot] authored
-
- 11 Mar, 2025 4 commits
-
-
Xiaomeng Zhao authored
perf(inference): optimize batch processing for different GPU memory s…
-
myhloli authored
- Set NPUDTCompile to false for better performance on NPU - Adjust batch ratio
-
Xiaomeng Zhao authored
fix(pre_proc): add Discarded block type to span block type compatibility
-
myhloli authored
- Include BlockType.Discarded in the list of compatible block types for ContentType.Text and ContentType.InlineEquation - This change improves the OCR dictionary merging process by handling discarded blocks more effectively
-
- 07 Mar, 2025 1 commit
-
-
github-actions[bot] authored
-
- 06 Mar, 2025 1 commit
-
-
github-actions[bot] authored
-
- 04 Mar, 2025 5 commits
-
-
Xiaomeng Zhao authored
master->dev
-
myhloli authored
-
Xiaomeng Zhao authored
Release 1.2.2
-
Xiaomeng Zhao authored
refactor(magic_pdf): improve paragraph splitting logic and update dep…
-
myhloli authored
- Optimize paragraph splitting algorithm for better text block separation - Update fast-langdetect dependency to ensure compatibility
-
- 03 Mar, 2025 19 commits
-
-
Xiaomeng Zhao authored
master -> dev
-
Xiaomeng Zhao authored
-
myhloli authored
-
Xiaomeng Zhao authored
Release 1.2.1
-
Xiaomeng Zhao authored
docs(readme): update changelog for v1.2.1 release
-
Xiaomeng Zhao authored
fix(readme): update changelog for v1.2.1 release
-
myhloli authored
- Update README.md and README_zh-CN.md with the latest changes - Add details about bug fixes in version1.2.1 - Include improvements for full-width to half-width conversion, caption matching, and formula span issues
-
Xiaomeng Zhao authored
perf(inference): adjust batch ratio for high GPU memory
-
myhloli authored
- Increase batch ratio to 8 for GPU memory >=16GB - Improve inference performance on systems with higher GPU memory
-
Xiaomeng Zhao authored
fix: caption match
-
Xiaomeng Zhao authored
refactor(pre_proc): allow interline equations to be associated with text blocks
-
myhloli authored
- Update OCR dictionary merge logic to include text blocks when processing interline equations - This change improves the handling of equations that may be embedded within text content
-
icecraft authored
-
Xiaomeng Zhao authored
perf(mfr): improve Math Formula Recognition by sorting images by area
-
myhloli authored
- Simplify batch ratio logic for GPU memory >= 16GB - Remove unnecessary conditions for 20GB and 40GB memory
-
myhloli authored
- Simplify batch ratio logic for GPU memory >= 16GB - Remove unnecessary conditions for 20GB and 40GB memory
-
myhloli authored
- Sort detected images by area before processing to enhance MFR accuracy - Implement stable sorting to maintain original order of images with equal
-
myhloli authored
- Comment out @measure_time decorator for txt_spans_extract_v2 and sort_lines_by_model functions - Remove logger.info for page_process_time - Comment out PerformanceStats.print_stats call
-
myhloli authored
- Add performance_stats module to measure and print execution time statistics - Implement measure_time decorator to track execution time of key functions - Remove multi-threading in pdf parsing for better resource management - Optimize pdf parsing logic for improved performance
-