- 01 Apr, 2025 1 commit
-
-
myhloli authored
- Enhance the logging of execution times by adding more detailed function identification - Implement class name and module name inclusion for better traceability
-
- 24 Mar, 2025 5 commits
-
-
myhloli authored
- Remove unnecessary addition of 1 when calculating lines for block height - This change affects the logic for both potential double-column and triple-column structures
-
myhloli authored
- Remove unnecessary addition of 1 when calculating lines for block height - This change affects the logic for both potential double-column and triple-column structures
-
myhloli authored
- Add condition to check for identical or space characters when resolving overlaps - Skip non-conflicting character pairs to prevent unnecessary removals
-
icecraft authored
-
myhloli authored
- Comment out margin cropping to prevent errors with broken files - Refactor image resizing to preserve aspect ratio - Update padding calculation and application using OpenCV
-
- 22 Mar, 2025 1 commit
-
-
myhloli authored
- Replace deprecated importlib.resources.path with importlib.resources.files - Simplify code structure and improve readability - Remove unnecessary comments and empty lines
-
- 21 Mar, 2025 2 commits
-
-
myhloli authored
- Implement `remove_x_overlapping_chars` function in `ocr_span_list_modify.py` - Integrate the new function in `pdf_parse_union_core_v2.py` to process spans - Remove unnecessary character replacement functions and comments
-
myhloli authored
- Comment out LayoutLMv3, TableMaster, and StructEqTable models - Update MFR model path to unimernet_hf_small_2503- Remove unused import in Unimernet.py
-
- 20 Mar, 2025 7 commits
-
-
myhloli authored
- Remove separate condition for GPU memory >= 24GB - Simplify logic to use a single threshold of 16GB
-
myhloli authored
- Increase batch ratio to 32 for GPU memory >= 24GB - Set batch ratio to 16 for GPU memory >= 16GB - Reduce batch ratio to 8 for GPU memory >= 12GB - Lower batch ratio to 4 for GPU memory >= 8GB - Set batch ratio to 2 for GPU memory >= 6GB - Keep batch ratio at 1 for lower GPU memory sizes
-
myhloli authored
- Add bf_16_support check for CUDA and MPS devices - Use bfloat16 precision for layoutreader model on supported devices - Improve performance on devices with bf_16 support
-
myhloli authored
- Remove torchtext version check and deprecation warning handling from multiple files - This code was unnecessary and potentially caused issues when torchtext was not installed
-
myhloli authored
- Remove half() calls for DocLayoutYOLO and YOLOv8 models - This change prevents potential errors when running models on CPU
-
myhloli authored
- Update config version to1.2.0 - Refactor model initialization in model_init.py- Update dependencies in requirements.txt files - Remove unused imports and models - Add conditional imports for table models
-
myhloli authored
- Add support for Apple M1 chips (mps device) - Refactor image processing for better performance and compatibility - Update model loading and inference for various devices - Adjust batch processing and memory management
-
- 19 Mar, 2025 2 commits
- 17 Mar, 2025 1 commit
-
-
myhloli authored
- Move title level determination to the beginning of the Title block processing - Add condition to include text_level only if it's not 0 - Adjust title level to 0 instead of 1 when it's less than 1
-
- 13 Mar, 2025 5 commits
- 12 Mar, 2025 1 commit
-
-
myhloli authored
- Remove unnecessary __getitem__ method - Simplify image cropping in detect_math_formula_region - Improve code readability and efficiency
-
- 11 Mar, 2025 2 commits
-
-
myhloli authored
- Set NPUDTCompile to false for better performance on NPU - Adjust batch ratio
-
myhloli authored
- Include BlockType.Discarded in the list of compatible block types for ContentType.Text and ContentType.InlineEquation - This change improves the OCR dictionary merging process by handling discarded blocks more effectively
-
- 10 Mar, 2025 1 commit
-
-
myhloli authored
- Remove unused @ImportPIL decorator from load_images_from_pdf function - Update image shape handling in YOLOv11.py for better compatibility These changes improve code readability and performance without altering the original functionality.
-
- 07 Mar, 2025 2 commits
-
-
myhloli authored
- Replace PIL with cv2 for image processing - Fix issues with image cropping and resizing - Add boundary checks and error handling - Optimize code for better performance and readability
-
myhloli authored
- Remove PIL usage across multiple files - Convert image processing functions to use NumPy arrays - Update crop_img function to work with NumPy arrays - Modify image loading and resizing to use NumPy and OpenCV - Clean up unused imports and comments related to PIL
-
- 04 Mar, 2025 2 commits
- 03 Mar, 2025 8 commits
-
-
myhloli authored
-
myhloli authored
- Increase batch ratio to 8 for GPU memory >=16GB - Improve inference performance on systems with higher GPU memory
-
myhloli authored
- Update OCR dictionary merge logic to include text blocks when processing interline equations - This change improves the handling of equations that may be embedded within text content
-
icecraft authored
-
myhloli authored
- Simplify batch ratio logic for GPU memory >= 16GB - Remove unnecessary conditions for 20GB and 40GB memory
-
myhloli authored
- Simplify batch ratio logic for GPU memory >= 16GB - Remove unnecessary conditions for 20GB and 40GB memory
-
myhloli authored
- Sort detected images by area before processing to enhance MFR accuracy - Implement stable sorting to maintain original order of images with equal
-
myhloli authored
- Comment out @measure_time decorator for txt_spans_extract_v2 and sort_lines_by_model functions - Remove logger.info for page_process_time - Comment out PerformanceStats.print_stats call
-