- 16 Jan, 2025 1 commit
-
-
myhloli authored
- Adjust end_page_id calculation to prevent IndexError when accessing pages - Enhance error handling in LLM post-processing by specifically catching JSONDecodeError
-
- 15 Jan, 2025 5 commits
-
-
myhloli authored
- Rename and update merge_title_blocks function - Implement merge_two_bbox helper function - Refactor merging logic to preserve original block structure- Update function calls and integrate with existing pipeline
-
myhloli authored
- Add support for NPU (Neural Processing Unit) when available - Implement batch analysis for GPU and NPU devices - Optimize memory usage and improve performance - Update logging and error handling
-
myhloli authored
- Add `remove_invalid_surrogates` function to filter out invalid UTF-16 surrogate pairs - Integrate the new function into the `detect_lang` workflow - Include a test case with UTF-16 surrogates to verify the fix
-
myhloli authored
- Clarify the expected format for the optimized title list JSON output- Emphasize the need to return only the title levels in the specified format
-
myhloli authored
- Modified the IOU threshold in ocr_span_list_modify.py from 0.9 to 0.35 - This change aims to improve the detection of overlapping characters in OCR processed PDFs
-
- 14 Jan, 2025 4 commits
-
-
myhloli authored
- Add average line height calculation for title blocks - Include page number in title dictionary - Improve title optimization prompt for better hierarchy- Implement retry mechanism for JSON decoding errors - Add error logging for title count mismatch
-
myhloli authored
-
myhloli authored
- Merge title blocks that are close to each other horizontally - Adjust line insertion logic for title blocks- Increase image size and decrease confidence threshold for layout detection - Update DocLayoutYOLO model weights - Refactor drawing of bounding boxes for different block types
-
Xiaomeng Zhao authored
-
- 10 Jan, 2025 4 commits
-
-
myhloli authored
-
myhloli authored
- Add enable flag check for formula, text, and title optimizations
-
myhloli authored
-
myhloli authored
- Add MPS support for Apple Silicon devices - Implement empty_cache() for MPS devices - Set PYTORCH_ENABLE_MPS_FALLBACK environment variable - Adjust MFR model device allocation for MPS
-
- 09 Jan, 2025 5 commits
-
-
myhloli authored
- Improve language detection by removing newline characters from the input text - Add error handling and fallback mechanism to deal with text containing control characters
-
myhloli authored
- Remove conditional logic for OCR engine selection - Always use RapidOCR as the OCR engine - Simplify the __init__ method by removing unused code
-
myhloli authored
- Remove YOLO v11 language detection model from model_configs.yaml - Update language detection utils to use a fixed model path instead of dynamic configuration - Remove unused model weight parameter for YOLO v11 language detection
-
myhloli authored
- Implement block sorting within image and table blocks - Ensure correct order of captions and footnotes within blocks - Improve overall document structure and parsing accuracy
-
myhloli authored
- Remove LangDetectMode and related conditional logic - Use a single model weight for language detection - Add logging for language detection results - Update model initialization and prediction methods
-
- 08 Jan, 2025 3 commits
-
-
myhloli authored
- Add language detection model initialization and integration - Update model list to include language detection - Refactor language detection utils for better model management
-
myhloli authored
- Add separate models for Chinese/Japanese and English/French/German detection - Implement mode-based detection to use appropriate models for different languages - Update language detection process to use higher DPI for better accuracy - Modify model initialization and prediction logic to support new language-specific models
-
myhloli authored
- Add logic to set any negative values in block['bbox'] to 0 - This prevents potential errors when processing PDF blocks
-
- 07 Jan, 2025 1 commit
-
-
myhloli authored
- Remove DropMode and MakeMode imports from user code - Set default drop_mode to DropMode.NONE in get_markdown and get_content_list methods - Remove md_make_mode parameter from get_content_list method - Add dump_middle_json method to PipeResult - Update examples in API documentation and demo script
-
- 06 Jan, 2025 3 commits
-
-
Xiaomeng Zhao authored
-
myhloli authored
- Add check for empty OCR result when using PaddleOCR model - Assign None to ocr_result if no text is detected, preventing further errors
-
icecraft authored
-
- 05 Jan, 2025 3 commits
-
-
myhloli authored
- Add `draw_char_bbox` function to `draw_bbox.py` for drawing character bounding boxes - Integrate `draw_char_bbox` into `common.py` for use in PDF processing pipeline - Include option to draw character bounding boxes in debug mode
-
myhloli authored
style(pdf_parse_union_core_v2): remove unnecessary spaces and improve code formatting- Remove extra space in conditional statement for character spacing logic - Adjust spacing in trigonometric checks for line direction- Improve overall code readability and consistency
-
myhloli authored
- Add missing 'else' statement in OCR model selection logic - Ensure consistent formatting of 'if' statements for better readability - Remove unnecessary empty line in the 'app.py' file
-
- 03 Jan, 2025 2 commits
-
-
myhloli authored
- Remove logger.info() call for additional_ocr_params to reduce log verbosity
-
myhloli authored
- Implement ONNXModelSingleton to manage ONNX models - Modify ModifiedPaddleOCR to use ONNX models on ARM CPUs without CUDA - Update RapidTableModel to use RapidOCR with ONNXRuntime on CPU - Add rapidocr_onnxruntime dependency in setup.py
-
- 02 Jan, 2025 2 commits
-
-
myhloli authored
- Update the logic for inserting spaces between characters- Consider the next character's position instead of the previous one - Adjust the spacing threshold to 25% of the average character width - Ignore spaces at the end of lines to prevent double spaces
-
myhloli authored
- Update the logic for inserting spaces between characters- Consider the next character's position instead of the previous one - Adjust the spacing threshold to 25% of the average character width - Ignore spaces at the end of lines to prevent double spaces
-
- 30 Dec, 2024 2 commits
-
-
myhloli authored
- Remove use_npu variable initialization - Comment out device assignment and npu check - Comment out use_npu parameter in ModifiedPaddleOCR constructor
-
myhloli authored
- Update `clean_memory.py` to use `torch_npu.npu` instead of `torch.npu` - Update `model_utils.py` to use `torch_npu.npu` instead of `torch.npu` - Simplify NPU availability check and bfloat16 support in `pdf_parse_union_core_v2.py`
-
- 27 Dec, 2024 1 commit
-
-
icecraft authored
-
- 26 Dec, 2024 2 commits
-
-
myhloli authored
- Update clean_memory function to support both CUDA and NPU devices - Implement get_device function to centralize device selection logic - Modify model initialization and memory cleaning to use the selected device - Update RapidTableModel to support both RapidOCR and PaddleOCR engines
-
myhloli authored
- Add NPU support for memory cleaning and model initialization - Optimize table model initialization and prediction process - Update memory utils to support NPU - Add language parameter for table model
-
- 25 Dec, 2024 2 commits
-
-
myhloli authored
- Comment out logging statements for title list, title completion, and length comparison - Improve code readability and reduce clutter by removing unused debug information
-
myhloli authored
- Implement llm_aided_title function to optimize document titles using LLM - Update pdf_parse_union_core_v2.py to include title optimization - Modify ocr_mkcontent.py to use optimized title levels- Add openai SDK dependency in setup.py
-