- 05 Jun, 2025 1 commit
-
-
seedclaimer authored
fix absence of sorted_boxes, merge_det_boxes, update_det_boxes.
-
- 04 Jun, 2025 2 commits
- 29 May, 2025 1 commit
-
-
Xiaomeng Zhao authored
-
- 28 May, 2025 1 commit
-
-
speta authored
-
- 24 May, 2025 2 commits
- 22 Apr, 2025 1 commit
-
-
myhloli authored
- Remove OCR engine instantiation inside the loop - Pass language directly to the table model instead of OCR engine - Simplify code structure and improve readability
-
- 11 Apr, 2025 1 commit
-
-
myhloli authored
- Update batch processing logic for improved efficiency - Refactor image analysis and inference methods - Optimize dataset handling and image retrieval - Improve error handling and logging in batch processes
-
- 09 Apr, 2025 2 commits
-
-
myhloli authored
- Comment out the line that updates det_count in batch_analyze.py - Add a new OCR model configuration for Chinese (ch_lite) in models_config.yml- Update the Chinese OCR model configuration to use a different recognition model
-
myhloli authored
- Add functions to calculate IoU, check if tables are inside each other, and merge tables - Implement table merging for high IoU tables - Add filtering to remove nested tables that don't overlap but cover a large area - Update table_res_list and layout_res to reflect these changes
-
- 08 Apr, 2025 1 commit
-
-
myhloli authored
- Update OCR score formatting in batch_analyze.py and pdf_parse_union_core_v2.py - Change score rounding method to preserve three decimal places - Enhance accuracy representation without significantly altering the score value
-
- 03 Apr, 2025 3 commits
-
-
myhloli authored
- Update table recognition logic to process each table individually - Refactor layout detection to use tqdm for progress tracking - Optimize OCR recognition by using a single tqdm wrapper - Improve MFR prediction with a more accurate progress bar - Simplify MFD prediction by removing unnecessary total calculation
-
myhloli authored
- Remove unused imports and comments - Increase MIN_BATCH_INFERENCE_SIZE from 100 to 200 - Comment out VRAM cleaning and logging in batch_analyze.py - Simplify code in doc_analyze_by_custom_model.py- Add tqdm progress bar in pdf_parse_union_core_v2.py - Enable tqdm in OCR processing
-
myhloli authored
- Add tqdm progress bar to batch prediction loops in multiple model modules - Improve logging and error handling in batch analysis script - Update table model initialization to use default sub-model if none specified - Add tqdm dependency to requirements.txt
-
- 31 Mar, 2025 3 commits
-
-
myhloli authored
- Replace direct OCR model access with AtomModelSingleton for better model management - Round OCR scores to 2 decimal places for consistency - Improve error handling and logging in batch analysis - Simplify OCR result processing in pdf_parse_union_core_v2.py
-
myhloli authored
- Add support for multiple languages in OCR processing - Create separate lists for each language to improve processing efficiency - Update OCR model initialization to use PytorchPaddleOCR instead of ModifiedPaddleOCR - Modify get_ocr_result_list function to include language information- Improve logging for OCR detection and recognition
-
myhloli authored
- Split OCR process into detection and recognition stages - Update batch analysis and document analysis pipelines - Modify OCR result formatting and handling - Remove unused imports and optimize code structure
-
- 26 Mar, 2025 1 commit
-
-
icecraft authored
-
- 07 Mar, 2025 1 commit
-
-
myhloli authored
- Remove PIL usage across multiple files - Convert image processing functions to use NumPy arrays - Update crop_img function to work with NumPy arrays - Modify image loading and resizing to use NumPy and OpenCV - Clean up unused imports and comments related to PIL
-
- 21 Jan, 2025 1 commit
-
-
myhloli authored
- Reduce YOLO_LAYOUT_BASE_BATCH_SIZE from 4 to 1 - Simplify batch ratio calculation for formula detection - Remove unused conditional logic in batch ratio determination
-
- 16 Jan, 2025 1 commit
-
-
myhloli authored
- Modify the batch analyze process to handle the rapid table model's output - Add logic_points variable to capture additional output from rapid table prediction
-
- 15 Jan, 2025 1 commit
-
-
myhloli authored
- Add support for NPU (Neural Processing Unit) when available - Implement batch analysis for GPU and NPU devices - Optimize memory usage and improve performance - Update logging and error handling
-
- 14 Jan, 2025 1 commit
-
-
myhloli authored
-
- 26 Dec, 2024 1 commit
-
-
myhloli authored
- Update clean_memory function to support both CUDA and NPU devices - Implement get_device function to centralize device selection logic - Modify model initialization and memory cleaning to use the selected device - Update RapidTableModel to support both RapidOCR and PaddleOCR engines
-
- 18 Dec, 2024 1 commit
-
-
icecraft authored
-
- 13 Dec, 2024 2 commits
- 12 Dec, 2024 2 commits