"vscode:/vscode.git/clone" did not exist on "11bfff8ee11ffa6e49ec8fbecf3a20fa060b582f"
- 08 Apr, 2025 1 commit
-
-
myhloli authored
- Import os and pathlib modules to handle file paths - Define the path to the slanet-plus model - Update RapidTableInput initialization to include the model path
-
- 07 Apr, 2025 1 commit
-
-
myhloli authored
- Refactor VRAM detection logic for better readability and efficiency - Add fallback mechanism for unknown VRAM sizes - Improve device checking in get_vram function
-
- 03 Apr, 2025 3 commits
-
-
myhloli authored
- Update table recognition logic to process each table individually - Refactor layout detection to use tqdm for progress tracking - Optimize OCR recognition by using a single tqdm wrapper - Improve MFR prediction with a more accurate progress bar - Simplify MFD prediction by removing unnecessary total calculation
-
myhloli authored
- Remove unused imports and comments - Increase MIN_BATCH_INFERENCE_SIZE from 100 to 200 - Comment out VRAM cleaning and logging in batch_analyze.py - Simplify code in doc_analyze_by_custom_model.py- Add tqdm progress bar in pdf_parse_union_core_v2.py - Enable tqdm in OCR processing
-
myhloli authored
- Add tqdm progress bar to batch prediction loops in multiple model modules - Improve logging and error handling in batch analysis script - Update table model initialization to use default sub-model if none specified - Add tqdm dependency to requirements.txt
-
- 02 Apr, 2025 10 commits
-
-
myhloli authored
- Replace ch_PP-OCRv4_det_infer.pth with ch_PP-OCRv3_det_infer.pth in models_config.yml - Add new ch_PP-OCRv3_det_infer model configuration in arch_config.yaml
-
myhloli authored
- Remove unnecessary GPU checks and cuda() calls - Consolidate tensor device placement using .to(self.device) - Add warning suppression for cleaner output - Refactor conditional logic for better readability
-
myhloli authored
- Remove unnecessary imports and code in batch_demo.py - Update demo.py to use relative paths and improve code structure - Adjust output directory structure in both scripts - Remove redundant code and simplify functions
-
myhloli authored
- Update PyMuPDF to version <1.25.0 - Update pydantic to version <2.11 - Update transformers to version < 5.0.0 - Remove always_apply parameter from alb.ToGray in image processing
-
myhloli authored
- Update the default configuration path in pytorchocr_utility.py - Add required dependencies for paddleocr2pytorch in setup.py: - shapely - pyclipper - omegaconf
-
myhloli authored
- Remove unused UniMERNet and LayoutLMv3 model configurations - Update OCR model path and dictionary path for PaddleOCR - Modify README to update system requirements and installation instructions - Update setup.py to include new package data
-
myhloli authored
- Remove unused imports for concurrent.futures, multiprocessing, and paddle - Delete commented-out code - Update numpy dependency to remove upper version limit - Remove InferenceResult import that was commented out
-
myhloli authored
- Add newline at the beginning of arabic_dict.txt - Change mode of multiple dictionary files
-
myhloli authored
- Remove OCR utils, modified PaddleOCR, and StructEqTable model - Delete related import statements and model definitions - Update dependencies in setup.py to remove paddlepaddle and related OCR packages
-
myhloli authored
- Comment out print statements in base_ocr_v20.py and pytorch_paddle.py - Update table model initialization to use lang parameter instead of ocr_engine - Remove unused RapidOCR initialization in rapid_table.py
-
- 01 Apr, 2025 2 commits
-
-
myhloli authored
- Remove unused OCR dictionaries for Arabic, Belarusian, Bulgarian and Armenian languages - Update model configurations in arch_config.yaml: - Comment out 'out_channels' for various language models - Rename Arabic, Korean, Japanese, Tamil and Devanagari model configurations to use 'v3' instead of 'v4' - Delete ar_dict.txt, be_dict.txt and bg_dict.txt files - Update arabic_dict.txt to remove blank line at the start
-
myhloli authored
- Remove unused imports and code - Simplify model architecture by removing unnecessary components - Update initialization and forward pass logic - Rename variables for consistency
-
- 31 Mar, 2025 3 commits
-
-
myhloli authored
- Replace direct OCR model access with AtomModelSingleton for better model management - Round OCR scores to 2 decimal places for consistency - Improve error handling and logging in batch analysis - Simplify OCR result processing in pdf_parse_union_core_v2.py
-
myhloli authored
- Add support for multiple languages in OCR processing - Create separate lists for each language to improve processing efficiency - Update OCR model initialization to use PytorchPaddleOCR instead of ModifiedPaddleOCR - Modify get_ocr_result_list function to include language information- Improve logging for OCR detection and recognition
-
myhloli authored
- Split OCR process into detection and recognition stages - Update batch analysis and document analysis pipelines - Modify OCR result formatting and handling - Remove unused imports and optimize code structure
-
- 27 Mar, 2025 1 commit
-
-
myhloli authored
- Add base model structure for OCR in pytorch - Implement data augmentation and transformation modules - Create utilities for dictionary handling and state dict conversion - Include post-processing modules for OCR - Add weight initialization and loading functions
-
- 26 Mar, 2025 2 commits
- 24 Mar, 2025 2 commits
- 22 Mar, 2025 1 commit
-
-
myhloli authored
- Replace deprecated importlib.resources.path with importlib.resources.files - Simplify code structure and improve readability - Remove unnecessary comments and empty lines
-
- 21 Mar, 2025 1 commit
-
-
myhloli authored
- Comment out LayoutLMv3, TableMaster, and StructEqTable models - Update MFR model path to unimernet_hf_small_2503- Remove unused import in Unimernet.py
-
- 20 Mar, 2025 6 commits
-
-
myhloli authored
- Remove separate condition for GPU memory >= 24GB - Simplify logic to use a single threshold of 16GB
-
myhloli authored
- Increase batch ratio to 32 for GPU memory >= 24GB - Set batch ratio to 16 for GPU memory >= 16GB - Reduce batch ratio to 8 for GPU memory >= 12GB - Lower batch ratio to 4 for GPU memory >= 8GB - Set batch ratio to 2 for GPU memory >= 6GB - Keep batch ratio at 1 for lower GPU memory sizes
-
myhloli authored
- Remove torchtext version check and deprecation warning handling from multiple files - This code was unnecessary and potentially caused issues when torchtext was not installed
-
myhloli authored
- Remove half() calls for DocLayoutYOLO and YOLOv8 models - This change prevents potential errors when running models on CPU
-
myhloli authored
- Update config version to1.2.0 - Refactor model initialization in model_init.py- Update dependencies in requirements.txt files - Remove unused imports and models - Add conditional imports for table models
-
myhloli authored
- Add support for Apple M1 chips (mps device) - Refactor image processing for better performance and compatibility - Update model loading and inference for various devices - Adjust batch processing and memory management
-
- 19 Mar, 2025 2 commits
- 13 Mar, 2025 4 commits
- 12 Mar, 2025 1 commit
-
-
myhloli authored
- Remove unnecessary __getitem__ method - Simplify image cropping in detect_math_formula_region - Improve code readability and efficiency
-