- 05 Jun, 2025 1 commit
-
-
seedclaimer authored
fix absence of sorted_boxes, merge_det_boxes, update_det_boxes.
-
- 04 Jun, 2025 2 commits
- 29 May, 2025 1 commit
-
-
Xiaomeng Zhao authored
-
- 28 May, 2025 1 commit
-
-
speta authored
-
- 24 May, 2025 3 commits
- 23 May, 2025 2 commits
-
-
myhloli authored
-
myhloli authored
- Add PPHGNetV2_B4 backbone to the list of supported backbones - Introduce new OCR model configuration for PP-OCRv5 with PPHGNetV2_B4 - Update existing model configurations to use the new backbone - Modify RNN neck to support input with H > 1 - Adjust batch size for inference
-
- 22 May, 2025 1 commit
-
-
myhloli authored
- Add new PP-OCRv5 detection and recognition models - Update arch_config.yaml with new model architectures - Modify models_config.yml to include PP-OCRv5 models for ch_lite configuration- Change dictionary file for ch_lite to ppocrv5_dict.txt
-
- 19 May, 2025 1 commit
-
-
myhloli authored
-
- 14 May, 2025 1 commit
-
-
myhloli authored
-
- 09 May, 2025 2 commits
- 08 May, 2025 1 commit
-
-
myhloli authored
-
- 06 May, 2025 1 commit
-
-
myhloli authored
-
- 30 Apr, 2025 1 commit
-
-
myhloli authored
- Add logger info for each batch processed - Include batch number and page count in log message
-
- 29 Apr, 2025 3 commits
-
-
myhloli authored
-
myhloli authored
- Adjust the threshold for considering tables inside other tables from2 to 3 - Add support for custom formula delimiters through user configuration - Pin pdfminer.six to version 20250324 to prevent parsing failures
-
myhloli authored
- Add regex patterns for replacing LaTeX symbols \fint and \up with their Unicode equivalents
-
- 28 Apr, 2025 1 commit
-
-
myhloli authored
- Add support for \(\) and \[\] delimiters in addition to $$ and $$- Make LaTeX delimiter configuration more flexible and user-defined - Update configuration file to include LaTeX delimiter settings - Modify OCR content generation to use configurable delimiters
-
- 27 Apr, 2025 4 commits
-
-
myhloli authored
- Add \textunderscore to the list of LaTeX patterns - This allows the model to properly render underscore characters
-
myhloli authored
-
myhloli authored
- Improve \left and \right command handling in LaTeX formulas - Enhance environment type matching for array, matrix, and other structures - Refactor code for better readability and maintainability
-
myhloli authored
- Refactor LaTeX left/right pair fixing logic for better balance - Add environment detection and correction for common math environments - Implement more robust whitespace handling and command substitution - Optimize regex patterns for improved performance and readability
-
- 25 Apr, 2025 2 commits
-
-
myhloli authored
- Add functions to fix LaTeX left and right commands - Implement brace matching and repair in LaTeX formulas - Remove unnecessary whitespace and repair LaTeX code - Replace specific LaTeX commands with appropriate alternatives - Add logging for debugging purposes
-
myhloli authored
- Add functions to fix LaTeX left and right commands - Implement brace matching and repair in LaTeX formulas - Remove unnecessary whitespace and repair LaTeX code - Replace specific LaTeX commands with appropriate alternatives - Add logging for debugging purposes
-
- 24 Apr, 2025 1 commit
-
-
myhloli authored
- Preserve "\ " sequences during whitespace removal - Add temporary substitution to prevent incorrect processing of "\ " sequences - Restore "\ " sequences after removing unnecessary whitespace
-
- 23 Apr, 2025 3 commits
-
-
myhloli authored
-
myhloli authored
- Replace get_device() function call with direct 'device' variable usage - Simplify device configuration in OCR model initialization
-
myhloli authored
- Add new Chinese OCR model (ch_PP-OCRv4_rec_server_doc_infer) for server-side use - Update language support in app.py to include new Chinese model - Modify models_config.yml to add new model configuration
-
- 22 Apr, 2025 3 commits
-
-
myhloli authored
-
myhloli authored
- Automatically change to ch_lite model when using CPU for Chinese OCR - This modification improves performance on CPU devices
-
myhloli authored
- Remove OCR engine instantiation inside the loop - Pass language directly to the table model instead of OCR engine - Simplify code structure and improve readability
-
- 21 Apr, 2025 2 commits
- 17 Apr, 2025 2 commits
- 16 Apr, 2025 1 commit
-
-
myhloli authored
- Temporarily disable Chinese font check for Windows systems - This change allows bypassing the font check when the required fonts are not present
-