"vscode:/vscode.git/clone" did not exist on "11bfff8ee11ffa6e49ec8fbecf3a20fa060b582f"
  1. 08 Apr, 2025 1 commit
  2. 07 Apr, 2025 1 commit
  3. 03 Apr, 2025 3 commits
    • myhloli's avatar
      refactor(magic_pdf): optimize table recognition and layout detection · 1fd72f5f
      myhloli authored
      - Update table recognition logic to process each table individually
      - Refactor layout detection to use tqdm for progress tracking
      - Optimize OCR recognition by using a single tqdm wrapper
      - Improve MFR prediction with a more accurate progress bar
      - Simplify MFD prediction by removing unnecessary total calculation
      1fd72f5f
    • myhloli's avatar
      refactor(magic_pdf): optimize code and improve logging · 553f250f
      myhloli authored
      - Remove unused imports and comments
      - Increase MIN_BATCH_INFERENCE_SIZE from 100 to 200
      - Comment out VRAM cleaning and logging in batch_analyze.py
      - Simplify code in doc_analyze_by_custom_model.py- Add tqdm progress bar in pdf_parse_union_core_v2.py
      - Enable tqdm in OCR processing
      553f250f
    • myhloli's avatar
      feat(model): add tqdm progress bar to model prediction loops · 8e1c2339
      myhloli authored
      - Add tqdm progress bar to batch prediction loops in multiple model modules
      - Improve logging and error handling in batch analysis script
      - Update table model initialization to use default sub-model if none specified
      - Add tqdm dependency to requirements.txt
      8e1c2339
  4. 02 Apr, 2025 10 commits
  5. 01 Apr, 2025 2 commits
    • myhloli's avatar
      refactor(ocr): remove unused OCR dictionaries and update model configurations · 41f1fb8a
      myhloli authored
      - Remove unused OCR dictionaries for Arabic, Belarusian, Bulgarian and Armenian languages
      - Update model configurations in arch_config.yaml:
      - Comment out 'out_channels' for various language models
        - Rename Arabic, Korean, Japanese, Tamil and Devanagari model configurations to use 'v3' instead of 'v4'
      - Delete ar_dict.txt, be_dict.txt and bg_dict.txt files
      - Update arabic_dict.txt to remove blank line at the start
      41f1fb8a
    • myhloli's avatar
      refactor(ocr): remove unused code and simplify model architecture · b3d6785d
      myhloli authored
      - Remove unused imports and code
      - Simplify model architecture by removing unnecessary components
      - Update initialization and forward pass logic
      - Rename variables for consistency
      b3d6785d
  6. 31 Mar, 2025 3 commits
    • myhloli's avatar
      refactor(model): integrate AtomModelSingleton for OCR and improve OCR result handling · 59d6b195
      myhloli authored
      - Replace direct OCR model access with AtomModelSingleton for better model management
      - Round OCR scores to 2 decimal places for consistency
      - Improve error handling and logging in batch analysis
      - Simplify OCR result processing in pdf_parse_union_core_v2.py
      59d6b195
    • myhloli's avatar
      feat(ocr): implement language-specific OCR processing · d7d85a28
      myhloli authored
      - Add support for multiple languages in OCR processing
      - Create separate lists for each language to improve processing efficiency
      - Update OCR model initialization to use PytorchPaddleOCR instead of ModifiedPaddleOCR
      - Modify get_ocr_result_list function to include language information- Improve logging for OCR detection and recognition
      d7d85a28
    • myhloli's avatar
      feat(ocr): implement separate detection and recognition processes · a330651d
      myhloli authored
      - Split OCR process into detection and recognition stages
      - Update batch analysis and document analysis pipelines
      - Modify OCR result formatting and handling
      - Remove unused imports and optimize code structure
      a330651d
  7. 27 Mar, 2025 1 commit
    • myhloli's avatar
      feat(model): add OCR model base structure and utilities · a7a899f6
      myhloli authored
      - Add base model structure for OCR in pytorch
      - Implement data augmentation and transformation modules
      - Create utilities for dictionary handling and state dict conversion
      - Include post-processing modules for OCR
      - Add weight initialization and loading functions
      a7a899f6
  8. 26 Mar, 2025 2 commits
  9. 24 Mar, 2025 2 commits
  10. 22 Mar, 2025 1 commit
  11. 21 Mar, 2025 1 commit
  12. 20 Mar, 2025 6 commits
  13. 19 Mar, 2025 2 commits
  14. 13 Mar, 2025 4 commits
  15. 12 Mar, 2025 1 commit