1. 20 Mar, 2025 1 commit
  2. 21 Jan, 2025 1 commit
    • myhloli's avatar
      fix(models): update unimernet_small model path · 2a3a006f
      myhloli authored
      - Update model path from 'unimernet_small' to 'unimernet_small_2501' in multiple scripts and configuration files
      - This change affects download_models.py, download_models_hf.py, and model_configs.yaml
      2a3a006f
  3. 14 Jan, 2025 1 commit
    • myhloli's avatar
      feat(layout): improve title block handling and layout detection · c20e9a1e
      myhloli authored
      - Merge title blocks that are close to each other horizontally
      - Adjust line insertion logic for title blocks- Increase image size and decrease confidence threshold for layout detection
      - Update DocLayoutYOLO model weights
      - Refactor drawing of bounding boxes for different block types
      c20e9a1e
  4. 09 Jan, 2025 2 commits
  5. 08 Jan, 2025 1 commit
    • myhloli's avatar
      feat(language-detection): improve language detection accuracy for specific languages · 356cb1f2
      myhloli authored
      - Add separate models for Chinese/Japanese and English/French/German detection
      - Implement mode-based detection to use appropriate models for different languages
      - Update language detection process to use higher DPI for better accuracy
      - Modify model initialization and prediction logic to support new language-specific models
      356cb1f2
  6. 17 Dec, 2024 1 commit
  7. 08 Nov, 2024 1 commit
  8. 23 Oct, 2024 1 commit
    • myhloli's avatar
      feat(model): add support for DocLayout-YOLO model · 1279f2cd
      myhloli authored
      - Add new layout model option: DocLayout-YOLO
      - Implement model initialization and prediction for DocLayout-YOLO
      - Update configuration options to include new model- Modify existing code to support both LayoutLMv3 and DocLayout-YOLO models
      - Update Gradio app to support more Custom Switch
      1279f2cd
  9. 20 Sep, 2024 1 commit
  10. 12 Sep, 2024 1 commit
  11. 10 Sep, 2024 1 commit
    • myhloli's avatar
      refactor(pdf_extract_kit): update model config and weight paths for UniMERNet-0.2.0 · 4f340c44
      myhloli authored
      Update the paths to model weights and configuration files for the UniMERNet architecture
      in both the demo.yaml and model_configs.yaml files. Adjust the mfr_model_init function toreflect the new weight and configuration paths. The changes include specifying more detailed
      paths to the unimernet_base directory and changing the weight file extension to .pth.
      4f340c44
  12. 02 Sep, 2024 2 commits
  13. 02 Aug, 2024 1 commit
    • Kaiwen Liu's avatar
      feat(model inference): add table recognition and conversion to LaTeX (#284) · 37925f36
      Kaiwen Liu authored
      * # add table recognition using struct-eqtable
      ## Changelog
      31/07/20204
      - Support table recognition. Table images will be converted into html.
      
      ### how to use the new feature:
      set the attribute 'table-mode' to 'true' in magic-pdf.json
      
      ### caution:
      it takes 200s to 500s to convert a single table image using cpu
      
      * # add table recognition using struct-eqtable
      ## Changelog
      31/07/20204
      - Support table recognition. Table images will be converted into LaTex.
      
      ### how to use the new feature:
      set the attribute 'table-mode' to 'true' in magic-pdf.json
      
      ### caution:
      it takes 200s to 500s to convert a single table image using cpu
      
      * # feat(model inference): add table recognition and convertion to LaTeX
      
      # What's Changed
      
      ### New Features
      
      - Add table content recognition, we use weights of [StructEqTable](https://github.com/UniModal4Reasoning/StructEqTable-Deploy) to convert table image to LaTex.
      
      ### Instruction
      
      - pip install pypandoc struct-eqtable==0.1.0
      - Download [StructEqTable weights](https://huggingface.co/wanderkid/PDF-Extract-Kit/tree/main/models/TabRec
      
      ) and put it under models/ directory.
      - Edit 'table-mode' value to turn on table recognition function which is turned off by default.
      - If you did not download any models before, refer to [how to download models](docs/how_to_download_models_zh_cn.md)。
      
      * add table recognition and convertion to LaTeX
      
      * add table recognition and conversion to LaTeX
      
      * add table recognition and conversion to LaTeX
      
      * add table recognition and conversion to LaTeX
      
      ---------
      Co-authored-by: default avatarliukaiwen <liukaiwen@pjlab.org.cn>
      37925f36
  14. 01 Aug, 2024 2 commits
  15. 31 Jul, 2024 1 commit
    • liukaiwen's avatar
      # add table recognition using struct-eqtable · b29badc1
      liukaiwen authored
      ## Changelog
      31/07/20204
      - Support table recognition. Table images will be converted into html.
      
      ### how to use the new feature:
      set the attribute 'table-mode' to 'true' in magic-pdf.json
      
      ### caution:
      it takes 200s to 500s to convert a single table image using cpu
      b29badc1
  16. 19 Jul, 2024 1 commit
  17. 12 Jul, 2024 2 commits
  18. 09 Jul, 2024 1 commit