1. 02 Sep, 2024 2 commits
  2. 26 Aug, 2024 1 commit
  3. 22 Aug, 2024 1 commit
    • Xiaomeng Zhao's avatar
      build(docker): update docker build step (#471) · 1fc0b76d
      Xiaomeng Zhao authored
      * build(docker): update base image to Ubuntu 22.04 and install PaddlePaddleUpgrade the Docker base image from ubuntu:latest to ubuntu:22.04 for improved
      performance and stability.
      
      Additionally, integrate PaddlePaddle GPU version 3.0.0b1
      into the Docker build for enhanced AI capabilities. The MinIO configuration file has
      also been updated to the latest version.
      
      * build(dockerfile): Updated the Dockerfile
      
      * build(Dockerfile): update Dockerfile
      
      * docs(docker): add instructions for quick deployment with Docker
      
      Include Docker-based deployment instructions in the README for both English and
      Chinese locales. This update provides users a quick-start guide to using Docker for
      deployment, with notes on GPU VRAM requirements and default acceleration features.
      
      * build(docker): Layer the installation of dependencies, downloading the model, and the setup of the program itself.
      
      * build(docker): Layer the installation of dependencies, downloading the model, and the setup of the program itself.
      1fc0b76d
  4. 20 Aug, 2024 2 commits
    • Xiaomeng Zhao's avatar
      fix(ocr_mkcontent): revise table caption output (#397) · dd19f59e
      Xiaomeng Zhao authored
      
      
      * fix(ocr_mkcontent): revise table caption output
      
      - Ensuring that
        table captions are properly included in the output.
      - Remove the redundant `table_caption` variable。
      
      * Update cla.yml
      
      * Update bug_report.yml
      
      * feat(cli): add debug option for detailed error handling
      
      Enable users to invoke the CLI command with a new debug flag to get detailed debugging information.
      
      * fix(pdf-extract-kit): adjust crop_paste parameters for better accuracyThe crop_paste_x and crop_paste_y values in the pdf_extract_kit.py have been modified
      to improve the accuracy and consistency of OCR processing. The new values are set to 25
      to ensure more precise image cropping and pasting which leads to better OCR recognition
      results.
      
      * Update README_zh-CN.md (#404)
      
      correct FAQ url
      
      * Update README_zh-CN.md (#404) (#409) (#410)
      
      correct FAQ url
      Co-authored-by: default avatarsfk <18810651050@163.com>
      
      * Update FAQ_zh_cn.md
      
      add new issue
      
      * Update FAQ_en_us.md
      
      * Update README_Windows_CUDA_Acceleration_zh_CN.md
      
      * Update README_zh-CN.md
      
      * @Thepathakarpit has signed the CLA in opendatalab/MinerU#418
      
      * fix(pdf-extract-kit): increase crop_paste margin for OCR processingDouble the crop_paste margin from25 to 50 to ensure better OCR accuracy and
      handling of border cases. This change will help in improving the overall quality of
      OCR'ed text by providing more context around the detected text areas.
      
      * fix(common): deep copy model list before drawing model bbox
      
      Use a deep copy of the original model list in `drow_model_bbox` to avoid potential
      modifications to the source data. This ensures the integrity of the original models
      is maintained while generating the model bounding boxes visualization.
      
      ---------
      Co-authored-by: default avatarsfk <18810651050@163.com>
      Co-authored-by: default avatardrunkpig <60862764+drunkpig@users.noreply.github.com>
      Co-authored-by: default avatargithub-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
      dd19f59e
    • icecraft's avatar
      feat: rename the file generated by command line tools (#401) · c9a51491
      icecraft authored
      
      
      * feat: rename the file generated by command line tools
      
      * feat: add pdf filename as prefix to {span,layout,model}.pdf
      
      ---------
      Co-authored-by: default avataricecraft <tmortred@gmail.com>
      Co-authored-by: default avataricecraft <xurui1@pjlab.org.cn>
      c9a51491
  5. 13 Aug, 2024 3 commits
  6. 09 Aug, 2024 1 commit
  7. 02 Aug, 2024 2 commits
    • myhloli's avatar
      docs: specify absolute path for model weights configuration · 9778a461
      myhloli authored
      Update the README documents to clarify that the "models-dir" in the
      configuration should be an absolute path. Also, provide additional guidance
      for Windows users on how to correctly format the path to avoid common issues
      with path escaping in JSON files.
      9778a461
    • myhloli's avatar
      docs: update model download instructions and CUDA acceleration setup · 3ef4d054
      myhloli authored
      Update the documentation to reflect the latest model download procedures, emphasis on
      model file integrity checks, and expanded instructions for setting up CUDA accelerationon Ubuntu and Windows environments. The README files for various OS have been
      enhanced with additional details to assist users in configuring and verifying their
      environments for optimal performance.
      3ef4d054
  8. 01 Aug, 2024 2 commits
    • myhloli's avatar
      docs(readme): update installation URLs for faster wheel downloads · 793e8d43
      myhloli authored
      Change the URLs in the installation instructions to new mirrors that are expected to provide faster downloads for users. This update affects the installation guides for both detectron2 and magic-pdf in the Chinese documentation.
      793e8d43
    • myhloli's avatar
      refactor(readme): optimize detectron2 installation guide · 69ce578c
      myhloli authored
      Reorganize the installation instructions for Magic-PDF to clarify the dependency on
      detectron2 and provide a more straightforward installation process. The update includes
      separating the dependency installation from the package installation and adding a note
      about precompiled wheels for Python 3.10.
      
      BREAKING CHANGE: The installation guide now assumes basic familiarity with detectron2
      installation requirements. Users who need to compile detectron2 from source should refer
      to the official detectron2 documentation.
      69ce578c
  9. 31 Jul, 2024 5 commits
    • Xiaomeng Zhao's avatar
      Update README_zh-CN.md · 5e8d149f
      Xiaomeng Zhao authored
      5e8d149f
    • myhloli's avatar
      docs(readme): update PyTorch installation guide for CUDA 11.8 · 13d30a4f
      myhloli authored
      Update the PyTorch installation command in the README files for both English and Chinese
      versions to reflect the required version compatibility with CUDA 11.8. Include explicit
      instructions to specify the PyTorch version to avoid automatic installation of higher,
      unsupported versions. Additionally, clarify the importance of modifying the "device-mode"
      parameter in the magic-pdf.json configuration file for proper CUDA device selection.
      13d30a4f
    • myhloli's avatar
      fix(readme): specify supported PyTorch versions in install guide · fd60393d
      myhloli authored
      Update the PyTorch installation guide in both English and Chinese READMEs to explicitly
      recommend using torch==2.3.1 and torchvision==0.18.1 for CUDA 11.8. Emphasize the
      importance of specifying these versions to avoid compatibility issues with higher,
      unsupported versions.
      fd60393d
    • myhloli's avatar
      docs(readme): update install command and add beta version notice · 891a9741
      myhloli authored
      - Change the pip install command in README_zh-CN.md to reflect the new version 0.6.2b1.
      - Include a notice about the pre-release of version 0.6.2beta, cautioning users about its未经完整QA测试的状态,并提供回退到0.6.1版本的指导。
      - Verify the installed version with `magic-pdf --version` after installation to ensure
       the correct version is installed, addressing feedback about incorrect versions due to
       mirror source and dependency conflicts.
      891a9741
    • xuchao's avatar
      update discord link · cfcb1f47
      xuchao authored
      cfcb1f47
  10. 30 Jul, 2024 1 commit
  11. 29 Jul, 2024 1 commit
  12. 28 Jul, 2024 1 commit
  13. 26 Jul, 2024 1 commit
  14. 24 Jul, 2024 2 commits
    • myhloli's avatar
      fix(readme): update wheel install guide for detectron2 · f8599d2b
      myhloli authored
      Update the installation guide in README_zh-CN.md to clarify that the pre-compiled
      whl packages are only compatible with 64-bit systems running Python 3.10 on
      windows/linux/macOS. Add a note warning that the packages are not compatible with
      32-bit systems or non-mac ARM platforms, and suggest manual compilation for
      unsupported systems.
      f8599d2b
    • myhloli's avatar
      feat(readme): add instructions for config file setup and model weight path... · 3fc9943f
      myhloli authored
      feat(readme): add instructions for config file setup and model weight path configurationUpdate the README_zh-CN.md to include detailed steps for copying the
      configuration template to the user directory and configuring the 'models-dir'
      to point to the downloaded model weight files. This ensures users have a clear
      guideline on setting up the configuration file correctly, preventing program
      failures due to missing model files.
      3fc9943f
  15. 23 Jul, 2024 2 commits
  16. 22 Jul, 2024 1 commit
  17. 19 Jul, 2024 1 commit
  18. 18 Jul, 2024 1 commit
  19. 17 Jul, 2024 1 commit
  20. 15 Jul, 2024 1 commit
  21. 13 Jul, 2024 3 commits
  22. 12 Jul, 2024 5 commits