- 01 Aug, 2024 2 commits
-
-
liukaiwen authored
-
liukaiwen authored
# What's Changed ### New Features - Add table content recognition, we use weights of [StructEqTable](https://github.com/UniModal4Reasoning/StructEqTable-Deploy) to convert table image to LaTex. ### Instruction - pip install pypandoc struct-eqtable==0.1.0 - Download [StructEqTable weights](https://huggingface.co/wanderkid/PDF-Extract-Kit/tree/main/models/TabRec) and put it under models/ directory. - Edit 'table-mode' value to turn on table recognition function which is turned off by default. - If you did not download any models before, refer to [how to download models](docs/how_to_download_models_zh_cn.md)。
-
- 31 Jul, 2024 3 commits
-
-
liukaiwen authored
## Changelog 31/07/20204 - Support table recognition. Table images will be converted into LaTex. ### how to use the new feature: set the attribute 'table-mode' to 'true' in magic-pdf.json ### caution: it takes 200s to 500s to convert a single table image using cpu
-
myhloli authored
-
liukaiwen authored
## Changelog 31/07/20204 - Support table recognition. Table images will be converted into html. ### how to use the new feature: set the attribute 'table-mode' to 'true' in magic-pdf.json ### caution: it takes 200s to 500s to convert a single table image using cpu
-
- 30 Jul, 2024 2 commits
- 29 Jul, 2024 1 commit
-
-
myhloli authored
-
- 28 Jul, 2024 1 commit
-
-
myhloli authored
-
- 25 Jul, 2024 1 commit
-
-
myhloli authored
fix(pdf_extract_kit): specify utf-8 encoding when reading model configEnsure the model configuration file is read with utf-8 encoding to support non-ASCII characters and prevent potential encoding errors.
-
- 24 Jul, 2024 5 commits
-
-
myhloli authored
Specify utf-8 encoding when opening the configuration file to ensure compatibility with files containing non-ASCII characters, avoiding potentialencoding errors.
-
赵小蒙 authored
-
myhloli authored
-
myhloli authored
fix(magic-pdf): add default values and improve warning logs for config optionsEnsure that 'temp-output-dir', 'models-dir', and 'device-mode' have sensible default values in case they are not specified in the config file.
-
myhloli authored
-
- 23 Jul, 2024 6 commits
- 22 Jul, 2024 3 commits
- 19 Jul, 2024 3 commits
- 18 Jul, 2024 1 commit
-
-
myhloli authored
-
- 17 Jul, 2024 4 commits
- 15 Jul, 2024 1 commit
-
-
myhloli authored
-
- 14 Jul, 2024 2 commits
-
-
myhloli authored
Improve the model loading mechanism in magic_pdf by implementing a Singleton pattern to reduce redundant model instantiation. Additionally, enhance the command-line interface to support input from list files, allowing batch processing of multiple PDF documents.
-
myhloli authored
Introduce a Singleton pattern to manage custom models in the magic_pdf module. This change improves the efficiency by ensuring that a single instance of the custom model is created and reused, thereby reducing the overhead of multiple instantiate calls for the same model configuration.
-
- 13 Jul, 2024 2 commits
- 12 Jul, 2024 3 commits
-
-
myhloli authored
-
myhloli authored
-
zhaoxiaomeng authored
-