Commit ab263aad authored by myhloli's avatar myhloli
Browse files

docs(readme): update changelog for v1.1.0 release- Update model capabilities:...

docs(readme): update changelog for v1.1.0 release- Update model capabilities: upgrade to latest doclayout_yolo(2501) and unimernet(2501) models

- Improve performance: optimize resource usage and processing pipeline for faster parsing on high-end devices- Enhance parsing effects: add new heading classification feature to online demo
- Refactor changelog structure for better readability and organization
parent 6ff18b14
...@@ -48,9 +48,12 @@ Easier to use: Just grab MinerU Desktop. No coding, no login, just a simple inte ...@@ -48,9 +48,12 @@ Easier to use: Just grab MinerU Desktop. No coding, no login, just a simple inte
# Changelog # Changelog
- 2025/01/22 1.1.0 released. In this version we have focused on improving parsing accuracy and efficiency: - 2025/01/22 1.1.0 released. In this version we have focused on improving parsing accuracy and efficiency:
- Upgraded to the latest doclayout_yolo(2501) model, enhancing layout recognition accuracy. - Model capability upgrade (requires re-executing the [model download process](docs/how_to_download_models_en.md) to obtain incremental updates of model files)
- Upgraded to the latest unimernet(2501) model, improving formula recognition accuracy. - The layout recognition model has been upgraded to the latest `doclayout_yolo(2501)` model, improving layout recognition accuracy.
- The formula parsing model has been upgraded to the latest `unimernet(2501)` model, improving formula recognition accuracy.
- Performance optimization
- On devices that meet certain configuration requirements (16GB+ VRAM), by optimizing resource usage and restructuring the processing pipeline, overall parsing speed has been increased by more than 50%. - On devices that meet certain configuration requirements (16GB+ VRAM), by optimizing resource usage and restructuring the processing pipeline, overall parsing speed has been increased by more than 50%.
- Parsing effect optimization
- Added a new heading classification feature (testing version, enabled by default) to the online demo, which supports hierarchical classification of headings, thereby enhancing document structuring. - Added a new heading classification feature (testing version, enabled by default) to the online demo, which supports hierarchical classification of headings, thereby enhancing document structuring.
- 2025/01/10 1.0.1 released. This is our first official release, where we have introduced a completely new API interface and enhanced compatibility through extensive refactoring, as well as a brand new automatic language identification feature: - 2025/01/10 1.0.1 released. This is our first official release, where we have introduced a completely new API interface and enhanced compatibility through extensive refactoring, as well as a brand new automatic language identification feature:
- New API Interface - New API Interface
......
...@@ -47,10 +47,13 @@ ...@@ -47,10 +47,13 @@
# 更新记录 # 更新记录
- 2025/01/22 1.1.0 发布,在这个版本我们重点提升了解析的精度与效率: - 2025/01/22 1.1.0 发布,在这个版本我们重点提升了解析的精度与效率:
- 升级了最新版的doclayout_yolo(2501)模型,提升了layout识别精度 - 模型能力升级(需重新执行[模型下载流程](docs/how_to_download_models_zh_cn.md)以获得模型文件的增量更新)
- 升级了最新版的unimernet(2501)模型,提升了公式识别精度 - 布局识别模型升级到最新的`doclayout_yolo(2501)`模型,提升了layout识别精度
- 公式解析模型升级到最新的`unimernet(2501)`模型,提升了公式识别精度
- 性能优化
- 在配置满足一定条件(显存16GB+)的设备上,通过优化资源占用和重构处理流水线,整体解析速度提升50%以上 - 在配置满足一定条件(显存16GB+)的设备上,通过优化资源占用和重构处理流水线,整体解析速度提升50%以上
- 在线demo上新增标题分级功能(测试版本,默认开启),支持对标题进行分级,提升文档结构化程度 - 解析效果优化
- 在线demo(mineru.net/huggingface/modelscope)上新增标题分级功能(测试版本,默认开启),支持对标题进行分级,提升文档结构化程度
- 2025/01/10 1.0.1 发布,这是我们的第一个正式版本,在这个版本中,我们通过大量重构带来了全新的API接口和更广泛的兼容性,以及全新的自动语言识别功能: - 2025/01/10 1.0.1 发布,这是我们的第一个正式版本,在这个版本中,我们通过大量重构带来了全新的API接口和更广泛的兼容性,以及全新的自动语言识别功能:
- 全新API接口 - 全新API接口
- 对于数据侧API,我们引入了Dataset类,旨在提供一个强大而灵活的数据处理框架。该框架当前支持包括图像(.jpg及.png)、PDF、Word(.doc及.docx)、以及PowerPoint(.ppt及.pptx)在内的多种文档格式,确保了从简单到复杂的数据处理任务都能得到有效的支持。 - 对于数据侧API,我们引入了Dataset类,旨在提供一个强大而灵活的数据处理框架。该框架当前支持包括图像(.jpg及.png)、PDF、Word(.doc及.docx)、以及PowerPoint(.ppt及.pptx)在内的多种文档格式,确保了从简单到复杂的数据处理任务都能得到有效的支持。
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment