docs(README): update release notes for version 1.2.0

- Update English and Chinese README files with the changelog for version 1.2.0 - Include details on performance optimizations, parsing improvements, and bug fixes - Highlight specific enhancements for PDF document classification, watermark handling, and layout matching

docs(README): update release notes for version 1.2.0
- Update English and Chinese README files with the changelog for version 1.2.0 - Include details on performance optimizations, parsing improvements, and bug fixes - Highlight specific enhancements for PDF document classification, watermark handling, and layout matching
2a466e03 · myhloli · 19916856 · 2a466e03 · 2a466e03
Commit 2a466e03 authored Feb 24, 2025 by myhloli
Hide whitespace changes
Inline Side-by-side

Showing with 20 additions and 0 deletions

README.md README.md +10 -0

README_zh-CN.md README_zh-CN.md +10 -0

No files found.
--- a/README.md
+++ b/README.md
@@ -47,6 +47,16 @@ Easier to use: Just grab MinerU Desktop. No coding, no login, just a simple inte
 </div>

 # Changelog
+
+- 2025/02/24 Release 1.2.0: This version includes several fixes and improvements to enhance parsing efficiency and accuracy:
+  - Performance Optimization
+    - Increased classification speed for PDF documents in auto mode.
+  - Parsing Optimization
+    - Improved parsing logic for documents containing watermarks, significantly enhancing the parsing results for such documents.
+    - Enhanced the matching logic for multiple images/tables and captions within a single page, improving the accuracy of image-text matching in complex layouts.
+  - Bug Fixes
+    - Fixed an issue where image/table spans were incorrectly filled into text blocks under certain conditions.
+    - Resolved an issue where title blocks were empty in some cases.
 - 2025/01/22 1.1.0 released. In this version we have focused on improving parsing accuracy and efficiency:
  - Model capability upgrade (requires re-executing the [model download process](docs/how_to_download_models_en.md) to obtain incremental updates of model files)
    - The layout recognition model has been upgraded to the latest `doclayout_yolo(2501)` model, improving layout recognition accuracy.

--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@@ -46,6 +46,16 @@
 </div>

 # 更新记录
+- 2025/02/24 1.2.0 发布，这个版本我们修复了一些问题，提升了解析的效率与精度：
+  - 性能优化 
+    - auto模式下pdf文档的分类速度提升
+    - 在华为昇腾 NPU 加速模式下，添加高性能插件支持，常见场景下端到端加速可达 300% [申请链接](https://aicarrier.feishu.cn/share/base/form/shrcnb10VaoNQB8kQPA8DEfZC6d)
+  - 解析优化
+    - 优化对包含水印文档的解析逻辑，显著提升包含水印文档的解析效果
+    - 改进了单页内多个图像/表格与caption的匹配逻辑，提升了复杂布局下图文匹配的准确性
+  - 问题修复
+    - 修复在某些情况下图片/表格span被填充进textblock导致的异常
+    - 修复在某些情况下标题block为空的问题
 - 2025/01/22 1.1.0 发布，在这个版本我们重点提升了解析的精度与效率：
  - 模型能力升级（需重新执行[模型下载流程](docs/how_to_download_models_zh_cn.md)以获得模型文件的增量更新） 
    - 布局识别模型升级到最新的`doclayout_yolo(2501)`模型，提升了layout识别精度