Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
wangsen
MinerU
Commits
d3542f6a71f2f56099bb91aa440e1cd243e063bb
Switch branch/tag
mineru
magic_pdf
25 Apr, 2024
8 commits
add para_to_standard_format logic
· d3542f6a
赵小蒙
authored
Apr 25, 2024
d3542f6a
add init
· c7cca210
赵小蒙
authored
Apr 25, 2024
c7cca210
fix remove error
· f70289f9
赵小蒙
authored
Apr 25, 2024
f70289f9
fix remove error
· 1936703b
赵小蒙
authored
Apr 25, 2024
1936703b
change remove spans logic
· fcf94b2d
赵小蒙
authored
Apr 25, 2024
fcf94b2d
change some remove logic
· 91ee9911
赵小蒙
authored
Apr 25, 2024
91ee9911
change parse_union_pdf error output
· 7f0874de
赵小蒙
authored
Apr 25, 2024
7f0874de
fix interline_equations block
· 7631907f
赵小蒙
authored
Apr 25, 2024
7631907f
24 Apr, 2024
1 commit
change dropped_bbox drawing color
· 351a3ce1
赵小蒙
authored
Apr 24, 2024
351a3ce1
23 Apr, 2024
16 commits
修复了para_split内容丢失
· f4a7e0d7
liukaiwen
authored
Apr 23, 2024
f4a7e0d7
feat: draw block based on block_type
· 4aa48329
许瑞
authored
Apr 23, 2024
4aa48329
fix annotation
· 11462061
赵小蒙
authored
Apr 23, 2024
11462061
update confidence score 0.95->0.05
· 3f062ad7
赵小蒙
authored
Apr 23, 2024
3f062ad7
fix draw_span_bbox logic
· 49076f02
赵小蒙
authored
Apr 23, 2024
49076f02
fix draw_layout_bbox logic
· 60208b1b
赵小蒙
authored
Apr 23, 2024
60208b1b
output dir by method
· 39c65fe5
赵小蒙
authored
Apr 23, 2024
39c65fe5
修改找不到model文件时候的错误提示
· 21159040
kernel.h@qq.com
authored
Apr 23, 2024
21159040
模型数据初始化时根据置信度进行过滤,预设阈值95%
· c460be91
赵小蒙
authored
Apr 23, 2024
c460be91
有些ocr的text和block框差异过大,降低fill阈值到0.7
· ce992f27
赵小蒙
authored
Apr 23, 2024
ce992f27
v2pipeline在分段区域增加异常捕获
· fa6e305c
赵小蒙
authored
Apr 23, 2024
fa6e305c
将imagedir修改为相对路径,以便在markdown中渲染
· 2edf2f8a
赵小蒙
authored
Apr 23, 2024
2edf2f8a
更新了para_split
· 778b1fb7
liukaiwen
authored
Apr 23, 2024
778b1fb7
避免空para导致的error
· 81f73a3d
赵小蒙
authored
Apr 23, 2024
81f73a3d
修改pdf的路径
· ef0129ad
kernel.h@qq.com
authored
Apr 23, 2024
ef0129ad
更新了para_split
· 9528a839
liukaiwen
authored
Apr 23, 2024
9528a839
22 Apr, 2024
15 commits
统一使用ocr组装markdown
· 1340a97a
赵小蒙
authored
Apr 22, 2024
1340a97a
文本框与标题框重叠,优先信任文本框
· 83641d3d
赵小蒙
authored
Apr 22, 2024
83641d3d
fix: remove_overlap leading zero height case
· ebc2f057
许瑞
authored
Apr 22, 2024
ebc2f057
fix ocr_mk_markdown_with_para_core_v2
· 52777b22
赵小蒙
authored
Apr 22, 2024
52777b22
更新mm markdown拼装函数
· d7128a9d
赵小蒙
authored
Apr 22, 2024
d7128a9d
更新了para_split
· 37483f0a
liukaiwen
authored
Apr 22, 2024
37483f0a
更新了para_split
· 4cc88d2b
liukaiwen
authored
Apr 22, 2024
4cc88d2b
参数命名修正
· 8dcfe9ad
赵小蒙
authored
Apr 22, 2024
8dcfe9ad
更新了para_split
· 85153b02
liukaiwen
authored
Apr 22, 2024
85153b02
更新了para_split
· c60ca827
liukaiwen
authored
Apr 22, 2024
c60ca827
block重叠和嵌套问题修复
· 55f358d1
赵小蒙
authored
Apr 22, 2024
55f358d1
feat: update cli
· b16599cd
许瑞
authored
Apr 22, 2024
b16599cd
block type 字段名修复
· 45ce99bf
赵小蒙
authored
Apr 22, 2024
增加remove_overlaps_min_blocks逻辑
45ce99bf
更新了para_split
· e31066ba
liukaiwen
authored
Apr 22, 2024
e31066ba
增加block嵌套问题的todo
· d8c5b7a7
赵小蒙
authored
Apr 22, 2024
d8c5b7a7