Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
wangsen
MinerU
Commits
"docs/en_US/TrainingService/PaiYarnMode.rst" did not exist on "dbb2434f5d2d976be26b594342a68cb46619ecea"
99055af3316ffe7b29a8e560fa1c472a2edf7f2e
Switch branch/tag
mineru
magic_pdf
dict2md
ocr_mkcontent.py
21 Mar, 2024
4 commits
英文文本拼接时,如果单个单词超过15个字符,则对该单词进行切分处理。行间公式/图片/表格独立占有一行
· 99055af3
赵小蒙
authored
Mar 21, 2024
99055af3
解决'[]'括起来的文本被识别成链接的问题
· 0dbbf9c3
赵小蒙
authored
Mar 21, 2024
0dbbf9c3
制作OCR markdown
· 056aed86
kernel.h@qq.com
authored
Mar 21, 2024
056aed86
line_to_standard_format 逻辑更新
· c5624ace
赵小蒙
authored
Mar 21, 2024
c5624ace
19 Mar, 2024
1 commit
qa需求定制输出
· ef267e09
赵小蒙
authored
Mar 19, 2024
ef267e09
15 Mar, 2024
4 commits
增加标准格式的拼装逻辑
· 051ee3c3
赵小蒙
authored
Mar 15, 2024
051ee3c3
s3_image_save_path统一配置
· f10b4a50
赵小蒙
authored
Mar 15, 2024
f10b4a50
mk_mm_markdown2中span_type分类更新
· 195998a0
赵小蒙
authored
Mar 15, 2024
195998a0
make多模态markdown时图片地址更改为fullpath
· f06a3213
赵小蒙
authored
Mar 15, 2024
f06a3213
14 Mar, 2024
5 commits
实现layout内部分段
· 084e9328
xuchao
authored
Mar 14, 2024
084e9328
make markdown时特殊符号转义
· 59b0b0c3
赵小蒙
authored
Mar 14, 2024
59b0b0c3
ocr模式更新spark pipeline
· 9bd6294b
赵小蒙
authored
Mar 14, 2024
9bd6294b
ocr模式下content type 抽象
· 26c23782
赵小蒙
authored
Mar 14, 2024
26c23782
在layout.pdf中绘制drop的bbox
· b6f051d8
赵小蒙
authored
Mar 14, 2024
b6f051d8
12 Mar, 2024
1 commit
增加生成多模态markdown逻辑
· ec1a6ef7
赵小蒙
authored
Mar 12, 2024
ec1a6ef7
07 Mar, 2024
1 commit
修复一个span可能没有content导致的问题
· 00f3e329
赵小蒙
authored
Mar 07, 2024
00f3e329
06 Mar, 2024
1 commit
增加ocr版本解析功能
· 701f3849
赵小蒙
authored
Mar 06, 2024
701f3849