Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
zhougaofeng
magic_pdf
Commits
f3b09a0b
Commit
f3b09a0b
authored
Oct 25, 2024
by
zhougaofeng
Browse files
Update para_split_v2.py
parent
c0a9d1c7
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
6 deletions
+6
-6
magic_pdf/para/para_split_v2.py
magic_pdf/para/para_split_v2.py
+6
-6
No files found.
magic_pdf/para/para_split_v2.py
View file @
f3b09a0b
...
@@ -763,16 +763,16 @@ def para_split(pdf_info_dict, debug_mode, lang="en"):
...
@@ -763,16 +763,16 @@ def para_split(pdf_info_dict, debug_mode, lang="en"):
is_conn
=
__connect_para_inter_page
(
pre_page_paras
,
next_page_paras
,
pre_page_layout_bbox
,
is_conn
=
__connect_para_inter_page
(
pre_page_paras
,
next_page_paras
,
pre_page_layout_bbox
,
next_page_layout_bbox
,
page_num
,
lang
)
next_page_layout_bbox
,
page_num
,
lang
)
if
debug_able
:
#
if debug_able:
if
is_conn
:
#
if is_conn:
logger
.
info
(
f
"连接了第
{
page_num
-
1
}
页和第
{
page_num
}
页的段落"
)
#
logger.info(f"连接了第{page_num - 1}页和第{page_num}页的段落")
is_list_conn
=
__connect_list_inter_page
(
pre_page_paras
,
next_page_paras
,
pre_page_layout_bbox
,
is_list_conn
=
__connect_list_inter_page
(
pre_page_paras
,
next_page_paras
,
pre_page_layout_bbox
,
next_page_layout_bbox
,
all_page_list_info
[
page_num
-
1
],
next_page_layout_bbox
,
all_page_list_info
[
page_num
-
1
],
all_page_list_info
[
page_num
],
page_num
,
lang
)
all_page_list_info
[
page_num
],
page_num
,
lang
)
if
debug_able
:
#
if debug_able:
if
is_list_conn
:
#
if is_list_conn:
logger
.
info
(
f
"连接了第
{
page_num
-
1
}
页和第
{
page_num
}
页的列表段落"
)
#
logger.info(f"连接了第{page_num - 1}页和第{page_num}页的列表段落")
"""接下来可能会漏掉一些特别的一些可以合并的内容,对他们进行段落连接
"""接下来可能会漏掉一些特别的一些可以合并的内容,对他们进行段落连接
1. 正文中有时出现一个行顶格,接下来几行缩进的情况。
1. 正文中有时出现一个行顶格,接下来几行缩进的情况。
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment