Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
wangsen
MinerU
Commits
6ab12348
Unverified
Commit
6ab12348
authored
Jun 13, 2025
by
Xiaomeng Zhao
Committed by
GitHub
Jun 13, 2025
Browse files
Merge pull request #2625 from opendatalab/release-2.0.0
Release 2.0.0
parents
9487d33d
4fbec469
Changes
743
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
0 additions
and
377 deletions
+0
-377
projects/web/src/pages/extract/components/pdf-upload/index.module.scss
...src/pages/extract/components/pdf-upload/index.module.scss
+0
-115
projects/web/src/pages/extract/components/pdf-upload/index.tsx
...cts/web/src/pages/extract/components/pdf-upload/index.tsx
+0
-116
projects/web/src/pages/extract/components/pdf-viewer/index.tsx
...cts/web/src/pages/extract/components/pdf-viewer/index.tsx
+0
-146
No files found.
Too many changes to show.
To preserve performance only
743 of 743+
files are displayed.
Plain diff
Email patch
projects/web/src/pages/extract/components/pdf-upload/index.module.scss
deleted
100644 → 0
View file @
9487d33d
.textBtn
{
background-image
:
none
!
important
;
background-clip
:
text
;
-webkit-background-clip
:
text
;
-webkit-text-fill-color
:
transparent
;
background
:
linear-gradient
(
111deg
,
#0D53DE
-21
.44%
,
#5246FF
102%
)
!
important
;
background-clip
:
text
!
important
;
-webkit-background-clip
:
text
!
important
;
-webkit-text-fill-color
:
transparent
!
important
;
height
:
1
.5rem
!
important
;
font-weight
:
600
;
height
:
280px
!
important
;
width
:
600px
!
important
;
overflow
:
hidden
;
}
.uploadText
{
font-feature-settings
:
'liga'
off
,
'clig'
off
;
font-family
:
"PingFang SC"
;
font-size
:
18px
;
font-style
:
normal
;
font-weight
:
600
;
line-height
:
24px
;
/* 133.333% */
background
:
linear-gradient
(
107deg
,
#38A0FF
-24
.14%
,
#0D53DE
30
.09%
,
#5246FF
86
.61%
);
background-clip
:
text
;
-webkit-background-clip
:
text
;
-webkit-text-fill-color
:
transparent
;
}
.uploadDescText
{
font-size
:
13px
;
line-height
:
20px
;
font-weight
:
400
;
background
:
linear-gradient
(
107deg
,
rgba
(
18
,
19
,
22
,
0
.6
)
-24
.14%
,
rgba
(
18
,
19
,
22
,
0
.6
)
100
.09%
);
background-clip
:
text
;
-webkit-background-clip
:
text
;
-webkit-text-fill-color
:
transparent
;
margin-bottom
:
1rem
;
margin-top
:
0
.5rem
;
}
.linearText
{
font-size
:
13px
;
line-height
:
20px
;
font-weight
:
400
;
background
:
linear-gradient
(
107deg
,
rgba
(
18
,
19
,
22
,
0
.6
)
-24
.14%
,
rgba
(
18
,
19
,
22
,
0
.6
)
100
.09%
);
background-clip
:
text
;
-webkit-background-clip
:
text
;
-webkit-text-fill-color
:
transparent
;
&
-item
{
font-weight
:
400
;
font-size
:
13px
;
line-height
:
20px
;
margin-right
:
1rem
;
background
:
linear-gradient
(
107deg
,
#38A0FF
-24
.14%
,
#0D53DE
30
.09%
,
#5246FF
86
.61%
);
background-clip
:
text
;
-webkit-background-clip
:
text
;
-webkit-text-fill-color
:
transparent
;
&
:hover
{
background
:
#3477EB
;
background
:
linear-gradient
(
107deg
,
#3477EB
-24
.14%
,
#3477EB
100
.09%
);
background-clip
:
text
;
-webkit-background-clip
:
text
;
-webkit-text-fill-color
:
transparent
;
}
}
}
.uploadSection
{
border-radius
:
12px
;
border
:
1px
dashed
var
(
---
Brand1-6
,
#0D53DE
);
background
:
linear-gradient
(
180deg
,
rgba
(
92
,
147
,
255
,
0
.10
)
-130
.23%
,
rgba
(
255
,
255
,
255
,
1
)
83
.57%
);
display
:
flex
;
flex-direction
:
column
;
justify-content
:
center
;
align-items
:
center
;
filter
:
blur
(
0px
);
height
:
280px
!
important
;
width
:
600px
!
important
;
&
:hover
{
background
:
linear-gradient
(
180deg
,
rgb
(
245
,
248
,
255
)
-130
.23%
,
rgb
(
245
,
248
,
255
)
83
.57%
);
}
}
.item
{
border-radius
:
12px
;
border
:
1px
solid
rgba
(
198
,
217
,
255
,
0
.20
);
background
:
linear-gradient
(
155deg
,
rgba
(
92
,
147
,
255
,
0
.10
)
-13
.23%
,
rgba
(
255
,
255
,
255
,
0
.00
)
83
.57%
);
filter
:
blur
(
0px
);
padding
:
42px
20px
;
}
.customPopover
{
:global
{
.ant-popover-content
,
.ant-popover-inner
{
border-radius
:
12px
!
important
;
overflow
:
hidden
;
box-shadow
:
0px
8px
26px
0px
rgba
(
0
,
0
,
0
,
0
.12
);
}
.ant-popover-inner-content
{
padding
:
12px
!
important
;
}
.ant-popover-arrow
{
display
:
none
!
important
;
}
}
}
projects/web/src/pages/extract/components/pdf-upload/index.tsx
deleted
100644 → 0
View file @
9487d33d
import
UploadBg
from
"
@/assets/imgs/online.experience/file-upload-bg.svg
"
;
import
style
from
"
./index.module.scss
"
;
import
{
ExtractorUploadButton
}
from
"
../pdf-upload-button
"
;
import
{
useNavigate
}
from
"
react-router-dom
"
;
import
cls
from
"
classnames
"
;
import
{
SubmitRes
}
from
"
@/api/extract
"
;
import
{
Checkbox
,
Popover
}
from
"
antd
"
;
import
{
useIntl
}
from
"
react-intl
"
;
import
IconFont
from
"
@/components/icon-font
"
;
import
{
ADD_TASK_LIST
}
from
"
@/constant/event
"
;
import
{
useState
}
from
"
react
"
;
const
PdfUpload
=
()
=>
{
const
navigate
=
useNavigate
();
const
{
formatMessage
}
=
useIntl
();
const
[
checked
,
setChecked
]
=
useState
(
false
);
const
afterUploadSuccess
=
(
data
:
SubmitRes
)
=>
{
navigate
(
`/OpenSourceTools/Extractor/PDF/
${
data
?.
id
}
`
);
setTimeout
(()
=>
{
document
.
dispatchEvent
(
new
CustomEvent
(
ADD_TASK_LIST
,
{
detail
:
data
,
})
);
},
10
);
};
const
afterAsyncCheck
=
async
()
=>
{
return
Promise
.
resolve
(
true
);
};
return
(
<
div
className
=
"w-full h-full flex flex-col relative items-center relative"
>
<
div
className
=
"w-full h-full flex flex-col relative justify-center items-center translate-y-[-60px] z-0"
>
<
div
className
=
"mb-6 text-[1.5rem] text-[#121316] font-semibold"
>
{
formatMessage
({
id
:
"
extractor.pdf.title
"
})
}
</
div
>
<
div
className
=
"mb-12 text-[1.25rem] text-center text-[#121316]/[0.8] leading-[1.5rem] max-w-[48rem]"
>
{
formatMessage
({
id
:
"
extractor.pdf.subTitle
"
})
}
</
div
>
<
ExtractorUploadButton
accept
=
".pdf"
taskType
=
"pdf"
afterUploadSuccess
=
{
afterUploadSuccess
}
afterAsyncCheck
=
{
afterAsyncCheck
}
extractType
=
{
"
pdf
"
}
isOcr
=
{
checked
}
text
=
{
<
div
className
=
{
cls
(
style
.
uploadSection
,
"
border-[1px] border-dashed border-[#0D53DE] rounded-xl flex flex-col items-center justify-center
"
)
}
>
<
img
src
=
{
UploadBg
}
className
=
"mb-4"
/>
<
span
className
=
{
cls
(
style
.
uploadText
,
"
text-[18px] leading-[20px]
"
)
}
>
{
formatMessage
({
id
:
"
extractor.common.upload
"
})
}
</
span
>
<
span
className
=
{
cls
(
style
.
uploadDescText
,
"
!mb-0 flex items-center
"
)
}
onClick
=
{
(
e
)
=>
{
e
.
preventDefault
();
e
.
stopPropagation
();
}
}
>
<
Checkbox
className
=
"mr-1"
checked
=
{
checked
}
onClick
=
{
()
=>
setChecked
(
!
checked
)
}
/>
{
formatMessage
({
id
:
"
extractor.pdf.ocr
"
})
}
<
Popover
content
=
{
<
div
className
=
"max-w-[20rem]"
>
{
formatMessage
({
id
:
"
extractor.pdf.ocr.popover
"
,
})
}
</
div
>
}
placement
=
"right"
showArrow
=
{
false
}
overlayClassName
=
{
style
.
customPopover
}
>
<
IconFont
type
=
"icon-QuestionCircleOutlined"
className
=
"text-[#121316]/[0.6] ml-1 text-[16px] hover:text-[#0D53DE]"
/>
</
Popover
>
</
span
>
{
/* <span className={cls(style.uploadDescText)}>
{formatMessage({ id: "extractor.common.pdf.upload.tip" })}
</span> */
}
</
div
>
}
className
=
{
style
.
textBtn
}
showIcon
=
{
false
}
/>
</
div
>
<
div
className
=
"absolute bottom-[1.5rem] text-[13px] text-[#121316]/[0.35] text-center leading-[20px] max-w-[64rem]"
>
{
formatMessage
({
id
:
"
extractor.law
"
,
})
}
</
div
>
</
div
>
);
};
export
default
PdfUpload
;
projects/web/src/pages/extract/components/pdf-viewer/index.tsx
deleted
100644 → 0
View file @
9487d33d
import
{
MD_DRIVE_PDF
}
from
"
@/constant/event
"
;
import
{
message
}
from
"
antd
"
;
import
{
TaskIdProgress
,
TaskIdResItem
}
from
"
@/api/extract
"
;
import
React
,
{
useEffect
,
useRef
}
from
"
react
"
;
import
{
useLatest
}
from
"
ahooks
"
;
import
{
DEFAULT_COLOR_SECTION
,
PDF_COLOR_PICKER
,
}
from
"
@/constant/pdf-color-picker
"
;
interface
PDFViewerState
{
page
:
number
;
}
interface
Bbox
{
type
:
"
title
"
|
"
text
"
|
"
discarded
"
|
"
image
"
;
bbox
:
[
number
,
number
,
number
,
number
];
color
:
any
;
}
interface
ExtractLayerItem
{
preproc_blocks
:
Bbox
[];
page_idx
:
number
;
page_size
:
[
number
,
number
];
discarded_blocks
:
Bbox
[];
}
// func
const
formatJson
=
(
layerList
:
ExtractLayerItem
[])
=>
{
return
layerList
?.
map
((
i
)
=>
{
let
bboxes
=
[]
as
{
type
:
string
;
bbox
:
number
[];
color
:
any
}[];
i
?.
preproc_blocks
?.
forEach
((
item
)
=>
{
bboxes
.
push
({
type
:
item
.
type
,
bbox
:
item
.
bbox
,
color
:
PDF_COLOR_PICKER
?.[
item
.
type
]
||
DEFAULT_COLOR_SECTION
,
});
});
i
?.
discarded_blocks
?.
forEach
((
item
)
=>
{
bboxes
.
push
({
type
:
item
.
type
,
bbox
:
item
.
bbox
,
color
:
PDF_COLOR_PICKER
?.[
item
.
type
]
||
DEFAULT_COLOR_SECTION
,
});
});
return
{
...
i
,
bboxes
,
};
});
};
const
PDFViewer
=
({
taskInfo
,
onChange
,
}:
{
taskInfo
:
TaskIdProgress
&
TaskIdResItem
;
onChange
:
(
state
:
PDFViewerState
)
=>
void
;
})
=>
{
const
iframeRef
=
useRef
<
HTMLIFrameElement
>
(
null
);
const
_layerData
=
useLatest
(
taskInfo
?.
content
);
const
pdfUrl
=
taskInfo
?.
url
;
const
sendMessageToIframe
=
(
type
:
string
,
message
:
any
)
=>
{
if
(
iframeRef
.
current
)
{
iframeRef
.
current
.
contentWindow
?.
postMessage
(
{
type
,
data
:
message
,
},
import
.
meta
.
env
.
BASE_URL
||
"
*
"
);
}
};
useEffect
(()
=>
{
const
handleMessage
=
(
event
:
MessageEvent
)
=>
{
if
(
event
?.
data
?.
pageNum
)
{
const
num
=
event
?.
data
?.
pageNum
||
1
;
sendMessageToIframe
(
"
pageChange
"
,
num
);
}
if
(
event
?.
data
?.
pageNumDetail
)
{
const
pageNumDetail
=
event
?.
data
?.
pageNumDetail
||
1
;
onChange
?.({
page
:
pageNumDetail
,
});
sendMessageToIframe
(
"
pageNumDetail
"
,
pageNumDetail
);
}
if
(
event
?.
data
?.
status
)
{
const
status
=
event
?.
data
?.
status
;
if
(
status
===
"
loaded
"
)
{
sendMessageToIframe
(
"
initExtractLayerData
"
,
formatJson
(
_layerData
?.
current
as
any
)
);
sendMessageToIframe
(
"
title
"
,
""
);
}
}
if
(
event
?.
data
?.
error
)
{
message
?.
error
(
event
?.
data
?.
error
);
}
};
window
.
addEventListener
(
"
message
"
,
handleMessage
);
return
()
=>
{
window
.
removeEventListener
(
"
message
"
,
handleMessage
);
};
},
[]);
useEffect
(()
=>
{
const
handlePageChange
=
({
detail
}:
CustomEvent
)
=>
{
sendMessageToIframe
(
"
setPage
"
,
detail
+
1
);
};
document
.
addEventListener
(
MD_DRIVE_PDF
,
handlePageChange
as
EventListener
);
return
()
=>
{
document
.
removeEventListener
(
MD_DRIVE_PDF
,
handlePageChange
as
EventListener
);
};
},
[]);
return
(
<>
{
pdfUrl
?
(
<
iframe
ref
=
{
iframeRef
}
className
=
"w-full border-0 h-full"
src
=
{
`
${
import
.
meta
.
env
.
BASE_URL
}
pdfjs-dist/web/viewer.html?file=
${
encodeURIComponent
(
pdfUrl
)}
`
}
></
iframe
>
)
:
null
}
</>
);
};
export
default
PDFViewer
;
Prev
1
…
34
35
36
37
38
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment