Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
donut_pytorch
Commits
f174bb02
"...resnet50_tensorflow.git" did not exist on "f047d65958f0b07f9b178eabbbcb70a3cc5374b8"
Commit
f174bb02
authored
Aug 04, 2022
by
Geewook Kim
Browse files
feat: loads the pre-trained weight from the official branch, related to #10
parent
7e451193
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
7 additions
and
7 deletions
+7
-7
README.md
README.md
+6
-6
donut/model.py
donut/model.py
+1
-1
No files found.
README.md
View file @
f174bb02
...
@@ -35,14 +35,14 @@ Gradio web demos are available! [
(
Document
Parsing) | 0.7 /
<br>
0.7 /
<br>
1.2 | 93.9 /
<br>
93.6 /
<br>
93.5 |
[
donut-base-finetuned-cord-v2
](
https://huggingface.co/naver-clova-ix/donut-base-finetuned-cord-v2
)
(
1280
)
/
<br>
[
donut-base-finetuned-cord-v1
](
https://huggingface.co/naver-clova-ix/donut-base-finetuned-cord-v1
)
(
1280
)
/
<br>
[
donut-base-finetuned-cord-v1-2560
](
https://huggingface.co/naver-clova-ix/donut-base-finetuned-cord-v1-2560
)
|
[
gradio space web demo
](
https://huggingface.co/spaces/naver-clova-ix/donut-base-finetuned-cord-v2
)
,
<br>
[
google colab demo
](
https://colab.research.google.com/drive/1o07hty-3OQTvGnc_7lgQFLvvKQuLjqiw?usp=sharing
)
|
|
[
CORD
](
https://github.com/clovaai/cord
)
(
Document
Parsing) | 0.7 /
<br>
0.7 /
<br>
1.2 | 93.9 /
<br>
93.6 /
<br>
93.5 |
[
donut-base-finetuned-cord-v2
](
https://huggingface.co/naver-clova-ix/donut-base-finetuned-cord-v2
/tree/official
)
(
1280
)
/
<br>
[
donut-base-finetuned-cord-v1
](
https://huggingface.co/naver-clova-ix/donut-base-finetuned-cord-v1
/tree/official
)
(
1280
)
/
<br>
[
donut-base-finetuned-cord-v1-2560
](
https://huggingface.co/naver-clova-ix/donut-base-finetuned-cord-v1-2560
/tree/official
)
|
[
gradio space web demo
](
https://huggingface.co/spaces/naver-clova-ix/donut-base-finetuned-cord-v2
)
,
<br>
[
google colab demo
](
https://colab.research.google.com/drive/1o07hty-3OQTvGnc_7lgQFLvvKQuLjqiw?usp=sharing
)
|
|
[
Train Ticket
](
https://github.com/beacandler/EATEN
)
(
Document
Parsing) | 0.6 | 98.8 |
[
donut-base-finetuned-zhtrainticket
](
https://huggingface.co/naver-clova-ix/donut-base-finetuned-zhtrainticket
)
|
[
google colab demo
](
https://colab.research.google.com/drive/16O-hMvGiXrYZnlXA_tfJ9_q760YcLoOj?usp=sharing
)
|
|
[
Train Ticket
](
https://github.com/beacandler/EATEN
)
(
Document
Parsing) | 0.6 | 98.8 |
[
donut-base-finetuned-zhtrainticket
](
https://huggingface.co/naver-clova-ix/donut-base-finetuned-zhtrainticket
/tree/official
)
|
[
google colab demo
](
https://colab.research.google.com/drive/16O-hMvGiXrYZnlXA_tfJ9_q760YcLoOj?usp=sharing
)
|
|
[
RVL-CDIP
](
https://www.cs.cmu.edu/~aharley/rvl-cdip
)
(
Document
Classification) | 0.75 | 95.3 |
[
donut-base-finetuned-rvlcdip
](
https://huggingface.co/naver-clova-ix/donut-base-finetuned-rvlcdip
)
|
[
google colab demo
](
https://colab.research.google.com/drive/1xUDmLqlthx8A8rWKLMSLThZ7oeRJkDuU?usp=sharing
)
|
|
[
RVL-CDIP
](
https://www.cs.cmu.edu/~aharley/rvl-cdip
)
(
Document
Classification) | 0.75 | 95.3 |
[
donut-base-finetuned-rvlcdip
](
https://huggingface.co/naver-clova-ix/donut-base-finetuned-rvlcdip
/tree/official
)
|
[
google colab demo
](
https://colab.research.google.com/drive/1xUDmLqlthx8A8rWKLMSLThZ7oeRJkDuU?usp=sharing
)
|
|
[
DocVQA Task1
](
https://rrc.cvc.uab.es/?ch=17
)
(
Document
VQA) | 0.78 | 67.5 |
[
donut-base-finetuned-docvqa
](
https://huggingface.co/naver-clova-ix/donut-base-finetuned-docvqa
)
|
[
google colab demo
](
https://colab.research.google.com/drive/1Z4WG8Wunj3HE0CERjt608ALSgSzRC9ig?usp=sharing
)
|
|
[
DocVQA Task1
](
https://rrc.cvc.uab.es/?ch=17
)
(
Document
VQA) | 0.78 | 67.5 |
[
donut-base-finetuned-docvqa
](
https://huggingface.co/naver-clova-ix/donut-base-finetuned-docvqa
/tree/official
)
|
[
google colab demo
](
https://colab.research.google.com/drive/1Z4WG8Wunj3HE0CERjt608ALSgSzRC9ig?usp=sharing
)
|
The links to the pre-trained backbones are here:
The links to the pre-trained backbones are here:
-
[
`donut-base`
](
https://huggingface.co/naver-clova-ix/donut-base
)
: trained with 64 A100 GPUs (~2.5 days), number of layers (encoder: {2,2,14,2}, decoder: 4), input size 2560x1920, swin window size 10, IIT-CDIP (11M) and SynthDoG (ECJK, 0.5M x 4).
-
[
`donut-base`
](
https://huggingface.co/naver-clova-ix/donut-base
/tree/official
)
: trained with 64 A100 GPUs (~2.5 days), number of layers (encoder: {2,2,14,2}, decoder: 4), input size 2560x1920, swin window size 10, IIT-CDIP (11M) and SynthDoG (ECJK, 0.5M x 4).
-
[
`donut-proto`
](
https://huggingface.co/naver-clova-ix/donut-proto
)
: (preliminary model) trained with 8 V100 GPUs (~5 days), number of layers (encoder: {2,2,18,2}, decoder: 4), input size 2048x1536, swin window size 8, and SynthDoG (EJK, 0.4M x 3).
-
[
`donut-proto`
](
https://huggingface.co/naver-clova-ix/donut-proto
/tree/official
)
: (preliminary model) trained with 8 V100 GPUs (~5 days), number of layers (encoder: {2,2,18,2}, decoder: 4), input size 2048x1536, swin window size 8, and SynthDoG (EJK, 0.4M x 3).
Please see
[
our paper
](
#how-to-cite
)
for more details.
Please see
[
our paper
](
#how-to-cite
)
for more details.
...
...
donut/model.py
View file @
f174bb02
...
@@ -592,7 +592,7 @@ class DonutModel(PreTrainedModel):
...
@@ -592,7 +592,7 @@ class DonutModel(PreTrainedModel):
Name of a pretrained model name either registered in huggingface.co. or saved in local,
Name of a pretrained model name either registered in huggingface.co. or saved in local,
e.g., `naver-clova-ix/donut-base`, or `naver-clova-ix/donut-base-finetuned-rvlcdip`
e.g., `naver-clova-ix/donut-base`, or `naver-clova-ix/donut-base-finetuned-rvlcdip`
"""
"""
model
=
super
(
DonutModel
,
cls
).
from_pretrained
(
pretrained_model_name_or_path
,
*
model_args
,
**
kwargs
)
model
=
super
(
DonutModel
,
cls
).
from_pretrained
(
pretrained_model_name_or_path
,
revision
=
"official"
,
*
model_args
,
**
kwargs
)
# truncate or interplolate position embeddings of donut decoder
# truncate or interplolate position embeddings of donut decoder
max_length
=
kwargs
.
get
(
"max_length"
,
model
.
config
.
max_position_embeddings
)
max_length
=
kwargs
.
get
(
"max_length"
,
model
.
config
.
max_position_embeddings
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment