Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
donut_pytorch
Commits
d3f759b3
Commit
d3f759b3
authored
Aug 17, 2022
by
Geewook Kim
Browse files
docs: enhance comments
parent
0353dbf8
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
6 additions
and
4 deletions
+6
-4
README.md
README.md
+2
-2
config/train_docvqa.yaml
config/train_docvqa.yaml
+2
-1
config/train_rvlcdip.yaml
config/train_rvlcdip.yaml
+2
-1
No files found.
README.md
View file @
d3f759b3
...
@@ -41,8 +41,8 @@ Gradio web demos are available! [
(
Document
VQA) | 0.78 | 67.5 |
[
donut-base-finetuned-docvqa
](
https://huggingface.co/naver-clova-ix/donut-base-finetuned-docvqa/tree/official
)
|
[
gradio space web demo
](
https://huggingface.co/spaces/nielsr/donut-docvqa
)
,
<br>
[
google colab demo
](
https://colab.research.google.com/drive/1Z4WG8Wunj3HE0CERjt608ALSgSzRC9ig?usp=sharing
)
|
|
[
DocVQA Task1
](
https://rrc.cvc.uab.es/?ch=17
)
(
Document
VQA) | 0.78 | 67.5 |
[
donut-base-finetuned-docvqa
](
https://huggingface.co/naver-clova-ix/donut-base-finetuned-docvqa/tree/official
)
|
[
gradio space web demo
](
https://huggingface.co/spaces/nielsr/donut-docvqa
)
,
<br>
[
google colab demo
](
https://colab.research.google.com/drive/1Z4WG8Wunj3HE0CERjt608ALSgSzRC9ig?usp=sharing
)
|
The links to the pre-trained backbones are here:
The links to the pre-trained backbones are here:
-
[
`donut-base`
](
https://huggingface.co/naver-clova-ix/donut-base/tree/official
)
: trained with 64 A100 GPUs (~2.5 days), number of layers (encoder: {2,2,14,2}, decoder: 4), input size 2560x1920, swin window size 10, IIT-CDIP (11M) and SynthDoG (E
CJK
, 0.5M x 4).
-
[
`donut-base`
](
https://huggingface.co/naver-clova-ix/donut-base/tree/official
)
: trained with 64 A100 GPUs (~2.5 days), number of layers (encoder: {2,2,14,2}, decoder: 4), input size 2560x1920, swin window size 10, IIT-CDIP (11M) and SynthDoG (E
nglish, Chinese, Japanese, Korean
, 0.5M x 4).
-
[
`donut-proto`
](
https://huggingface.co/naver-clova-ix/donut-proto/tree/official
)
: (preliminary model) trained with 8 V100 GPUs (~5 days), number of layers (encoder: {2,2,18,2}, decoder: 4), input size 2048x1536, swin window size 8, and SynthDoG (E
JK
, 0.4M x 3).
-
[
`donut-proto`
](
https://huggingface.co/naver-clova-ix/donut-proto/tree/official
)
: (preliminary model) trained with 8 V100 GPUs (~5 days), number of layers (encoder: {2,2,18,2}, decoder: 4), input size 2048x1536, swin window size 8, and SynthDoG (E
nglish, Japanese, Korean
, 0.4M x 3).
Please see
[
our paper
](
#how-to-cite
)
for more details.
Please see
[
our paper
](
#how-to-cite
)
for more details.
...
...
config/train_docvqa.yaml
View file @
d3f759b3
...
@@ -8,7 +8,8 @@ val_batch_sizes: [4]
...
@@ -8,7 +8,8 @@ val_batch_sizes: [4]
input_size
:
[
2560
,
1920
]
input_size
:
[
2560
,
1920
]
max_length
:
128
max_length
:
128
align_long_axis
:
False
align_long_axis
:
False
num_nodes
:
8
# num_nodes: 8 # memo: donut-base-finetuned-docvqa was trained with 8 nodes
num_nodes
:
1
seed
:
2022
seed
:
2022
lr
:
3e-5
lr
:
3e-5
warmup_steps
:
10000
warmup_steps
:
10000
...
...
config/train_rvlcdip.yaml
View file @
d3f759b3
...
@@ -8,7 +8,8 @@ val_batch_sizes: [4]
...
@@ -8,7 +8,8 @@ val_batch_sizes: [4]
input_size
:
[
2560
,
1920
]
input_size
:
[
2560
,
1920
]
max_length
:
8
max_length
:
8
align_long_axis
:
False
align_long_axis
:
False
num_nodes
:
8
# num_nodes: 8 # memo: donut-base-finetuned-rvlcdip was trained with 8 nodes
num_nodes
:
1
seed
:
2022
seed
:
2022
lr
:
2e-5
lr
:
2e-5
warmup_steps
:
10000
warmup_steps
:
10000
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment