docs: enhance comments

d3f759b3 · Geewook Kim · 0353dbf8 · d3f759b3 · d3f759b3 · d3f759b3
Commit d3f759b3 authored Aug 17, 2022 by Geewook Kim
Hide whitespace changes
Inline Side-by-side

Showing with 6 additions and 4 deletions

README.md README.md +2 -2

config/train_docvqa.yaml config/train_docvqa.yaml +2 -1

config/train_rvlcdip.yaml config/train_rvlcdip.yaml +2 -1

No files found.
--- a/README.md
+++ b/README.md
@@ -41,8 +41,8 @@ Gradio web demos are available! [![Demo](https://img.shields.io/badge/Demo-Gradi
 | [DocVQA Task1](https://rrc.cvc.uab.es/?ch=17) (Document VQA) |  0.78       | 67.5 | [donut-base-finetuned-docvqa](https://huggingface.co/naver-clova-ix/donut-base-finetuned-docvqa/tree/official) | [gradio space web demo](https://huggingface.co/spaces/nielsr/donut-docvqa),<br>[google colab demo](https://colab.research.google.com/drive/1Z4WG8Wunj3HE0CERjt608ALSgSzRC9ig?usp=sharing) |
 The links to the pre-trained backbones are here:
- [`donut-base`](https://huggingface.co/naver-clova-ix/donut-base/tree/official): trained with 64 A100 GPUs (~2.5 days), number of layers (encoder: {2,2,14,2}, decoder: 4), input size 2560x1920, swin window size 10, IIT-CDIP (11M) and SynthDoG (ECJK, 0.5M x 4).
+- [`donut-base`](https://huggingface.co/naver-clova-ix/donut-base/tree/official): trained with 64 A100 GPUs (~2.5 days), number of layers (encoder: {2,2,14,2}, decoder: 4), input size 2560x1920, swin window size 10, IIT-CDIP (11M) and SynthDoG (English, Chinese, Japanese, Korean, 0.5M x 4).
- [`donut-proto`](https://huggingface.co/naver-clova-ix/donut-proto/tree/official): (preliminary model) trained with 8 V100 GPUs (~5 days), number of layers (encoder: {2,2,18,2}, decoder: 4), input size 2048x1536, swin window size 8, and SynthDoG (EJK, 0.4M x 3).
+- [`donut-proto`](https://huggingface.co/naver-clova-ix/donut-proto/tree/official): (preliminary model) trained with 8 V100 GPUs (~5 days), number of layers (encoder: {2,2,18,2}, decoder: 4), input size 2048x1536, swin window size 8, and SynthDoG (English, Japanese, Korean, 0.4M x 3).
 Please see [our paper](#how-to-cite) for more details.

--- a/config/train_docvqa.yaml
+++ b/config/train_docvqa.yaml
@@ -8,7 +8,8 @@ val_batch_sizes: [4]
 input_size: [2560, 1920]
 max_length: 128
 align_long_axis: False
-num_nodes: 8
+# num_nodes: 8 # memo: donut-base-finetuned-docvqa was trained with 8 nodes
+num_nodes: 1
 seed: 2022
 lr: 3e-5
 warmup_steps: 10000

--- a/config/train_rvlcdip.yaml
+++ b/config/train_rvlcdip.yaml
@@ -8,7 +8,8 @@ val_batch_sizes: [4]
 input_size: [2560, 1920]
 max_length: 8
 align_long_axis: False
-num_nodes: 8
+# num_nodes: 8 # memo: donut-base-finetuned-rvlcdip was trained with 8 nodes
+num_nodes: 1
 seed: 2022
 lr: 2e-5
 warmup_steps: 10000