Unverified Commit 26769fed authored by Geewook Kim's avatar Geewook Kim Committed by GitHub
Browse files

Update README.md

parent 9451b1c9
......@@ -40,8 +40,8 @@ Gradio web demos are available! [![Demo](https://img.shields.io/badge/Demo-Gradi
| [DocVQA Task1](https://rrc.cvc.uab.es/?ch=17) (Document VQA) | 0.78 | 67.5 | [donut-base-finetuned-docvqa](https://huggingface.co/naver-clova-ix/donut-base-finetuned-docvqa) | [google colab demo](https://colab.research.google.com/drive/1Z4WG8Wunj3HE0CERjt608ALSgSzRC9ig?usp=sharing) |
The links to the pre-trained backbones are here:
- [`donut-base`](https://huggingface.co/naver-clova-ix/donut-base): trained with 64 A100 GPUs, number of layers (encoder: {2,2,14,2}, decoder: 4), input size 2560x1920, swin window size 10, IIT-CDIP (11M) and SynthDoG (ECJK, 0.5M x 4).
- [`donut-proto`](https://huggingface.co/naver-clova-ix/donut-proto): (preliminary model) trained with 8 V100 GPUs, number of layers (encoder: {2,2,18,2}, decoder: 4), input size 2048x1536, swin window size 8, and SynthDoG (EJK, 0.4M x 3).
- [`donut-base`](https://huggingface.co/naver-clova-ix/donut-base): trained with 64 A100 GPUs (~2.5 days), number of layers (encoder: {2,2,14,2}, decoder: 4), input size 2560x1920, swin window size 10, IIT-CDIP (11M) and SynthDoG (ECJK, 0.5M x 4).
- [`donut-proto`](https://huggingface.co/naver-clova-ix/donut-proto): (preliminary model) trained with 8 V100 GPUs (~5 days), number of layers (encoder: {2,2,18,2}, decoder: 4), input size 2048x1536, swin window size 8, and SynthDoG (EJK, 0.4M x 3).
Please see [our paper](#how-to-cite) for more details.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment