[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)

* rm all model cards * Update the .rst @sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler * Add a rootlevel README.md with simple instructions/context * Update docs/source/model_sharing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * rm all model cards Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)
* rm all model cards * Update the .rst @sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler * Add a rootlevel README.md with simple instructions/context * Update docs/source/model_sharing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * rm all model cards Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
3552d0e0 · Julien Chaumond · GitHub · 29e45979 · 29e45979 · 29e45979
Unverified Commit 3552d0e0 authored Dec 12, 2020 by Julien Chaumond Committed by GitHub Dec 11, 2020
20 changed files
--- a/model_cards/illuin/camembert-base-fquad/README.md
+++ b/model_cards/illuin/camembert-base-fquad/README.md
---
-language: fr
-tags:
- question-answering
- camembert
-license: gpl-3.0
-datasets:
- fquad
---
-
-# camembert-base-fquad
-
-## Description
-
-A native French Question Answering model [CamemBERT-base](https://camembert-model.fr/) fine-tuned on [FQuAD](https://fquad.illuin.tech/).
-
-## Evaluation results
-
-On the development set.
-
-```shell
-{"f1": 88.1, "exact_match": 78.1}
-```
-
-On the test set.
-
-```shell
-{"f1": 88.3, "exact_match": 78.0}
-```
-
-## Usage
-
-```python
-from transformers import pipeline
-
-nlp = pipeline('question-answering', model='illuin/camembert-base-fquad', tokenizer='illuin/camembert-base-fquad')
-
-nlp({
-    'question': "Qui est Claude Monet?",
-    'context': "Claude Monet, né le 14 novembre 1840 à Paris et mort le 5 décembre 1926 à Giverny, est un peintre français et l’un des fondateurs de l'impressionnisme."
-})
-```
-
-## Citation
-
-If you use our work, please cite:
-
-```bibtex
-@article{dHoffschmidt2020FQuADFQ,
-  title={FQuAD: French Question Answering Dataset},
-  author={Martin d'Hoffschmidt and Maxime Vidal and Wacim Belblidia and Tom Brendl'e and Quentin Heinrich},
-  journal={ArXiv},
-  year={2020},
-  volume={abs/2002.06071}
-}
-```
--- a/model_cards/illuin/camembert-large-fquad/README.md
+++ b/model_cards/illuin/camembert-large-fquad/README.md
---
-language: fr
-tags:
- question-answering
- camembert
-license: gpl-3.0
-datasets:
- fquad
---
-
-# camembert-large-fquad
-
-## Description
-
-A native French Question Answering model [CamemBERT-large](https://camembert-model.fr/) fine-tuned on [FQuAD](https://fquad.illuin.tech/).
-
-## FQuAD Leaderboard and evaluation scores
-
-The results of Camembert-large-fquad can be compared with other state-of-the-art models of the [FQuAD Leaderboard](https://illuin-tech.github.io/FQuAD-explorer/).
-
-On the test set the model scores,
-
-```shell
-{"f1": 91.5, "exact_match": 82.0}
-```
-
-On the development set the model scores,
-
-```shell
-{"f1": 91.0, "exact_match": 81.2}
-```
-
-Note : You can also explore the results of the model on [FQuAD-Explorer](https://illuin-tech.github.io/FQuAD-explorer/) !
-
-## Usage
-
-```python
-from transformers import pipeline
-
-nlp = pipeline('question-answering', model='illuin/camembert-large-fquad', tokenizer='illuin/camembert-large-fquad')
-
-nlp({
-    'question': "Qui est Claude Monet?",
-    'context': "Claude Monet, né le 14 novembre 1840 à Paris et mort le 5 décembre 1926 à Giverny, est un peintre français et l’un des fondateurs de l'impressionnisme."
-})
-```
-
-## Citation
-
-If you use our work, please cite:
-
-```bibtex
-@article{dHoffschmidt2020FQuADFQ,
-  title={FQuAD: French Question Answering Dataset},
-  author={Martin d'Hoffschmidt and Maxime Vidal and Wacim Belblidia and Tom Brendl'e and Quentin Heinrich},
-  journal={ArXiv},
-  year={2020},
-  volume={abs/2002.06071}
-}
-```
--- a/model_cards/illuin/lepetit/README.md
+++ b/model_cards/illuin/lepetit/README.md
---
-language: fr
-thumbnail: https://miro.medium.com/max/700/1*MoPnD6vA9wTHjdLfW7POyw.png
-widget:
- text: "Le camembert LePetit c'est le <mask>."
- text: "Salut les <mask> ça va ?"
-license: gpl-3.0
-tags:
- masked-lm
---
-
-# LePetit: A pre-training efficient and lightning fast French Language Model
-
-See [blogpost](https://medium.com/illuin/lepetit-a-pre-training-efficient-and-lightning-fast-french-language-model-96495ad726b3)
-
--- a/model_cards/indobenchmark/indobert-base-p1/README.md
+++ b/model_cards/indobenchmark/indobert-base-p1/README.md
---
-language: id
-tags:
- indobert
- indobenchmark
- indonlu
-license: mit
-inference: false
-datasets:
- Indo4B
---
-
-# IndoBERT Base Model (phase1 - uncased)
-
-[IndoBERT](https://arxiv.org/abs/2009.05387) is a state-of-the-art language model for Indonesian based on the BERT model. The pretrained model is trained using a masked language modeling (MLM) objective and next sentence prediction (NSP) objective. 
-
-## All Pre-trained Models
-
-| Model                          | #params                        | Arch. | Training data                     |
-|--------------------------------|--------------------------------|-------|-----------------------------------|
-| `indobenchmark/indobert-base-p1` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-base-p2` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p1` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p2` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p1` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p2` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p1` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p2` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-
-## How to use
-
-### Load model and tokenizer
-```python
-from transformers import BertTokenizer, AutoModel
-tokenizer = BertTokenizer.from_pretrained("indobenchmark/indobert-base-p1")
-model = AutoModel.from_pretrained("indobenchmark/indobert-base-p1")
-```
-
-### Extract contextual representation
-```python
-x = torch.LongTensor(tokenizer.encode('aku adalah anak [MASK]')).view(1,-1)
-print(x, model(x)[0].sum())
-```
-
-## Authors 
-
-<b>IndoBERT</b> was trained and evaluated by Bryan Wilie\*, Karissa Vincentio\*, Genta Indra Winata\*, Samuel Cahyawijaya\*, Xiaohong Li, Zhi Yuan Lim, Sidik Soleman, Rahmad Mahendra, Pascale Fung, Syafri Bahar, Ayu Purwarianti.
-
-
-## Citation
-If you use our work, please cite:
-
-```bibtex
-@inproceedings{wilie2020indonlu,
-  title={IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding},
-  author={Bryan Wilie and Karissa Vincentio and Genta Indra Winata and Samuel Cahyawijaya and X. Li and Zhi Yuan Lim and S. Soleman and R. Mahendra and Pascale Fung and Syafri Bahar and A. Purwarianti},
-  booktitle={Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing},
-  year={2020}
-}
-```
--- a/model_cards/indobenchmark/indobert-base-p2/README.md
+++ b/model_cards/indobenchmark/indobert-base-p2/README.md
---
-language: id
-tags:
- indobert
- indobenchmark
- indonlu
-license: mit
-inference: false
-datasets:
- Indo4B
---
-
-# IndoBERT Base Model (phase2 - uncased)
-
-[IndoBERT](https://arxiv.org/abs/2009.05387) is a state-of-the-art language model for Indonesian based on the BERT model. The pretrained model is trained using a masked language modeling (MLM) objective and next sentence prediction (NSP) objective. 
-
-## All Pre-trained Models
-
-| Model                          | #params                        | Arch. | Training data                     |
-|--------------------------------|--------------------------------|-------|-----------------------------------|
-| `indobenchmark/indobert-base-p1` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-base-p2` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p1` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p2` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p1` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p2` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p1` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p2` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-
-## How to use
-
-### Load model and tokenizer
-```python
-from transformers import BertTokenizer, AutoModel
-tokenizer = BertTokenizer.from_pretrained("indobenchmark/indobert-base-p2")
-model = AutoModel.from_pretrained("indobenchmark/indobert-base-p2")
-```
-
-### Extract contextual representation
-```python
-x = torch.LongTensor(tokenizer.encode('aku adalah anak [MASK]')).view(1,-1)
-print(x, model(x)[0].sum())
-```
-
-## Authors 
-
-<b>IndoBERT</b> was trained and evaluated by Bryan Wilie\*, Karissa Vincentio\*, Genta Indra Winata\*, Samuel Cahyawijaya\*, Xiaohong Li, Zhi Yuan Lim, Sidik Soleman, Rahmad Mahendra, Pascale Fung, Syafri Bahar, Ayu Purwarianti.
-
-
-## Citation
-If you use our work, please cite:
-
-```bibtex
-@inproceedings{wilie2020indonlu,
-  title={IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding},
-  author={Bryan Wilie and Karissa Vincentio and Genta Indra Winata and Samuel Cahyawijaya and X. Li and Zhi Yuan Lim and S. Soleman and R. Mahendra and Pascale Fung and Syafri Bahar and A. Purwarianti},
-  booktitle={Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing},
-  year={2020}
-}
-```
--- a/model_cards/indobenchmark/indobert-large-p1/README.md
+++ b/model_cards/indobenchmark/indobert-large-p1/README.md
---
-language: id
-tags:
- indobert
- indobenchmark
- indonlu
-license: mit
-inference: false
-datasets:
- Indo4B
---
-
-# IndoBERT Large Model (phase1 - uncased)
-
-[IndoBERT](https://arxiv.org/abs/2009.05387) is a state-of-the-art language model for Indonesian based on the BERT model. The pretrained model is trained using a masked language modeling (MLM) objective and next sentence prediction (NSP) objective. 
-
-## All Pre-trained Models
-
-| Model                          | #params                        | Arch. | Training data                     |
-|--------------------------------|--------------------------------|-------|-----------------------------------|
-| `indobenchmark/indobert-base-p1` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-base-p2` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p1` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p2` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p1` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p2` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p1` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p2` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-
-## How to use
-
-### Load model and tokenizer
-```python
-from transformers import BertTokenizer, AutoModel
-tokenizer = BertTokenizer.from_pretrained("indobenchmark/indobert-large-p1")
-model = AutoModel.from_pretrained("indobenchmark/indobert-large-p1")
-```
-
-### Extract contextual representation
-```python
-x = torch.LongTensor(tokenizer.encode('aku adalah anak [MASK]')).view(1,-1)
-print(x, model(x)[0].sum())
-```
-
-## Authors 
-
-<b>IndoBERT</b> was trained and evaluated by Bryan Wilie\*, Karissa Vincentio\*, Genta Indra Winata\*, Samuel Cahyawijaya\*, Xiaohong Li, Zhi Yuan Lim, Sidik Soleman, Rahmad Mahendra, Pascale Fung, Syafri Bahar, Ayu Purwarianti.
-
-
-## Citation
-If you use our work, please cite:
-
-```bibtex
-@inproceedings{wilie2020indonlu,
-  title={IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding},
-  author={Bryan Wilie and Karissa Vincentio and Genta Indra Winata and Samuel Cahyawijaya and X. Li and Zhi Yuan Lim and S. Soleman and R. Mahendra and Pascale Fung and Syafri Bahar and A. Purwarianti},
-  booktitle={Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing},
-  year={2020}
-}
-```
--- a/model_cards/indobenchmark/indobert-large-p2/README.md
+++ b/model_cards/indobenchmark/indobert-large-p2/README.md
---
-language: id
-tags:
- indobert
- indobenchmark
- indonlu
-license: mit
-inference: false
-datasets:
- Indo4B
---
-
-# IndoBERT Large Model (phase2 - uncased)
-
-[IndoBERT](https://arxiv.org/abs/2009.05387) is a state-of-the-art language model for Indonesian based on the BERT model. The pretrained model is trained using a masked language modeling (MLM) objective and next sentence prediction (NSP) objective. 
-
-## All Pre-trained Models
-
-| Model                          | #params                        | Arch. | Training data                     |
-|--------------------------------|--------------------------------|-------|-----------------------------------|
-| `indobenchmark/indobert-base-p1` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-base-p2` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p1` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p2` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p1` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p2` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p1` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p2` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-
-## How to use
-
-### Load model and tokenizer
-```python
-from transformers import BertTokenizer, AutoModel
-tokenizer = BertTokenizer.from_pretrained("indobenchmark/indobert-large-p2")
-model = AutoModel.from_pretrained("indobenchmark/indobert-large-p2")
-```
-
-### Extract contextual representation
-```python
-x = torch.LongTensor(tokenizer.encode('aku adalah anak [MASK]')).view(1,-1)
-print(x, model(x)[0].sum())
-```
-
-## Authors 
-
-<b>IndoBERT</b> was trained and evaluated by Bryan Wilie\*, Karissa Vincentio\*, Genta Indra Winata\*, Samuel Cahyawijaya\*, Xiaohong Li, Zhi Yuan Lim, Sidik Soleman, Rahmad Mahendra, Pascale Fung, Syafri Bahar, Ayu Purwarianti.
-
-
-## Citation
-If you use our work, please cite:
-
-```bibtex
-@inproceedings{wilie2020indonlu,
-  title={IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding},
-  author={Bryan Wilie and Karissa Vincentio and Genta Indra Winata and Samuel Cahyawijaya and X. Li and Zhi Yuan Lim and S. Soleman and R. Mahendra and Pascale Fung and Syafri Bahar and A. Purwarianti},
-  booktitle={Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing},
-  year={2020}
-}
-```
--- a/model_cards/indobenchmark/indobert-lite-base-p1/README.md
+++ b/model_cards/indobenchmark/indobert-lite-base-p1/README.md
---
-language: id
-tags:
- indobert
- indobenchmark
- indonlu
-license: mit
-inference: false
-datasets:
- Indo4B
---
-
-# IndoBERT-Lite Base Model (phase1 - uncased)
-
-[IndoBERT](https://arxiv.org/abs/2009.05387) is a state-of-the-art language model for Indonesian based on the BERT model. The pretrained model is trained using a masked language modeling (MLM) objective and next sentence prediction (NSP) objective. 
-
-## All Pre-trained Models
-
-| Model                          | #params                        | Arch. | Training data                     |
-|--------------------------------|--------------------------------|-------|-----------------------------------|
-| `indobenchmark/indobert-base-p1` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-base-p2` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p1` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p2` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p1` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p2` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p1` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p2` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-
-## How to use
-
-### Load model and tokenizer
-```python
-from transformers import BertTokenizer, AutoModel
-tokenizer = BertTokenizer.from_pretrained("indobenchmark/indobert-lite-base-p1")
-model = AutoModel.from_pretrained("indobenchmark/indobert-lite-base-p1")
-```
-
-### Extract contextual representation
-```python
-x = torch.LongTensor(tokenizer.encode('aku adalah anak [MASK]')).view(1,-1)
-print(x, model(x)[0].sum())
-```
-
-## Authors 
-
-<b>IndoBERT</b> was trained and evaluated by Bryan Wilie\*, Karissa Vincentio\*, Genta Indra Winata\*, Samuel Cahyawijaya\*, Xiaohong Li, Zhi Yuan Lim, Sidik Soleman, Rahmad Mahendra, Pascale Fung, Syafri Bahar, Ayu Purwarianti.
-
-
-## Citation
-If you use our work, please cite:
-
-```bibtex
-@inproceedings{wilie2020indonlu,
-  title={IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding},
-  author={Bryan Wilie and Karissa Vincentio and Genta Indra Winata and Samuel Cahyawijaya and X. Li and Zhi Yuan Lim and S. Soleman and R. Mahendra and Pascale Fung and Syafri Bahar and A. Purwarianti},
-  booktitle={Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing},
-  year={2020}
-}
-```
--- a/model_cards/indobenchmark/indobert-lite-base-p2/README.md
+++ b/model_cards/indobenchmark/indobert-lite-base-p2/README.md
---
-language: id
-tags:
- indobert
- indobenchmark
- indonlu
-license: mit
-inference: false
-datasets:
- Indo4B
---
-
-# IndoBERT-Lite Base Model (phase2 - uncased)
-
-[IndoBERT](https://arxiv.org/abs/2009.05387) is a state-of-the-art language model for Indonesian based on the BERT model. The pretrained model is trained using a masked language modeling (MLM) objective and next sentence prediction (NSP) objective. 
-
-## All Pre-trained Models
-
-| Model                          | #params                        | Arch. | Training data                     |
-|--------------------------------|--------------------------------|-------|-----------------------------------|
-| `indobenchmark/indobert-base-p1` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-base-p2` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p1` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p2` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p1` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p2` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p1` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p2` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-
-## How to use
-
-### Load model and tokenizer
-```python
-from transformers import BertTokenizer, AutoModel
-tokenizer = BertTokenizer.from_pretrained("indobenchmark/indobert-lite-base-p2")
-model = AutoModel.from_pretrained("indobenchmark/indobert-lite-base-p2")
-```
-
-### Extract contextual representation
-```python
-x = torch.LongTensor(tokenizer.encode('aku adalah anak [MASK]')).view(1,-1)
-print(x, model(x)[0].sum())
-```
-
-## Authors 
-
-<b>IndoBERT</b> was trained and evaluated by Bryan Wilie\*, Karissa Vincentio\*, Genta Indra Winata\*, Samuel Cahyawijaya\*, Xiaohong Li, Zhi Yuan Lim, Sidik Soleman, Rahmad Mahendra, Pascale Fung, Syafri Bahar, Ayu Purwarianti.
-
-
-## Citation
-If you use our work, please cite:
-
-```bibtex
-@inproceedings{wilie2020indonlu,
-  title={IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding},
-  author={Bryan Wilie and Karissa Vincentio and Genta Indra Winata and Samuel Cahyawijaya and X. Li and Zhi Yuan Lim and S. Soleman and R. Mahendra and Pascale Fung and Syafri Bahar and A. Purwarianti},
-  booktitle={Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing},
-  year={2020}
-}
-```
--- a/model_cards/indobenchmark/indobert-lite-large-p1/README.md
+++ b/model_cards/indobenchmark/indobert-lite-large-p1/README.md
---
-language: id
-tags:
- indobert
- indobenchmark
- indonlu
-license: mit
-inference: false
-datasets:
- Indo4B
---
-
-# IndoBERT-Lite Large Model (phase1 - uncased)
-
-[IndoBERT](https://arxiv.org/abs/2009.05387) is a state-of-the-art language model for Indonesian based on the BERT model. The pretrained model is trained using a masked language modeling (MLM) objective and next sentence prediction (NSP) objective. 
-
-## All Pre-trained Models
-
-| Model                          | #params                        | Arch. | Training data                     |
-|--------------------------------|--------------------------------|-------|-----------------------------------|
-| `indobenchmark/indobert-base-p1` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-base-p2` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p1` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p2` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p1` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p2` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p1` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p2` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-
-## How to use
-
-### Load model and tokenizer
-```python
-from transformers import BertTokenizer, AutoModel
-tokenizer = BertTokenizer.from_pretrained("indobenchmark/indobert-lite-large-p1")
-model = AutoModel.from_pretrained("indobenchmark/indobert-lite-large-p1")
-```
-
-### Extract contextual representation
-```python
-x = torch.LongTensor(tokenizer.encode('aku adalah anak [MASK]')).view(1,-1)
-print(x, model(x)[0].sum())
-```
-
-## Authors 
-
-<b>IndoBERT</b> was trained and evaluated by Bryan Wilie\*, Karissa Vincentio\*, Genta Indra Winata\*, Samuel Cahyawijaya\*, Xiaohong Li, Zhi Yuan Lim, Sidik Soleman, Rahmad Mahendra, Pascale Fung, Syafri Bahar, Ayu Purwarianti.
-
-
-## Citation
-If you use our work, please cite:
-
-```bibtex
-@inproceedings{wilie2020indonlu,
-  title={IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding},
-  author={Bryan Wilie and Karissa Vincentio and Genta Indra Winata and Samuel Cahyawijaya and X. Li and Zhi Yuan Lim and S. Soleman and R. Mahendra and Pascale Fung and Syafri Bahar and A. Purwarianti},
-  booktitle={Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing},
-  year={2020}
-}
-```
--- a/model_cards/indobenchmark/indobert-lite-large-p2/README.md
+++ b/model_cards/indobenchmark/indobert-lite-large-p2/README.md
---
-language: id
-tags:
- indobert
- indobenchmark
- indonlu
-license: mit
-inference: false
-datasets:
- Indo4B
---
-
-# IndoBERT-Lite Large Model (phase2 - uncased)
-
-[IndoBERT](https://arxiv.org/abs/2009.05387) is a state-of-the-art language model for Indonesian based on the BERT model. The pretrained model is trained using a masked language modeling (MLM) objective and next sentence prediction (NSP) objective. 
-
-## All Pre-trained Models
-
-| Model                          | #params                        | Arch. | Training data                     |
-|--------------------------------|--------------------------------|-------|-----------------------------------|
-| `indobenchmark/indobert-base-p1` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-base-p2` | 124.5M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p1` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-large-p2` | 335.2M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p1` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-base-p2` | 11.7M   | Base  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p1` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-| `indobenchmark/indobert-lite-large-p2` | 17.7M   | Large  | Indo4B (23.43 GB of text)            |
-
-## How to use
-
-### Load model and tokenizer
-```python
-from transformers import BertTokenizer, AutoModel
-tokenizer = BertTokenizer.from_pretrained("indobenchmark/indobert-lite-large-p2")
-model = AutoModel.from_pretrained("indobenchmark/indobert-lite-large-p2")
-```
-
-### Extract contextual representation
-```python
-x = torch.LongTensor(tokenizer.encode('aku adalah anak [MASK]')).view(1,-1)
-print(x, model(x)[0].sum())
-```
-
-## Authors 
-
-<b>IndoBERT</b> was trained and evaluated by Bryan Wilie\*, Karissa Vincentio\*, Genta Indra Winata\*, Samuel Cahyawijaya\*, Xiaohong Li, Zhi Yuan Lim, Sidik Soleman, Rahmad Mahendra, Pascale Fung, Syafri Bahar, Ayu Purwarianti.
-
-
-## Citation
-If you use our work, please cite:
-
-```bibtex
-@inproceedings{wilie2020indonlu,
-  title={IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding},
-  author={Bryan Wilie and Karissa Vincentio and Genta Indra Winata and Samuel Cahyawijaya and X. Li and Zhi Yuan Lim and S. Soleman and R. Mahendra and Pascale Fung and Syafri Bahar and A. Purwarianti},
-  booktitle={Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing},
-  year={2020}
-}
-```
--- a/model_cards/indolem/indobert-base-uncased/README.md
+++ b/model_cards/indolem/indobert-base-uncased/README.md
---
-language: id
-tags:
- indobert
- indolem
-license: mit
-inference: false
-datasets:
- 220M words (IndoWiki, IndoWC, News)
---
-
-## About
-
-[IndoBERT](https://arxiv.org/pdf/2011.00677.pdf) is the Indonesian version of BERT model. We train the model using over 220M words, aggregated from three main sources: 
-* Indonesian Wikipedia (74M words)
-* news articles from Kompas, Tempo (Tala et al., 2003), and Liputan6 (55M words in total)
-* an Indonesian Web Corpus (Medved and Suchomel, 2017) (90M words).
-
-We trained the model for 2.4M steps (180 epochs) with the final perplexity over the development set being <b>3.97</b> (similar to English BERT-base).
-
-This <b>IndoBERT</b> was used to examine IndoLEM - an Indonesian benchmark that comprises of seven tasks for the Indonesian language, spanning morpho-syntax, semantics, and discourse. 
-
-| Task | Metric | Bi-LSTM | mBERT | MalayBERT | IndoBERT |
-| ---- | ---- | ---- | ---- | ---- | ---- |
-| POS Tagging | Acc | 95.4 | <b>96.8</b> | <b>96.8</b> | <b>96.8</b> |
-| NER UGM | F1| 70.9 | 71.6 | 73.2 | <b>74.9</b> |
-| NER UI | F1 | 82.2 | 82.2 | 87.4 | <b>90.1</b> |
-| Dep. Parsing (UD-Indo-GSD) | UAS/LAS | 85.25/80.35 | 86.85/81.78 | 86.99/81.87 | <b>87.12<b/>/<b>82.32</b> |
-| Dep. Parsing (UD-Indo-PUD) | UAS/LAS | 84.04/79.01 | <b>90.58</b>/<b>85.44</b> | 88.91/83.56 | 89.23/83.95 |
-| Sentiment Analysis | F1 | 71.62 | 76.58 | 82.02 | <b>84.13</b> |
-| Summarization | R1/R2/RL | 67.96/61.65/67.24 | 68.40/61.66/67.67 | 68.44/61.38/67.71 | <b>69.93</b>/<b>62.86</b>/<b>69.21</b> |
-| Next Tweet Prediction | Acc | 73.6 | 92.4 | 93.1 | <b>93.7</b> |
-| Tweet Ordering | Spearman corr. | 0.45 | 0.53 | 0.51 | <b>0.59</b> |
-
-The paper is published at the 28th COLING 2020. Please refer to https://indolem.github.io for more details about the benchmarks.
-
-## How to use
-
-### Load model and tokenizer (tested with transformers==3.5.1)
-```python
-from transformers import AutoTokenizer, AutoModel
-tokenizer = AutoTokenizer.from_pretrained("indolem/indobert-base-uncased")
-model = AutoModel.from_pretrained("indolem/indobert-base-uncased")
-```
-
-## Citation
-If you use our work, please cite:
-
-```bibtex
-@inproceedings{koto2020indolem,
-  title={IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP},
-  author={Fajri Koto and Afshin Rahimi and Jey Han Lau and Timothy Baldwin},
-  booktitle={Proceedings of the 28th COLING},
-  year={2020}
-}
-```
--- a/model_cards/ipuneetrathore/bert-base-cased-finetuned-finBERT/README.md
+++ b/model_cards/ipuneetrathore/bert-base-cased-finetuned-finBERT/README.md
-## FinBERT
-
-Code for importing and using this model is available [here](https://github.com/ipuneetrathore/BERT_models)
--- a/model_cards/iuliaturc/bert_uncased_L-2_H-128_A-2/README.md
+++ b/model_cards/iuliaturc/bert_uncased_L-2_H-128_A-2/README.md
---
-thumbnail: https://huggingface.co/front/thumbnails/google.png
-
-license: apache-2.0
---
-
-BERT Miniatures
-===
-
-This is the set of 24 BERT models referenced in [Well-Read Students Learn Better: On the Importance of Pre-training Compact Models](https://arxiv.org/abs/1908.08962) (English only, uncased, trained with WordPiece masking).
-
-We have shown that the standard BERT recipe (including model architecture and training objective) is effective on a wide range of model sizes, beyond BERT-Base and BERT-Large. The smaller BERT models are intended for environments with restricted computational resources. They can be fine-tuned in the same manner as the original BERT models. However, they are most effective in the context of knowledge distillation, where the fine-tuning labels are produced by a larger and more accurate teacher.
-
-Our goal is to enable research in institutions with fewer computational resources and encourage the community to seek directions of innovation alternative to increasing model capacity.
-
-You can download the 24 BERT miniatures either from the [official BERT Github page](https://github.com/google-research/bert/), or via HuggingFace from the links below:
-
-|   |H=128|H=256|H=512|H=768|
-|---|:---:|:---:|:---:|:---:|
-| **L=2**  |[**2/128 (BERT-Tiny)**][2_128]|[2/256][2_256]|[2/512][2_512]|[2/768][2_768]|
-| **L=4**  |[4/128][4_128]|[**4/256 (BERT-Mini)**][4_256]|[**4/512 (BERT-Small)**][4_512]|[4/768][4_768]|
-| **L=6**  |[6/128][6_128]|[6/256][6_256]|[6/512][6_512]|[6/768][6_768]|
-| **L=8**  |[8/128][8_128]|[8/256][8_256]|[**8/512 (BERT-Medium)**][8_512]|[8/768][8_768]|
-| **L=10** |[10/128][10_128]|[10/256][10_256]|[10/512][10_512]|[10/768][10_768]|
-| **L=12** |[12/128][12_128]|[12/256][12_256]|[12/512][12_512]|[**12/768 (BERT-Base)**][12_768]|
-
-Note that the BERT-Base model in this release is included for completeness only; it was re-trained under the same regime as the original model.
-
-Here are the corresponding GLUE scores on the test set:
-
-|Model|Score|CoLA|SST-2|MRPC|STS-B|QQP|MNLI-m|MNLI-mm|QNLI(v2)|RTE|WNLI|AX|
-|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
-|BERT-Tiny|64.2|0.0|83.2|81.1/71.1|74.3/73.6|62.2/83.4|70.2|70.3|81.5|57.2|62.3|21.0|
-|BERT-Mini|65.8|0.0|85.9|81.1/71.8|75.4/73.3|66.4/86.2|74.8|74.3|84.1|57.9|62.3|26.1|
-|BERT-Small|71.2|27.8|89.7|83.4/76.2|78.8/77.0|68.1/87.0|77.6|77.0|86.4|61.8|62.3|28.6|
-|BERT-Medium|73.5|38.0|89.6|86.6/81.6|80.4/78.4|69.6/87.9|80.0|79.1|87.7|62.2|62.3|30.5|
-
-For each task, we selected the best fine-tuning hyperparameters from the lists below, and trained for 4 epochs:
- batch sizes: 8, 16, 32, 64, 128
- learning rates: 3e-4, 1e-4, 5e-5, 3e-5
-
-If you use these models, please cite the following paper:
-
-```
-@article{turc2019,
-  title={Well-Read Students Learn Better: On the Importance of Pre-training Compact Models},
-  author={Turc, Iulia and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
-  journal={arXiv preprint arXiv:1908.08962v2 },
-  year={2019}
-}
-```
-
-[2_128]: https://huggingface.co/google/bert_uncased_L-2_H-128_A-2
-[2_256]: https://huggingface.co/google/bert_uncased_L-2_H-256_A-4
-[2_512]: https://huggingface.co/google/bert_uncased_L-2_H-512_A-8
-[2_768]: https://huggingface.co/google/bert_uncased_L-2_H-768_A-12
-[4_128]: https://huggingface.co/google/bert_uncased_L-4_H-128_A-2
-[4_256]: https://huggingface.co/google/bert_uncased_L-4_H-256_A-4
-[4_512]: https://huggingface.co/google/bert_uncased_L-4_H-512_A-8
-[4_768]: https://huggingface.co/google/bert_uncased_L-4_H-768_A-12
-[6_128]: https://huggingface.co/google/bert_uncased_L-6_H-128_A-2
-[6_256]: https://huggingface.co/google/bert_uncased_L-6_H-256_A-4
-[6_512]: https://huggingface.co/google/bert_uncased_L-6_H-512_A-8
-[6_768]: https://huggingface.co/google/bert_uncased_L-6_H-768_A-12
-[8_128]: https://huggingface.co/google/bert_uncased_L-8_H-128_A-2
-[8_256]: https://huggingface.co/google/bert_uncased_L-8_H-256_A-4
-[8_512]: https://huggingface.co/google/bert_uncased_L-8_H-512_A-8
-[8_768]: https://huggingface.co/google/bert_uncased_L-8_H-768_A-12
-[10_128]: https://huggingface.co/google/bert_uncased_L-10_H-128_A-2
-[10_256]: https://huggingface.co/google/bert_uncased_L-10_H-256_A-4
-[10_512]: https://huggingface.co/google/bert_uncased_L-10_H-512_A-8
-[10_768]: https://huggingface.co/google/bert_uncased_L-10_H-768_A-12
-[12_128]: https://huggingface.co/google/bert_uncased_L-12_H-128_A-2
-[12_256]: https://huggingface.co/google/bert_uncased_L-12_H-256_A-4
-[12_512]: https://huggingface.co/google/bert_uncased_L-12_H-512_A-8
-[12_768]: https://huggingface.co/google/bert_uncased_L-12_H-768_A-12
--- a/model_cards/ixa-ehu/berteus-base-cased/README.md
+++ b/model_cards/ixa-ehu/berteus-base-cased/README.md
---
-language: eu
---
-
-# BERTeus base cased
-
-This is the Basque language pretrained model presented in [Give your Text Representation Models some Love: the Case for Basque](https://arxiv.org/pdf/2004.00033.pdf). This model has been trained on a Basque corpus comprising Basque crawled news articles from online newspapers and the Basque Wikipedia. The training corpus contains 224.6 million tokens, of which 35 million come from the Wikipedia.
-
-BERTeus has been tested on four different downstream tasks for Basque: part-of-speech (POS) tagging, named entity recognition (NER), sentiment analysis and topic classification; improving the state of the art for all tasks. See summary of results below:
-
-
-| Downstream task | BERTeus | mBERT | Previous SOTA |
-| --------------- | ------- | ------| ------------- |
-| Topic Classification	  | **76.77**   | 68.42 | 63.00 	    |
-| Sentiment    	  | **78.10**   | 71.02 | 74.02 	    |
-| POS   	  | **97.76**   | 96.37 | 96.10 	    |
-| NER    	  | **87.06**   | 81.52 | 76.72 	    |
-
-
-If using this model, please cite the following paper:
-```
-@inproceedings{agerri2020give,
-  title={Give your Text Representation Models some Love: the Case for Basque},
-  author={Rodrigo Agerri and I{\~n}aki San Vicente and Jon Ander Campos and Ander Barrena and Xabier Saralegi and Aitor Soroa and Eneko Agirre},
-  booktitle={Proceedings of the 12th International Conference on Language Resources and Evaluation},
-  year={2020}
-}
-```
--- a/model_cards/ixa-ehu/ixambert-base-cased/README.md
+++ b/model_cards/ixa-ehu/ixambert-base-cased/README.md
---
-language: 
- en
- es
- eu
---
-
-# IXAmBERT base cased
-
-This is a multilingual language pretrained for English, Spanish and Basque. The training corpora is composed by the English, Spanish and Basque Wikipedias, together with Basque crawled news articles from online newspapers. The model has been successfully used to transfer knowledge from English to Basque in a conversational QA system, as reported in the paper [Conversational Question Answering in Low Resource Scenarios: A Dataset and Case Study for Basque](http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.55.pdf). In the paper, IXAmBERT performed better than mBERT when transferring knowledge from English to Basque, as shown in the following Table:
-
-| Model              | Zero-shot | Transfer learning |
-|--------------------|-----------|-------------------|
-| Baseline           |      28.7 |              28.7 |
-| mBERT              |      31.5 |              37.4 |
-| IXAmBERT           |      38.9 |          **41.2** |
-| mBERT + history    |      33.3 |              28.7 |
-| IXAmBERT + history |  **40.7** |              40.0 |
-
-This Table shows the results on a Basque CQA dataset. *Zero-shot* means that the model is fine-tuned using using QuaC, an English CQA dataset. In the *Transfer Learning* setting the model is first fine-tuned on QuaC, and then on a Basque CQA dataset. 
-
-If using this model, please cite the following paper:
-```
-@inproceedings{otegi2020conversational,
-  title={Conversational Question Answering in Low Resource Scenarios: A Dataset and Case Study for Basque},
-  author={Otegi, Arantxa and Agirre, Aitor and Campos, Jon Ander and Soroa, Aitor and Agirre, Eneko},
-  booktitle={Proceedings of The 12th Language Resources and Evaluation Conference},
-  pages={436--442},
-  year={2020}
-}
-```
--- a/model_cards/jannesg/bertsson/README.md
+++ b/model_cards/jannesg/bertsson/README.md
---
-language: sv
---
-
-# BERTSSON Models
-
-The models are trained on:
- Government Text
- Swedish Literature
- Swedish News
-
-Corpus size: Roughly 6B tokens.
-
-The following models are currently available:
-
- **bertsson** - A BERT base model trained with the same hyperparameters as first published by Google.
-
-All models are cased and trained with whole word masking.
-
-Stay tuned for evaluations. 
--- a/model_cards/jannesg/takalane_afr_roberta/README.md
+++ b/model_cards/jannesg/takalane_afr_roberta/README.md
---
-language: 
- af
-thumbnail: https://pbs.twimg.com/media/EVjR6BsWoAAFaq5.jpg
-tags:
- af
- fill-mask
- pytorch
- roberta
- masked-lm
-license: MIT
---
-
-# Takalani Sesame - Salie - Afrikaans 🇿🇦
-
-<img src="https://pbs.twimg.com/media/EVjR6BsWoAAFaq5.jpg" width="600"/> 
-
-## Model description
-
-Takalani Sesame (named after the South African version of Sesame Street) is a project that aims to promote the use of South African languages in NLP, and in particular look at techniques for low-resource languages to equalise performance with larger languages around the world.
-
-## Intended uses & limitations
-
-#### How to use
-
-```python
-from transformers import AutoTokenizer, AutoModelWithLMHead
-
-tokenizer = AutoTokenizer.from_pretrained("jannesg/takalane_afr_roberta")
-
-model = AutoModelWithLMHead.from_pretrained("jannesg/takalane_afr_roberta")
-```
-
-#### Limitations and bias
-
-Updates will be added continuously to improve performance. 
-
-## Training data
-
-Data collected from [https://wortschatz.uni-leipzig.de/en](https://wortschatz.uni-leipzig.de/en) <br/>
-**Sentences:** 2.8M
-
-## Training procedure
-
-No preprocessing. Standard Huggingface hyperparameters. 
-
-## Author
-
-Jannes Germishuys [website](http://jannesgg.github.io)
--- a/model_cards/jannesg/takalane_nbl_roberta/README.md
+++ b/model_cards/jannesg/takalane_nbl_roberta/README.md
---
-language: 
- nr
-thumbnail: https://pbs.twimg.com/media/EVjR6BsWoAAFaq5.jpg
-tags:
- nr
- fill-mask
- pytorch
- roberta
- masked-lm
-license: MIT
---
-
-# Takalani Sesame - Ndebele 🇿🇦
-
-<img src="https://pbs.twimg.com/media/EVjR6BsWoAAFaq5.jpg" width="600"/> 
-
-## Model description
-
-Takalani Sesame (named after the South African version of Sesame Street) is a project that aims to promote the use of South African languages in NLP, and in particular look at techniques for low-resource languages to equalise performance with larger languages around the world.
-
-## Intended uses & limitations
-
-#### How to use
-
-```python
-from transformers import AutoTokenizer, AutoModelWithLMHead
-
-tokenizer = AutoTokenizer.from_pretrained("jannesg/takalane_nbl_roberta")
-
-model = AutoModelWithLMHead.from_pretrained("jannesg/takalane_nbl_roberta")
-```
-
-#### Limitations and bias
-
-Updates will be added continously to improve performance. This is a very low resource language, results may be poor at first. 
-
-## Training data
-
-Data collected from [https://wortschatz.uni-leipzig.de/en](https://wortschatz.uni-leipzig.de/en) <br/>
-**Sentences:** 318M
-
-## Training procedure
-
-No preprocessing. Standard Huggingface hyperparameters. 
-
-## Author
-
-Jannes Germishuys [website](http://jannesgg.github.io)
--- a/model_cards/jannesg/takalane_nso_roberta/README.md
+++ b/model_cards/jannesg/takalane_nso_roberta/README.md
---
-language: 
- nso
-thumbnail: https://pbs.twimg.com/media/EVjR6BsWoAAFaq5.jpg
-tags:
- nso
- fill-mask
- pytorch
- roberta
- masked-lm
-license: MIT
---
-
-# Takalani Sesame - Northern Sotho 🇿🇦
-
-<img src="https://pbs.twimg.com/media/EVjR6BsWoAAFaq5.jpg" width="600"/> 
-
-## Model description
-
-Takalani Sesame (named after the South African version of Sesame Street) is a project that aims to promote the use of South African languages in NLP, and in particular look at techniques for low-resource languages to equalise performance with larger languages around the world.
-
-## Intended uses & limitations
-
-#### How to use
-
-```python
-from transformers import AutoTokenizer, AutoModelWithLMHead
-
-tokenizer = AutoTokenizer.from_pretrained("jannesg/takalane_nso_roberta")
-
-model = AutoModelWithLMHead.from_pretrained("jannesg/takalane_nso_roberta")
-```
-
-#### Limitations and bias
-
-Updates will be added continously to improve performance. 
-
-## Training data
-
-Data collected from [https://wortschatz.uni-leipzig.de/en](https://wortschatz.uni-leipzig.de/en) <br/>
-**Sentences:** 4746
-
-## Training procedure
-
-No preprocessing. Standard Huggingface hyperparameters. 
-
-## Author
-
-Jannes Germishuys [website](http://jannesgg.github.io)