[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)

* rm all model cards * Update the .rst @sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler * Add a rootlevel README.md with simple instructions/context * Update docs/source/model_sharing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * rm all model cards Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)
* rm all model cards * Update the .rst @sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler * Add a rootlevel README.md with simple instructions/context * Update docs/source/model_sharing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * rm all model cards Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
3552d0e0 · Julien Chaumond · GitHub · 29e45979 · 29e45979 · 29e45979
Unverified Commit 3552d0e0 authored Dec 12, 2020 by Julien Chaumond Committed by GitHub Dec 11, 2020
20 changed files
--- a/model_cards/mrm8488/RoBasquERTa/README.md
+++ b/model_cards/mrm8488/RoBasquERTa/README.md
---
-language: eu
-widget:
- text: "Euskara da Euskal Herriko <mask> ofiziala"
- text: "Gaur egun, Euskadik Espainia osoko ekonomia <mask> du"
---
-
-# RoBasquERTa: RoBERTa-like Language model trained on OSCAR Basque corpus
--- a/model_cards/mrm8488/RuPERTa-base-finetuned-ner/README.md
+++ b/model_cards/mrm8488/RuPERTa-base-finetuned-ner/README.md
---
-language: es
-thumbnail:
---
-
-# RuPERTa-base  (Spanish RoBERTa) + NER 🎃🏷
-
-This model is a fine-tuned on [NER-C](https://www.kaggle.com/nltkdata/conll-corpora) version of [RuPERTa-base](https://huggingface.co/mrm8488/RuPERTa-base) for **NER** downstream task.
-
-## Details of the downstream task (NER) - Dataset
-
- [Dataset:  CONLL Corpora ES](https://www.kaggle.com/nltkdata/conll-corpora) 📚
-
-| Dataset                | # Examples |
-| ---------------------- | ----- |
-| Train                  |  329 K |
-| Dev                    | 40 K |
-
-
- [Fine-tune on NER script provided by Huggingface](https://github.com/huggingface/transformers/blob/master/examples/token-classification/run_ner_old.py)
-
- Labels covered:
-
-```
-B-LOC
-B-MISC
-B-ORG
-B-PER
-I-LOC
-I-MISC
-I-ORG
-I-PER
-O
-```
-
-## Metrics on evaluation set 🧾
-
-|                                                      Metric                                                       |  # score  |
-| :------------------------------------------------------------------------------------: | :-------: |
-| F1                                       | **77.55**  
-| Precision                                | **75.53** | 
-| Recall                                   | **79.68** |    
-
-## Model in action 🔨
-
-
-Example of usage:
-
-```python
-import torch
-from transformers import AutoModelForTokenClassification, AutoTokenizer
-
-id2label = {
-    "0": "B-LOC",
-    "1": "B-MISC",
-    "2": "B-ORG",
-    "3": "B-PER",
-    "4": "I-LOC",
-    "5": "I-MISC",
-    "6": "I-ORG",
-    "7": "I-PER",
-    "8": "O"
-}
-
-text ="Julien, CEO de HF, nació en Francia."
-input_ids = torch.tensor(tokenizer.encode(text)).unsqueeze(0)
-
-outputs = model(input_ids)
-last_hidden_states = outputs[0]
-
-for m in last_hidden_states:
-  for index, n in enumerate(m):
-    if(index > 0 and index <= len(text.split(" "))):
-      print(text.split(" ")[index-1] + ": " + id2label[str(torch.argmax(n).item())])
-      
-'''
-Output:
--------
-Julien,: I-PER
-CEO: O
-de: O
-HF,: B-ORG
-nació: I-PER
-en: I-PER
-Francia.: I-LOC
-'''
-```
-Yeah! Not too bad 🎉
-
-> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488)
-
-> Made with <span style="color: #e25555;">&hearts;</span> in Spain
--- a/model_cards/mrm8488/RuPERTa-base-finetuned-pawsx-es/README.md
+++ b/model_cards/mrm8488/RuPERTa-base-finetuned-pawsx-es/README.md
---
-language: es
-datasets:
- xtreme
-widget:
- text: "En 2009 se mudó a Filadelfia y en la actualidad vive en Nueva York. Se mudó nuevamente a Filadelfia en 2009 y ahora vive en la ciudad de Nueva York."
---
-
-# RuPERTa-base fine-tuned on PAWS-X-es for Paraphrase Identification
--- a/model_cards/mrm8488/RuPERTa-base-finetuned-pos/README.md
+++ b/model_cards/mrm8488/RuPERTa-base-finetuned-pos/README.md
---
-language: es
-thumbnail:
---
-
-# RuPERTa-base  (Spanish RoBERTa) + POS 🎃🏷
-
-This model is a fine-tuned on [CONLL CORPORA](https://www.kaggle.com/nltkdata/conll-corpora) version of [RuPERTa-base](https://huggingface.co/mrm8488/RuPERTa-base) for **POS** downstream task.
-
-## Details of the downstream task (POS) - Dataset
-
- [Dataset:  CONLL Corpora ES](https://www.kaggle.com/nltkdata/conll-corpora) 📚
-
-| Dataset                | # Examples |
-| ---------------------- | ----- |
-| Train                  | 445 K |
-| Dev                    | 55 K |
-
- [Fine-tune on NER script provided by Huggingface](https://github.com/huggingface/transformers/blob/master/examples/token-classification/run_ner_old.py)
-
- Labels covered:
-
-```
-ADJ
-ADP
-ADV
-AUX
-CCONJ
-DET
-INTJ
-NOUN
-NUM
-PART
-PRON
-PROPN
-PUNCT
-SCONJ
-SYM
-VERB
-```
-
-## Metrics on evaluation set 🧾
-
-|                                                      Metric                                                       |  # score  |
-| :------------------------------------------------------------------------------------: | :-------: |
-| F1                                       | **97.39**  
-| Precision                                | **97.47** | 
-| Recall                                   | **9732** |    
-
-## Model in action 🔨
-
-
-Example of usage
-
-```python
-import torch
-from transformers import AutoModelForTokenClassification, AutoTokenizer
-
-tokenizer = AutoTokenizer.from_pretrained('mrm8488/RuPERTa-base-finetuned-pos')
-model = AutoModelForTokenClassification.from_pretrained('mrm8488/RuPERTa-base-finetuned-pos')
-
-id2label = {
-    "0": "O",
-    "1": "ADJ",
-    "2": "ADP",
-    "3": "ADV",
-    "4": "AUX",
-    "5": "CCONJ",
-    "6": "DET",
-    "7": "INTJ",
-    "8": "NOUN",
-    "9": "NUM",
-    "10": "PART",
-    "11": "PRON",
-    "12": "PROPN",
-    "13": "PUNCT",
-    "14": "SCONJ",
-    "15": "SYM",
-    "16": "VERB"
-}
-
-text ="Mis amigos están pensando viajar a Londres este verano."
-input_ids = torch.tensor(tokenizer.encode(text)).unsqueeze(0)
-
-outputs = model(input_ids)
-last_hidden_states = outputs[0]
-
-for m in last_hidden_states:
-  for index, n in enumerate(m):
-    if(index > 0 and index <= len(text.split(" "))):
-      print(text.split(" ")[index-1] + ": " + id2label[str(torch.argmax(n).item())])
-      
-'''
-Output:
--------
-Mis: NUM
-amigos: PRON
-están: AUX
-pensando: ADV
-viajar: VERB
-a: ADP
-Londres: PROPN
-este: DET
-verano..: NOUN
-'''
-```
-Yeah! Not too bad 🎉
-
-> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488) | [LinkedIn](https://www.linkedin.com/in/manuel-romero-cs/)
-
-> Made with <span style="color: #e25555;">&hearts;</span> in Spain
--- a/model_cards/mrm8488/RuPERTa-base-finetuned-squadv1/README.md
+++ b/model_cards/mrm8488/RuPERTa-base-finetuned-squadv1/README.md
---
-language: es
-datasets:
- squad
---
--- a/model_cards/mrm8488/RuPERTa-base-finetuned-squadv2/README.md
+++ b/model_cards/mrm8488/RuPERTa-base-finetuned-squadv2/README.md
---
-language: es
-datasets:
- squad_v2
---
--- a/model_cards/mrm8488/RuPERTa-base/README.md
+++ b/model_cards/mrm8488/RuPERTa-base/README.md
---
-language: es
-thumbnail: https://i.imgur.com/DUlT077.jpg
-widget:
- text: "España es un país muy <mask> en la UE"
---
-
-# RuPERTa: the Spanish RoBERTa 🎃<img src="https://abs-0.twimg.com/emoji/v2/svg/1f1ea-1f1f8.svg" alt="spain flag" width="25"/>
-
-RuPERTa-base (uncased) is a [RoBERTa model](https://github.com/pytorch/fairseq/tree/master/examples/roberta) trained on a *uncased* verison of [big Spanish corpus](https://github.com/josecannete/spanish-corpora).
-RoBERTa iterates on BERT's pretraining procedure, including training the model longer, with bigger batches over more data; removing the next sentence prediction objective; training on longer sequences; and dynamically changing the masking pattern applied to the training data.
-The architecture is the same as `roberta-base`:
-
-`roberta.base:` **RoBERTa** using the **BERT-base architecture 125M** params
-
-## Benchmarks 🧾 
-WIP (I continue working on it) 🚧
-
-| Task/Dataset     |    F1 | Precision | Recall |                                                                        Fine-tuned model |                                                                                                                                                                                                                                                                                               Reproduce it |
-| -------- | ----: | --------: | -----: | --------------------------------------------------------------------------------------: | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
-| POS      | 97.39 |     97.47 |  97.32 | [RuPERTa-base-finetuned-pos](https://huggingface.co/mrm8488/RuPERTa-base-finetuned-pos) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mrm8488/shared_colab_notebooks/blob/master/RuPERTa_base_finetuned_POS.ipynb)
-| NER      | 77.55 |     75.53 |  79.68 | [RuPERTa-base-finetuned-ner](https://huggingface.co/mrm8488/RuPERTa-base-finetuned-ner) |
-| SQUAD-es v1 |  to-do |       |    |[RuPERTa-base-finetuned-squadv1](https://huggingface.co/mrm8488/RuPERTa-base-finetuned-squadv1)
-| SQUAD-es v2 |  to-do |       |  |[RuPERTa-base-finetuned-squadv2](https://huggingface.co/mrm8488/RuPERTa-base-finetuned-squadv2)
-
-## Model in action 🔨
-
-### Usage for POS and NER 🏷
-
-```python
-import torch
-from transformers import AutoModelForTokenClassification, AutoTokenizer
-
-id2label = {
-    "0": "B-LOC",
-    "1": "B-MISC",
-    "2": "B-ORG",
-    "3": "B-PER",
-    "4": "I-LOC",
-    "5": "I-MISC",
-    "6": "I-ORG",
-    "7": "I-PER",
-    "8": "O"
-}
-
-tokenizer = AutoTokenizer.from_pretrained('mrm8488/RuPERTa-base-finetuned-ner')
-model = AutoModelForTokenClassification.from_pretrained('mrm8488/RuPERTa-base-finetuned-ner')
-
-text ="Julien, CEO de HF, nació en Francia."
-
-input_ids = torch.tensor(tokenizer.encode(text)).unsqueeze(0)
-
-outputs = model(input_ids)
-last_hidden_states = outputs[0]
-
-for m in last_hidden_states:
-  for index, n in enumerate(m):
-    if(index > 0 and index <= len(text.split(" "))):
-      print(text.split(" ")[index-1] + ": " + id2label[str(torch.argmax(n).item())])
-
-# Output:
-'''
-Julien,: I-PER
-CEO: O
-de: O
-HF,: B-ORG
-nació: I-PER
-en: I-PER
-Francia.: I-LOC
-'''
-```
-
-For **POS** just change the `id2label` dictionary and the model path to [mrm8488/RuPERTa-base-finetuned-pos](https://huggingface.co/mrm8488/RuPERTa-base-finetuned-pos)
-
-### Fast usage for LM with `pipelines` 🧪
-
-```python
-from transformers import AutoModelWithLMHead, AutoTokenizer
-model = AutoModelWithLMHead.from_pretrained('mrm8488/RuPERTa-base')
-tokenizer = AutoTokenizer.from_pretrained("mrm8488/RuPERTa-base", do_lower_case=True)
-
-from transformers import pipeline
-
-pipeline_fill_mask = pipeline("fill-mask", model=model, tokenizer=tokenizer)
-
-pipeline_fill_mask("España es un país muy <mask> en la UE")
-```
-
-```json
-[
-  {
-    "score": 0.1814306527376175,
-    "sequence": "<s> españa es un país muy importante en la ue</s>",
-    "token": 1560
-  },
-  {
-    "score": 0.024842597544193268,
-    "sequence": "<s> españa es un país muy fuerte en la ue</s>",
-    "token": 2854
-  },
-  {
-    "score": 0.02473250962793827,
-    "sequence": "<s> españa es un país muy pequeño en la ue</s>",
-    "token": 2948
-  },
-  {
-    "score": 0.023991240188479424,
-    "sequence": "<s> españa es un país muy antiguo en la ue</s>",
-    "token": 5240
-  },
-  {
-    "score": 0.0215945765376091,
-    "sequence": "<s> españa es un país muy popular en la ue</s>",
-    "token": 5782
-  }
-]
-```
-
-## Acknowledgments
-
-I thank [🤗/transformers team](https://github.com/huggingface/transformers) for answering my doubts and Google for helping me with the [TensorFlow Research Cloud](https://www.tensorflow.org/tfrc) program.
-
-> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488)
-
-> Made with <span style="color: #e25555;">&hearts;</span> in Spain
--- a/model_cards/mrm8488/TinyBERT-spanish-uncased-finetuned-ner/README.md
+++ b/model_cards/mrm8488/TinyBERT-spanish-uncased-finetuned-ner/README.md
---
-language: es
-thumbnail:
---
-
-# Spanish TinyBERT + NER
-
-This model is a fine-tuned on [NER-C](https://www.kaggle.com/nltkdata/conll-corpora) of a [Spanish Tiny Bert](https://huggingface.co/mrm8488/es-tinybert-v1-1) model I created using *distillation* for **NER** downstream task. The **size** of the model is **55MB**
-
-## Details of the downstream task (NER) - Dataset
-
- [Dataset:  CONLL Corpora ES](https://www.kaggle.com/nltkdata/conll-corpora) 
-
-I preprocessed the dataset and split it as train / dev (80/20)
-
-| Dataset                | # Examples |
-| ---------------------- | ----- |
-| Train                  | 8.7 K |
-| Dev                    | 2.2 K |
-
-
- [Fine-tune on NER script provided by Huggingface](https://github.com/huggingface/transformers/blob/master/examples/token-classification/run_ner_old.py)
-
- Labels covered:
-
-```
-B-LOC
-B-MISC
-B-ORG
-B-PER
-I-LOC
-I-MISC
-I-ORG
-I-PER
-O
-```
-
-## Metrics on evaluation set:
-
-|                                                      Metric                                                       |  # score  |
-| :------------------------------------------------------------------------------------: | :-------: |
-| F1                                       | **70.00**  
-| Precision                                | **67.83** | 
-| Recall                                   | **71.46** |    
-
-## Comparison:
-
-|                                                      Model                                                       |  # F1 score  |Size(MB)|
-| :--------------------------------------------------------------------------------------------------------------: | :-------: |:------|
-|                                        bert-base-spanish-wwm-cased (BETO)                                        |   88.43   | 421
-| [bert-spanish-cased-finetuned-ner](https://huggingface.co/mrm8488/bert-spanish-cased-finetuned-ner) | **90.17** | 420 |
-|                                              Best Multilingual BERT                                              |   87.38   | 681 |
-|TinyBERT-spanish-uncased-finetuned-ner (this one)                                                                  | 70.00 | **55** |
-
-## Model in action
-
-
-Example of usage:
-
-```python
-import torch
-from transformers import AutoModelForTokenClassification, AutoTokenizer
-
-id2label = {
-    "0": "B-LOC",
-    "1": "B-MISC",
-    "2": "B-ORG",
-    "3": "B-PER",
-    "4": "I-LOC",
-    "5": "I-MISC",
-    "6": "I-ORG",
-    "7": "I-PER",
-    "8": "O"
-}
-
-tokenizer = AutoTokenizer.from_pretrained('mrm8488/TinyBERT-spanish-uncased-finetuned-ner')
-model = AutoModelForTokenClassification.from_pretrained('mrm8488/TinyBERT-spanish-uncased-finetuned-ner')
-text ="Mis amigos están pensando viajar a Londres este verano."
-input_ids = torch.tensor(tokenizer.encode(text)).unsqueeze(0)
-
-outputs = model(input_ids)
-last_hidden_states = outputs[0]
-
-for m in last_hidden_states:
-  for index, n in enumerate(m):
-    if(index > 0 and index <= len(text.split(" "))):
-      print(text.split(" ")[index-1] + ": " + id2label[str(torch.argmax(n).item())])
-      
-'''
-Output:
--------
-Mis: O
-amigos: O
-están: O
-pensando: O
-viajar: O
-a: O
-Londres: B-LOC
-este: O
-verano.: O
-'''
-```
-
-> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488)
-
-> Made with <span style="color: #e25555;">&hearts;</span> in Spain
--- a/model_cards/mrm8488/bert-base-german-dbmdz-cased-finetuned-pawsx-de/README.md
+++ b/model_cards/mrm8488/bert-base-german-dbmdz-cased-finetuned-pawsx-de/README.md
---
-language: de
-datasets:
- xtreme
-widget:
- text: "Winarsky ist Mitglied des IEEE, Phi Beta Kappa, des ACM und des Sigma Xi. Winarsky ist Mitglied des ACM, des IEEE, der Phi Beta Kappa und der Sigma Xi."
---
-
-# bert-base-german-dbmdz-cased fine-tuned on PAWS-X-de for Paraphrase Identification
--- a/model_cards/mrm8488/bert-base-german-finetuned-ler/README.md
+++ b/model_cards/mrm8488/bert-base-german-finetuned-ler/README.md
---
-language: de
---
-
-# German BERT + LER (Legal Entity Recognition) ⚖️
-
-German BERT ([BERT-base-german-cased](https://huggingface.co/bert-base-german-cased)) fine-tuned on [Legal-Entity-Recognition](https://github.com/elenanereiss/Legal-Entity-Recognition) dataset for **LER** (NER) downstream task.
-
-## Details of the downstream task (NER) - Dataset
-
-[Legal-Entity-Recognition](https://github.com/elenanereiss/Legal-Entity-Recognition): Fine-grained Named Entity Recognition in Legal Documents.
-
-Court decisions from 2017 and 2018 were selected for the dataset, published online by the [Federal Ministry of Justice and Consumer Protection](http://www.rechtsprechung-im-internet.de). The documents originate from seven federal courts: Federal Labour Court (BAG), Federal Fiscal Court (BFH), Federal Court of Justice (BGH), Federal Patent Court (BPatG), Federal Social Court (BSG), Federal Constitutional Court (BVerfG) and Federal Administrative Court (BVerwG). 
-
-
-|  Split             | # Samples |
-| ---------------------- | ----- |
-| Train                  | 1657048 |
-| Eval                    | 500000 |
-
- Training script: [Fine-tuning script for NER provided by Huggingface](https://github.com/huggingface/transformers/blob/master/examples/token-classification/run_ner_old.py)
-Colab: [How to fine-tune a model for NER using HF scripts](https://colab.research.google.com/drive/156Qrd7NsUHwA3nmQ6gXdZY0NzOvqk9AT?usp=sharing)
-
- Labels covered (and its distribution):
-
-```
-    107 B-AN
-    918 B-EUN
-   2238 B-GRT
-  13282 B-GS
-   1113 B-INN
-    704 B-LD
-    151 B-LDS
-   2490 B-LIT
-    282 B-MRK
-    890 B-ORG
-   1374 B-PER
-   1480 B-RR
-  10046 B-RS
-    401 B-ST
-     68 B-STR
-   1011 B-UN
-    282 B-VO
-    391 B-VS
-   2648 B-VT
-     46 I-AN
-   6925 I-EUN
-   1957 I-GRT
-  70257 I-GS
-   2931 I-INN
-    153 I-LD
-     26 I-LDS
-  28881 I-LIT
-    383 I-MRK
-   1185 I-ORG
-    330 I-PER
-    106 I-RR
- 138938 I-RS
-     34 I-ST
-     55 I-STR
-   1259 I-UN
-   1572 I-VO
-   2488 I-VS
-  11121 I-VT
-1348525 O
-```
- [Annotation Guidelines (German)](https://github.com/elenanereiss/Legal-Entity-Recognition/blob/master/docs/Annotationsrichtlinien.pdf)
-
-
-## Metrics on evaluation set
-
-|                                                      Metric                                                       |  # score  |
-| :------------------------------------------------------------------------------------: | :-------: |
-| F1                                       | **85.67**  
-| Precision                                | **84.35** | 
-| Recall                                   | **87.04** | 
-| Accuracy                                 | **98.46** |
-
-## Model in action
-
-Fast usage with **pipelines**:
-
-```python
-from transformers import pipeline
-
-nlp_ler = pipeline(
-    "ner",
-    model="mrm8488/bert-base-german-finetuned-ler",
-    tokenizer="mrm8488/bert-base-german-finetuned-ler"
-)
-
-text = "Your German legal text here"
-
-nlp_ler(text)
-
-```
-
-> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488)
-
-> Made with <span style="color: #e25555;">&hearts;</span> in Spain
--- a/model_cards/mrm8488/bert-base-spanish-wwm-cased-finetuned-spa-squad2-es/README.md
+++ b/model_cards/mrm8488/bert-base-spanish-wwm-cased-finetuned-spa-squad2-es/README.md
---
-language: es
-thumbnail: https://i.imgur.com/jgBdimh.png
---
-
-# BETO (Spanish BERT) + Spanish SQuAD2.0
-
-This model is provided by [BETO team](https://github.com/dccuchile/beto) and fine-tuned on [SQuAD-es-v2.0](https://github.com/ccasimiro88/TranslateAlignRetrieve) for **Q&A** downstream task.
-
-## Details of the language model('dccuchile/bert-base-spanish-wwm-cased')
-
-Language model ([**'dccuchile/bert-base-spanish-wwm-cased'**](https://github.com/dccuchile/beto/blob/master/README.md)):
-
-BETO is a [BERT model](https://github.com/google-research/bert) trained on a [big Spanish corpus](https://github.com/josecannete/spanish-corpora). BETO is of size similar to a BERT-Base and was trained with the Whole Word Masking technique. Below you find Tensorflow and Pytorch checkpoints for the uncased and cased versions, as well as some results for Spanish benchmarks comparing BETO with [Multilingual BERT](https://github.com/google-research/bert/blob/master/multilingual.md) as well as other (not BERT-based) models.
-
-## Details of the downstream task (Q&A) - Dataset
-[SQuAD-es-v2.0](https://github.com/ccasimiro88/TranslateAlignRetrieve)
-
-| Dataset                | # Q&A |
-| ---------------------- | ----- |
-| SQuAD2.0 Train         | 130 K |
-| SQuAD2.0-es-v2.0       | 111 K |
-| SQuAD2.0 Dev           | 12  K |
-| SQuAD-es-v2.0-small Dev| 69  K |
-
-## Model training
-
-The model was trained on a Tesla P100 GPU and 25GB of RAM with the following command:
-
-```bash
-export SQUAD_DIR=path/to/nl_squad
-python transformers/examples/question-answering/run_squad.py \
-  --model_type bert \
-  --model_name_or_path dccuchile/bert-base-spanish-wwm-cased \
-  --do_train \
-  --do_eval \
-  --do_lower_case \
-  --train_file $SQUAD_DIR/train_nl-v2.0.json \
-  --predict_file $SQUAD_DIR/dev_nl-v2.0.json \
-  --per_gpu_train_batch_size 12 \
-  --learning_rate 3e-5 \
-  --num_train_epochs 2.0 \
-  --max_seq_length 384 \
-  --doc_stride 128 \
-  --output_dir /content/model_output \
-  --save_steps 5000 \
-  --threads 4 \
-  --version_2_with_negative 
-```
-
-## Results:
-
-
-  | Metric               | # Value |
-| ---------------------- | ----- |
-| **Exact**              | **76.50**50 |
-| **F1**                 | **86.07**81 |
-
-```json
-{
-  "exact": 76.50501430594491,
-  "f1": 86.07818773108252,
-  "total": 69202,
-  "HasAns_exact": 67.93020719738277,
-  "HasAns_f1": 82.37912207996466,
-  "HasAns_total": 45850,
-  "NoAns_exact": 93.34104145255225,
-  "NoAns_f1": 93.34104145255225,
-  "NoAns_total": 23352,
-  "best_exact": 76.51223953064941,
-  "best_exact_thresh": 0.0,
-  "best_f1": 86.08541295578848,
-  "best_f1_thresh": 0.0
-}
-```
-
-### Model in action (in a Colab Notebook)
-<details>
-
-1.  Set the context and ask some questions:
-
-![Set context and questions](https://media.giphy.com/media/mCIaBpfN0LQcuzkA2F/giphy.gif)
-
-2.  Run predictions:
-
-![Run the model](https://media.giphy.com/media/WT453aptcbCP7hxWTZ/giphy.gif)
-</details>
-
-> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488)
-
-> Made with <span style="color: #e25555;">&hearts;</span> in Spain
--- a/model_cards/mrm8488/bert-italian-finedtuned-squadv1-it-alfa/README.md
+++ b/model_cards/mrm8488/bert-italian-finedtuned-squadv1-it-alfa/README.md
---
-language: it
-thumbnail:
---
-
-# Italian BERT fine-tuned on SQuAD_it v1
-
-[Italian BERT base cased](https://huggingface.co/dbmdz/bert-base-italian-cased) fine-tuned on [italian SQuAD](https://github.com/crux82/squad-it) for **Q&A** downstream task.
-
-## Details of Italian BERT
-
-The source data for the Italian BERT model consists of a recent Wikipedia dump and various texts from the OPUS corpora collection. The final training corpus has a size of 13GB and 2,050,057,573 tokens.
-
-For sentence splitting, we use NLTK (faster compared to spacy). Our cased and uncased models are training with an initial sequence length of 512 subwords for ~2-3M steps.
-
-For the XXL Italian models, we use the same training data from OPUS and extend it with data from the Italian part of the OSCAR corpus. Thus, the final training corpus has a size of 81GB and 13,138,379,147 tokens.
-More in its official [model card](https://huggingface.co/dbmdz/bert-base-italian-cased)
-
-Created by [Stefan](https://huggingface.co/stefan-it) at [MDZ](https://huggingface.co/dbmdz)
-
-## Details of the downstream task (Q&A) - Dataset 📚 🧐 ❓
-
-[Italian SQuAD v1.1](https://rajpurkar.github.io/SQuAD-explorer/) is derived from the SQuAD dataset and it is obtained through semi-automatic translation of the SQuAD dataset
-into Italian. It represents a large-scale dataset for open question answering processes on factoid questions in Italian.
-**The dataset contains more than 60,000 question/answer pairs derived from the original English dataset.** The dataset is split into training and test sets to support the replicability of the benchmarking of QA systems:
-
- `SQuAD_it-train.json`: it contains training examples derived from the original SQuAD 1.1 trainig material.
- `SQuAD_it-test.json`: it contains test/benchmarking examples derived from the origial SQuAD 1.1 development material.
-
-More details about SQuAD-it can be found in [Croce et al. 2018]. The original paper can be found at this [link](https://link.springer.com/chapter/10.1007/978-3-030-03840-3_29).
-
-## Model training 🏋️‍
-
-The model was trained on a Tesla P100 GPU and 25GB of RAM.
-The script for fine tuning can be found [here](https://github.com/huggingface/transformers/blob/master/examples/question-answering/run_squad.py)
-
-## Results 📝
-
-| Metric | # Value   |
-| ------ | --------- |
-| **EM** | **62.51** |
-| **F1** | **74.16** |
-
-### Raw metrics
-
-```json
-{
-  "exact": 62.5180707057432,
-  "f1": 74.16038329042492,
-  "total": 7609,
-  "HasAns_exact": 62.5180707057432,
-  "HasAns_f1": 74.16038329042492,
-  "HasAns_total": 7609,
-  "best_exact": 62.5180707057432,
-  "best_exact_thresh": 0.0,
-  "best_f1": 74.16038329042492,
-  "best_f1_thresh": 0.0
-}
-```
-
-## Comparison ⚖️
-
-| Model                                                                                                                            | EM        | F1 score  |
-| -------------------------------------------------------------------------------------------------------------------------------- | --------- | --------- |
-| [DrQA-it trained on SQuAD-it ](https://github.com/crux82/squad-it/blob/master/README.md#evaluating-a-neural-model-over-squad-it) | 56.1      | 65.9      |
-| This one                                                                                                                         | **62.51** | **74.16** |
-
-## Model in action 🚀
-
-Fast usage with **pipelines** 🧪
-
-```python
-from transformers import pipeline
-
-nlp_qa = pipeline(
-    'question-answering',
-    model='mrm8488/bert-italian-finedtuned-squadv1-it-alfa',
-    tokenizer='mrm8488/bert-italian-finedtuned-squadv1-it-alfa'
-)
-
-nlp_qa(
-    {
-        'question': 'Per quale lingua stai lavorando?',
-        'context': 'Manuel Romero è colaborando attivamente con HF / trasformatori per il trader del poder de las últimas ' +
-       'técnicas di procesamiento de lenguaje natural al idioma español'
-    }
-)
-
-# Output: {'answer': 'español', 'end': 174, 'score': 0.9925341537498156, 'start': 168}
-```
-
-> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488) | [LinkedIn](https://www.linkedin.com/in/manuel-romero-cs/)
-
-> Made with <span style="color: #e25555;">&hearts;</span> in Spain
-
-Dataset citation
-
-<details>
-@InProceedings{10.1007/978-3-030-03840-3_29,
-	author="Croce, Danilo and Zelenanska, Alexandra and Basili, Roberto",
-	editor="Ghidini, Chiara and Magnini, Bernardo and Passerini, Andrea and Traverso, Paolo",
-	title="Neural Learning for Question Answering in Italian",
-	booktitle="AI*IA 2018 -- Advances in Artificial Intelligence",
-	year="2018",
-	publisher="Springer International Publishing",
-	address="Cham",
-	pages="389--402",
-	isbn="978-3-030-03840-3"
-}
-</detail>
--- a/model_cards/mrm8488/bert-medium-finetuned-squadv2/README.md
+++ b/model_cards/mrm8488/bert-medium-finetuned-squadv2/README.md
---
-language: en
-thumbnail:
---
-
-# BERT-Medium fine-tuned on SQuAD v2
-
-[BERT-Medium](https://github.com/google-research/bert/) created by [Google Research](https://github.com/google-research) and fine-tuned on [SQuAD 2.0](https://rajpurkar.github.io/SQuAD-explorer/) for **Q&A** downstream task.
-
-**Mode size** (after training): **157.46 MB**
-
-## Details of BERT-Small and its 'family' (from their documentation)
-
-Released on March 11th, 2020
-
-This is model is a part of 24 smaller BERT models (English only, uncased, trained with WordPiece masking) referenced in [Well-Read Students Learn Better: On the Importance of Pre-training Compact Models](https://arxiv.org/abs/1908.08962).
-
-The smaller BERT models are intended for environments with restricted computational resources. They can be fine-tuned in the same manner as the original BERT models. However, they are most effective in the context of knowledge distillation, where the fine-tuning labels are produced by a larger and more accurate teacher.
-
-## Details of the downstream task (Q&A) - Dataset
-
-[SQuAD2.0](https://rajpurkar.github.io/SQuAD-explorer/) combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering.
-
-| Dataset  | Split | # samples |
-| -------- | ----- | --------- |
-| SQuAD2.0 | train | 130k      |
-| SQuAD2.0 | eval  | 12.3k     |
-
-## Model training
-
-The model was trained on a Tesla P100 GPU and 25GB of RAM.
-The script for fine tuning can be found [here](https://github.com/huggingface/transformers/blob/master/examples/question-answering/run_squad.py)
-
-## Results:
-
-| Metric | # Value   |
-| ------ | --------- |
-| **EM** | **65.95** |
-| **F1** | **70.11** |
-
-### Raw metrics from benchmark included in training script:
-
-```json
-{
-  "exact": 65.95637159942727,
-  "f1": 70.11632254245896,
-  "total": 11873,
-  "HasAns_exact": 67.79689608636977,
-  "HasAns_f1": 76.12872765631123,
-  "HasAns_total": 5928,
-  "NoAns_exact": 64.12111017661901,
-  "NoAns_f1": 64.12111017661901,
-  "NoAns_total": 5945,
-  "best_exact": 65.96479407058031,
-  "best_exact_thresh": 0.0,
-  "best_f1": 70.12474501361196,
-  "best_f1_thresh": 0.0
-}
-```
-
-## Comparison:
-
-| Model                                                                                         | EM        | F1 score  | SIZE (MB) |
-| --------------------------------------------------------------------------------------------- | --------- | --------- | --------- |
-| [bert-tiny-finetuned-squadv2](https://huggingface.co/mrm8488/bert-tiny-finetuned-squadv2)     | 48.60     | 49.73     | **16.74** |
-| [bert-tiny-5-finetuned-squadv2](https://huggingface.co/mrm8488/bert-tiny-5-finetuned-squadv2) | 57.12     | 60.86     | 24.34     |
-| [bert-mini-finetuned-squadv2](https://huggingface.co/mrm8488/bert-mini-finetuned-squadv2)     | 56.31     | 59.65     | 42.63     |
-| [bert-mini-5-finetuned-squadv2](https://huggingface.co/mrm8488/bert-mini-5-finetuned-squadv2) | 63.51     | 66.78     | 66.76     |
-| [bert-small-finetuned-squadv2](https://huggingface.co/mrm8488/bert-small-finetuned-squadv2)   | 60.49     | 64.21     | 109.74    |
-| [bert-medium-finetuned-squadv2](https://huggingface.co/mrm8488/bert-medium-finetuned-squadv2) | **65.95** | **70.11** | 157.46    |
-
-## Model in action
-
-Fast usage with **pipelines**:
-
-```python
-from transformers import pipeline
-
-qa_pipeline = pipeline(
-    "question-answering",
-    model="mrm8488/bert-small-finetuned-squadv2",
-    tokenizer="mrm8488/bert-small-finetuned-squadv2"
-)
-
-qa_pipeline({
-    'context': "Manuel Romero has been working hardly in the repository hugginface/transformers lately",
-    'question': "Who has been working hard for hugginface/transformers lately?"
-
-})
-
-# Output:
-```
-
-```json
-{
-  "answer": "Manuel Romero",
-  "end": 13,
-  "score": 0.9939319924374637,
-  "start": 0
-}
-```
-
-### Yes! That was easy 🎉 Let's try with another example
-
-```python
-qa_pipeline({
-    'context': "Manuel Romero has been working remotely in the repository hugginface/transformers lately",
-    'question': "How has been working Manuel Romero?"
-})
-
-# Output:
-```
-
-```json
-{ "answer": "remotely", "end": 39, "score": 0.3612058272768017, "start": 31 }
-```
-
-### It works!! 🎉 🎉 🎉
-
-> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488) | [LinkedIn](https://www.linkedin.com/in/manuel-romero-cs/)
-
-> Made with <span style="color: #e25555;">&hearts;</span> in Spain
--- a/model_cards/mrm8488/bert-mini-finetuned-squadv2/README.md
+++ b/model_cards/mrm8488/bert-mini-finetuned-squadv2/README.md
---
-language: en
-thumbnail:
---
-
-# BERT-Mini fine-tuned on SQuAD v2
-
-[BERT-Mini](https://github.com/google-research/bert/) created by [Google Research](https://github.com/google-research) and fine-tuned on [SQuAD 2.0](https://rajpurkar.github.io/SQuAD-explorer/) for **Q&A** downstream task.
-
-**Mode size** (after training): **42.63 MB**
-
-## Details of BERT-Mini and its 'family' (from their documentation)
-
-Released on March 11th, 2020
-
-This is model is a part of 24 smaller BERT models (English only, uncased, trained with WordPiece masking) referenced in [Well-Read Students Learn Better: On the Importance of Pre-training Compact Models](https://arxiv.org/abs/1908.08962).
-
-The smaller BERT models are intended for environments with restricted computational resources. They can be fine-tuned in the same manner as the original BERT models. However, they are most effective in the context of knowledge distillation, where the fine-tuning labels are produced by a larger and more accurate teacher.
-
-## Details of the downstream task (Q&A) - Dataset
-
-[SQuAD2.0](https://rajpurkar.github.io/SQuAD-explorer/) combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering.
-
-| Dataset  | Split | # samples |
-| -------- | ----- | --------- |
-| SQuAD2.0 | train | 130k      |
-| SQuAD2.0 | eval  | 12.3k     |
-
-## Model training
-
-The model was trained on a Tesla P100 GPU and 25GB of RAM.
-The script for fine tuning can be found [here](https://github.com/huggingface/transformers/blob/master/examples/question-answering/run_squad.py)
-
-## Results:
-
-| Metric | # Value   |
-| ------ | --------- |
-| **EM** | **56.31** |
-| **F1** | **59.65** |
-
-## Comparison:
-
-| Model                                                                                     | EM        | F1 score  | SIZE (MB) |
-| ----------------------------------------------------------------------------------------- | --------- | --------- | --------- |
-| [bert-tiny-finetuned-squadv2](https://huggingface.co/mrm8488/bert-tiny-finetuned-squadv2) | 48.60     | 49.73     | **16.74** |
-| [bert-tiny-5-finetuned-squadv2](https://huggingface.co/mrm8488/bert-tiny-5-finetuned-squadv2) | 57.12     | 60.86     | 24.34 |
-| [bert-mini-finetuned-squadv2](https://huggingface.co/mrm8488/bert-mini-finetuned-squadv2) | 56.31 | 59.65 | 42.63     |
-| [bert-mini-5-finetuned-squadv2](https://huggingface.co/mrm8488/bert-mini-5-finetuned-squadv2) | **63.51** | **66.78** | 66.76 |
-
-## Model in action
-
-Fast usage with **pipelines**:
-
-```python
-from transformers import pipeline
-
-qa_pipeline = pipeline(
-    "question-answering",
-    model="mrm8488/bert-mini-finetuned-squadv2",
-    tokenizer="mrm8488/bert-mini-finetuned-squadv2"
-)
-
-qa_pipeline({
-    'context': "Manuel Romero has been working hardly in the repository hugginface/transformers lately",
-    'question': "Who has been working hard for hugginface/transformers lately?"
-
-})
-
-# Output:
-```
-
-```json
-{
-  "answer": "Manuel Romero",
-  "end": 13,
-  "score": 0.9676484207783673,
-  "start": 0
-}
-```
-
-### Yes! That was easy 🎉 Let's try with another example
-
-```python
-qa_pipeline({
-    'context': "Manuel Romero has been working hardly in the repository hugginface/transformers lately",
-    'question': "For which company has worked Manuel Romero?"
-})
-
-# Output:
-```
-
-```json
-{
-  "answer": "hugginface/transformers",
-  "end": 79,
-  "score": 0.5301655914731853,
-  "start": 56
-}
-```
-
-### It works!! 🎉 🎉 🎉
-
-> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488) | [LinkedIn](https://www.linkedin.com/in/manuel-romero-cs/)
-
-> Made with <span style="color: #e25555;">&hearts;</span> in Spain
--- a/model_cards/mrm8488/bert-mini2bert-mini-finetuned-cnn_daily_mail-summarization/README.md
+++ b/model_cards/mrm8488/bert-mini2bert-mini-finetuned-cnn_daily_mail-summarization/README.md
---
-language: en
-license: apache-2.0
-datasets:
- cnn_dailymail
-tags:
- summarization
---
-
-# Bert-mini2Bert-mini Summarization with 🤗EncoderDecoder Framework
-
-This model is a warm-started *BERT2BERT* ([mini](https://huggingface.co/google/bert_uncased_L-4_H-256_A-4)) model fine-tuned on the *CNN/Dailymail* summarization dataset.
-
-The model achieves a **16.51** ROUGE-2 score on *CNN/Dailymail*'s test dataset.
-
-For more details on how the model was fine-tuned, please refer to 
-[this](https://colab.research.google.com/drive/1Ekd5pUeCX7VOrMx94_czTkwNtLN32Uyu?usp=sharing) notebook.
-
-## Results on test set 📝
-
-| Metric | # Value   |
-| ------ | --------- |
-| **ROUGE-2** | **16.51** |
-
-
-
-## Model in Action 🚀
-
-```python
-from transformers import BertTokenizerFast, EncoderDecoderModel
-import torch
-device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
-tokenizer = BertTokenizerFast.from_pretrained('mrm8488/bert-mini2bert-mini-finetuned-cnn_daily_mail-summarization')
-model = EncoderDecoderModel.from_pretrained('mrm8488/bert-mini2bert-mini-finetuned-cnn_daily_mail-summarization').to(device)
-
-def generate_summary(text):
-    # cut off at BERT max length 512
-    inputs = tokenizer([text], padding="max_length", truncation=True, max_length=512, return_tensors="pt")
-    input_ids = inputs.input_ids.to(device)
-    attention_mask = inputs.attention_mask.to(device)
-
-    output = model.generate(input_ids, attention_mask=attention_mask)
-
-    return tokenizer.decode(output[0], skip_special_tokens=True)
-  
-text = "your text to be summarized here..."
-generate_summary(text)
-```
-
-> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488) | [LinkedIn](https://www.linkedin.com/in/manuel-romero-cs/)
-
-> Made with <span style="color: #e25555;">&hearts;</span> in Spain
--- a/model_cards/mrm8488/bert-multi-cased-finedtuned-xquad-tydiqa-goldp/README.md
+++ b/model_cards/mrm8488/bert-multi-cased-finedtuned-xquad-tydiqa-goldp/README.md
---
-language: multilingual
-thumbnail:
---
-
-# A fine-tuned model on GoldP task from Tydi QA dataset
-
-This model uses [bert-multi-cased-finetuned-xquadv1](https://huggingface.co/mrm8488/bert-multi-cased-finetuned-xquadv1) and fine-tuned on [Tydi QA](https://github.com/google-research-datasets/tydiqa) dataset for Gold Passage task [(GoldP)](https://github.com/google-research-datasets/tydiqa#the-tasks)
-
-## Details of the language model
-The base language model [(bert-multi-cased-finetuned-xquadv1)](https://huggingface.co/mrm8488/bert-multi-cased-finetuned-xquadv1) is a fine-tuned version of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) for the **Q&A** downstream task
-
-
-## Details of the Tydi QA dataset
-
-TyDi QA contains 200k human-annotated question-answer pairs in 11 Typologically Diverse languages, written without seeing the answer and without the use of translation, and is designed for the **training and evaluation** of automatic question answering systems. This repository provides evaluation code and a baseline system for the dataset. https://ai.google.com/research/tydiqa
-
-
-## Details of the downstream task (Gold Passage or GoldP aka the secondary task)
-
-Given a passage that is guaranteed to contain the answer, predict the single contiguous span of characters that answers the question. the gold passage task differs from the [primary task](https://github.com/google-research-datasets/tydiqa/blob/master/README.md#the-tasks) in several ways:
-*   only the gold answer passage is provided rather than the entire Wikipedia article;
-*   unanswerable questions have been discarded, similar to MLQA and XQuAD;
-*   we evaluate with the SQuAD 1.1 metrics like XQuAD; and
-*   Thai and Japanese are removed since the lack of whitespace breaks some tools.
-
-
-## Model training
-
-The model was fine-tuned on a Tesla P100 GPU and 25GB of RAM.
-The script is the following:
-
-```python
-python run_squad.py \
-  --model_type bert \
-  --model_name_or_path mrm8488/bert-multi-cased-finetuned-xquadv1 \
-  --do_train \
-  --do_eval \
-  --train_file /content/dataset/train.json \
-  --predict_file /content/dataset/dev.json \
-  --per_gpu_train_batch_size 24 \
-  --per_gpu_eval_batch_size 24 \
-  --learning_rate 3e-5 \
-  --num_train_epochs 2.5 \
-  --max_seq_length 384 \
-  --doc_stride 128 \
-  --output_dir /content/model_output \
-  --overwrite_output_dir \
-  --save_steps 5000 \
-  --threads 40
-  ```
-
-## Global Results (dev set):
-
-| Metric    | # Value     |
-| --------- | ----------- |
-| **Exact** | **71.06** |
-| **F1**    | **82.16** |
-
-## Specific Results (per language):
-
-| Language    | # Samples     | # Exact | # F1 |
-| --------- | ----------- |--------| ------ |
-| Arabic    | 1314  | 73.29 | 84.72 |
-| Bengali   | 180   | 64.60 | 77.84 |
-| English   | 654   | 72.12 |   82.24   |
-| Finnish   | 1031  | 70.14 | 80.36 |
-| Indonesian| 773   | 77.25 | 86.36 |
-| Korean    | 414   | 68.92 | 70.95 |
-| Russian   | 1079    | 62.65 | 78.55 |
-| Swahili   | 596   | 80.11 | 86.18 |
-| Telegu    | 874   | 71.00 | 84.24 |
-
-
-
-> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488)
-
-> Made with <span style="color: #e25555;">&hearts;</span> in Spain
--- a/model_cards/mrm8488/bert-multi-cased-finetuned-xquadv1/README.md
+++ b/model_cards/mrm8488/bert-multi-cased-finetuned-xquadv1/README.md
---
-language: multilingual
-thumbnail:
---
-
-# BERT (base-multilingual-cased) fine-tuned for multilingual Q&A
-
-This model was created by [Google](https://github.com/google-research/bert/blob/master/multilingual.md) and fine-tuned on [XQuAD](https://github.com/deepmind/xquad) like data for multilingual (`11 different languages`) **Q&A** downstream task.
-
-## Details of the language model('bert-base-multilingual-cased')
-
-[Language model](https://github.com/google-research/bert/blob/master/multilingual.md)
-
-| Languages | Heads | Layers | Hidden | Params |
-| --------- | ----- | ------ | ------ | ------ |
-| 104       | 12    | 12     | 768    | 100 M  |
-
-## Details of the downstream task (multilingual Q&A) - Dataset
-
-Deepmind [XQuAD](https://github.com/deepmind/xquad)
-
-Languages covered:
-
- Arabic: `ar`
- German: `de`
- Greek: `el`
- English: `en`
- Spanish: `es`
- Hindi: `hi`
- Russian: `ru`
- Thai: `th`
- Turkish: `tr`
- Vietnamese: `vi`
- Chinese: `zh`
-
-As the dataset is based on SQuAD v1.1, there are no unanswerable questions in the data. We chose this
-setting so that models can focus on cross-lingual transfer.
-
-We show the average number of tokens per paragraph, question, and answer for each language in the
-table below. The statistics were obtained using [Jieba](https://github.com/fxsjy/jieba) for Chinese
-and the [Moses tokenizer](https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl)
-for the other languages.
-
-|           |  en   |  es   |  de   |  el   |  ru   |  tr   |  ar   |  vi   |  th   |  zh   |  hi   |
-| --------- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
-| Paragraph | 142.4 | 160.7 | 139.5 | 149.6 | 133.9 | 126.5 | 128.2 | 191.2 | 158.7 | 147.6 | 232.4 |
-| Question  | 11.5  | 13.4  | 11.0  | 11.7  | 10.0  |  9.8  | 10.7  | 14.8  | 11.5  | 10.5  | 18.7  |
-| Answer    |  3.1  |  3.6  |  3.0  |  3.3  |  3.1  |  3.1  |  3.1  |  4.5  |  4.1  |  3.5  |  5.6  |
-
-Citation:
-
-<details>
-
-```bibtex
-@article{Artetxe:etal:2019,
-      author    = {Mikel Artetxe and Sebastian Ruder and Dani Yogatama},
-      title     = {On the cross-lingual transferability of monolingual representations},
-      journal   = {CoRR},
-      volume    = {abs/1910.11856},
-      year      = {2019},
-      archivePrefix = {arXiv},
-      eprint    = {1910.11856}
-}
-```
-
-</details>
-
-As **XQuAD** is just an evaluation dataset, I used `Data augmentation techniques` (scraping, neural machine translation, etc) to obtain more samples and split the dataset in order to have a train and test set. The test set was created in a way that contains the same number of samples for each language. Finally, I got:
-
-| Dataset     | # samples |
-| ----------- | --------- |
-| XQUAD train | 50 K      |
-| XQUAD test  | 8 K       |
-
-## Model training
-
-The model was trained on a Tesla P100 GPU and 25GB of RAM.
-The script for fine tuning can be found [here](https://github.com/huggingface/transformers/blob/master/examples/distillation/run_squad_w_distillation.py)
-
-
-## Model in action
-
-Fast usage with **pipelines**:
-
-```python
-from transformers import pipeline
-
-from transformers import pipeline
-
-qa_pipeline = pipeline(
-    "question-answering",
-    model="mrm8488/bert-multi-cased-finetuned-xquadv1",
-    tokenizer="mrm8488/bert-multi-cased-finetuned-xquadv1"
-)
-
-
-# context: Coronavirus is seeding panic in the West because it expands so fast.
-
-# question: Where is seeding panic Coronavirus?
-qa_pipeline({
-    'context': "कोरोनावायरस पश्चिम में आतंक बो रहा है क्योंकि यह इतनी तेजी से फैलता है।",
-    'question': "कोरोनावायरस घबराहट कहां है?"
-    
-})
-# output: {'answer': 'पश्चिम', 'end': 18, 'score': 0.7037217439689059, 'start': 12}
-
-qa_pipeline({
-    'context': "Manuel Romero has been working hardly in the repository hugginface/transformers lately",
-    'question': "Who has been working hard for hugginface/transformers lately?"
-    
-})
-# output: {'answer': 'Manuel Romero', 'end': 13, 'score': 0.7254485993702389, 'start': 0}
-
-qa_pipeline({
-    'context': "Manuel Romero a travaillé à peine dans le référentiel hugginface / transformers ces derniers temps",
-    'question': "Pour quel référentiel a travaillé Manuel Romero récemment?"
-    
-})
-#output: {'answer': 'hugginface / transformers', 'end': 79, 'score': 0.6482061613915384, 'start': 54}
-```
-![model in action](https://media.giphy.com/media/MBlire8Wj7ng73VBQ5/giphy.gif)
-
-Try it on a Colab:
-
-<a href="https://colab.research.google.com/github/mrm8488/shared_colab_notebooks/blob/master/Try_mrm8488_xquad_finetuned_model.ipynb" target="_parent"><img src="https://camo.githubusercontent.com/52feade06f2fecbf006889a904d221e6a730c194/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667" alt="Open In Colab" data-canonical-src="https://colab.research.google.com/assets/colab-badge.svg"></a>
-
-
-
-> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488)
-
-> Made with <span style="color: #e25555;">&hearts;</span> in Spain
--- a/model_cards/mrm8488/bert-multi-uncased-finetuned-xquadv1/README.md
+++ b/model_cards/mrm8488/bert-multi-uncased-finetuned-xquadv1/README.md
---
-language: multilingual
-thumbnail:
---
-
-# BERT (base-multilingual-uncased) fine-tuned for multilingual Q&A
-
-This model was created by [Google](https://github.com/google-research/bert/blob/master/multilingual.md) and fine-tuned on [XQuAD](https://github.com/deepmind/xquad) like data for multilingual (`11 different languages`) **Q&A** downstream task.
-
-## Details of the language model('bert-base-multilingual-uncased')
-
-[Language model](https://github.com/google-research/bert/blob/master/multilingual.md)
-
-| Languages | Heads | Layers | Hidden | Params |
-| --------- | ----- | ------ | ------ | ------ |
-| 102       | 12    | 12     | 768    | 100 M  |
-
-## Details of the downstream task (multilingual Q&A) - Dataset
-
-Deepmind [XQuAD](https://github.com/deepmind/xquad)
-
-Languages covered:
-
- Arabic: `ar`
- German: `de`
- Greek: `el`
- English: `en`
- Spanish: `es`
- Hindi: `hi`
- Russian: `ru`
- Thai: `th`
- Turkish: `tr`
- Vietnamese: `vi`
- Chinese: `zh`
-
-As the dataset is based on SQuAD v1.1, there are no unanswerable questions in the data. We chose this
-setting so that models can focus on cross-lingual transfer.
-
-We show the average number of tokens per paragraph, question, and answer for each language in the
-table below. The statistics were obtained using [Jieba](https://github.com/fxsjy/jieba) for Chinese
-and the [Moses tokenizer](https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/tokenizer.perl)
-for the other languages.
-
-|           |  en   |  es   |  de   |  el   |  ru   |  tr   |  ar   |  vi   |  th   |  zh   |  hi   |
-| --------- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
-| Paragraph | 142.4 | 160.7 | 139.5 | 149.6 | 133.9 | 126.5 | 128.2 | 191.2 | 158.7 | 147.6 | 232.4 |
-| Question  | 11.5  | 13.4  | 11.0  | 11.7  | 10.0  |  9.8  | 10.7  | 14.8  | 11.5  | 10.5  | 18.7  |
-| Answer    |  3.1  |  3.6  |  3.0  |  3.3  |  3.1  |  3.1  |  3.1  |  4.5  |  4.1  |  3.5  |  5.6  |
-
-Citation:
-
-<details>
-
-```bibtex
-@article{Artetxe:etal:2019,
-      author    = {Mikel Artetxe and Sebastian Ruder and Dani Yogatama},
-      title     = {On the cross-lingual transferability of monolingual representations},
-      journal   = {CoRR},
-      volume    = {abs/1910.11856},
-      year      = {2019},
-      archivePrefix = {arXiv},
-      eprint    = {1910.11856}
-}
-```
-
-</details>
-
-As **XQuAD** is just an evaluation dataset, I used `Data augmentation techniques` (scraping, neural machine translation, etc) to obtain more samples and split the dataset in order to have a train and test set. The test set was created in a way that contains the same number of samples for each language. Finally, I got:
-
-| Dataset     | # samples |
-| ----------- | --------- |
-| XQUAD train | 50 K      |
-| XQUAD test  | 8 K       |
-
-## Model training
-
-The model was trained on a Tesla P100 GPU and 25GB of RAM.
-The script for fine tuning can be found [here](https://github.com/huggingface/transformers/blob/master/examples/distillation/run_squad_w_distillation.py)
-
-
-## Model in action
-
-Fast usage with **pipelines**:
-
-```python
-from transformers import pipeline
-
-qa_pipeline = pipeline(
-    "question-answering",
-    model="mrm8488/bert-multi-uncased-finetuned-xquadv1",
-    tokenizer="mrm8488/bert-multi-uncased-finetuned-xquadv1"
-)
-
-
-# context: Coronavirus is seeding panic in the West because it expands so fast.
-
-# question: Where is seeding panic Coronavirus?
-qa_pipeline({
-    'context': "कोरोनावायरस पश्चिम में आतंक बो रहा है क्योंकि यह इतनी तेजी से फैलता है।",
-    'question': "कोरोनावायरस घबराहट कहां है?"
-    
-})
-# output: {'answer': 'पश्चिम', 'end': 18, 'score': 0.7037217439689059, 'start': 12}
-
-qa_pipeline({
-    'context': "Manuel Romero has been working hardly in the repository hugginface/transformers lately",
-    'question': "Who has been working hard for hugginface/transformers lately?"
-    
-})
-# output: {'answer': 'Manuel Romero', 'end': 13, 'score': 0.7254485993702389, 'start': 0}
-
-qa_pipeline({
-    'context': "Manuel Romero a travaillé à peine dans le référentiel hugginface / transformers ces derniers temps",
-    'question': "Pour quel référentiel a travaillé Manuel Romero récemment?"
-    
-})
-#output: {'answer': 'hugginface / transformers', 'end': 79, 'score': 0.6482061613915384, 'start': 54}
-```
-![model in action](https://media.giphy.com/media/MBlire8Wj7ng73VBQ5/giphy.gif)
-
-Try it on a Colab:
-
-<a href="https://colab.research.google.com/github/mrm8488/shared_colab_notebooks/blob/master/Try_mrm8488_xquad_finetuned_uncased_model.ipynb" target="_parent"><img src="https://camo.githubusercontent.com/52feade06f2fecbf006889a904d221e6a730c194/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667" alt="Open In Colab" data-canonical-src="https://colab.research.google.com/assets/colab-badge.svg"></a>
-
-
-
-> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488)
-
-> Made with <span style="color: #e25555;">&hearts;</span> in Spain
--- a/model_cards/mrm8488/bert-small-finetuned-squadv2/README.md
+++ b/model_cards/mrm8488/bert-small-finetuned-squadv2/README.md
---
-language: en
-thumbnail:
---
-
-# BERT-Small fine-tuned on SQuAD v2
-
-[BERT-Small](https://github.com/google-research/bert/) created by [Google Research](https://github.com/google-research) and fine-tuned on [SQuAD 2.0](https://rajpurkar.github.io/SQuAD-explorer/) for **Q&A** downstream task.
-
-**Mode size** (after training): **109.74 MB**
-
-## Details of BERT-Small and its 'family' (from their documentation)
-
-Released on March 11th, 2020
-
-This is model is a part of 24 smaller BERT models (English only, uncased, trained with WordPiece masking) referenced in [Well-Read Students Learn Better: On the Importance of Pre-training Compact Models](https://arxiv.org/abs/1908.08962).
-
-The smaller BERT models are intended for environments with restricted computational resources. They can be fine-tuned in the same manner as the original BERT models. However, they are most effective in the context of knowledge distillation, where the fine-tuning labels are produced by a larger and more accurate teacher.
-
-## Details of the downstream task (Q&A) - Dataset
-
-[SQuAD2.0](https://rajpurkar.github.io/SQuAD-explorer/) combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering.
-
-| Dataset  | Split | # samples |
-| -------- | ----- | --------- |
-| SQuAD2.0 | train | 130k      |
-| SQuAD2.0 | eval  | 12.3k     |
-
-## Model training
-
-The model was trained on a Tesla P100 GPU and 25GB of RAM.
-The script for fine tuning can be found [here](https://github.com/huggingface/transformers/blob/master/examples/question-answering/run_squad.py)
-
-## Results:
-
-| Metric | # Value   |
-| ------ | --------- |
-| **EM** | **60.49** |
-| **F1** | **64.21** |
-
-## Comparison:
-
-| Model                                                                                       | EM        | F1 score  | SIZE (MB) |
-| ------------------------------------------------------------------------------------------- | --------- | --------- | --------- |
-| [bert-tiny-finetuned-squadv2](https://huggingface.co/mrm8488/bert-tiny-finetuned-squadv2)   | 48.60     | 49.73     | **16.74** |
-| [bert-mini-finetuned-squadv2](https://huggingface.co/mrm8488/bert-mini-finetuned-squadv2)   | 56.31     | 59.65     | 42.63     |
-| [bert-small-finetuned-squadv2](https://huggingface.co/mrm8488/bert-small-finetuned-squadv2) | **60.49** | **64.21** | 109.74    |
-
-## Model in action
-
-Fast usage with **pipelines**:
-
-```python
-from transformers import pipeline
-
-qa_pipeline = pipeline(
-    "question-answering",
-    model="mrm8488/bert-small-finetuned-squadv2",
-    tokenizer="mrm8488/bert-small-finetuned-squadv2"
-)
-
-qa_pipeline({
-    'context': "Manuel Romero has been working hardly in the repository hugginface/transformers lately",
-    'question': "Who has been working hard for hugginface/transformers lately?"
-
-})
-
-# Output:
-```
-
-```json
-{
-  "answer": "Manuel Romero",
-  "end": 13,
-  "score": 0.9939319924374637,
-  "start": 0
-}
-```
-
-### Yes! That was easy 🎉 Let's try with another example
-
-```python
-qa_pipeline({
-    'context': "Manuel Romero has been working hardly in the repository hugginface/transformers lately",
-    'question': "For which company has worked Manuel Romero?"
-})
-
-# Output:
-```
-
-```json
-{
-  "answer": "hugginface/transformers",
-  "end": 79,
-  "score": 0.6024888734447131,
-  "start": 56
-}
-```
-
-### It works!! 🎉 🎉 🎉
-
-> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488) | [LinkedIn](https://www.linkedin.com/in/manuel-romero-cs/)
-
-> Made with <span style="color: #e25555;">&hearts;</span> in Spain
--- a/model_cards/mrm8488/bert-small-finetuned-typo-detection/README.md
+++ b/model_cards/mrm8488/bert-small-finetuned-typo-detection/README.md
---
-language: en
-thumbnail:
---
-
-# BERT SMALL + Typo Detection ✍❌✍✔
-
-[BERT SMALL](https://huggingface.co/google/bert_uncased_L-4_H-512_A-8) fine-tuned on [GitHub Typo Corpus](https://github.com/mhagiwara/github-typo-corpus) for **typo detection** (using *NER* style)
-
-## Details of the downstream task (Typo detection as NER)
-
- Dataset: [GitHub Typo Corpus](https://github.com/mhagiwara/github-typo-corpus) 📚
-
- [Fine-tune script on NER dataset provided by Huggingface](https://github.com/huggingface/transformers/blob/master/examples/token-classification/run_ner_old.py) 🏋️‍♂️
-
-## Metrics on test set 📋
-
-|  Metric   |  # score  |
-| :-------: | :-------: |
-|    F1     | **89.12** |
-| Precision | **93.82** |
-|  Recall   | **84.87** |
-
-## Model in action 🔨
-
-Fast usage with **pipelines** 🧪
-
-```python
-from transformers import pipeline
-
-typo_checker = pipeline(
-    "ner",
-    model="mrm8488/bert-small-finetuned-typo-detection",
-    tokenizer="mrm8488/bert-small-finetuned-typo-detection"
-)
-
-result = typo_checker("here there is an error in coment")
-result[1:-1]
-
-# Output:
-[{'entity': 'ok', 'score': 0.9021041989326477, 'word': 'here'},
- {'entity': 'ok', 'score': 0.7975626587867737, 'word': 'there'},
- {'entity': 'ok', 'score': 0.8596242070198059, 'word': 'is'},
- {'entity': 'ok', 'score': 0.7071516513824463, 'word': 'an'},
- {'entity': 'ok', 'score': 0.943381130695343, 'word': 'error'},
- {'entity': 'ok', 'score': 0.8047608733177185, 'word': 'in'},
- {'entity': 'ok', 'score': 0.8240702152252197, 'word': 'come'},
- {'entity': 'typo', 'score': 0.5004884004592896, 'word': '##nt'}]
-```
-
-It works🎉! we typed ```coment``` instead of ```comment```
-
-Let's try with another example
-
-```python
-result = typo_checker("Adddd validation midelware")
-result[1:-1]
-
-# Output:
-[{'entity': 'ok', 'score': 0.7128152847290039, 'word': 'add'},
- {'entity': 'typo', 'score': 0.5388424396514893, 'word': '##dd'},
- {'entity': 'ok', 'score': 0.94792640209198, 'word': 'validation'},
- {'entity': 'typo', 'score': 0.5839331746101379, 'word': 'mid'},
- {'entity': 'ok', 'score': 0.5195121765136719, 'word': '##el'},
- {'entity': 'ok', 'score': 0.7222476601600647, 'word': '##ware'}]
-```
-Yeah! We typed wrong ```Add and middleware```
-
-
-> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488)
-
-> Made with <span style="color: #e25555;">&hearts;</span> in Spain