[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)

* rm all model cards * Update the .rst @sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler * Add a rootlevel README.md with simple instructions/context * Update docs/source/model_sharing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * rm all model cards Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)
* rm all model cards * Update the .rst @sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler * Add a rootlevel README.md with simple instructions/context * Update docs/source/model_sharing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * rm all model cards Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
3552d0e0 · Julien Chaumond · GitHub · 29e45979 · 29e45979 · 29e45979
Unverified Commit 3552d0e0 authored Dec 12, 2020 by Julien Chaumond Committed by GitHub Dec 11, 2020
20 changed files
--- a/model_cards/T-Systems-onsite/german-roberta-sentence-transformer-v2/README.md
+++ b/model_cards/T-Systems-onsite/german-roberta-sentence-transformer-v2/README.md
---
-language: de
-license: mit
-tags:
- sentence_embedding
- search
- pytorch 
- xlm-roberta 
- roberta
- xlm-r-distilroberta-base-paraphrase-v1
- paraphrase
-datasets:
- STSbenchmark
-metrics:
- Spearman’s rank correlation
- cosine similarity
---
-
-# German RoBERTa for Sentence Embeddings V2
-**The new [T-Systems-onsite/cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer) model is slightly better for German language. It is also the current best model for English language and works cross-lingually. Please consider using that model.**
-
-This model is intended to [compute sentence (text embeddings)](https://www.sbert.net/docs/usage/computing_sentence_embeddings.html) for German text. These embeddings can then be compared with [cosine-similarity](https://en.wikipedia.org/wiki/Cosine_similarity) to find sentences with a similar semantic meaning. For example this can be useful for [semantic textual similarity](https://www.sbert.net/docs/usage/semantic_textual_similarity.html), [semantic search](https://www.sbert.net/docs/usage/semantic_search.html), or [paraphrase mining](https://www.sbert.net/docs/usage/paraphrase_mining.html). To do this you have to use the [Sentence Transformers Python framework](https://github.com/UKPLab/sentence-transformers).
-
-> Sentence-BERT (SBERT) is a  modification  of  the  pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. This reduces the effort for finding the most similar pair from 65hours with BERT / RoBERTa to about 5 seconds with SBERT, while maintaining the accuracy from BERT.
-
-Source: [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://arxiv.org/abs/1908.10084)
-
-This model is fine-tuned from [Philip May](https://eniak.de/) and open-sourced by [T-Systems-onsite](https://www.t-systems-onsite.de/). Special thanks to [Nils Reimers](https://www.nils-reimers.de/) for your awesome open-source work, the Sentence Transformers, the models and your help on GitHub.
-
-## How to use
-**The usage description above - provided by Hugging Face - is wrong for sentence embeddings! Please use this:**
-
-To use this model install the `sentence-transformers` package (see here: <https://github.com/UKPLab/sentence-transformers>).
-
-```python
-from sentence_transformers import SentenceTransformer
-model = SentenceTransformer('T-Systems-onsite/german-roberta-sentence-transformer-v2')
-```
-
-For details of usage and examples see here:
- [Computing Sentence Embeddings](https://www.sbert.net/docs/usage/computing_sentence_embeddings.html)
- [Semantic Textual Similarity](https://www.sbert.net/docs/usage/semantic_textual_similarity.html)
- [Paraphrase Mining](https://www.sbert.net/docs/usage/paraphrase_mining.html)
- [Semantic Search](https://www.sbert.net/docs/usage/semantic_search.html)
- [Cross-Encoders](https://www.sbert.net/docs/usage/cross-encoder.html)
- [Examples on GitHub](https://github.com/UKPLab/sentence-transformers/tree/master/examples)
-
-## Training
-The base model is [xlm-roberta-base](https://huggingface.co/xlm-roberta-base). This model has been further trained by [Nils Reimers](https://www.nils-reimers.de/) on a large scale paraphrase dataset for 50+ languages. [Nils Reimers](https://www.nils-reimers.de/) about this [on GitHub](https://github.com/UKPLab/sentence-transformers/issues/509#issuecomment-712243280):
-
->A paper is upcoming for the paraphrase models.
->
->These models were trained on various datasets with Millions of examples for paraphrases, mainly derived from Wikipedia edit logs, paraphrases mined from Wikipedia and SimpleWiki, paraphrases from news reports, AllNLI-entailment pairs with in-batch-negative loss etc.
->
->In internal tests, they perform much better than the NLI+STSb models as they have see more and broader type of training data. NLI+STSb has the issue that they are rather narrow in their domain and do not contain any domain specific words / sentences (like from chemistry, computer science, math etc.). The paraphrase models has seen plenty of sentences from various domains.
->
->More details with the setup, all the datasets, and a wider evaluation will follow soon.
-
-The resulting model called `xlm-r-distilroberta-base-paraphrase-v1` has been released here: <https://github.com/UKPLab/sentence-transformers/releases/tag/v0.3.8>
-
-Building on this cross language model we fine-tuned it for German language on the [deepl.com](https://www.deepl.com/translator) dataset of our [German STSbenchmark dataset](https://github.com/t-systems-on-site-services-gmbh/german-STSbenchmark).
-
-We did an automatic hyperparameter search for 102 trials with [Optuna](https://github.com/optuna/optuna). Using 10-fold crossvalidation on the deepl.com test and dev dataset we found the following best hyperparameters:
- batch_size = 15
- num_epochs = 4
- lr = 2.2995320905210864e-05
- eps = 1.8979875906303792e-06
- weight_decay = 0.003314045812507563
- warmup_steps_proportion = 0.46141685205829014
-
-The final model was trained with these hyperparameters on the combination of `sts_de_train.csv` and `sts_de_dev.csv`. The `sts_de_test.csv` was left for testing.
-
-# Evaluation
-The evaluation has been done on the test set of our [German STSbenchmark dataset](https://github.com/t-systems-on-site-services-gmbh/german-STSbenchmark). The code is available on [Colab](https://colab.research.google.com/drive/1aCWOqDQx953kEnQ5k4Qn7uiixokocOHv?usp=sharing). As the metric for evaluation we use the Spearman’s rank correlation between the  cosine-similarity of the sentence embeddings and STSbenchmark labels.
-
-| Model Name                           | Spearman rank correlation<br/>(German)           |
-|--------------------------------------|-------------------------------------|
-| xlm-r-distilroberta-base-paraphrase-v1                        | 0.8079     |
-| xlm-r-100langs-bert-base-nli-stsb-mean-tokens                 | 0.8194     |
-| xlm-r-bert-base-nli-stsb-mean-tokens                          | 0.8194     |
-| **T-Systems-onsite/<br/>german-roberta-sentence-transformer-v2**   | **0.8529** |
-| **[T-Systems-onsite/<br/>cross-en-de-roberta-sentence-transformer](https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer)** | **0.8550** |
--- a/model_cards/Tereveni-AI/gpt2-124M-uk-fiction/README.md
+++ b/model_cards/Tereveni-AI/gpt2-124M-uk-fiction/README.md
---
-language: uk
---
-
-Note: **default code snippet above won't work** because we are using `AlbertTokenizer` with `GPT2LMHeadModel`, see [issue](https://github.com/huggingface/transformers/issues/4285).
-
-## GPT2 124M Trained on Ukranian Fiction
-
-### Training details
-
-Model was trained on corpus of 4040 fiction books, 2.77 GiB in total.
-Evaluation on [brown-uk](https://github.com/brown-uk/corpus) gives perplexity of 50.16. 
-
-### Example usage:
-```python
-from transformers import AlbertTokenizer, GPT2LMHeadModel
-
-tokenizer = AlbertTokenizer.from_pretrained("Tereveni-AI/gpt2-124M-uk-fiction")
-model = GPT2LMHeadModel.from_pretrained("Tereveni-AI/gpt2-124M-uk-fiction")
-
-input_ids = tokenizer.encode("Но зла Юнона, суча дочка,", add_special_tokens=False, return_tensors='pt')
-
-outputs = model.generate(
-    input_ids,
-    do_sample=True,
-    num_return_sequences=3,
-    max_length=50
-)
-
-for i, out in enumerate(outputs):
-    print("{}: {}".format(i, tokenizer.decode(out)))
-```
-
-Prints something like this:
-```bash
-0: Но зла Юнона, суча дочка, яка затьмарила всі її таємниці: І хто з'їсть її душу, той помре». І, не дочекавшись гніву богів, посунула в пітьму, щоб не бачити перед собою. Але, за
-1: Но зла Юнона, суча дочка, і довела мене до божевілля. Але він не знав нічого. Після того як я його побачив, мені стало зле. Я втратив рівновагу. Але в мене не було часу на роздуми. Я вже втратив надію
-2: Но зла Юнона, суча дочка, не нарікала нам! — раптом вигукнула Юнона. — Це ти, старий йолопе! — мовила вона, не перестаючи сміятись. — Хіба ти не знаєш, що мені подобається ходити з тобою?
-```
\ No newline at end of file
--- a/model_cards/TurkuNLP/bert-base-finnish-cased-v1/README.md
+++ b/model_cards/TurkuNLP/bert-base-finnish-cased-v1/README.md
---
-language: fi
---
-
-## Quickstart
-
-**Release 1.0** (November 25, 2019)
-
-Download the models here:
-
-* Cased Finnish BERT Base: [bert-base-finnish-cased-v1.zip](http://dl.turkunlp.org/finbert/bert-base-finnish-cased-v1.zip)
-* Uncased Finnish BERT Base: [bert-base-finnish-uncased-v1.zip](http://dl.turkunlp.org/finbert/bert-base-finnish-uncased-v1.zip)
-
-We generally recommend the use of the cased model.
-
-Paper presenting Finnish BERT: [arXiv:1912.07076](https://arxiv.org/abs/1912.07076)
-
-## What's this?
-
-A version of Google's [BERT](https://github.com/google-research/bert) deep transfer learning model for Finnish. The model can be fine-tuned to achieve state-of-the-art results for various Finnish natural language processing tasks.
-
-FinBERT features a custom 50,000 wordpiece vocabulary that has much better coverage of Finnish words than e.g. the previously released [multilingual BERT](https://github.com/google-research/bert/blob/master/multilingual.md) models from Google:
-
-| Vocabulary | Example |
-|------------|---------|
-| FinBERT    | Suomessa vaihtuu kesän aikana sekä pääministeri että valtiovarain ##ministeri . |
-| Multilingual BERT | Suomessa vai ##htuu kes ##än aikana sekä p ##ää ##minister ##i että valt ##io ##vara ##in ##minister ##i . |
-
-FinBERT has been pre-trained for 1 million steps on over 3 billion tokens (24B characters) of Finnish text drawn from news, online discussion, and internet crawls. By contrast, Multilingual BERT was trained on Wikipedia texts, where the Finnish Wikipedia text is approximately 3% of the amount used to train FinBERT.
-
-These features allow FinBERT to outperform not only Multilingual BERT but also all previously proposed models when fine-tuned for Finnish natural language processing tasks.
-
-## Results
-
-### Document classification
-
-![learning curves for Yle and Ylilauta document classification](https://raw.githubusercontent.com/TurkuNLP/FinBERT/master/img/yle-ylilauta-curves.png)
-
-FinBERT outperforms multilingual BERT (M-BERT) on document classification over a range of training set sizes on the Yle news (left) and Ylilauta online discussion (right) corpora. (Baseline classification performance with [FastText](https://fasttext.cc/) included for reference.)
-
-[[code](https://github.com/spyysalo/finbert-text-classification)][[Yle data](https://github.com/spyysalo/yle-corpus)] [[Ylilauta data](https://github.com/spyysalo/ylilauta-corpus)]
-
-### Named Entity Recognition
-
-Evaluation on FiNER corpus ([Ruokolainen et al 2019](https://arxiv.org/abs/1908.04212))
-
-| Model          | Accuracy |
-|--------------------|----------|
-| **FinBERT**  | **92.40%** |
-| Multilingual BERT | 90.29% |
-| [FiNER-tagger](https://github.com/Traubert/FiNer-rules) (rule-based) | 86.82%      |
-
-(FiNER tagger results from [Ruokolainen et al. 2019](https://arxiv.org/pdf/1908.04212.pdf))
-
-[[code](https://github.com/jouniluoma/keras-bert-ner)][[data](https://github.com/mpsilfve/finer-data)]
-
-### Part of speech tagging
-
-Evaluation on three Finnish corpora annotated with [Universal Dependencies](https://universaldependencies.org/) part-of-speech tags: the Turku Dependency Treebank (TDT), FinnTreeBank (FTB), and Parallel UD treebank (PUD)
-
-| Model             |     TDT     |     FTB     |     PUD     |
-|-------------------|-------------|-------------|-------------|
-| **FinBERT**       | **98.23%**  | **98.39%**  | **98.08%**  |
-| Multilingual BERT |   96.97%    |   95.87%    |   97.58%    |
-
-[[code](https://github.com/spyysalo/bert-pos)][[data](http://hdl.handle.net/11234/1-2837)]
-
-## Use with PyTorch
-
-If you want to use the model with the huggingface/transformers library, follow the steps in [huggingface_transformers.md](https://github.com/TurkuNLP/FinBERT/blob/master/huggingface_transformers.md)
-
-## Previous releases
-
-### Release 0.2
-
-**October 24, 2019** Beta version of the BERT base uncased model trained from scratch on a corpus of Finnish news, online discussions, and crawled data. 
-
-Download the model here: [bert-base-finnish-uncased.zip](http://dl.turkunlp.org/finbert/bert-base-finnish-uncased.zip)
-
-### Release 0.1
-
-**September 30, 2019** We release a beta version of the BERT base cased model trained from scratch on a corpus of Finnish news, online discussions, and crawled data. 
-
-Download the model here: [bert-base-finnish-cased.zip](http://dl.turkunlp.org/finbert/bert-base-finnish-cased.zip)
--- a/model_cards/TurkuNLP/bert-base-finnish-uncased-v1/README.md
+++ b/model_cards/TurkuNLP/bert-base-finnish-uncased-v1/README.md
---
-language: fi
---
-
-## Quickstart
-
-**Release 1.0** (November 25, 2019)
-
-Download the models here:
-
-* Cased Finnish BERT Base: [bert-base-finnish-cased-v1.zip](http://dl.turkunlp.org/finbert/bert-base-finnish-cased-v1.zip)
-* Uncased Finnish BERT Base: [bert-base-finnish-uncased-v1.zip](http://dl.turkunlp.org/finbert/bert-base-finnish-uncased-v1.zip)
-
-We generally recommend the use of the cased model.
-
-Paper presenting Finnish BERT: [arXiv:1912.07076](https://arxiv.org/abs/1912.07076)
-
-## What's this?
-
-A version of Google's [BERT](https://github.com/google-research/bert) deep transfer learning model for Finnish. The model can be fine-tuned to achieve state-of-the-art results for various Finnish natural language processing tasks.
-
-FinBERT features a custom 50,000 wordpiece vocabulary that has much better coverage of Finnish words than e.g. the previously released [multilingual BERT](https://github.com/google-research/bert/blob/master/multilingual.md) models from Google:
-
-| Vocabulary | Example |
-|------------|---------|
-| FinBERT    | Suomessa vaihtuu kesän aikana sekä pääministeri että valtiovarain ##ministeri . |
-| Multilingual BERT | Suomessa vai ##htuu kes ##än aikana sekä p ##ää ##minister ##i että valt ##io ##vara ##in ##minister ##i . |
-
-FinBERT has been pre-trained for 1 million steps on over 3 billion tokens (24B characters) of Finnish text drawn from news, online discussion, and internet crawls. By contrast, Multilingual BERT was trained on Wikipedia texts, where the Finnish Wikipedia text is approximately 3% of the amount used to train FinBERT.
-
-These features allow FinBERT to outperform not only Multilingual BERT but also all previously proposed models when fine-tuned for Finnish natural language processing tasks.
-
-## Results
-
-### Document classification
-
-![learning curves for Yle and Ylilauta document classification](https://raw.githubusercontent.com/TurkuNLP/FinBERT/master/img/yle-ylilauta-curves.png)
-
-FinBERT outperforms multilingual BERT (M-BERT) on document classification over a range of training set sizes on the Yle news (left) and Ylilauta online discussion (right) corpora. (Baseline classification performance with [FastText](https://fasttext.cc/) included for reference.)
-
-[[code](https://github.com/spyysalo/finbert-text-classification)][[Yle data](https://github.com/spyysalo/yle-corpus)] [[Ylilauta data](https://github.com/spyysalo/ylilauta-corpus)]
-
-### Named Entity Recognition
-
-Evaluation on FiNER corpus ([Ruokolainen et al 2019](https://arxiv.org/abs/1908.04212))
-
-| Model          | Accuracy |
-|--------------------|----------|
-| **FinBERT**  | **92.40%** |
-| Multilingual BERT | 90.29% |
-| [FiNER-tagger](https://github.com/Traubert/FiNer-rules) (rule-based) | 86.82%      |
-
-(FiNER tagger results from [Ruokolainen et al. 2019](https://arxiv.org/pdf/1908.04212.pdf))
-
-[[code](https://github.com/jouniluoma/keras-bert-ner)][[data](https://github.com/mpsilfve/finer-data)]
-
-### Part of speech tagging
-
-Evaluation on three Finnish corpora annotated with [Universal Dependencies](https://universaldependencies.org/) part-of-speech tags: the Turku Dependency Treebank (TDT), FinnTreeBank (FTB), and Parallel UD treebank (PUD)
-
-| Model             |     TDT     |     FTB     |     PUD     |
-|-------------------|-------------|-------------|-------------|
-| **FinBERT**       | **98.23%**  | **98.39%**  | **98.08%**  |
-| Multilingual BERT |   96.97%    |   95.87%    |   97.58%    |
-
-[[code](https://github.com/spyysalo/bert-pos)][[data](http://hdl.handle.net/11234/1-2837)]
-
-## Use with PyTorch
-
-If you want to use the model with the huggingface/transformers library, follow the steps in [huggingface_transformers.md](https://github.com/TurkuNLP/FinBERT/blob/master/huggingface_transformers.md)
-
-## Previous releases
-
-### Release 0.2
-
-**October 24, 2019** Beta version of the BERT base uncased model trained from scratch on a corpus of Finnish news, online discussions, and crawled data. 
-
-Download the model here: [bert-base-finnish-uncased.zip](http://dl.turkunlp.org/finbert/bert-base-finnish-uncased.zip)
-
-### Release 0.1
-
-**September 30, 2019** We release a beta version of the BERT base cased model trained from scratch on a corpus of Finnish news, online discussions, and crawled data. 
-
-Download the model here: [bert-base-finnish-cased.zip](http://dl.turkunlp.org/finbert/bert-base-finnish-cased.zip)
--- a/model_cards/TypicaAI/magbert-ner/README.md
+++ b/model_cards/TypicaAI/magbert-ner/README.md
---
-language: fr
-widget:
- text: "Je m'appelle Hicham et je vis a Fès"
---
-
-# MagBERT-NER: a state-of-the-art NER model for Moroccan French language (Maghreb)
-
-## Introduction
-
-[MagBERT-NER] is a state-of-the-art NER model for Moroccan French language (Maghreb). The MagBERT-NER model was fine-tuned for NER Task based the language model for French Camembert (based on the RoBERTa architecture).
-
-For further information or requests, please visite our website at [typica.ai Website](https://typica.ai/) or send us an email at contactus@typica.ai
-
-## How to use MagBERT-NER with HuggingFace
-
-##### Load MagBERT-NER and its sub-word tokenizer :
-
-```python
-from transformers import AutoTokenizer, AutoModelForTokenClassification
-
-tokenizer = AutoTokenizer.from_pretrained("TypicaAI/magbert-ner")
-model = AutoModelForTokenClassification.from_pretrained("TypicaAI/magbert-ner")
-
-
-##### Process text sample (from wikipedia about the current Prime Minister of Morocco) Using NER pipeline  
-
-from transformers import pipeline
-
-nlp = pipeline('ner', model=model, tokenizer=tokenizer, grouped_entities=True)
-nlp("Saad Dine El Otmani, né le 16 janvier 1956 à Inezgane, est un homme d'État marocain, chef du gouvernement du Maroc depuis le 5 avril 2017")
-
-
-#[{'entity_group': 'I-PERSON',
-#  'score': 0.8941445276141167,
-#  'word': 'Saad Dine El Otmani'},
-# {'entity_group': 'B-DATE',
-#  'score': 0.5967703461647034,
-#  'word': '16 janvier 1956'},
-# {'entity_group': 'B-GPE', 'score': 0.7160899192094803, 'word': 'Inezgane'},
-# {'entity_group': 'B-NORP', 'score': 0.7971733212471008, 'word': 'marocain'},
-# {'entity_group': 'B-GPE', 'score': 0.8921478390693665, 'word': 'Maroc'},
-# {'entity_group': 'B-DATE',
-#  'score': 0.5760444005330404,
-#  'word': '5 avril 2017'}]
-
-```
-
-
-## Authors 
-
-MagBert-NER Model was trained by Hicham Assoudi, Ph.D. 
-For any questions, comments you can contact me at assoudi@typica.ai
-
-
-## Citation
-
-If you use our work, please cite:
-Hicham Assoudi, Ph.D., MagBERT-NER: a state-of-the-art NER model for Moroccan French language (Maghreb), (2020)
--- a/model_cards/Vamsi/T5_Paraphrase_Paws/README.md
+++ b/model_cards/Vamsi/T5_Paraphrase_Paws/README.md
---
-language: "en"
-tags:
- paraphrase-generation
- text-generation
- Conditional Generation
-inference: false
---
-
-# Paraphrase-Generation
-
-## Model description
-
-T5 Model for generating paraphrases of english sentences. Trained on the [Google PAWS](https://github.com/google-research-datasets/paws) dataset.
-
-## How to use
-
-PyTorch and TF models available
-
-```python
-from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
-
-tokenizer = AutoTokenizer.from_pretrained("Vamsi/T5_Paraphrase_Paws")  
-model = AutoModelForSeq2SeqLM.from_pretrained("Vamsi/T5_Paraphrase_Paws")
-
-sentence = "This is something which i cannot understand at all"
-
-text =  "paraphrase: " + sentence + " </s>"
-
-encoding = tokenizer.encode_plus(text,pad_to_max_length=True, return_tensors="pt")
-input_ids, attention_masks = encoding["input_ids"].to("cuda"), encoding["attention_mask"].to("cuda")
-
-
-outputs = model.generate(
-    input_ids=input_ids, attention_mask=attention_masks,
-    max_length=256,
-    do_sample=True,
-    top_k=120,
-    top_p=0.95,
-    early_stopping=True,
-    num_return_sequences=5
-)
-
-for output in outputs:
-    line = tokenizer.decode(output, skip_special_tokens=True,clean_up_tokenization_spaces=True)
-    print(line)
-
-
-```
-
-For more reference on training your own T5 model or using this model, do check out [Paraphrase Generation](https://github.com/Vamsi995/Paraphrase-Generator).
--- a/model_cards/VictorSanh/roberta-base-finetuned-yelp-polarity/README.md
+++ b/model_cards/VictorSanh/roberta-base-finetuned-yelp-polarity/README.md
---
-language: en
-datasets:
- yelp_polarity
---
-
-# RoBERTa-base-finetuned-yelp-polarity
-
-This is a [RoBERTa-base](https://huggingface.co/roberta-base) checkpoint fine-tuned on binary sentiment classifcation from [Yelp polarity](https://huggingface.co/nlp/viewer/?dataset=yelp_polarity).
-It gets **98.08%** accuracy on the test set.
-
-## Hyper-parameters
-
-We used the following hyper-parameters to train the model on one GPU:
-```python
-num_train_epochs            = 2.0
-learning_rate               = 1e-05
-weight_decay                = 0.0
-adam_epsilon                = 1e-08
-max_grad_norm               = 1.0
-per_device_train_batch_size = 32
-gradient_accumulation_steps = 1
-warmup_steps                = 3500
-seed                        = 42
-```
--- a/model_cards/ViktorAlm/electra-base-norwegian-uncased-discriminator/README.md
+++ b/model_cards/ViktorAlm/electra-base-norwegian-uncased-discriminator/README.md
---
-language: no
-thumbnail: https://i.imgur.com/QqSEC5I.png
---
-
-# Norwegian Electra
-![Image of norwegian electra](https://i.imgur.com/QqSEC5I.png)
-
-Trained on Oscar + wikipedia + opensubtitles + some other data I had with the awesome power of TPUs(V3-8)
-
-Use with caution. I have no downstream tasks in Norwegian to test on so I have no idea of its performance yet.
-# Model
-## Electra: Pre-training Text Encoders as Discriminators Rather Than Generators
-Kevin Clark and Minh-Thang Luong and Quoc V. Le and Christopher D. Manning
- https://openreview.net/pdf?id=r1xMH1BtvB
- https://github.com/google-research/electra
-# Acknowledgments
-### TensorFlow Research Cloud
-Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC). Thanks for providing access to the TFRC ❤️
- https://www.tensorflow.org/tfrc
-#### OSCAR corpus
- https://oscar-corpus.com/
-#### OPUS
- http://opus.nlpl.eu/
- http://www.opensubtitles.org/
--- a/model_cards/a-ware/bart-squadv2/README.md
+++ b/model_cards/a-ware/bart-squadv2/README.md
---
-datasets:
- squad_v2
---
-
-# BART-LARGE finetuned on SQuADv2
-
-This is bart-large model finetuned on SQuADv2 dataset for question answering task
-
-## Model details
-BART was propsed in the [paper](https://arxiv.org/abs/1910.13461) **BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension**.
-BART is a seq2seq model intended for both NLG and NLU tasks. 
-
-To use BART for question answering tasks, we feed the complete document into the encoder and decoder, and use the top
-hidden state of the decoder as a representation for each
-word. This representation is used to classify the token. As given in the paper bart-large achives comparable to ROBERTa on SQuAD.
-Another notable thing about BART is that it can handle sequences with upto 1024 tokens.
-
-| Param               | #Value |
-|---------------------|--------|
-| encoder layers      | 12     |
-| decoder layers      | 12     |
-| hidden size         | 4096   |
-| num attetion heads  | 16     |
-| on disk size        | 1.63GB |
-
-
-## Model training
-This model was trained with following parameters using simpletransformers wrapper:
-```
-train_args = {
-    'learning_rate': 1e-5,
-    'max_seq_length': 512,
-    'doc_stride': 512,
-    'overwrite_output_dir': True,
-    'reprocess_input_data': False,
-    'train_batch_size': 8,
-    'num_train_epochs': 2,
-    'gradient_accumulation_steps': 2,
-    'no_cache': True,
-    'use_cached_eval_features': False,
-    'save_model_every_epoch': False,
-    'output_dir': "bart-squadv2",
-    'eval_batch_size': 32,
-    'fp16_opt_level': 'O2',
-    }
-```
-
-[You can even train your own model using this colab notebook](https://colab.research.google.com/drive/1I5cK1M_0dLaf5xoewh6swcm5nAInfwHy?usp=sharing)
-
-## Results
-```{"correct": 6832, "similar": 4409, "incorrect": 632, "eval_loss": -14.950117511952177}```
-
-## Model in Action  🚀
-```python3
-from transformers import BartTokenizer, BartForQuestionAnswering
-import torch
-
-tokenizer = BartTokenizer.from_pretrained('a-ware/bart-squadv2')
-model = BartForQuestionAnswering.from_pretrained('a-ware/bart-squadv2')
-
-question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"
-encoding = tokenizer(question, text, return_tensors='pt')
-input_ids = encoding['input_ids']
-attention_mask = encoding['attention_mask']
-
-start_scores, end_scores = model(input_ids, attention_mask=attention_mask, output_attentions=False)[:2]
-
-all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0])
-answer = ' '.join(all_tokens[torch.argmax(start_scores) : torch.argmax(end_scores)+1])
-answer = tokenizer.convert_tokens_to_ids(answer.split())
-answer = tokenizer.decode(answer)
-#answer => 'a nice puppet' 
-```
-
-> Created with ❤️ by A-ware UG [![Github icon](https://cdn0.iconfinder.com/data/icons/octicons/1024/mark-github-32.png)](https://github.com/aware-ai)
--- a/model_cards/a-ware/roberta-large-squad-classification/README.md
+++ b/model_cards/a-ware/roberta-large-squad-classification/README.md
---
-datasets:
- squad_v2
---
-
-# Roberta-LARGE finetuned on SQuADv2
-
-This is roberta-large model finetuned on SQuADv2 dataset for question answering answerability classification
-
-## Model details
-This model is simply an Sequenceclassification model with two inputs (context and question) in a list.
-The result is either [1] for answerable or [0] if it is not answerable.
-It was trained over 4 epochs on squadv2 dataset and can be used to filter out which context is good to give into the QA model to avoid bad answers.
-
-## Model training
-This model was trained with following parameters using simpletransformers wrapper:
-```
-train_args = {
-    'learning_rate': 1e-5,
-    'max_seq_length': 512,
-    'overwrite_output_dir': True,
-    'reprocess_input_data': False,
-    'train_batch_size': 4,
-    'num_train_epochs': 4,
-    'gradient_accumulation_steps': 2,
-    'no_cache': True,
-    'use_cached_eval_features': False,
-    'save_model_every_epoch': False,
-    'output_dir': "bart-squadv2",
-    'eval_batch_size': 8,
-    'fp16_opt_level': 'O2',
-    }
-```
-
-## Results
-```{"accuracy": 90.48%}```
-## Model in Action  🚀
-```python3
-from simpletransformers.classification import ClassificationModel
-
-model = ClassificationModel('roberta', 'a-ware/roberta-large-squadv2', num_labels=2, args=train_args)
-
-predictions, raw_outputs = model.predict([["my dog is an year old. he loves to go into the rain", "how old is my dog ?"]])
-print(predictions)
-==> [1]
-```
-
-> Created with ❤️ by A-ware UG [![Github icon](https://cdn0.iconfinder.com/data/icons/octicons/1024/mark-github-32.png)](https://github.com/aware-ai)
--- a/model_cards/a-ware/xlmroberta-squadv2/README.md
+++ b/model_cards/a-ware/xlmroberta-squadv2/README.md
---
-datasets:
- squad_v2
---
-
-# XLM-ROBERTA-LARGE finetuned on SQuADv2
-
-This is xlm-roberta-large model finetuned on SQuADv2 dataset for question answering task
-
-## Model details
-XLM-Roberta was propsed in the [paper](https://arxiv.org/pdf/1911.02116.pdf) **XLM-R: State-of-the-art cross-lingual understanding through self-supervision
-
-## Model training
-This model was trained with following parameters using simpletransformers wrapper:
-```
-train_args = {
-    'learning_rate': 1e-5,
-    'max_seq_length': 512,
-    'doc_stride': 512,
-    'overwrite_output_dir': True,
-    'reprocess_input_data': False,
-    'train_batch_size': 8,
-    'num_train_epochs': 2,
-    'gradient_accumulation_steps': 2,
-    'no_cache': True,
-    'use_cached_eval_features': False,
-    'save_model_every_epoch': False,
-    'output_dir': "bart-squadv2",
-    'eval_batch_size': 32,
-    'fp16_opt_level': 'O2',
-    }
-```
-
-## Results
-```{"correct": 6961, "similar": 4359, "incorrect": 553, "eval_loss": -12.177856394381962}```
-
-## Model in Action  🚀
-```python3
-from transformers import XLMRobertaTokenizer, XLMRobertaForQuestionAnswering
-import torch
-
-tokenizer = XLMRobertaTokenizer.from_pretrained('a-ware/xlmroberta-squadv2')
-model = XLMRobertaForQuestionAnswering.from_pretrained('a-ware/xlmroberta-squadv2')
-
-question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"
-encoding = tokenizer(question, text, return_tensors='pt')
-input_ids = encoding['input_ids']
-attention_mask = encoding['attention_mask']
-
-start_scores, end_scores = model(input_ids, attention_mask=attention_mask, output_attentions=False)[:2]
-
-all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0])
-answer = ' '.join(all_tokens[torch.argmax(start_scores) : torch.argmax(end_scores)+1])
-answer = tokenizer.convert_tokens_to_ids(answer.split())
-answer = tokenizer.decode(answer)
-#answer => 'a nice puppet' 
-```
-
-> Created with ❤️ by A-ware UG [![Github icon](https://cdn0.iconfinder.com/data/icons/octicons/1024/mark-github-32.png)](https://github.com/aware-ai)
--- a/model_cards/abhilash1910/financial_roberta/README.md
+++ b/model_cards/abhilash1910/financial_roberta/README.md
---
-tags:
- finance
---
-# Roberta Masked Language Model Trained On Financial Phrasebank Corpus 
-
-
-This is a Masked Language Model trained with [Roberta](https://huggingface.co/transformers/model_doc/roberta.html) on a Financial Phrasebank Corpus.
-The model is built using Huggingface transformers.
-The model can be found at :[Financial_Roberta](https://huggingface.co/abhilash1910/financial_roberta)
-
-
-## Specifications
-
-
-The corpus for training is taken from the Financial Phrasebank (Malo et al)[https://www.researchgate.net/publication/251231107_Good_Debt_or_Bad_Debt_Detecting_Semantic_Orientations_in_Economic_Texts]. 
-
-
-## Model Specification
-
-
-The model chosen for training is [Roberta](https://arxiv.org/abs/1907.11692) with the following specifications:
- 1. vocab_size=56000
- 2. max_position_embeddings=514
- 3. num_attention_heads=12
- 4. num_hidden_layers=6
- 5. type_vocab_size=1
-
-
-This is trained by using  RobertaConfig from transformers package.
-The model is trained for 10 epochs with a gpu batch size of 64 units. 
-
-
-
-## Usage Specifications
-
-
-For using this model, we have to first import AutoTokenizer and AutoModelWithLMHead Modules from transformers
-After that we have to specify, the pre-trained model,which in this case is 'abhilash1910/financial_roberta' for the tokenizers and the model.
-
-
-```python
-from transformers import AutoTokenizer, AutoModelWithLMHead
-
-tokenizer = AutoTokenizer.from_pretrained("abhilash1910/financial_roberta")
-
-model = AutoModelWithLMHead.from_pretrained("abhilash1910/financial_roberta")
-```
-
-
-After this the model will be downloaded, it will take some time to download all the model files.
-For testing the model, we have to import  pipeline module from transformers and create a masked output model for inference as follows:
-
-
-```python
-from transformers import pipeline
-model_mask = pipeline('fill-mask', model='abhilash1910/inancial_roberta')
-model_mask("The  company had a <mask> of 20% in 2020.")
-```
-
-
-Some of the examples are also provided with generic financial statements:
-
-Example 1:
-
-
-```python
-model_mask("The  company had a <mask> of 20% in 2020.")
-```
-
-
-Output:
-
-
-```bash
-[{'sequence': '<s>The  company had a profit of 20% in 2020.</s>',
-  'score': 0.023112965747714043,
-  'token': 421,
-  'token_str': 'Ġprofit'},
- {'sequence': '<s>The  company had a loss of 20% in 2020.</s>',
-  'score': 0.021379893645644188,
-  'token': 616,
-  'token_str': 'Ġloss'},
- {'sequence': '<s>The  company had a year of 20% in 2020.</s>',
-  'score': 0.0185744296759367,
-  'token': 443,
-  'token_str': 'Ġyear'},
- {'sequence': '<s>The  company had a sales of 20% in 2020.</s>',
-  'score': 0.018143286928534508,
-  'token': 428,
-  'token_str': 'Ġsales'},
- {'sequence': '<s>The  company had a value of 20% in 2020.</s>',
-  'score': 0.015319528989493847,
-  'token': 776,
-  'token_str': 'Ġvalue'}]
-  ```
- 
- Example 2:
- 
-```python
- model_mask("The <mask>  is listed under NYSE")
-```
-
-Output:
-
-```bash
-[{'sequence': '<s>The company  is listed under NYSE</s>',
-  'score': 0.1566661298274994,
-  'token': 359,
-  'token_str': 'Ġcompany'},
- {'sequence': '<s>The total  is listed under NYSE</s>',
-  'score': 0.05542507395148277,
-  'token': 522,
-  'token_str': 'Ġtotal'},
- {'sequence': '<s>The value  is listed under NYSE</s>',
-  'score': 0.04729423299431801,
-  'token': 776,
-  'token_str': 'Ġvalue'},
- {'sequence': '<s>The order  is listed under NYSE</s>',
-  'score': 0.02533523552119732,
-  'token': 798,
-  'token_str': 'Ġorder'},
- {'sequence': '<s>The contract  is listed under NYSE</s>',
-  'score': 0.02087237872183323,
-  'token': 635,
-  'token_str': 'Ġcontract'}]
-  ```
-  
-
-## Resources
-
-For all resources , please look into the [HuggingFace](https://huggingface.co/) Site and the [Repositories](https://github.com/huggingface).
--- a/model_cards/abhilash1910/french-roberta/README.md
+++ b/model_cards/abhilash1910/french-roberta/README.md
-# Roberta Trained Model For Masked Language Model On French Corpus :robot:
-
-
-This is a Masked Language Model trained with [Roberta](https://huggingface.co/transformers/model_doc/roberta.html) on a small French News Corpus(Leipzig corpora).
-The model is built using Huggingface transformers.
-The model can be found at :[French-Roberta](https://huggingface.co/abhilash1910/french-roberta)
-
-
-## Specifications
-
-
-The corpus for training is taken from Leipzig Corpora (French News) , and is trained on a small set of the corpus (300K). 
-
-
-## Model Specification
-
-
-The model chosen for training is [Roberta](https://arxiv.org/abs/1907.11692) with the following specifications:
- 1. vocab_size=32000
- 2. max_position_embeddings=514
- 3. num_attention_heads=12
- 4. num_hidden_layers=6
- 5. type_vocab_size=1
-
-
-This is trained by using  RobertaConfig from transformers package.The total training parameters :68124416
-The model is trained for 100 epochs with a gpu batch size of 64 units. 
-More details for building custom models can be found at the [HuggingFace Blog](https://huggingface.co/blog/how-to-train)
-
-
-
-## Usage Specifications
-
-
-For using this model, we have to first import AutoTokenizer and AutoModelWithLMHead Modules from transformers
-After that we have to specify, the pre-trained model,which in this case is 'abhilash1910/french-roberta' for the tokenizers and the model.
-
-
-```python
-from transformers import AutoTokenizer, AutoModelWithLMHead
-
-tokenizer = AutoTokenizer.from_pretrained("abhilash1910/french-roberta")
-
-model = AutoModelWithLMHead.from_pretrained("abhilash1910/french-roberta")
-```
-
-
-After this the model will be downloaded, it will take some time to download all the model files.
-For testing the model, we have to import  pipeline module from transformers and create a masked output model for inference as follows:
-
-
-```python
-from transformers import pipeline
-model_mask = pipeline('fill-mask', model='abhilash1910/french-roberta')
-model_mask("Le tweet <mask>.")
-```
-
-
-Some of the examples are also provided with generic French sentences:
-
-Example 1:
-
-
-```python
-model_mask("À ce jour, <mask> projet a entraîné")
-```
-
-
-Output:
-
-
-```bash
-[{'sequence': '<s>À ce jour, belles projet a entraîné</s>',
-  'score': 0.18685665726661682,
-  'token': 6504,
-  'token_str': 'Ġbelles'},
- {'sequence': '<s>À ce jour,- projet a entraîné</s>',
-  'score': 0.0005200508167035878,
-  'token': 17,
-  'token_str': '-'},
- {'sequence': '<s>À ce jour, de projet a entraîné</s>',
-  'score': 0.00045729897101409733,
-  'token': 268,
-  'token_str': 'Ġde'},
- {'sequence': '<s>À ce jour, du projet a entraîné</s>',
-  'score': 0.0004307595663703978,
-  'token': 326,
-  'token_str': 'Ġdu'},
- {'sequence': '<s>À ce jour," projet a entraîné</s>',
-  'score': 0.0004219160182401538,
-  'token': 6,
-  'token_str': '"'}]
-  ```
- 
- Example 2:
- 
-```python
- model_mask("C'est un <mask>")
-```
-
-Output:
-
-```bash
-[{'sequence': "<s>C'est un belles</s>",
-  'score': 0.16440927982330322,
-  'token': 6504,
-  'token_str': 'Ġbelles'},
- {'sequence': "<s>C'est un de</s>",
-  'score': 0.0005495127406902611,
-  'token': 268,
-  'token_str': 'Ġde'},
- {'sequence': "<s>C'est un du</s>",
-  'score': 0.00044988933950662613,
-  'token': 326,
-  'token_str': 'Ġdu'},
- {'sequence': "<s>C'est un-</s>",
-  'score': 0.00044542422983795404,
-  'token': 17,
-  'token_str': '-'},
- {'sequence': "<s>C'est un\t</s>",
-  'score': 0.00037563967634923756,
-  'token': 202,
-  'token_str': 'ĉ'}]
-  ```
-  
-
-## Resources
-
-For all resources , please look into the [HuggingFace](https://huggingface.co/) Site and the [Repositories](https://github.com/huggingface).
-
-
--- a/model_cards/activebus/BERT-DK_laptop/README.md
+++ b/model_cards/activebus/BERT-DK_laptop/README.md
-# ReviewBERT
-
-BERT (post-)trained from review corpus to understand sentiment, options and various e-commence aspects.  
-
-`BERT-DK_laptop` is trained from 100MB laptop corpus under `Electronics/Computers & Accessories/Laptops`. 
-
-
-## Model Description
-
-The original model is from `BERT-base-uncased` trained from Wikipedia+BookCorpus.  
-Models are post-trained from [Amazon Dataset](http://jmcauley.ucsd.edu/data/amazon/) and [Yelp Dataset](https://www.yelp.com/dataset/challenge/).  
-
-`BERT-DK_laptop` is trained from 100MB laptop corpus under `Electronics/Computers & Accessories/Laptops`. 
-
-## Instructions
-Loading the post-trained weights are as simple as, e.g., 
-
-```python
-import torch
-from transformers import AutoModel, AutoTokenizer
-
-tokenizer = AutoTokenizer.from_pretrained("activebus/BERT-DK_laptop")
-model = AutoModel.from_pretrained("activebus/BERT-DK_laptop")
-
-```
-
-
-## Evaluation Results
-
-Check our [NAACL paper](https://www.aclweb.org/anthology/N19-1242.pdf) 
-
-
-## Citation
-If you find this work useful, please cite as following.
-```
-@inproceedings{xu_bert2019,
-    title = "BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis",
-    author = "Xu, Hu and Liu, Bing and Shu, Lei and Yu, Philip S.",
-    booktitle = "Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics",
-    month = "jun",
-    year = "2019",
-}
-```
--- a/model_cards/activebus/BERT-DK_rest/README.md
+++ b/model_cards/activebus/BERT-DK_rest/README.md
-# ReviewBERT
-
-BERT (post-)trained from review corpus to understand sentiment, options and various e-commence aspects.
-
-`BERT-DK_rest` is trained from 1G (19 types) restaurants from Yelp.  
-
-## Model Description
-
-The original model is from `BERT-base-uncased` trained from Wikipedia+BookCorpus.  
-Models are post-trained from [Amazon Dataset](http://jmcauley.ucsd.edu/data/amazon/) and [Yelp Dataset](https://www.yelp.com/dataset/challenge/).  
-
-
-## Instructions
-Loading the post-trained weights are as simple as, e.g., 
-
-```python
-import torch
-from transformers import AutoModel, AutoTokenizer
-
-tokenizer = AutoTokenizer.from_pretrained("activebus/BERT-DK_rest")
-model = AutoModel.from_pretrained("activebus/BERT-DK_rest")
-
-```
-
-
-## Evaluation Results
-
-Check our [NAACL paper](https://www.aclweb.org/anthology/N19-1242.pdf) 
-
-
-## Citation
-If you find this work useful, please cite as following.
-```
-@inproceedings{xu_bert2019,
-    title = "BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis",
-    author = "Xu, Hu and Liu, Bing and Shu, Lei and Yu, Philip S.",
-    booktitle = "Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics",
-    month = "jun",
-    year = "2019",
-}
-```
--- a/model_cards/activebus/BERT-PT_laptop/README.md
+++ b/model_cards/activebus/BERT-PT_laptop/README.md
-# ReviewBERT
-
-BERT (post-)trained from review corpus to understand sentiment, options and various e-commence aspects.  
-
-`BERT-DK_laptop` is trained from 100MB laptop corpus under `Electronics/Computers & Accessories/Laptops`. 
-`BERT-PT_*` addtionally uses SQuAD 1.1.  
-
-## Model Description
-
-The original model is from `BERT-base-uncased` trained from Wikipedia+BookCorpus.  
-Models are post-trained from [Amazon Dataset](http://jmcauley.ucsd.edu/data/amazon/) and [Yelp Dataset](https://www.yelp.com/dataset/challenge/).  
-
-
-## Instructions
-Loading the post-trained weights are as simple as, e.g., 
-
-```python
-import torch
-from transformers import AutoModel, AutoTokenizer
-
-tokenizer = AutoTokenizer.from_pretrained("activebus/BERT-PT_laptop")
-model = AutoModel.from_pretrained("activebus/BERT-PT_laptop")
-
-```
-
-## Evaluation Results
-
-Check our [NAACL paper](https://www.aclweb.org/anthology/N19-1242.pdf) 
-
-
-## Citation
-If you find this work useful, please cite as following.
-```
-@inproceedings{xu_bert2019,
-    title = "BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis",
-    author = "Xu, Hu and Liu, Bing and Shu, Lei and Yu, Philip S.",
-    booktitle = "Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics",
-    month = "jun",
-    year = "2019",
-}
-```
--- a/model_cards/activebus/BERT-PT_rest/README.md
+++ b/model_cards/activebus/BERT-PT_rest/README.md
-# ReviewBERT
-
-BERT (post-)trained from review corpus to understand sentiment, options and various e-commence aspects.  
-
-`BERT-DK_rest` is trained from 1G (19 types) restaurants from Yelp.
-`BERT-PT_*` addtionally uses SQuAD 1.1.  
-
-## Model Description
-
-The original model is from `BERT-base-uncased` trained from Wikipedia+BookCorpus.  
-Models are post-trained from [Amazon Dataset](http://jmcauley.ucsd.edu/data/amazon/) and [Yelp Dataset](https://www.yelp.com/dataset/challenge/).  
-
-
-## Instructions
-Loading the post-trained weights are as simple as, e.g., 
-
-```python
-import torch
-from transformers import AutoModel, AutoTokenizer
-
-tokenizer = AutoTokenizer.from_pretrained("activebus/BERT-PT_rest")
-model = AutoModel.from_pretrained("activebus/BERT-PT_rest")
-
-```
-
-
-## Evaluation Results
-
-Check our [NAACL paper](https://www.aclweb.org/anthology/N19-1242.pdf) 
-
-
-## Citation
-If you find this work useful, please cite as following.
-```
-@inproceedings{xu_bert2019,
-    title = "BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis",
-    author = "Xu, Hu and Liu, Bing and Shu, Lei and Yu, Philip S.",
-    booktitle = "Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics",
-    month = "jun",
-    year = "2019",
-}
-```
--- a/model_cards/activebus/BERT-XD_Review/README.md
+++ b/model_cards/activebus/BERT-XD_Review/README.md
-# ReviewBERT
-
-BERT (post-)trained from review corpus to understand sentiment, options and various e-commence aspects.  
-Please visit https://github.com/howardhsu/BERT-for-RRC-ABSA for details.  
-
-`BERT-XD_Review` is a cross-domain (beyond just `laptop` and `restaurant`) language model, where each example is from a single product / restaurant with the same rating, post-trained (fine-tuned) on a combination of 5-core Amazon reviews and all Yelp data, expected to be 22 G in total. It is trained for 4 epochs on `bert-base-uncased`.
-The preprocessing code [here](https://github.com/howardhsu/BERT-for-RRC-ABSA/transformers).
-
-## Model Description
-
-The original model is from `BERT-base-uncased`.  
-Models are post-trained from [Amazon Dataset](http://jmcauley.ucsd.edu/data/amazon/) and [Yelp Dataset](https://www.yelp.com/dataset/challenge/).  
-
-
-## Instructions
-Loading the post-trained weights are as simple as, e.g., 
-
-```python
-import torch
-from transformers import AutoModel, AutoTokenizer
-
-tokenizer = AutoTokenizer.from_pretrained("activebus/BERT-XD_Review")
-model = AutoModel.from_pretrained("activebus/BERT-XD_Review")
-
-```
-
-
-## Evaluation Results
-
-Check our [NAACL paper](https://www.aclweb.org/anthology/N19-1242.pdf) 
-`BERT_Review` is expected to have similar performance on domain-specific tasks (such as aspect extraction) as `BERT-DK`, but much better on general tasks such as aspect sentiment classification (different domains mostly share similar sentiment words).
-
-
-## Citation
-If you find this work useful, please cite as following.
-```
-@inproceedings{xu_bert2019,
-    title = "BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis",
-    author = "Xu, Hu and Liu, Bing and Shu, Lei and Yu, Philip S.",
-    booktitle = "Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics",
-    month = "jun",
-    year = "2019",
-}
-```
--- a/model_cards/activebus/BERT_Review/README.md
+++ b/model_cards/activebus/BERT_Review/README.md
-# ReviewBERT
-
-BERT (post-)trained from review corpus to understand sentiment, options and various e-commence aspects.  
-
-`BERT_Review` is cross-domain (beyond just `laptop` and `restaurant`) language model with one example from randomly mixed domains, post-trained (fine-tuned) on a combination of 5-core Amazon reviews and all Yelp data, expected to be 22 G in total. It is trained for 4 epochs on `bert-base-uncased`.
-The preprocessing code [here](https://github.com/howardhsu/BERT-for-RRC-ABSA/transformers).
-
-
-## Model Description
-
-The original model is from `BERT-base-uncased` trained from Wikipedia+BookCorpus.  
-Models are post-trained from [Amazon Dataset](http://jmcauley.ucsd.edu/data/amazon/) and [Yelp Dataset](https://www.yelp.com/dataset/challenge/).  
-
-
-## Instructions
-Loading the post-trained weights are as simple as, e.g., 
-
-```python
-import torch
-from transformers import AutoModel, AutoTokenizer
-
-tokenizer = AutoTokenizer.from_pretrained("activebus/BERT_Review")
-model = AutoModel.from_pretrained("activebus/BERT_Review")
-
-```
-
-
-## Evaluation Results
-
-Check our [NAACL paper](https://www.aclweb.org/anthology/N19-1242.pdf) 
-`BERT_Review` is expected to have similar performance on domain-specific tasks (such as aspect extraction) as `BERT-DK`, but much better on general tasks such as aspect sentiment classification (different domains mostly share similar sentiment words).
-
-
-## Citation
-If you find this work useful, please cite as following.
-```
-@inproceedings{xu_bert2019,
-    title = "BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis",
-    author = "Xu, Hu and Liu, Bing and Shu, Lei and Yu, Philip S.",
-    booktitle = "Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics",
-    month = "jun",
-    year = "2019",
-}
-```
--- a/model_cards/adalbertojunior/PTT5-SMALL-SUM/README.md
+++ b/model_cards/adalbertojunior/PTT5-SMALL-SUM/README.md
---
-language: pt
---
-
-# PTT5-SMALL-SUM
-
-## Model description
-
-This model was trained to summarize texts in portuguese
-
-
-based on ```unicamp-dl/ptt5-small-portuguese-vocab```
-
-#### How to use
-
-```python
-from transformers import T5Tokenizer, T5ForConditionalGeneration
-
-tokenizer = T5Tokenizer.from_pretrained('adalbertojunior/PTT5-SMALL-SUM')
-
-t5 = T5ForConditionalGeneration.from_pretrained('adalbertojunior/PTT5-SMALL-SUM')
-
-text="Esse é um exemplo de sumarização."
-
-input_ids = tokenizer.encode(text, return_tensors="pt", add_special_tokens=True)
-
-generated_ids = t5.generate(
-        input_ids=input_ids,
-        num_beams=1,
-        max_length=40,
-        #repetition_penalty=2.5
-    ).squeeze()
-    
-predicted_span = tokenizer.decode(generated_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True)
-
-
-```