[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)

* rm all model cards * Update the .rst @sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler * Add a rootlevel README.md with simple instructions/context * Update docs/source/model_sharing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * rm all model cards Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)
* rm all model cards * Update the .rst @sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler * Add a rootlevel README.md with simple instructions/context * Update docs/source/model_sharing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * rm all model cards Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
3552d0e0 · Julien Chaumond · GitHub · 29e45979 · 29e45979 · 29e45979
Unverified Commit 3552d0e0 authored Dec 12, 2020 by Julien Chaumond Committed by GitHub Dec 11, 2020
20 changed files
--- a/model_cards/t5-large-README.md
+++ b/model_cards/t5-large-README.md
---
-language: en
-datasets:
- c4
-tags:
- summarization
- translation
-license: apache-2.0
---
-[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) 
-Pretraining Dataset: [C4](https://huggingface.co/datasets/c4)
-Other Community Checkpoints: [here](https://huggingface.co/models?search=t5)
-Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)
-Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu* 
-## Abstract
-Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.
-![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)
--- a/model_cards/t5-small-README.md
+++ b/model_cards/t5-small-README.md
---
-language: en
-datasets:
- c4
-tags:
- summarization
- translation
-license: apache-2.0
---
-[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) 
-Pretraining Dataset: [C4](https://huggingface.co/datasets/c4)
-Other Community Checkpoints: [here](https://huggingface.co/models?search=t5)
-Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)
-Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu* 
-## Abstract
-Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.
-![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)
--- a/model_cards/tartuNLP/EstBERT/README.md
+++ b/model_cards/tartuNLP/EstBERT/README.md
---
-language: et
---
-# EstBERT
-### What's this?
-The EstBERT model is a pretrained BERT<sub>Base</sub> model exclusively trained on Estonian cased corpus on both 128 and 512 sequence length of data. 
-### How to use?
-You can use the model transformer library both in tensorflow and pytorch version. 
-```
-from transformers import AutoTokenizer, AutoModelForMaskedLM
-tokenizer = AutoTokenizer.from_pretrained("tartuNLP/EstBERT")
-model = AutoModelForMaskedLM.from_pretrained("tartuNLP/EstBERT")
-```
-You can also download the pretrained model from here, [EstBERT_128]() [EstBERT_512]()
-#### Dataset used to train the model
-The EstBERT model is trained both on 128 and 512 sequence length of data. For training the EstBERT we used the [Estonian National Corpus 2017](https://metashare.ut.ee/repository/browse/estonian-national-corpus-2017/b616ceda30ce11e8a6e4005056b40024880158b577154c01bd3d3fcfc9b762b3/), which was the largest Estonian language corpus available at the time. It consists of four sub-corpora: Estonian Reference Corpus 1990-2008, Estonian Web Corpus 2013, Estonian Web Corpus 2017 and Estonian Wikipedia Corpus 2017.
-### Why would I use?
-Overall EstBERT performs better in parts of speech (POS), name entity recognition (NER), rubric, and sentiment classification tasks compared to mBERT and XLM-RoBERTa. The comparative results can be found below;
-|Model   |UPOS                  |XPOS   |Morph  |bf UPOS   |bf XPOS                  |Morph                 |
-|--------------|----------------------------|-------------|-------------|-------------|----------------------------|----------------------------|
-| EstBERT      | **_97.89_** | **98.40** | **96.93** | **97.84** | **_98.43_** | **_96.80_** |
-| mBERT        | 97.42                     | 98.06      | 96.24      | 97.43      | 98.13                     | 96.13                     |
-| XLM-RoBERTa | 97.78                     | 98.36      | 96.53      | 97.80      | 98.40                     | 96.69                     |
-|Model|Rubric<sub>128</sub>        |Sentiment<sub>128</sub>  | Rubric<sub>128</sub>   |Sentiment<sub>512</sub>         |
-|-------------------|----------------------------|--------------------|-----------------------------------------------|----------------------------|
-| EstBERT           | **_81.70_** | 74.36             | **80.96**                                   | 74.50                     |
-| mBERT             | 75.67                     | 70.23             | 74.94                                        | 69.52                     |
-| XLM\-RoBERTa      | 80.34                     | **74.50**        | 78.62                                        | **_76.07_**|
-|Model   |Precicion<sub>128</sub>   |Recall<sub>128</sub>                  |F1-Score<sub>128</sub>               |Precision<sub>512</sub>               |Recall<sub>512</sub>   |F1-Score<sub>512</sub>   |
-|--------------|----------------|----------------------------|----------------------------|----------------------------|-------------|----------------|
-| EstBERT      | **88.42**    | 90.38                     |**_89.39_** | 88.35                     | 89.74      | 89.04         |
-| mBERT        | 85.88         | 87.09                     | 86.51                     |**_88.47_** | 88.28      | 88.37         |
-| XLM\-RoBERTa | 87.55         |**_91.19_** | 89.34                     | 87.50                     | **90.76** | **89.10**    |
--- a/model_cards/tblard/tf-allocine/README.md
+++ b/model_cards/tblard/tf-allocine/README.md
---
-language: fr
---
-# tf-allociné
-A french sentiment analysis model, based on [CamemBERT](https://camembert-model.fr/), and finetuned on a large-scale dataset scraped from [Allociné.fr](http://www.allocine.fr/) user reviews.
-## Results
-| Validation Accuracy | Validation F1-Score | Test Accuracy | Test F1-Score |
-|--------------------:| -------------------:| -------------:|--------------:|
-|               97.39 |               97.36 |         97.44 |         97.34 |
-The dataset and the evaluation code are available on [this repo](https://github.com/TheophileBlard/french-sentiment-analysis-with-bert).
-## Usage
-```python
-from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
-from transformers import pipeline
-tokenizer = AutoTokenizer.from_pretrained("tblard/tf-allocine")
-model = TFAutoModelForSequenceClassification.from_pretrained("tblard/tf-allocine")
-nlp = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)
-print(nlp("Alad'2 est clairement le meilleur film de l'année 2018.")) # POSITIVE
-print(nlp("Juste whoaaahouuu !")) # POSITIVE
-print(nlp("NUL...A...CHIER ! FIN DE TRANSMISSION.")) # NEGATIVE
-print(nlp("Je m'attendais à mieux de la part de Franck Dubosc !")) # NEGATIVE
-```
-## Author
-Théophile Blard – :email: theophile.blard@gmail.com
-If you use this work (code, model or dataset), please cite as:
-> Théophile Blard, French sentiment analysis with BERT, (2020), GitHub repository, <https://github.com/TheophileBlard/french-sentiment-analysis-with-bert>
--- a/model_cards/tuner007/pegasus_paraphrase/README.md
+++ b/model_cards/tuner007/pegasus_paraphrase/README.md
-# Pegasus for Paraphrasing
-Pegasus model fine-tuned for paraphrasing
-## Model in Action 🚀
-```
-import torch
-from transformers import PegasusForConditionalGeneration, PegasusTokenizer
-model_name = 'tuner007/pegasus_paraphrase'
-torch_device = 'cuda' if torch.cuda.is_available() else 'cpu'
-tokenizer = PegasusTokenizer.from_pretrained(model_name)
-model = PegasusForConditionalGeneration.from_pretrained(model_name).to(torch_device)
-def get_response(input_text,num_return_sequences):
-  batch = tokenizer.prepare_seq2seq_batch([input_text],truncation=True,padding='longest',max_length=60, return_tensors="pt").to(torch_device)
-  translated = model.generate(**batch,max_length=60,num_beams=10, num_return_sequences=num_return_sequences, temperature=1.5)
-  tgt_text = tokenizer.batch_decode(translated, skip_special_tokens=True)
-  return tgt_text
-```
-#### Example 1: 
-```
-context = "The ultimate test of your knowledge is your capacity to convey it to another."
-get_response(context,10)
-# output:
-['The test of your knowledge is your ability to convey it.',
- 'The ability to convey your knowledge is the ultimate test of your knowledge.',
- 'The ability to convey your knowledge is the most important test of your knowledge.',
- 'Your capacity to convey your knowledge is the ultimate test of it.',
- 'The test of your knowledge is your ability to communicate it.',
- 'Your capacity to convey your knowledge is the ultimate test of your knowledge.',
- 'Your capacity to convey your knowledge to another is the ultimate test of your knowledge.',
- 'Your capacity to convey your knowledge is the most important test of your knowledge.',
- 'The test of your knowledge is how well you can convey it.',
- 'Your capacity to convey your knowledge is the ultimate test.']
-```
-#### Example 2: Question paraphrasing (was not trained on quora dataset)
-```
-context = "Which course should I take to get started in data science?"
-get_response(context,10)
-# output: 
-['Which data science course should I take?',
- 'Which data science course should I take first?',
- 'Should I take a data science course?',
- 'Which data science class should I take?',
- 'Which data science course should I attend?',
- 'I want to get started in data science.',
- 'Which data science course should I enroll in?',
- 'Which data science course is right for me?',
- 'Which data science course is best for me?',
- 'Which course should I take to get started?']
-```
-> Created by Arpit Rajauria
-[![Twitter icon](https://cdn0.iconfinder.com/data/icons/shift-logotypes/32/Twitter-32.png)](https://twitter.com/arpit_rajauria)
--- a/model_cards/tuner007/pegasus_qa/README.md
+++ b/model_cards/tuner007/pegasus_qa/README.md
-# Pegasus for question-answering
-Pegasus model fine-tuned for QA using text-to-text approach
-## Model in Action 🚀
-```
-import torch
-from transformers import PegasusForConditionalGeneration, PegasusTokenizer
-model_name = 'tuner007/pegasus_qa'
-torch_device = 'cuda' if torch.cuda.is_available() else 'cpu'
-tokenizer = PegasusTokenizer.from_pretrained(model_name)
-model = PegasusForConditionalGeneration.from_pretrained(model_name).to(torch_device)
-def get_answer(question, context):
-  input_text = "question: %s text: %s" % (question,context)
-  batch = tokenizer.prepare_seq2seq_batch([input_text], truncation=True, padding='longest', return_tensors="pt").to(torch_device)
-  translated = model.generate(**batch)
-  tgt_text = tokenizer.batch_decode(translated, skip_special_tokens=True)
-  return tgt_text[0]
-```
-#### Example:
-```
-context = "PG&E stated it scheduled the blackouts in response to forecasts for high winds amid dry conditions. The aim is to reduce the risk of wildfires. Nearly 800 thousand customers were scheduled to be affected by the shutoffs which were expected to last through at least midday tomorrow."
-question = "How many customers were affected by the shutoffs?"
-get_answer(question, context)
-# output: '800 thousand'
-```
-> Created by Arpit Rajauria
-[![Twitter icon](https://cdn0.iconfinder.com/data/icons/shift-logotypes/32/Twitter-32.png)](https://twitter.com/arpit_rajauria)
--- a/model_cards/tuner007/t5_abs_qa/README.md
+++ b/model_cards/tuner007/t5_abs_qa/README.md
-# T5 for abstractive question-answering
-This is T5-base model fine-tuned for abstractive QA using text-to-text approach
-## Model training
-This model was trained on colab TPU with 35GB RAM for 2 epochs
-## Model in Action 🚀
-```
-from transformers import AutoModelWithLMHead, AutoTokenizer
-tokenizer = AutoTokenizer.from_pretrained("tuner007/t5_abs_qa")
-model = AutoModelWithLMHead.from_pretrained("tuner007/t5_abs_qa")
-device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-model = model.to(device)
-def get_answer(question, context):
-input_text = "context: %s <question for context: %s </s>" % (context,question)
-features = tokenizer([input_text], return_tensors='pt')
-out = model.generate(input_ids=features['input_ids'].to(device), attention_mask=features['attention_mask'].to(device))
-return tokenizer.decode(out[0])
-```
-#### Example 1: Answer available
-```
-context = "In Norse mythology, Valhalla is a majestic, enormous hall located in Asgard, ruled over by the god Odin."
-question = "What is Valhalla?"
-get_answer(question, context)
-# output: 'It is a hall of worship ruled by Odin.'
-```
-#### Example 2: Answer not available 
-```
-context = "In Norse mythology, Valhalla is a majestic, enormous hall located in Asgard, ruled over by the god Odin."
-question = "What is Asgard?"
-get_answer(question, context)
-# output: 'No answer available in context.'
-```
-> Created by Arpit Rajauria
-[![Twitter icon](https://cdn0.iconfinder.com/data/icons/shift-logotypes/32/Twitter-32.png)](https://twitter.com/arpit_rajauria)
--- a/model_cards/twmkn9/albert-base-v2-squad2/README.md
+++ b/model_cards/twmkn9/albert-base-v2-squad2/README.md
-This model is [ALBERT base v2](https://huggingface.co/albert-base-v2) trained on SQuAD v2 as:
-```
-export SQUAD_DIR=../../squad2
-python3 run_squad.py 
-    --model_type albert 
-    --model_name_or_path albert-base-v2 
-    --do_train 
-    --do_eval 
-    --overwrite_cache 
-    --do_lower_case 
-    --version_2_with_negative 
-    --save_steps 100000 
-    --train_file $SQUAD_DIR/train-v2.0.json 
-    --predict_file $SQUAD_DIR/dev-v2.0.json 
-    --per_gpu_train_batch_size 8 
-    --num_train_epochs 3 
-    --learning_rate 3e-5 
-    --max_seq_length 384 
-    --doc_stride 128 
-    --output_dir ./tmp/albert_fine/
-```
-Performance on a dev subset is close to the original paper:
-```
-Results: 
-{
-    'exact': 78.71010200723923, 
-    'f1': 81.89228117126069, 
-    'total': 6078, 
-    'HasAns_exact': 75.39518900343643, 
-    'HasAns_f1': 82.04167868004215, 
-    'HasAns_total': 2910, 
-    'NoAns_exact': 81.7550505050505, 
-    'NoAns_f1': 81.7550505050505, 
-    'NoAns_total': 3168, 
-    'best_exact': 78.72655478775913, 
-    'best_exact_thresh': 0.0, 
-    'best_f1': 81.90873395178066, 
-    'best_f1_thresh': 0.0
-}
-```
-We are hopeful this might save you time, energy, and compute. Cheers!
\ No newline at end of file
--- a/model_cards/twmkn9/bert-base-uncased-squad2/README.md
+++ b/model_cards/twmkn9/bert-base-uncased-squad2/README.md
-This model is [BERT base uncased](https://huggingface.co/bert-base-uncased) trained on SQuAD v2 as:
-```
-export SQUAD_DIR=../../squad2
-python3 run_squad.py 
-    --model_type bert 
-    --model_name_or_path bert-base-uncased 
-    --do_train 
-    --do_eval 
-    --overwrite_cache 
-    --do_lower_case 
-    --version_2_with_negative 
-    --save_steps 100000 
-    --train_file $SQUAD_DIR/train-v2.0.json 
-    --predict_file $SQUAD_DIR/dev-v2.0.json 
-    --per_gpu_train_batch_size 8 
-    --num_train_epochs 3 
-    --learning_rate 3e-5 
-    --max_seq_length 384 
-    --doc_stride 128 
-    --output_dir ./tmp/bert_fine_tuned/
-```
-Performance on a dev subset is close to the original paper:
-```
-Results: 
-{
-    'exact': 72.35932872655479, 
-    'f1': 75.75355132564763, 
-    'total': 6078, 
-    'HasAns_exact': 74.29553264604812, 
-    'HasAns_f1': 81.38490892002987, 
-    'HasAns_total': 2910, 
-    'NoAns_exact': 70.58080808080808, 
-    'NoAns_f1': 70.58080808080808, 
-    'NoAns_total': 3168, 
-    'best_exact': 72.35932872655479, 
-    'best_exact_thresh': 0.0, 
-    'best_f1': 75.75355132564766, 
-    'best_f1_thresh': 0.0
-}
-```
-We are hopeful this might save you time, energy, and compute. Cheers!
\ No newline at end of file
--- a/model_cards/twmkn9/distilbert-base-uncased-squad2/README.md
+++ b/model_cards/twmkn9/distilbert-base-uncased-squad2/README.md
-This model is [Distilbert base uncased](https://huggingface.co/distilbert-base-uncased) trained on SQuAD v2 as:
-```
-export SQUAD_DIR=../../squad2
-python3 run_squad.py 
-    --model_type distilbert 
-    --model_name_or_path distilbert-base-uncased
-    --do_train 
-    --do_eval 
-    --overwrite_cache 
-    --do_lower_case 
-    --version_2_with_negative 
-    --save_steps 100000 
-    --train_file $SQUAD_DIR/train-v2.0.json 
-    --predict_file $SQUAD_DIR/dev-v2.0.json 
-    --per_gpu_train_batch_size 8 
-    --num_train_epochs 3 
-    --learning_rate 3e-5 
-    --max_seq_length 384 
-    --doc_stride 128 
-    --output_dir ./tmp/distilbert_fine_tuned/
-```
-Performance on a dev subset is close to the original paper:
-```
-Results: 
-{
-    'exact': 64.88976637051661, 
-    'f1': 68.1776176526635, 
-    'total': 6078, 
-    'HasAns_exact': 69.7594501718213, 
-    'HasAns_f1': 76.62665295288285, 
-    'HasAns_total': 2910, 
-    'NoAns_exact': 60.416666666666664, 
-    'NoAns_f1': 60.416666666666664, 
-    'NoAns_total': 3168, 
-    'best_exact': 64.88976637051661, 
-    'best_exact_thresh': 0.0, 
-    'best_f1': 68.17761765266337, 
-    'best_f1_thresh': 0.0
-}
-```
-We are hopeful this might save you time, energy, and compute. Cheers!
\ No newline at end of file
--- a/model_cards/twmkn9/distilroberta-base-squad2/README.md
+++ b/model_cards/twmkn9/distilroberta-base-squad2/README.md
-This model is [Distilroberta base](https://huggingface.co/distilroberta-base) trained on SQuAD v2 as:
-```
-export SQUAD_DIR=../../squad2
-python3 run_squad.py 
-    --model_type robberta 
-    --model_name_or_path distilroberta-base 
-    --do_train 
-    --do_eval 
-    --overwrite_cache 
-    --do_lower_case 
-    --version_2_with_negative 
-    --save_steps 100000 
-    --train_file $SQUAD_DIR/train-v2.0.json 
-    --predict_file $SQUAD_DIR/dev-v2.0.json 
-    --per_gpu_train_batch_size 8 
-    --num_train_epochs 3 
-    --learning_rate 3e-5 
-    --max_seq_length 384 
-    --doc_stride 128 
-    --output_dir ./tmp/distilroberta_fine_tuned/
-```
-Performance on a dev subset is close to the original paper:
-```
-Results: 
-{
-    'exact': 70.9279368213228, 
-    'f1': 74.60439802429168, 
-    'total': 6078, 
-    'HasAns_exact': 67.62886597938144, 
-    'HasAns_f1': 75.30774267754136, 
-    'HasAns_total': 2910, 
-    'NoAns_exact': 73.95833333333333, 
-    'NoAns_f1': 73.95833333333333, 'NoAns_total': 3168, 
-    'best_exact': 70.94438960184272, 
-    'best_exact_thresh': 0.0, 
-    'best_f1': 74.62085080481161, 
-    'best_f1_thresh': 0.0
-}
-```
-We are hopeful this might save you time, energy, and compute. Cheers!
\ No newline at end of file
--- a/model_cards/uer/chinese_roberta_L-2_H-128/README.md
+++ b/model_cards/uer/chinese_roberta_L-2_H-128/README.md
---
-language: zh
-datasets: 
- CLUECorpus
---
-# Chinese RoBERTa Miniatures
-## Model description
-This is the set of 24 Chinese RoBERTa models pre-trained by [UER-py](https://www.aclweb.org/anthology/D19-3041.pdf).
-You can download the 24 Chinese RoBERTa miniatures either from the [UER-py Github page](https://github.com/dbiir/UER-py/), or via HuggingFace from the links below:
-|   |H=128|H=256|H=512|H=768|
-|---|:---:|:---:|:---:|:---:|
-| **L=2**  |[**2/128 (BERT-Tiny)**][2_128]|[2/256]|[2/512]|[2/768]|
-| **L=4**  |[4/128]|[**4/256 (BERT-Mini)**]|[**4/512 (BERT-Small)**]|[4/768]|
-| **L=6**  |[6/128]|[6/256]|[6/512]|[6/768]|
-| **L=8**  |[8/128]|[8/256]|[**8/512 (BERT-Medium)**]|[8/768]|
-| **L=10** |[10/128]|[10/256]|[10/512]|[10/768]|
-| **L=12** |[12/128]|[12/256]|[12/512]|[**12/768 (BERT-Base)**]|
-## Training data
-CLUECorpus2020 and CLUECorpusSmall are used as training corpus.
-## Training procedure
-Training details can be found in [UER-py](https://github.com/dbiir/UER-py/).
-### BibTeX entry and citation info
-```
-@article{zhao2019uer,
-  title={UER: An Open-Source Toolkit for Pre-training Models},
-  author={Zhao, Zhe and Chen, Hui and Zhang, Jinbin and Zhao, Xin and Liu, Tao and Lu, Wei and Chen, Xi and Deng, Haotang and Ju, Qi and Du, Xiaoyong},
-  journal={EMNLP-IJCNLP 2019},
-  pages={241},
-  year={2019}
-}
-```
-[2_128]: https://huggingface.co/uer/chinese_roberta_L-2_H-128
--- a/model_cards/uer/gpt2-chinese-couplet/README.md
+++ b/model_cards/uer/gpt2-chinese-couplet/README.md
---
-language: zh 
-widget:
- text: "[CLS]国 色 天 香 ， 姹 紫 嫣 红 ， 碧 水 青 云 欣 共 赏 -"
---
-# Chinese Couplet GPT2 Model
-## Model description
-The model is used to generate Chinese couplets. You can download the model either from the [GPT2-Chinese Github page](https://github.com/Morizeyao/GPT2-Chinese), or via HuggingFace from the link [gpt2-chinese-couplet][couplet].
-Since the parameter skip_special_tokens is used in the pipelines.py, special tokens such as [SEP], [UNK] will be deleted, and the output results may not be neat.
-## How to use
-You can use the model directly with a pipeline for text generation:
-When the parameter skip_special_tokens is True:
-```python
->>> from transformers import BertTokenizer, GPT2LMHeadModel, TextGenerationPipeline
->>> from transformers import TextGenerationPipeline, 
->>> tokenizer = BertTokenizer.from_pretrained("uer/gpt2-chinese-couplet")
->>> model = GPT2LMHeadModel.from_pretrained("uer/gpt2-chinese-couplet")
->>> text_generator = TextGenerationPipeline(model, tokenizer)   
->>> text_generator("[CLS]丹 枫 江 冷 人 初 去 -", max_length=25, do_sample=True)
-	[{'generated_text': '[CLS]丹 枫 江 冷 人 初 去 - 黄 叶 声 从 天 外 来 阅 旗'}]
-```
-When the parameter skip_special_tokens is False:
-```python
->>> from transformers import BertTokenizer, GPT2LMHeadModel, TextGenerationPipeline
->>> from transformers import TextGenerationPipeline, 
->>> tokenizer = BertTokenizer.from_pretrained("uer/gpt2-chinese-poem")
->>> model = GPT2LMHeadModel.from_pretrained("uer/gpt2-chinese-poem")
->>> text_generator = TextGenerationPipeline(model, tokenizer)   
->>> text_generator("[CLS]丹 枫 江 冷 人 初 去 -", max_length=25, do_sample=True)
-	[{'generated_text': '[CLS]丹 枫 江 冷 人 初 去 - 黄 叶 声 我 酒 不 辞 [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP]'}]
-```
-## Training data
-Contains 700,000 Chinese couplets collected by [couplet-clean-dataset](https://github.com/v-zich/couplet-clean-dataset).
-## Training procedure
-Models are pre-trained by [UER-py](https://github.com/dbiir/UER-py/) on [Tencent Cloud TI-ONE](https://cloud.tencent.com/product/tione/). We pre-train 25,000  steps with a sequence length of 64.
-```
-python3 preprocess.py --corpus_path corpora/couplet.txt \
-		      --vocab_path models/google_zh_vocab.txt \  
-		      --dataset_path couplet.pt --processes_num 16 \
-	              --seq_length 64 --target lm 
-```
-```
-python3 pretrain.py --dataset_path couplet.pt \
-	            --vocab_path models/google_zh_vocab.txt \
-		    --output_model_path models/couplet_gpt_base_model.bin \  
-	       	    --config_path models/bert_base_config.json --learning_rate 5e-4 \
-		    --tie_weight --world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
-		    --batch_size 64 --report_steps 1000 \
-		    --save_checkpoint_steps 5000 --total_steps 25000 \
-		    --embedding gpt --encoder gpt2 --target lm
-```
-### BibTeX entry and citation info
-```
-@article{zhao2019uer,
-  title={UER: An Open-Source Toolkit for Pre-training Models},
-  author={Zhao, Zhe and Chen, Hui and Zhang, Jinbin and Zhao, Xin and Liu, Tao and Lu, Wei and Chen, Xi and Deng, Haotang and Ju, Qi and Du, Xiaoyong},
-  journal={EMNLP-IJCNLP 2019},
-  pages={241},
-  year={2019}
-}
-```
-[couplet]: https://huggingface.co/uer/gpt2-chinese-couplet
--- a/model_cards/uer/gpt2-chinese-poem/README.md
+++ b/model_cards/uer/gpt2-chinese-poem/README.md
---
-language: zh 
-widget:
- text: "[CLS] 万 叠 春 山 积 雨 晴 ，"
- text: "[CLS] 青 山 削 芙 蓉 ，"
---
-# Chinese Poem GPT2 Model
-## Model description
-The model is used to generate Chinese ancient poems. You can download the model  either from the [GPT2-Chinese Github page](https://github.com/Morizeyao/GPT2-Chinese), or via HuggingFace from the link [gpt2-chinese-poem][poem].
-Since the parameter skip_special_tokens is used in the pipelines.py, special tokens such as [SEP], [UNK] will be deleted, and the output results may not be neat.
-## How to use
-You can use the model directly with a pipeline for text generation:
-When the parameter skip_special_tokens is True:
-```python
->>> from transformers import BertTokenizer, GPT2LMHeadModel, TextGenerationPipeline
->>> from transformers import TextGenerationPipeline, 
->>> tokenizer = BertTokenizer.from_pretrained("uer/gpt2-chinese-poem")
->>> model = GPT2LMHeadModel.from_pretrained("uer/gpt2-chinese-poem")
->>> text_generator = TextGenerationPipeline(model, tokenizer)   
->>> text_generator("[CLS]梅 山 如 积 翠 ，", max_length=50, do_sample=True)
-	[{'generated_text': '[CLS]梅 山 如 积 翠 ， 的 手 堪 捧 。 遥 遥 仙 人 尉 ， 盘 盘 故 时 陇 。 丹 泉 清 可 鉴 ， 石 乳 甘 于 。 行 将 解 尘 缨 ， 于 焉 蹈 高 踵 。 我'}]
-```
-When the parameter skip_special_tokens is False:
-```python
->>> from transformers import BertTokenizer, GPT2LMHeadModel, TextGenerationPipeline
->>> from transformers import TextGenerationPipeline, 
->>> tokenizer = BertTokenizer.from_pretrained("uer/gpt2-chinese-poem")
->>> model = GPT2LMHeadModel.from_pretrained("uer/gpt2-chinese-poem")
->>> text_generator = TextGenerationPipeline(model, tokenizer)   
->>> text_generator("[CLS]梅 山 如 积 翠 ，", max_length=50, do_sample=True)
-	[{'generated_text': '[CLS]梅 山 如 积 翠 ， 的 [UNK] 手 堪 捧 。 遥 遥 仙 人 尉 ， 盘 盘 故 时 陇 。 丹 泉 清 可 鉴 ， 石 乳 甘 可 捧 。 银 汉 迟 不 来 ， 槎 头 欲 谁 揽 。 何'}]
-```
-## Training data
-Contains 800,000 Chinese ancient poems collected by [chinese-poetry](https://github.com/chinese-poetry/chinese-poetry) and [Poetry](https://github.com/Werneror/Poetry) projects.
-## Training procedure
-The model is pre-trained by [UER-py](https://github.com/dbiir/UER-py/) on [Tencent Cloud TI-ONE](https://cloud.tencent.com/product/tione/). We pre-train 200,000 steps with a sequence length of 128.
-```
-python3 preprocess.py --corpus_path corpora/poem.txt \
-		      --vocab_path models/google_zh_vocab.txt \  
-		      --dataset_path poem.pt --processes_num 16 \
-		      --seq_length 128 --target lm 
-```
-```
-python3 pretrain.py --dataset_path poem.pt \
-		    --vocab_path models/google_zh_vocab.txt \
-		    --output_model_path models/poem_gpt_base_model.bin \  
-		    --config_path models/bert_base_config.json --learning_rate 5e-4 \
-		    --tie_weight --world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
-		    --batch_size 64 --report_steps 1000 \
-		    --save_checkpoint_steps 50000 --total_steps 200000 \
-		    --embedding gpt --encoder gpt2 --target lm
-```
-### BibTeX entry and citation info
-```
-@article{zhao2019uer,
-  title={UER: An Open-Source Toolkit for Pre-training Models},
-  author={Zhao, Zhe and Chen, Hui and Zhang, Jinbin and Zhao, Xin and Liu, Tao and Lu, Wei and Chen, Xi and Deng, Haotang and Ju, Qi and Du, Xiaoyong},
-  journal={EMNLP-IJCNLP 2019},
-  pages={241},
-  year={2019}
-}
-```
-[poem]: https://huggingface.co/uer/gpt2-chinese-poem
--- a/model_cards/uncnlp/lxmert-base-uncased/LICENSE
+++ b/model_cards/uncnlp/lxmert-base-uncased/LICENSE
-MIT License
-Copyright (c) 2019 Hao Tan
-Permission is hereby granted, free of charge, to any person obtaining a copy
-of this software and associated documentation files (the "Software"), to deal
-in the Software without restriction, including without limitation the rights
-to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
-copies of the Software, and to permit persons to whom the Software is
-furnished to do so, subject to the following conditions:
-The above copyright notice and this permission notice shall be included in all
-copies or substantial portions of the Software.
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
-AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
-LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
-OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
-SOFTWARE.
--- a/model_cards/uncnlp/lxmert-base-uncased/README.md
+++ b/model_cards/uncnlp/lxmert-base-uncased/README.md
-# LXMERT
-## Model Description
-[LXMERT](https://arxiv.org/abs/1908.07490) is a pre-trained multimodal transformer. The model takes an image and a sentence as input and compute cross-modal representions. The model is converted from [LXMERT github](https://github.com/airsplay/lxmert) by [Antonio Mendoza](https://avmendoza.info/) and is authored by [Hao Tan](https://www.cs.unc.edu/~airsplay/).
-![](./lxmert_model-1.jpg?raw=True)
-## Usage
-## Training Data and Prodcedure
-The model is jointly trained on multiple vision-and-language datasets.
-We included two image captioning datsets (i.e., [MS COCO](http://cocodataset.org/#home), [Visual Genome](https://visualgenome.org/)) and three image-question answering datasets (i.e.,  [VQA](https://visualqa.org/), [GQA](https://cs.stanford.edu/people/dorarad/gqa/), [VG QA](https://github.com/yukezhu/visual7w-toolkit)). The model is pre-trained on the above datasets  for 20 epochs (roughly 670K iterations with batch size 256), which takes around 8 days on 4 Titan V cards. The details of training could be found in the [LXMERT paper](https://arxiv.org/pdf/1908.07490.pdf).
-## Eval Results
-| Split            | [VQA](https://visualqa.org/)     | [GQA](https://cs.stanford.edu/people/dorarad/gqa/)     | [NLVR2](http://lil.nlp.cornell.edu/nlvr/)  |
-|-----------       |:----:   |:---:    |:------:|
-| Local Validation | 69.90%  | 59.80%  | 74.95% |
-| Test-Dev         | 72.42%  | 60.00%  | 74.45% (Test-P) |
-| Test-Standard    | 72.54%  | 60.33%  | 76.18% (Test-U) |
-## Reference
-```bibtex
-@inproceedings{tan2019lxmert,
-  title={LXMERT: Learning Cross-Modality Encoder Representations from Transformers},
-  author={Tan, Hao and Bansal, Mohit},
-  booktitle={Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing},
-  year={2019}
-}
-```
--- a/model_cards/uncnlp/lxmert-base-uncased/lxmert_model-1.jpg
+++ b/model_cards/uncnlp/lxmert-base-uncased/lxmert_model-1.jpg
--- a/model_cards/unideeplearning/polibert_sa/README.md
+++ b/model_cards/unideeplearning/polibert_sa/README.md
---
-language: it
-tags:
- sentiment
- Italian
-license: MIT
-widget:
- text: 'Giuseppe Rossi è un ottimo politico'
---
-# 🤗 + polibert_SA - POLItic BERT based Sentiment Analysis
-## Model description  
-This model performs sentiment analysis on Italian political twitter sentences. It was trained starting from an instance of "bert-base-italian-uncased-xxl" and fine-tuned on an Italian dataset of tweets. You can try it out at https://www.unideeplearning.com/twitter_sa/ (in italian!)
-#### Hands-on  
-```python
-import torch
-from torch import nn 
-from transformers import AutoTokenizer, AutoModelForSequenceClassification
-tokenizer = AutoTokenizer.from_pretrained("unideeplearning/polibert_sa")
-model = AutoModelForSequenceClassification.from_pretrained("unideeplearning/polibert_sa")
-text = "Giuseppe Rossi è un pessimo politico"
-input_ids = tokenizer.encode(text, add_special_tokens=True, return_tensors= 'pt')
-logits, = model(input_ids)
-logits = logits.squeeze(0)
-prob = nn.functional.softmax(logits, dim=0)
-# 0 Negative, 1 Neutral, 2 Positive 
-print(prob.argmax().tolist())
-```  
-#### Hyperparameters
- Optimizer: **AdamW** with learning rate of **2e-5**, epsilon of **1e-8**
- Max epochs: **2**
- Batch size: **16**
-## Acknowledgments
-Thanks to the support from: 
-the [Hugging Face](https://huggingface.co/), https://www.unioneprofessionisti.com
-https://www.unideeplearning.com/
--- a/model_cards/urduhack/roberta-urdu-small/README.md
+++ b/model_cards/urduhack/roberta-urdu-small/README.md
---
-language: ur
-thumbnail: https://raw.githubusercontent.com/urduhack/urduhack/master/docs/_static/urduhack.png
-tags:
- roberta-urdu-small
- urdu
- transformers
-license: mit
---
-## roberta-urdu-small
-[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/urduhack/urduhack/blob/master/LICENSE)
-### Overview
-**Language model:** roberta-urdu-small
-**Model size:** 125M
-**Language:** Urdu
-**Training data:** News data from urdu news resources in Pakistan
-### About roberta-urdu-small
-roberta-urdu-small is a language model for urdu language.
-```
-from transformers import pipeline
-fill_mask = pipeline("fill-mask", model="urduhack/roberta-urdu-small", tokenizer="urduhack/roberta-urdu-small")
-```
-## Training procedure
-roberta-urdu-small was trained on urdu news corpus. Training data was normalized using normalization module from
-urduhack to eliminate characters from other languages like arabic.
-### About Urduhack
-Urduhack is a Natural Language Processing (NLP) library for urdu language.
-Github: https://github.com/urduhack/urduhack
--- a/model_cards/valhalla/bart-large-finetuned-squadv1/README.md
+++ b/model_cards/valhalla/bart-large-finetuned-squadv1/README.md
---
-datasets:
- squad
---
-# BART-LARGE finetuned on SQuADv1
-This is bart-large model finetuned on SQuADv1 dataset for question answering task
-## Model details
-BART was propsed in the [paper](https://arxiv.org/abs/1910.13461) **BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension**.
-BART is a seq2seq model intended for both NLG and NLU tasks. 
-To use BART for question answering tasks, we feed the complete document into the encoder and decoder, and use the top
-hidden state of the decoder as a representation for each
-word. This representation is used to classify the token. As given in the paper bart-large achives comparable to ROBERTa on SQuAD.
-Another notable thing about BART is that it can handle sequences with upto 1024 tokens.
-| Param               | #Value |
-|---------------------|--------|
-| encoder layers      | 12     |
-| decoder layers      | 12     |
-| hidden size         | 4096   |
-| num attetion heads  | 16     |
-| on disk size        | 1.63GB |
-## Model training
-This model was trained on google colab v100 GPU. 
-You can find the fine-tuning colab here
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1I5cK1M_0dLaf5xoewh6swcm5nAInfwHy?usp=sharing).
-## Results
-The results are actually slightly worse than given in the paper. 
-In the paper the authors mentioned that bart-large achieves 88.8 EM and 94.6 F1
-| Metric | #Value |
-|--------|--------|
-| EM     | 86.8022|
-| F1     | 92.7342|
-## Model in Action  🚀
-```python3
-from transformers import BartTokenizer, BartForQuestionAnswering
-import torch
-tokenizer = BartTokenizer.from_pretrained('valhalla/bart-large-finetuned-squadv1')
-model = BartForQuestionAnswering.from_pretrained('valhalla/bart-large-finetuned-squadv1')
-question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"
-encoding = tokenizer(question, text, return_tensors='pt')
-input_ids = encoding['input_ids']
-attention_mask = encoding['attention_mask']
-start_scores, end_scores = model(input_ids, attention_mask=attention_mask, output_attentions=False)[:2]
-all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0])
-answer = ' '.join(all_tokens[torch.argmax(start_scores) : torch.argmax(end_scores)+1])
-answer = tokenizer.convert_tokens_to_ids(answer.split())
-answer = tokenizer.decode(answer)
-#answer => 'a nice puppet' 
-```
-> Created with ❤️ by Suraj Patil [![Github icon](https://cdn0.iconfinder.com/data/icons/octicons/1024/mark-github-32.png)](https://github.com/patil-suraj/)
-[![Twitter icon](https://cdn0.iconfinder.com/data/icons/shift-logotypes/32/Twitter-32.png)](https://twitter.com/psuraj28)