[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)

* rm all model cards * Update the .rst @sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler * Add a rootlevel README.md with simple instructions/context * Update docs/source/model_sharing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * rm all model cards Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)
* rm all model cards * Update the .rst @sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler * Add a rootlevel README.md with simple instructions/context * Update docs/source/model_sharing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * rm all model cards Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
3552d0e0 · Julien Chaumond · GitHub · 29e45979 · 29e45979 · 29e45979
Unverified Commit 3552d0e0 authored Dec 12, 2020 by Julien Chaumond Committed by GitHub Dec 11, 2020
20 changed files
--- a/model_cards/ahotrod/albert_xxlargev1_squad2_512/README.md
+++ b/model_cards/ahotrod/albert_xxlargev1_squad2_512/README.md
-## Albert xxlarge version 1 language model fine-tuned on SQuAD2.0
-###  (updated 30Sept2020) with the following results:
-```
-exact: 86.11134506864315
-f1: 89.35371214945009
-total': 11873
-HasAns_exact': 83.56950067476383
-HasAns_f1': 90.06353312254078
-HasAns_total': 5928
-NoAns_exact': 88.64592094196804
-NoAns_f1': 88.64592094196804
-NoAns_total': 5945
-best_exact': 86.11134506864315
-best_exact_thresh': 0.0
-best_f1': 89.35371214944985
-best_f1_thresh': 0.0
-```
-### from script:
-```
-python ${EXAMPLES}/run_squad.py \
-  --model_type albert \
-  --model_name_or_path albert-xxlarge-v1 \
-  --do_train \
-  --do_eval \
-  --train_file ${SQUAD}/train-v2.0.json \
-  --predict_file ${SQUAD}/dev-v2.0.json \
-  --version_2_with_negative \
-  --do_lower_case \
-  --num_train_epochs 3 \
-  --max_steps 8144 \
-  --warmup_steps 814 \
-  --learning_rate 3e-5 \
-  --max_seq_length 512 \
-  --doc_stride 128 \
-  --per_gpu_train_batch_size 6 \
-  --gradient_accumulation_steps 8 \
-  --per_gpu_eval_batch_size 48 \
-  --fp16 \
-  --fp16_opt_level O1 \
-  --threads 12 \
-  --logging_steps 50 \
-  --save_steps 3000 \
-  --overwrite_output_dir \
-  --output_dir ${MODEL_PATH}
-```
-### using the following software & system:
-```
-Transformers: 3.1.0
-PyTorch: 1.6.0
-TensorFlow: 2.3.1
-Python: 3.8.1
-OS: Linux-5.4.0-48-generic-x86_64-with-glibc2.10
-CPU/GPU: Intel i9-9900K / NVIDIA Titan RTX 24GB
-```
--- a/model_cards/ahotrod/electra_large_discriminator_squad2_512/README.md
+++ b/model_cards/ahotrod/electra_large_discriminator_squad2_512/README.md
-## ELECTRA_large_discriminator language model fine-tuned on SQuAD2.0
-### with the following results:
-```
-  "exact": 87.09677419354838,
-  "f1": 89.98343832723452,
-  "total": 11873,
-  "HasAns_exact": 84.66599190283401,
-  "HasAns_f1": 90.44759839056285,
-  "HasAns_total": 5928,
-  "NoAns_exact": 89.52060555088309,
-  "NoAns_f1": 89.52060555088309,
-  "NoAns_total": 5945,
-  "best_exact": 87.09677419354838,
-  "best_exact_thresh": 0.0,
-  "best_f1": 89.98343832723432,
-  "best_f1_thresh": 0.0
-```
-### from script:
-```
-python ${EXAMPLES}/run_squad.py \
-  --model_type electra \
-  --model_name_or_path google/electra-large-discriminator \
-  --do_train \
-  --do_eval \
-  --train_file ${SQUAD}/train-v2.0.json \
-  --predict_file ${SQUAD}/dev-v2.0.json \
-  --version_2_with_negative \
-  --do_lower_case \
-  --num_train_epochs 3 \
-  --warmup_steps 306 \
-  --weight_decay 0.01 \
-  --learning_rate 3e-5 \
-  --max_grad_norm 0.5 \
-  --adam_epsilon 1e-6 \
-  --max_seq_length 512 \
-  --doc_stride 128 \
-  --per_gpu_train_batch_size 8 \
-  --gradient_accumulation_steps 16 \
-  --per_gpu_eval_batch_size 128 \
-  --fp16 \
-  --fp16_opt_level O1 \
-  --threads 12 \
-  --logging_steps 50 \
-  --save_steps 1000 \
-  --overwrite_output_dir \
-  --output_dir ${MODEL_PATH}
-```
-### using the following system & software:
-```
-Transformers: 2.11.0
-PyTorch: 1.5.0
-TensorFlow: 2.2.0
-Python: 3.8.1
-OS/Platform: Linux-5.3.0-59-generic-x86_64-with-glibc2.10
-CPU/GPU: Intel i9-9900K / NVIDIA Titan RTX 24GB
-```
--- a/model_cards/ahotrod/roberta_large_squad2/README.md
+++ b/model_cards/ahotrod/roberta_large_squad2/README.md
-## RoBERTa-large language model fine-tuned on SQuAD2.0
-### with the following results:
-```
-  "exact": 84.46896319380106,
-  "f1": 87.85388093408943,
-  "total": 11873,
-  "HasAns_exact": 81.37651821862349,
-  "HasAns_f1": 88.1560607844881,
-  "HasAns_total": 5928,
-  "NoAns_exact": 87.55256518082422,
-  "NoAns_f1": 87.55256518082422,
-  "NoAns_total": 5945,
-  "best_exact": 84.46896319380106,
-  "best_exact_thresh": 0.0,
-  "best_f1": 87.85388093408929,
-  "best_f1_thresh": 0.0
-```
-### from script:
-```
-python ${EXAMPLES}/run_squad.py \
-  --model_type roberta \
-  --model_name_or_path roberta-large \
-  --do_train \
-  --do_eval \
-  --train_file ${SQUAD}/train-v2.0.json \
-  --predict_file ${SQUAD}/dev-v2.0.json \
-  --version_2_with_negative \
-  --do_lower_case \
-  --num_train_epochs 3 \
-  --warmup_steps 1642 \
-  --weight_decay 0.01 \
-  --learning_rate 3e-5 \
-  --adam_epsilon 1e-6 \
-  --max_seq_length 512 \
-  --doc_stride 128 \
-  --per_gpu_train_batch_size 8 \
-  --gradient_accumulation_steps 6 \
-  --per_gpu_eval_batch_size 48 \
-  --threads 12 \
-  --logging_steps 50 \
-  --save_steps 2000 \
-  --overwrite_output_dir \
-  --output_dir ${MODEL_PATH}
-$@
-```
-### using the following system & software:
-```
-Transformers: 2.7.0
-PyTorch: 1.4.0
-TensorFlow: 2.1.0
-Python: 3.7.7
-OS/Platform: Linux-5.3.0-46-generic-x86_64-with-debian-buster-sid
-CPU/GPU: Intel i9-9900K / NVIDIA Titan RTX 24GB
-```
--- a/model_cards/ahotrod/xlnet_large_squad2_512/README.md
+++ b/model_cards/ahotrod/xlnet_large_squad2_512/README.md
-## XLNet large language model fine-tuned on SQuAD2.0
-### with the following results:
-```
-  "exact": 82.07698138633876,
-  "f1": 85.898874470488,
-  "total": 11873,
-  "HasAns_exact": 79.60526315789474,
-  "HasAns_f1": 87.26000954590184,
-  "HasAns_total": 5928,
-  "NoAns_exact": 84.54163162321278,
-  "NoAns_f1": 84.54163162321278,
-  "NoAns_total": 5945,
-  "best_exact": 83.22243746315169,
-  "best_exact_thresh": -11.112004280090332,
-  "best_f1": 86.88541353813282,
-  "best_f1_thresh": -11.112004280090332
-```
-### from script:
-```
-python -m torch.distributed.launch --nproc_per_node=2 ${RUN_SQUAD_DIR}/run_squad.py \
-  --model_type xlnet \
-  --model_name_or_path xlnet-large-cased \
-  --do_train \
-  --train_file ${SQUAD_DIR}/train-v2.0.json \
-  --predict_file ${SQUAD_DIR}/dev-v2.0.json \
-  --version_2_with_negative \
-  --num_train_epochs 3 \
-  --learning_rate 3e-5 \
-  --adam_epsilon 1e-6 \
-  --max_seq_length 512 \
-  --doc_stride 128 \
-  --save_steps 2000 \
-  --per_gpu_train_batch_size 1 \
-  --gradient_accumulation_steps 24 \
-  --output_dir ${MODEL_PATH}
-CUDA_VISIBLE_DEVICES=0 python ${RUN_SQUAD_DIR}/run_squad_II.py \
-  --model_type xlnet \
-  --model_name_or_path ${MODEL_PATH} \
-  --do_eval \
-  --train_file ${SQUAD_DIR}/train-v2.0.json \
-  --predict_file ${SQUAD_DIR}/dev-v2.0.json \
-  --version_2_with_negative \
-  --max_seq_length 512 \
-  --per_gpu_eval_batch_size 48 \
-  --output_dir ${MODEL_PATH}
-```
-### using the following system & software:
-```
-OS/Platform: Linux-4.15.0-76-generic-x86_64-with-debian-buster-sid
-GPU/CPU: 2 x NVIDIA 1080Ti / Intel i7-8700
-Transformers: 2.1.1
-PyTorch: 1.4.0
-TensorFlow: 2.1.0
-Python: 3.7.6
-```
-### Utilize this xlnet_large_squad2_512 fine-tuned model with:
-```python
-tokenizer = AutoTokenizer.from_pretrained("ahotrod/xlnet_large_squad2_512")
-model = AutoModelForQuestionAnswering.from_pretrained("ahotrod/xlnet_large_squad2_512")
-```
--- a/model_cards/ai4bharat/indic-bert/README.md
+++ b/model_cards/ai4bharat/indic-bert/README.md
---
-language: en
-license: mit
-datasets:
- AI4Bharat IndicNLP Corpora
---
-# IndicBERT
-IndicBERT is a multilingual ALBERT model pretrained exclusively on 12 major Indian languages. It is pre-trained on our novel monolingual corpus of around 9 billion tokens and subsequently evaluated on a set of diverse tasks. IndicBERT has much fewer parameters than other multilingual models (mBERT, XLM-R etc.) while it also achieves a performance on-par or better than these models.
-The 12 languages covered by IndicBERT are: Assamese, Bengali, English, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu.
-The code can be found [here](https://github.com/divkakwani/indic-bert). For more information, checkout our [project page](https://indicnlp.ai4bharat.org/) or our [paper](https://indicnlp.ai4bharat.org/papers/arxiv2020_indicnlp_corpus.pdf).
-## Pretraining Corpus
-We pre-trained indic-bert on AI4Bharat's monolingual corpus. The corpus has the following distribution of languages:
-| Language          | as     | bn     | en     | gu     | hi     | kn     |         |
-| ----------------- | ------ | ------ | ------ | ------ | ------ | ------ | ------- |
-| **No. of Tokens** | 36.9M  | 815M   | 1.34B  | 724M   | 1.84B  | 712M   |         |
-| **Language**      | **ml** | **mr** | **or** | **pa** | **ta** | **te** | **all** |
-| **No. of Tokens** | 767M   | 560M   | 104M   | 814M   | 549M   | 671M   | 8.9B    |
-## Evaluation Results
-IndicBERT is evaluated on IndicGLUE and some additional tasks. The results are summarized below. For more details about the tasks, refer our [official repo](https://github.com/divkakwani/indic-bert)
-#### IndicGLUE
-Task | mBERT | XLM-R | IndicBERT
-----| ----- | ----- | ------ 
-News Article Headline Prediction | 89.58 | 95.52 | **95.87** 
-Wikipedia Section Title Prediction| **73.66** | 66.33 | 73.31 
-Cloze-style multiple-choice QA | 39.16 | 27.98 | **41.87** 
-Article Genre Classification | 90.63 | 97.03 | **97.34** 
-Named Entity Recognition (F1-score) | **73.24** | 65.93 | 64.47 
-Cross-Lingual Sentence Retrieval Task | 21.46 | 13.74 | **27.12** 
-Average | 64.62 | 61.09 | **66.66** 
-#### Additional Tasks
-Task | Task Type | mBERT | XLM-R | IndicBERT 
-----| ----- | ----- | ------ | ----- 
-BBC News Classification | Genre Classification | 60.55 | **75.52** | 74.60 
-IIT Product Reviews | Sentiment Analysis | 74.57 | **78.97** | 71.32 
-IITP Movie Reviews | Sentiment Analaysis | 56.77 | **61.61** | 59.03 
-Soham News Article | Genre Classification | 80.23 | **87.6** | 78.45 
-Midas Discourse | Discourse Analysis | 71.20 | **79.94** | 78.44 
-iNLTK Headlines Classification | Genre Classification | 87.95 | 93.38 | **94.52** 
-ACTSA Sentiment Analysis | Sentiment Analysis | 48.53 | 59.33 | **61.18** 
-Winograd NLI | Natural Language Inference | 56.34 | 55.87 | **56.34** 
-Choice of Plausible Alternative (COPA) | Natural Language Inference | 54.92 | 51.13 | **58.33** 
-Amrita Exact Paraphrase | Paraphrase Detection | **93.81** | 93.02 | 93.75 
-Amrita Rough Paraphrase | Paraphrase Detection | 83.38 | 82.20 | **84.33** 
-Average |  |  69.84 | **74.42** | 73.66 
-\* Note: all models have been restricted to a max_seq_length of 128.
-## Downloads
-The model can be downloaded [here](https://storage.googleapis.com/ai4bharat-public-indic-nlp-corpora/models/indic-bert-v1.tar.gz). Both tf checkpoints and pytorch binaries are included in the archive. Alternatively, you can also download it from [Huggingface](https://huggingface.co/ai4bharat/indic-bert).
-## Citing
-If you are using any of the resources, please cite the following article:
-```
-@inproceedings{kakwani2020indicnlpsuite,
-    title={{IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages}},
-    author={Divyanshu Kakwani and Anoop Kunchukuttan and Satish Golla and Gokul N.C. and Avik Bhattacharyya and Mitesh M. Khapra and Pratyush Kumar},
-    year={2020},
-    booktitle={Findings of EMNLP},
-}
-```
-We would like to hear from you if:
- You are using our resources. Please let us know how you are putting these resources to use.
- You have any feedback on these resources.
-## License
-The IndicBERT code (and models) are released under the MIT License.
-## Contributors
- Divyanshu Kakwani
- Anoop Kunchukuttan
- Gokul NC
- Satish Golla
- Avik Bhattacharyya
- Mitesh Khapra
- Pratyush Kumar
-This work is the outcome of a volunteer effort as part of [AI4Bharat initiative](https://ai4bharat.org).
-## Contact
- Anoop Kunchukuttan ([anoop.kunchukuttan@gmail.com](mailto:anoop.kunchukuttan@gmail.com))
- Mitesh Khapra ([miteshk@cse.iitm.ac.in](mailto:miteshk@cse.iitm.ac.in))
- Pratyush Kumar ([pratyush@cse.iitm.ac.in](mailto:pratyush@cse.iitm.ac.in))
--- a/model_cards/akhooli/gpt2-small-arabic-poetry/README.md
+++ b/model_cards/akhooli/gpt2-small-arabic-poetry/README.md
---
-language: "ar"
-tags:
- text-generation
-license: ""
-datasets:
- Arabic poetry from several eras
---
-# GPT2-Small-Arabic-Poetry
-## Model description
-Fine-tuned model of Arabic poetry dataset based on gpt2-small-arabic.
-## Intended uses & limitations
-#### How to use
-An example is provided in this [colab notebook](https://colab.research.google.com/drive/1mRl7c-5v-Klx27EEAEOAbrfkustL4g7a?usp=sharing).
-#### Limitations and bias
-Both the GPT2-small-arabic (trained on Arabic Wikipedia) and this model have several limitations in terms of coverage and training performance. 
-Use them as demonstrations or proof of concepts but not as production code.
-## Training data
-This pretrained model used the [Arabic Poetry dataset](https://www.kaggle.com/ahmedabelal/arabic-poetry) from 9 different eras with a total of around 40k poems. 
-The dataset was trained (fine-tuned) based on the [gpt2-small-arabic](https://huggingface.co/akhooli/gpt2-small-arabic) transformer model.
-## Training procedure
-Training was done using [Simple Transformers](https://github.com/ThilinaRajapakse/simpletransformers) library on Kaggle, using free GPU.
-## Eval results 
-Final perplexity reached ws 76.3, loss: 4.33
-### BibTeX entry and citation info
-```bibtex
-@inproceedings{Abed Khooli,
-  year={2020}
-}
-```
--- a/model_cards/akhooli/gpt2-small-arabic/README.md
+++ b/model_cards/akhooli/gpt2-small-arabic/README.md
---
-language: "ar"
-datasets:
- Arabic Wikipedia
-metrics:
- none
---
-# GPT2-Small-Arabic
-## Model description
-GPT2 model from Arabic Wikipedia dataset based on gpt2-small (using Fastai2).
-## Intended uses & limitations
-#### How to use
-An example is provided in this [colab notebook](https://colab.research.google.com/drive/1mRl7c-5v-Klx27EEAEOAbrfkustL4g7a?usp=sharing). 
-Both text and poetry (fine-tuned model) generation are included.
-#### Limitations and bias
-GPT2-small-arabic (trained on Arabic Wikipedia) has several limitations in terms of coverage (Arabic Wikipeedia quality, no diacritics) and training performance. 
-Use as demonstration or proof of concepts but not as production code.
-## Training data
-This pretrained model used the Arabic Wikipedia dump (around 900 MB). 
-## Training procedure
-Training was done using [Fastai2](https://github.com/fastai/fastai2/) library on Kaggle, using free GPU.
-## Eval results 
-Final perplexity reached was 72.19,  loss: 4.28, accuracy: 0.307
-### BibTeX entry and citation info
-```bibtex
-@inproceedings{Abed Khooli,
-  year={2020}
-}
-```
--- a/model_cards/akhooli/mbart-large-cc25-ar-en/README.md
+++ b/model_cards/akhooli/mbart-large-cc25-ar-en/README.md
---
-tags:
- translation
-language:
- ar
- en
-license: mit
---
-### mbart-large-ar-en
-This is mbart-large-cc25, finetuned on a subset of the OPUS corpus for ar_en.   
-Usage: see [example notebook](https://colab.research.google.com/drive/1I6RFOWMaTpPBX7saJYjnSTddW0TD6H1t?usp=sharing)  
-Note: model has limited training set, not fully trained (do not use for production).   
-Other models by me: [Abed Khooli](https://huggingface.co/akhooli)  
--- a/model_cards/akhooli/mbart-large-cc25-en-ar/README.md
+++ b/model_cards/akhooli/mbart-large-cc25-en-ar/README.md
---
-tags:
- translation
-language:
- en
- ar
-license: mit
---
-### mbart-large-en-ar
-This is mbart-large-cc25, finetuned on a subset of the UN corpus for en_ar.  
-Usage: see [example notebook](https://colab.research.google.com/drive/1I6RFOWMaTpPBX7saJYjnSTddW0TD6H1t?usp=sharing) 
-Note: model has limited training set, not fully trained (do not use for production). 
--- a/model_cards/akhooli/personachat-arabic/README.md
+++ b/model_cards/akhooli/personachat-arabic/README.md
---
-tags:
- conversational
-language:
- ar
-license: mit
---
-## personachat-arabic (conversational AI)
-This is personachat-arabic, using a subset from the persona-chat validation dataset, machine translated to Arabic (from English) 
-and fine-tuned from [akhooli/gpt2-small-arabic](https://huggingface.co/akhooli/gpt2-small-arabic) which is a limited text generation model.  
-Usage: see the last section of this [example notebook](https://colab.research.google.com/drive/1I6RFOWMaTpPBX7saJYjnSTddW0TD6H1t?usp=sharing) 
-Note: model has limited training set which was machine translated (do not use for production). 
--- a/model_cards/akhooli/xlm-r-large-arabic-sent/README.md
+++ b/model_cards/akhooli/xlm-r-large-arabic-sent/README.md
---
-language:
- ar
- en
-license: mit
---
-### xlm-r-large-arabic-sent 
-Multilingual sentiment classification (Label_0: mixed, Label_1: negative, Label_2: positive) of Arabic reviews by fine-tuning XLM-Roberta-Large. 
-Zero shot classification of other languages (also works in mixed languages - ex. Arabic & English). Mixed category is not accurate and may confuse other 
-classes (was based on a rate of 3 out of 5 in reviews).  
-Usage: see last section in this [Colab notebook](https://lnkd.in/d3bCFyZ)
--- a/model_cards/akhooli/xlm-r-large-arabic-toxic/README.md
+++ b/model_cards/akhooli/xlm-r-large-arabic-toxic/README.md
---
-language:
- ar
- en
-license: mit
---
-### xlm-r-large-arabic-toxic (toxic/hate speech classifier) 
-Toxic (hate speech) classification (Label_0: non-toxic, Label_1: toxic) of Arabic comments by fine-tuning XLM-Roberta-Large. 
-Zero shot classification of other languages (also works in mixed languages - ex. Arabic & English).  
-Usage and further info: see last section in this [Colab notebook](https://lnkd.in/d3bCFyZ)
--- a/model_cards/albert-base-v1-README.md
+++ b/model_cards/albert-base-v1-README.md
---
-tags:
- exbert
-license: apache-2.0
---
-<a href="https://huggingface.co/exbert/?model=albert-base-v1">
-	<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
-</a>
--- a/model_cards/albert-xxlarge-v2-README.md
+++ b/model_cards/albert-xxlarge-v2-README.md
---
-tags:
- exbert
-license: apache-2.0
---
-<a href="https://huggingface.co/exbert/?model=albert-xxlarge-v2">
-	<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
-</a>
\ No newline at end of file
--- a/model_cards/aliosm/ComVE-distilgpt2/README.md
+++ b/model_cards/aliosm/ComVE-distilgpt2/README.md
---
-language: "en"
-tags:
- exbert
- commonsense
- semeval2020
- comve
-license: "mit"
-datasets:
- ComVE
-metrics:
- bleu
-widget:
- text: "Chicken can swim in water. <|continue|>"
---
-# ComVE-distilgpt2
-## Model description
-Finetuned model on Commonsense Validation and Explanation (ComVE) dataset introduced in [SemEval2020 Task4](https://competitions.codalab.org/competitions/21080) using a causal language modeling (CLM) objective.
-The model is able to generate a reason why a given natural language statement is against commonsense.
-## Intended uses & limitations
-You can use the raw model for text generation to generate reasons why natural language statements are against commonsense.
-#### How to use
-You can use this model directly to generate reasons why the given statement is against commonsense using [`generate.sh`](https://github.com/AliOsm/SemEval2020-Task4-ComVE/tree/master/TaskC-Generation) script.
-*Note:* make sure that you are using version `2.4.1` of `transformers` package. Newer versions has some issue in text generation and the model repeats the last token generated again and again.
-#### Limitations and bias
-The model biased to negate the entered sentence usually instead of producing a factual reason.
-## Training data
-The model is initialized from the [distilgpt2](https://github.com/huggingface/transformers/blob/master/model_cards/distilgpt2-README.md) model and finetuned using [ComVE](https://github.com/wangcunxiang/SemEval2020-Task4-Commonsense-Validation-and-Explanation) dataset which contains 10K against commonsense sentences, each of them is paired with three reference reasons.
-## Training procedure
-Each natural language statement that against commonsense is concatenated with its reference reason with `<|continue|>` as a separator, then the model finetuned using CLM objective.
-The model trained on Nvidia Tesla P100 GPU from Google Colab platform with 5e-5 learning rate, 15 epochs, 128 maximum sequence length and 64 batch size.
-<center>
-  <img src="https://i.imgur.com/xKbrwBC.png">
-</center>
-## Eval results
-The model achieved 13.7582/13.8026 BLEU scores on SemEval2020 Task4: Commonsense Validation and Explanation development and testing dataset.
-### BibTeX entry and citation info
-```bibtex
-@article{fadel2020justers,
-  title={JUSTers at SemEval-2020 Task 4: Evaluating Transformer Models Against Commonsense Validation and Explanation},
-  author={Fadel, Ali and Al-Ayyoub, Mahmoud and Cambria, Erik},
-  year={2020}
-}
-```
-<a href="https://huggingface.co/exbert/?model=aliosm/ComVE-distilgpt2">
-	<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
-</a>
--- a/model_cards/aliosm/ComVE-gpt2-large/README.md
+++ b/model_cards/aliosm/ComVE-gpt2-large/README.md
---
-language: "en"
-tags:
- gpt2
- exbert
- commonsense
- semeval2020
- comve
-license: "mit"
-datasets:
- https://github.com/wangcunxiang/SemEval2020-Task4-Commonsense-Validation-and-Explanation
-metrics:
- bleu
-widget:
- text: "Chicken can swim in water. <|continue|>"
---
-# ComVE-gpt2-large
-## Model description
-Finetuned model on Commonsense Validation and Explanation (ComVE) dataset introduced in [SemEval2020 Task4](https://competitions.codalab.org/competitions/21080) using a causal language modeling (CLM) objective.
-The model is able to generate a reason why a given natural language statement is against commonsense.
-## Intended uses & limitations
-You can use the raw model for text generation to generate reasons why natural language statements are against commonsense.
-#### How to use
-You can use this model directly to generate reasons why the given statement is against commonsense using [`generate.sh`](https://github.com/AliOsm/SemEval2020-Task4-ComVE/tree/master/TaskC-Generation) script.
-*Note:* make sure that you are using version `2.4.1` of `transformers` package. Newer versions has some issue in text generation and the model repeats the last token generated again and again.
-#### Limitations and bias
-The model biased to negate the entered sentence usually instead of producing a factual reason.
-## Training data
-The model is initialized from the [gpt2-large](https://github.com/huggingface/transformers/blob/master/model_cards/gpt2-README.md) model and finetuned using [ComVE](https://github.com/wangcunxiang/SemEval2020-Task4-Commonsense-Validation-and-Explanation) dataset which contains 10K against commonsense sentences, each of them is paired with three reference reasons.
-## Training procedure
-Each natural language statement that against commonsense is concatenated with its reference reason with `<|conteniue|>` as a separator, then the model finetuned using CLM objective.
-The model trained on Nvidia Tesla P100 GPU from Google Colab platform with 5e-5 learning rate, 5 epochs, 128 maximum sequence length and 64 batch size.
-<center>
-  <img src="https://i.imgur.com/xKbrwBC.png">
-</center>
-## Eval results
-The model achieved 16.5110/15.9299 BLEU scores on SemEval2020 Task4: Commonsense Validation and Explanation development and testing dataset.
-### BibTeX entry and citation info
-```bibtex
-@article{fadel2020justers,
-  title={JUSTers at SemEval-2020 Task 4: Evaluating Transformer Models Against Commonsense Validation and Explanation},
-  author={Fadel, Ali and Al-Ayyoub, Mahmoud and Cambria, Erik},
-  year={2020}
-}
-```
-<a href="https://huggingface.co/exbert/?model=aliosm/ComVE-gpt2-large">
-	<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
-</a>
--- a/model_cards/aliosm/ComVE-gpt2-medium/README.md
+++ b/model_cards/aliosm/ComVE-gpt2-medium/README.md
---
-language: "en"
-tags:
- gpt2
- exbert
- commonsense
- semeval2020
- comve
-license: "mit"
-datasets:
- ComVE
-metrics:
- bleu
-widget:
- text: "Chicken can swim in water. <|continue|>"
---
-# ComVE-gpt2-medium
-## Model description
-Finetuned model on Commonsense Validation and Explanation (ComVE) dataset introduced in [SemEval2020 Task4](https://competitions.codalab.org/competitions/21080) using a causal language modeling (CLM) objective.
-The model is able to generate a reason why a given natural language statement is against commonsense.
-## Intended uses & limitations
-You can use the raw model for text generation to generate reasons why natural language statements are against commonsense.
-#### How to use
-You can use this model directly to generate reasons why the given statement is against commonsense using [`generate.sh`](https://github.com/AliOsm/SemEval2020-Task4-ComVE/tree/master/TaskC-Generation) script.
-*Note:* make sure that you are using version `2.4.1` of `transformers` package. Newer versions has some issue in text generation and the model repeats the last token generated again and again.
-#### Limitations and bias
-The model biased to negate the entered sentence usually instead of producing a factual reason.
-## Training data
-The model is initialized from the [gpt2-medium](https://github.com/huggingface/transformers/blob/master/model_cards/gpt2-README.md) model and finetuned using [ComVE](https://github.com/wangcunxiang/SemEval2020-Task4-Commonsense-Validation-and-Explanation) dataset which contains 10K against commonsense sentences, each of them is paired with three reference reasons.
-## Training procedure
-Each natural language statement that against commonsense is concatenated with its reference reason with `<|continue|>` as a separator, then the model finetuned using CLM objective.
-The model trained on Nvidia Tesla P100 GPU from Google Colab platform with 5e-5 learning rate, 5 epochs, 128 maximum sequence length and 64 batch size.
-<center>
-  <img src="https://i.imgur.com/xKbrwBC.png">
-</center>
-## Eval results
-The model achieved fifth place with 16.7153/16.1187 BLEU scores and third place with 1.94 Human Evaluation score on SemEval2020 Task4: Commonsense Validation and Explanation development and testing dataset.
-These are some examples generated by the model:
-|             Against Commonsense Statement             |               Generated Reason               |
-|:-----------------------------------------------------:|:--------------------------------------------:|
-| Chicken can swim in water.                            | Chicken can't swim.                          |
-| shoes can fly                                         | Shoes are not able to fly.                   |
-| Chocolate can be used to make a coffee pot            | Chocolate is not used to make coffee pots.   |
-| you can also buy tickets online with an identity card | You can't buy tickets with an identity card. |
-| a ball is square and can roll                         | A ball is round and cannot roll.             |
-| You can use detergent to dye your hair.               | Detergent is used to wash clothes.           |
-| you can eat mercury                                   | mercury is poisonous                         |
-| A gardener can follow a suspect                       | gardener is not a police officer             |
-| cars can float in the ocean just like a boat          | Cars are too heavy to float in the ocean.    |
-| I am going to work so I can lose money.               | Working is not a way to lose money.          |
-### BibTeX entry and citation info
-```bibtex
-@article{fadel2020justers,
-  title={JUSTers at SemEval-2020 Task 4: Evaluating Transformer Models Against Commonsense Validation and Explanation},
-  author={Fadel, Ali and Al-Ayyoub, Mahmoud and Cambria, Erik},
-  year={2020}
-}
-```
-<a href="https://huggingface.co/exbert/?model=aliosm/ComVE-gpt2-medium">
-	<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
-</a>
--- a/model_cards/aliosm/ComVE-gpt2/README.md
+++ b/model_cards/aliosm/ComVE-gpt2/README.md
---
-language: "en"
-tags:
- exbert
- commonsense
- semeval2020
- comve
-license: "mit"
-datasets:
- ComVE
-metrics:
- bleu
-widget:
- text: "Chicken can swim in water. <|continue|>"
---
-# ComVE-gpt2
-## Model description
-Finetuned model on Commonsense Validation and Explanation (ComVE) dataset introduced in [SemEval2020 Task4](https://competitions.codalab.org/competitions/21080) using a causal language modeling (CLM) objective.
-The model is able to generate a reason why a given natural language statement is against commonsense.
-## Intended uses & limitations
-You can use the raw model for text generation to generate reasons why natural language statements are against commonsense.
-#### How to use
-You can use this model directly to generate reasons why the given statement is against commonsense using [`generate.sh`](https://github.com/AliOsm/SemEval2020-Task4-ComVE/tree/master/TaskC-Generation) script.
-*Note:* make sure that you are using version `2.4.1` of `transformers` package. Newer versions has some issue in text generation and the model repeats the last token generated again and again.
-#### Limitations and bias
-The model biased to negate the entered sentence usually instead of producing a factual reason.
-## Training data
-The model is initialized from the [gpt2](https://github.com/huggingface/transformers/blob/master/model_cards/gpt2-README.md) model and finetuned using [ComVE](https://github.com/wangcunxiang/SemEval2020-Task4-Commonsense-Validation-and-Explanation) dataset which contains 10K against commonsense sentences, each of them is paired with three reference reasons.
-## Training procedure
-Each natural language statement that against commonsense is concatenated with its reference reason with `<|continue|>` as a separator, then the model finetuned using CLM objective.
-The model trained on Nvidia Tesla P100 GPU from Google Colab platform with 5e-5 learning rate, 5 epochs, 128 maximum sequence length and 64 batch size.
-<center>
-  <img src="https://i.imgur.com/xKbrwBC.png">
-</center>
-## Eval results
-The model achieved 14.0547/13.6534 BLEU scores on SemEval2020 Task4: Commonsense Validation and Explanation development and testing dataset.
-### BibTeX entry and citation info
-```bibtex
-@article{fadel2020justers,
-  title={JUSTers at SemEval-2020 Task 4: Evaluating Transformer Models Against Commonsense Validation and Explanation},
-  author={Fadel, Ali and Al-Ayyoub, Mahmoud and Cambria, Erik},
-  year={2020}
-}
-```
-<a href="https://huggingface.co/exbert/?model=aliosm/ComVE-gpt2">
-	<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
-</a>
--- a/model_cards/aliosm/ai-soco-cpp-roberta-small-clas/README.md
+++ b/model_cards/aliosm/ai-soco-cpp-roberta-small-clas/README.md
---
-language: "c++"
-tags:
- exbert
- authorship-identification
- fire2020
- pan2020
- ai-soco
- classification
-license: "mit"
-datasets:
- ai-soco
-metrics:
- accuracy
---
-# ai-soco-c++-roberta-small-clas
-## Model description
-`ai-soco-c++-roberta-small` model fine-tuned on [AI-SOCO](https://sites.google.com/view/ai-soco-2020) task.
-#### How to use
-You can use the model directly after tokenizing the text using the provided tokenizer with the model files.
-#### Limitations and bias
-The model is limited to C++ programming language only.
-## Training data
-The model initialized from [`ai-soco-c++-roberta-small`](https://github.com/huggingface/transformers/blob/master/model_cards/aliosm/ai-soco-c++-roberta-small) model and trained using [AI-SOCO](https://sites.google.com/view/ai-soco-2020) dataset to do text classification.
-## Training procedure
-The model trained on Google Colab platform using V100 GPU for 10 epochs, 32 batch size, 512 max sequence length (sequences larger than 512 were truncated). Each continues 4 spaces were converted to a single tab character (`\t`) before tokenization.
-## Eval results
-The model achieved 93.19%/92.88% accuracy on AI-SOCO task and ranked in the 4th place.
-### BibTeX entry and citation info
-```bibtex
-@inproceedings{ai-soco-2020-fire,
-    title = "Overview of the {PAN@FIRE} 2020 Task on {Authorship Identification of SOurce COde (AI-SOCO)}",
-    author = "Fadel, Ali and Musleh, Husam and Tuffaha, Ibraheem and Al-Ayyoub, Mahmoud and Jararweh, Yaser and Benkhelifa, Elhadj and Rosso, Paolo",
-    booktitle = "Proceedings of The 12th meeting of the Forum for Information Retrieval Evaluation (FIRE 2020)",
-    year = "2020"
-}
-```
-<a href="https://huggingface.co/exbert/?model=aliosm/ai-soco-c++-roberta-small-clas">
-	<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
-</a>
--- a/model_cards/aliosm/ai-soco-cpp-roberta-small/README.md
+++ b/model_cards/aliosm/ai-soco-cpp-roberta-small/README.md
---
-language: "c++"
-tags:
- exbert
- authorship-identification
- fire2020
- pan2020
- ai-soco
-license: "mit"
-datasets:
- ai-soco
-metrics:
- perplexity
---
-# ai-soco-c++-roberta-small
-## Model description
-From scratch pre-trained RoBERTa model with 6 layers and 12 attention heads using [AI-SOCO](https://sites.google.com/view/ai-soco-2020) dataset which consists of C++ codes crawled from CodeForces website.
-## Intended uses & limitations
-The model can be used to do code classification, authorship identification and other downstream tasks on C++ programming language.
-#### How to use
-You can use the model directly after tokenizing the text using the provided tokenizer with the model files.
-#### Limitations and bias
-The model is limited to C++ programming language only.
-## Training data
-The model initialized randomly and trained using [AI-SOCO](https://sites.google.com/view/ai-soco-2020) dataset which contains 100K C++ source codes.
-## Training procedure
-The model trained on Google Colab platform with 8 TPU cores for 200 epochs, 16\*8 batch size, 512 max sequence length and MLM objective. Other parameters were defaulted to the values mentioned in [`run_language_modelling.py`](https://github.com/huggingface/transformers/blob/master/examples/language-modeling/run_language_modeling.py) script. Each continues 4 spaces were converted to a single tab character (`\t`) before tokenization.
-### BibTeX entry and citation info
-```bibtex
-@inproceedings{ai-soco-2020-fire,
-    title = "Overview of the {PAN@FIRE} 2020 Task on {Authorship Identification of SOurce COde (AI-SOCO)}",
-    author = "Fadel, Ali and Musleh, Husam and Tuffaha, Ibraheem and Al-Ayyoub, Mahmoud and Jararweh, Yaser and Benkhelifa, Elhadj and Rosso, Paolo",
-    booktitle = "Proceedings of The 12th meeting of the Forum for Information Retrieval Evaluation (FIRE 2020)",
-    year = "2020"
-}
-```
-<a href="https://huggingface.co/exbert/?model=aliosm/ai-soco-c++-roberta-small">
-	<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
-</a>