Unverified Commit 3552d0e0 authored by Julien Chaumond's avatar Julien Chaumond Committed by GitHub
Browse files

[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)



* rm all model cards

* Update the .rst

@sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler

* Add a rootlevel README.md with simple instructions/context

* Update docs/source/model_sharing.rst
Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>

* make style

* rm all model cards
Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
parent 29e45979
## Albert xxlarge version 1 language model fine-tuned on SQuAD2.0
### (updated 30Sept2020) with the following results:
```
exact: 86.11134506864315
f1: 89.35371214945009
total': 11873
HasAns_exact': 83.56950067476383
HasAns_f1': 90.06353312254078
HasAns_total': 5928
NoAns_exact': 88.64592094196804
NoAns_f1': 88.64592094196804
NoAns_total': 5945
best_exact': 86.11134506864315
best_exact_thresh': 0.0
best_f1': 89.35371214944985
best_f1_thresh': 0.0
```
### from script:
```
python ${EXAMPLES}/run_squad.py \
--model_type albert \
--model_name_or_path albert-xxlarge-v1 \
--do_train \
--do_eval \
--train_file ${SQUAD}/train-v2.0.json \
--predict_file ${SQUAD}/dev-v2.0.json \
--version_2_with_negative \
--do_lower_case \
--num_train_epochs 3 \
--max_steps 8144 \
--warmup_steps 814 \
--learning_rate 3e-5 \
--max_seq_length 512 \
--doc_stride 128 \
--per_gpu_train_batch_size 6 \
--gradient_accumulation_steps 8 \
--per_gpu_eval_batch_size 48 \
--fp16 \
--fp16_opt_level O1 \
--threads 12 \
--logging_steps 50 \
--save_steps 3000 \
--overwrite_output_dir \
--output_dir ${MODEL_PATH}
```
### using the following software & system:
```
Transformers: 3.1.0
PyTorch: 1.6.0
TensorFlow: 2.3.1
Python: 3.8.1
OS: Linux-5.4.0-48-generic-x86_64-with-glibc2.10
CPU/GPU: Intel i9-9900K / NVIDIA Titan RTX 24GB
```
## ELECTRA_large_discriminator language model fine-tuned on SQuAD2.0
### with the following results:
```
"exact": 87.09677419354838,
"f1": 89.98343832723452,
"total": 11873,
"HasAns_exact": 84.66599190283401,
"HasAns_f1": 90.44759839056285,
"HasAns_total": 5928,
"NoAns_exact": 89.52060555088309,
"NoAns_f1": 89.52060555088309,
"NoAns_total": 5945,
"best_exact": 87.09677419354838,
"best_exact_thresh": 0.0,
"best_f1": 89.98343832723432,
"best_f1_thresh": 0.0
```
### from script:
```
python ${EXAMPLES}/run_squad.py \
--model_type electra \
--model_name_or_path google/electra-large-discriminator \
--do_train \
--do_eval \
--train_file ${SQUAD}/train-v2.0.json \
--predict_file ${SQUAD}/dev-v2.0.json \
--version_2_with_negative \
--do_lower_case \
--num_train_epochs 3 \
--warmup_steps 306 \
--weight_decay 0.01 \
--learning_rate 3e-5 \
--max_grad_norm 0.5 \
--adam_epsilon 1e-6 \
--max_seq_length 512 \
--doc_stride 128 \
--per_gpu_train_batch_size 8 \
--gradient_accumulation_steps 16 \
--per_gpu_eval_batch_size 128 \
--fp16 \
--fp16_opt_level O1 \
--threads 12 \
--logging_steps 50 \
--save_steps 1000 \
--overwrite_output_dir \
--output_dir ${MODEL_PATH}
```
### using the following system & software:
```
Transformers: 2.11.0
PyTorch: 1.5.0
TensorFlow: 2.2.0
Python: 3.8.1
OS/Platform: Linux-5.3.0-59-generic-x86_64-with-glibc2.10
CPU/GPU: Intel i9-9900K / NVIDIA Titan RTX 24GB
```
## RoBERTa-large language model fine-tuned on SQuAD2.0
### with the following results:
```
"exact": 84.46896319380106,
"f1": 87.85388093408943,
"total": 11873,
"HasAns_exact": 81.37651821862349,
"HasAns_f1": 88.1560607844881,
"HasAns_total": 5928,
"NoAns_exact": 87.55256518082422,
"NoAns_f1": 87.55256518082422,
"NoAns_total": 5945,
"best_exact": 84.46896319380106,
"best_exact_thresh": 0.0,
"best_f1": 87.85388093408929,
"best_f1_thresh": 0.0
```
### from script:
```
python ${EXAMPLES}/run_squad.py \
--model_type roberta \
--model_name_or_path roberta-large \
--do_train \
--do_eval \
--train_file ${SQUAD}/train-v2.0.json \
--predict_file ${SQUAD}/dev-v2.0.json \
--version_2_with_negative \
--do_lower_case \
--num_train_epochs 3 \
--warmup_steps 1642 \
--weight_decay 0.01 \
--learning_rate 3e-5 \
--adam_epsilon 1e-6 \
--max_seq_length 512 \
--doc_stride 128 \
--per_gpu_train_batch_size 8 \
--gradient_accumulation_steps 6 \
--per_gpu_eval_batch_size 48 \
--threads 12 \
--logging_steps 50 \
--save_steps 2000 \
--overwrite_output_dir \
--output_dir ${MODEL_PATH}
$@
```
### using the following system & software:
```
Transformers: 2.7.0
PyTorch: 1.4.0
TensorFlow: 2.1.0
Python: 3.7.7
OS/Platform: Linux-5.3.0-46-generic-x86_64-with-debian-buster-sid
CPU/GPU: Intel i9-9900K / NVIDIA Titan RTX 24GB
```
## XLNet large language model fine-tuned on SQuAD2.0
### with the following results:
```
"exact": 82.07698138633876,
"f1": 85.898874470488,
"total": 11873,
"HasAns_exact": 79.60526315789474,
"HasAns_f1": 87.26000954590184,
"HasAns_total": 5928,
"NoAns_exact": 84.54163162321278,
"NoAns_f1": 84.54163162321278,
"NoAns_total": 5945,
"best_exact": 83.22243746315169,
"best_exact_thresh": -11.112004280090332,
"best_f1": 86.88541353813282,
"best_f1_thresh": -11.112004280090332
```
### from script:
```
python -m torch.distributed.launch --nproc_per_node=2 ${RUN_SQUAD_DIR}/run_squad.py \
--model_type xlnet \
--model_name_or_path xlnet-large-cased \
--do_train \
--train_file ${SQUAD_DIR}/train-v2.0.json \
--predict_file ${SQUAD_DIR}/dev-v2.0.json \
--version_2_with_negative \
--num_train_epochs 3 \
--learning_rate 3e-5 \
--adam_epsilon 1e-6 \
--max_seq_length 512 \
--doc_stride 128 \
--save_steps 2000 \
--per_gpu_train_batch_size 1 \
--gradient_accumulation_steps 24 \
--output_dir ${MODEL_PATH}
CUDA_VISIBLE_DEVICES=0 python ${RUN_SQUAD_DIR}/run_squad_II.py \
--model_type xlnet \
--model_name_or_path ${MODEL_PATH} \
--do_eval \
--train_file ${SQUAD_DIR}/train-v2.0.json \
--predict_file ${SQUAD_DIR}/dev-v2.0.json \
--version_2_with_negative \
--max_seq_length 512 \
--per_gpu_eval_batch_size 48 \
--output_dir ${MODEL_PATH}
```
### using the following system & software:
```
OS/Platform: Linux-4.15.0-76-generic-x86_64-with-debian-buster-sid
GPU/CPU: 2 x NVIDIA 1080Ti / Intel i7-8700
Transformers: 2.1.1
PyTorch: 1.4.0
TensorFlow: 2.1.0
Python: 3.7.6
```
### Utilize this xlnet_large_squad2_512 fine-tuned model with:
```python
tokenizer = AutoTokenizer.from_pretrained("ahotrod/xlnet_large_squad2_512")
model = AutoModelForQuestionAnswering.from_pretrained("ahotrod/xlnet_large_squad2_512")
```
---
language: en
license: mit
datasets:
- AI4Bharat IndicNLP Corpora
---
# IndicBERT
IndicBERT is a multilingual ALBERT model pretrained exclusively on 12 major Indian languages. It is pre-trained on our novel monolingual corpus of around 9 billion tokens and subsequently evaluated on a set of diverse tasks. IndicBERT has much fewer parameters than other multilingual models (mBERT, XLM-R etc.) while it also achieves a performance on-par or better than these models.
The 12 languages covered by IndicBERT are: Assamese, Bengali, English, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu.
The code can be found [here](https://github.com/divkakwani/indic-bert). For more information, checkout our [project page](https://indicnlp.ai4bharat.org/) or our [paper](https://indicnlp.ai4bharat.org/papers/arxiv2020_indicnlp_corpus.pdf).
## Pretraining Corpus
We pre-trained indic-bert on AI4Bharat's monolingual corpus. The corpus has the following distribution of languages:
| Language | as | bn | en | gu | hi | kn | |
| ----------------- | ------ | ------ | ------ | ------ | ------ | ------ | ------- |
| **No. of Tokens** | 36.9M | 815M | 1.34B | 724M | 1.84B | 712M | |
| **Language** | **ml** | **mr** | **or** | **pa** | **ta** | **te** | **all** |
| **No. of Tokens** | 767M | 560M | 104M | 814M | 549M | 671M | 8.9B |
## Evaluation Results
IndicBERT is evaluated on IndicGLUE and some additional tasks. The results are summarized below. For more details about the tasks, refer our [official repo](https://github.com/divkakwani/indic-bert)
#### IndicGLUE
Task | mBERT | XLM-R | IndicBERT
-----| ----- | ----- | ------
News Article Headline Prediction | 89.58 | 95.52 | **95.87**
Wikipedia Section Title Prediction| **73.66** | 66.33 | 73.31
Cloze-style multiple-choice QA | 39.16 | 27.98 | **41.87**
Article Genre Classification | 90.63 | 97.03 | **97.34**
Named Entity Recognition (F1-score) | **73.24** | 65.93 | 64.47
Cross-Lingual Sentence Retrieval Task | 21.46 | 13.74 | **27.12**
Average | 64.62 | 61.09 | **66.66**
#### Additional Tasks
Task | Task Type | mBERT | XLM-R | IndicBERT
-----| ----- | ----- | ------ | -----
BBC News Classification | Genre Classification | 60.55 | **75.52** | 74.60
IIT Product Reviews | Sentiment Analysis | 74.57 | **78.97** | 71.32
IITP Movie Reviews | Sentiment Analaysis | 56.77 | **61.61** | 59.03
Soham News Article | Genre Classification | 80.23 | **87.6** | 78.45
Midas Discourse | Discourse Analysis | 71.20 | **79.94** | 78.44
iNLTK Headlines Classification | Genre Classification | 87.95 | 93.38 | **94.52**
ACTSA Sentiment Analysis | Sentiment Analysis | 48.53 | 59.33 | **61.18**
Winograd NLI | Natural Language Inference | 56.34 | 55.87 | **56.34**
Choice of Plausible Alternative (COPA) | Natural Language Inference | 54.92 | 51.13 | **58.33**
Amrita Exact Paraphrase | Paraphrase Detection | **93.81** | 93.02 | 93.75
Amrita Rough Paraphrase | Paraphrase Detection | 83.38 | 82.20 | **84.33**
Average | | 69.84 | **74.42** | 73.66
\* Note: all models have been restricted to a max_seq_length of 128.
## Downloads
The model can be downloaded [here](https://storage.googleapis.com/ai4bharat-public-indic-nlp-corpora/models/indic-bert-v1.tar.gz). Both tf checkpoints and pytorch binaries are included in the archive. Alternatively, you can also download it from [Huggingface](https://huggingface.co/ai4bharat/indic-bert).
## Citing
If you are using any of the resources, please cite the following article:
```
@inproceedings{kakwani2020indicnlpsuite,
title={{IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages}},
author={Divyanshu Kakwani and Anoop Kunchukuttan and Satish Golla and Gokul N.C. and Avik Bhattacharyya and Mitesh M. Khapra and Pratyush Kumar},
year={2020},
booktitle={Findings of EMNLP},
}
```
We would like to hear from you if:
- You are using our resources. Please let us know how you are putting these resources to use.
- You have any feedback on these resources.
## License
The IndicBERT code (and models) are released under the MIT License.
## Contributors
- Divyanshu Kakwani
- Anoop Kunchukuttan
- Gokul NC
- Satish Golla
- Avik Bhattacharyya
- Mitesh Khapra
- Pratyush Kumar
This work is the outcome of a volunteer effort as part of [AI4Bharat initiative](https://ai4bharat.org).
## Contact
- Anoop Kunchukuttan ([anoop.kunchukuttan@gmail.com](mailto:anoop.kunchukuttan@gmail.com))
- Mitesh Khapra ([miteshk@cse.iitm.ac.in](mailto:miteshk@cse.iitm.ac.in))
- Pratyush Kumar ([pratyush@cse.iitm.ac.in](mailto:pratyush@cse.iitm.ac.in))
---
language: "ar"
tags:
- text-generation
license: ""
datasets:
- Arabic poetry from several eras
---
# GPT2-Small-Arabic-Poetry
## Model description
Fine-tuned model of Arabic poetry dataset based on gpt2-small-arabic.
## Intended uses & limitations
#### How to use
An example is provided in this [colab notebook](https://colab.research.google.com/drive/1mRl7c-5v-Klx27EEAEOAbrfkustL4g7a?usp=sharing).
#### Limitations and bias
Both the GPT2-small-arabic (trained on Arabic Wikipedia) and this model have several limitations in terms of coverage and training performance.
Use them as demonstrations or proof of concepts but not as production code.
## Training data
This pretrained model used the [Arabic Poetry dataset](https://www.kaggle.com/ahmedabelal/arabic-poetry) from 9 different eras with a total of around 40k poems.
The dataset was trained (fine-tuned) based on the [gpt2-small-arabic](https://huggingface.co/akhooli/gpt2-small-arabic) transformer model.
## Training procedure
Training was done using [Simple Transformers](https://github.com/ThilinaRajapakse/simpletransformers) library on Kaggle, using free GPU.
## Eval results
Final perplexity reached ws 76.3, loss: 4.33
### BibTeX entry and citation info
```bibtex
@inproceedings{Abed Khooli,
year={2020}
}
```
---
language: "ar"
datasets:
- Arabic Wikipedia
metrics:
- none
---
# GPT2-Small-Arabic
## Model description
GPT2 model from Arabic Wikipedia dataset based on gpt2-small (using Fastai2).
## Intended uses & limitations
#### How to use
An example is provided in this [colab notebook](https://colab.research.google.com/drive/1mRl7c-5v-Klx27EEAEOAbrfkustL4g7a?usp=sharing).
Both text and poetry (fine-tuned model) generation are included.
#### Limitations and bias
GPT2-small-arabic (trained on Arabic Wikipedia) has several limitations in terms of coverage (Arabic Wikipeedia quality, no diacritics) and training performance.
Use as demonstration or proof of concepts but not as production code.
## Training data
This pretrained model used the Arabic Wikipedia dump (around 900 MB).
## Training procedure
Training was done using [Fastai2](https://github.com/fastai/fastai2/) library on Kaggle, using free GPU.
## Eval results
Final perplexity reached was 72.19, loss: 4.28, accuracy: 0.307
### BibTeX entry and citation info
```bibtex
@inproceedings{Abed Khooli,
year={2020}
}
```
---
tags:
- translation
language:
- ar
- en
license: mit
---
### mbart-large-ar-en
This is mbart-large-cc25, finetuned on a subset of the OPUS corpus for ar_en.
Usage: see [example notebook](https://colab.research.google.com/drive/1I6RFOWMaTpPBX7saJYjnSTddW0TD6H1t?usp=sharing)
Note: model has limited training set, not fully trained (do not use for production).
Other models by me: [Abed Khooli](https://huggingface.co/akhooli)
---
tags:
- translation
language:
- en
- ar
license: mit
---
### mbart-large-en-ar
This is mbart-large-cc25, finetuned on a subset of the UN corpus for en_ar.
Usage: see [example notebook](https://colab.research.google.com/drive/1I6RFOWMaTpPBX7saJYjnSTddW0TD6H1t?usp=sharing)
Note: model has limited training set, not fully trained (do not use for production).
---
tags:
- conversational
language:
- ar
license: mit
---
## personachat-arabic (conversational AI)
This is personachat-arabic, using a subset from the persona-chat validation dataset, machine translated to Arabic (from English)
and fine-tuned from [akhooli/gpt2-small-arabic](https://huggingface.co/akhooli/gpt2-small-arabic) which is a limited text generation model.
Usage: see the last section of this [example notebook](https://colab.research.google.com/drive/1I6RFOWMaTpPBX7saJYjnSTddW0TD6H1t?usp=sharing)
Note: model has limited training set which was machine translated (do not use for production).
---
language:
- ar
- en
license: mit
---
### xlm-r-large-arabic-sent
Multilingual sentiment classification (Label_0: mixed, Label_1: negative, Label_2: positive) of Arabic reviews by fine-tuning XLM-Roberta-Large.
Zero shot classification of other languages (also works in mixed languages - ex. Arabic & English). Mixed category is not accurate and may confuse other
classes (was based on a rate of 3 out of 5 in reviews).
Usage: see last section in this [Colab notebook](https://lnkd.in/d3bCFyZ)
---
language:
- ar
- en
license: mit
---
### xlm-r-large-arabic-toxic (toxic/hate speech classifier)
Toxic (hate speech) classification (Label_0: non-toxic, Label_1: toxic) of Arabic comments by fine-tuning XLM-Roberta-Large.
Zero shot classification of other languages (also works in mixed languages - ex. Arabic & English).
Usage and further info: see last section in this [Colab notebook](https://lnkd.in/d3bCFyZ)
---
tags:
- exbert
license: apache-2.0
---
<a href="https://huggingface.co/exbert/?model=albert-base-v1">
<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
</a>
---
tags:
- exbert
license: apache-2.0
---
<a href="https://huggingface.co/exbert/?model=albert-xxlarge-v2">
<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
</a>
\ No newline at end of file
---
language: "en"
tags:
- exbert
- commonsense
- semeval2020
- comve
license: "mit"
datasets:
- ComVE
metrics:
- bleu
widget:
- text: "Chicken can swim in water. <|continue|>"
---
# ComVE-distilgpt2
## Model description
Finetuned model on Commonsense Validation and Explanation (ComVE) dataset introduced in [SemEval2020 Task4](https://competitions.codalab.org/competitions/21080) using a causal language modeling (CLM) objective.
The model is able to generate a reason why a given natural language statement is against commonsense.
## Intended uses & limitations
You can use the raw model for text generation to generate reasons why natural language statements are against commonsense.
#### How to use
You can use this model directly to generate reasons why the given statement is against commonsense using [`generate.sh`](https://github.com/AliOsm/SemEval2020-Task4-ComVE/tree/master/TaskC-Generation) script.
*Note:* make sure that you are using version `2.4.1` of `transformers` package. Newer versions has some issue in text generation and the model repeats the last token generated again and again.
#### Limitations and bias
The model biased to negate the entered sentence usually instead of producing a factual reason.
## Training data
The model is initialized from the [distilgpt2](https://github.com/huggingface/transformers/blob/master/model_cards/distilgpt2-README.md) model and finetuned using [ComVE](https://github.com/wangcunxiang/SemEval2020-Task4-Commonsense-Validation-and-Explanation) dataset which contains 10K against commonsense sentences, each of them is paired with three reference reasons.
## Training procedure
Each natural language statement that against commonsense is concatenated with its reference reason with `<|continue|>` as a separator, then the model finetuned using CLM objective.
The model trained on Nvidia Tesla P100 GPU from Google Colab platform with 5e-5 learning rate, 15 epochs, 128 maximum sequence length and 64 batch size.
<center>
<img src="https://i.imgur.com/xKbrwBC.png">
</center>
## Eval results
The model achieved 13.7582/13.8026 BLEU scores on SemEval2020 Task4: Commonsense Validation and Explanation development and testing dataset.
### BibTeX entry and citation info
```bibtex
@article{fadel2020justers,
title={JUSTers at SemEval-2020 Task 4: Evaluating Transformer Models Against Commonsense Validation and Explanation},
author={Fadel, Ali and Al-Ayyoub, Mahmoud and Cambria, Erik},
year={2020}
}
```
<a href="https://huggingface.co/exbert/?model=aliosm/ComVE-distilgpt2">
<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
</a>
---
language: "en"
tags:
- gpt2
- exbert
- commonsense
- semeval2020
- comve
license: "mit"
datasets:
- https://github.com/wangcunxiang/SemEval2020-Task4-Commonsense-Validation-and-Explanation
metrics:
- bleu
widget:
- text: "Chicken can swim in water. <|continue|>"
---
# ComVE-gpt2-large
## Model description
Finetuned model on Commonsense Validation and Explanation (ComVE) dataset introduced in [SemEval2020 Task4](https://competitions.codalab.org/competitions/21080) using a causal language modeling (CLM) objective.
The model is able to generate a reason why a given natural language statement is against commonsense.
## Intended uses & limitations
You can use the raw model for text generation to generate reasons why natural language statements are against commonsense.
#### How to use
You can use this model directly to generate reasons why the given statement is against commonsense using [`generate.sh`](https://github.com/AliOsm/SemEval2020-Task4-ComVE/tree/master/TaskC-Generation) script.
*Note:* make sure that you are using version `2.4.1` of `transformers` package. Newer versions has some issue in text generation and the model repeats the last token generated again and again.
#### Limitations and bias
The model biased to negate the entered sentence usually instead of producing a factual reason.
## Training data
The model is initialized from the [gpt2-large](https://github.com/huggingface/transformers/blob/master/model_cards/gpt2-README.md) model and finetuned using [ComVE](https://github.com/wangcunxiang/SemEval2020-Task4-Commonsense-Validation-and-Explanation) dataset which contains 10K against commonsense sentences, each of them is paired with three reference reasons.
## Training procedure
Each natural language statement that against commonsense is concatenated with its reference reason with `<|conteniue|>` as a separator, then the model finetuned using CLM objective.
The model trained on Nvidia Tesla P100 GPU from Google Colab platform with 5e-5 learning rate, 5 epochs, 128 maximum sequence length and 64 batch size.
<center>
<img src="https://i.imgur.com/xKbrwBC.png">
</center>
## Eval results
The model achieved 16.5110/15.9299 BLEU scores on SemEval2020 Task4: Commonsense Validation and Explanation development and testing dataset.
### BibTeX entry and citation info
```bibtex
@article{fadel2020justers,
title={JUSTers at SemEval-2020 Task 4: Evaluating Transformer Models Against Commonsense Validation and Explanation},
author={Fadel, Ali and Al-Ayyoub, Mahmoud and Cambria, Erik},
year={2020}
}
```
<a href="https://huggingface.co/exbert/?model=aliosm/ComVE-gpt2-large">
<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
</a>
---
language: "en"
tags:
- gpt2
- exbert
- commonsense
- semeval2020
- comve
license: "mit"
datasets:
- ComVE
metrics:
- bleu
widget:
- text: "Chicken can swim in water. <|continue|>"
---
# ComVE-gpt2-medium
## Model description
Finetuned model on Commonsense Validation and Explanation (ComVE) dataset introduced in [SemEval2020 Task4](https://competitions.codalab.org/competitions/21080) using a causal language modeling (CLM) objective.
The model is able to generate a reason why a given natural language statement is against commonsense.
## Intended uses & limitations
You can use the raw model for text generation to generate reasons why natural language statements are against commonsense.
#### How to use
You can use this model directly to generate reasons why the given statement is against commonsense using [`generate.sh`](https://github.com/AliOsm/SemEval2020-Task4-ComVE/tree/master/TaskC-Generation) script.
*Note:* make sure that you are using version `2.4.1` of `transformers` package. Newer versions has some issue in text generation and the model repeats the last token generated again and again.
#### Limitations and bias
The model biased to negate the entered sentence usually instead of producing a factual reason.
## Training data
The model is initialized from the [gpt2-medium](https://github.com/huggingface/transformers/blob/master/model_cards/gpt2-README.md) model and finetuned using [ComVE](https://github.com/wangcunxiang/SemEval2020-Task4-Commonsense-Validation-and-Explanation) dataset which contains 10K against commonsense sentences, each of them is paired with three reference reasons.
## Training procedure
Each natural language statement that against commonsense is concatenated with its reference reason with `<|continue|>` as a separator, then the model finetuned using CLM objective.
The model trained on Nvidia Tesla P100 GPU from Google Colab platform with 5e-5 learning rate, 5 epochs, 128 maximum sequence length and 64 batch size.
<center>
<img src="https://i.imgur.com/xKbrwBC.png">
</center>
## Eval results
The model achieved fifth place with 16.7153/16.1187 BLEU scores and third place with 1.94 Human Evaluation score on SemEval2020 Task4: Commonsense Validation and Explanation development and testing dataset.
These are some examples generated by the model:
| Against Commonsense Statement | Generated Reason |
|:-----------------------------------------------------:|:--------------------------------------------:|
| Chicken can swim in water. | Chicken can't swim. |
| shoes can fly | Shoes are not able to fly. |
| Chocolate can be used to make a coffee pot | Chocolate is not used to make coffee pots. |
| you can also buy tickets online with an identity card | You can't buy tickets with an identity card. |
| a ball is square and can roll | A ball is round and cannot roll. |
| You can use detergent to dye your hair. | Detergent is used to wash clothes. |
| you can eat mercury | mercury is poisonous |
| A gardener can follow a suspect | gardener is not a police officer |
| cars can float in the ocean just like a boat | Cars are too heavy to float in the ocean. |
| I am going to work so I can lose money. | Working is not a way to lose money. |
### BibTeX entry and citation info
```bibtex
@article{fadel2020justers,
title={JUSTers at SemEval-2020 Task 4: Evaluating Transformer Models Against Commonsense Validation and Explanation},
author={Fadel, Ali and Al-Ayyoub, Mahmoud and Cambria, Erik},
year={2020}
}
```
<a href="https://huggingface.co/exbert/?model=aliosm/ComVE-gpt2-medium">
<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
</a>
---
language: "en"
tags:
- exbert
- commonsense
- semeval2020
- comve
license: "mit"
datasets:
- ComVE
metrics:
- bleu
widget:
- text: "Chicken can swim in water. <|continue|>"
---
# ComVE-gpt2
## Model description
Finetuned model on Commonsense Validation and Explanation (ComVE) dataset introduced in [SemEval2020 Task4](https://competitions.codalab.org/competitions/21080) using a causal language modeling (CLM) objective.
The model is able to generate a reason why a given natural language statement is against commonsense.
## Intended uses & limitations
You can use the raw model for text generation to generate reasons why natural language statements are against commonsense.
#### How to use
You can use this model directly to generate reasons why the given statement is against commonsense using [`generate.sh`](https://github.com/AliOsm/SemEval2020-Task4-ComVE/tree/master/TaskC-Generation) script.
*Note:* make sure that you are using version `2.4.1` of `transformers` package. Newer versions has some issue in text generation and the model repeats the last token generated again and again.
#### Limitations and bias
The model biased to negate the entered sentence usually instead of producing a factual reason.
## Training data
The model is initialized from the [gpt2](https://github.com/huggingface/transformers/blob/master/model_cards/gpt2-README.md) model and finetuned using [ComVE](https://github.com/wangcunxiang/SemEval2020-Task4-Commonsense-Validation-and-Explanation) dataset which contains 10K against commonsense sentences, each of them is paired with three reference reasons.
## Training procedure
Each natural language statement that against commonsense is concatenated with its reference reason with `<|continue|>` as a separator, then the model finetuned using CLM objective.
The model trained on Nvidia Tesla P100 GPU from Google Colab platform with 5e-5 learning rate, 5 epochs, 128 maximum sequence length and 64 batch size.
<center>
<img src="https://i.imgur.com/xKbrwBC.png">
</center>
## Eval results
The model achieved 14.0547/13.6534 BLEU scores on SemEval2020 Task4: Commonsense Validation and Explanation development and testing dataset.
### BibTeX entry and citation info
```bibtex
@article{fadel2020justers,
title={JUSTers at SemEval-2020 Task 4: Evaluating Transformer Models Against Commonsense Validation and Explanation},
author={Fadel, Ali and Al-Ayyoub, Mahmoud and Cambria, Erik},
year={2020}
}
```
<a href="https://huggingface.co/exbert/?model=aliosm/ComVE-gpt2">
<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
</a>
---
language: "c++"
tags:
- exbert
- authorship-identification
- fire2020
- pan2020
- ai-soco
- classification
license: "mit"
datasets:
- ai-soco
metrics:
- accuracy
---
# ai-soco-c++-roberta-small-clas
## Model description
`ai-soco-c++-roberta-small` model fine-tuned on [AI-SOCO](https://sites.google.com/view/ai-soco-2020) task.
#### How to use
You can use the model directly after tokenizing the text using the provided tokenizer with the model files.
#### Limitations and bias
The model is limited to C++ programming language only.
## Training data
The model initialized from [`ai-soco-c++-roberta-small`](https://github.com/huggingface/transformers/blob/master/model_cards/aliosm/ai-soco-c++-roberta-small) model and trained using [AI-SOCO](https://sites.google.com/view/ai-soco-2020) dataset to do text classification.
## Training procedure
The model trained on Google Colab platform using V100 GPU for 10 epochs, 32 batch size, 512 max sequence length (sequences larger than 512 were truncated). Each continues 4 spaces were converted to a single tab character (`\t`) before tokenization.
## Eval results
The model achieved 93.19%/92.88% accuracy on AI-SOCO task and ranked in the 4th place.
### BibTeX entry and citation info
```bibtex
@inproceedings{ai-soco-2020-fire,
title = "Overview of the {PAN@FIRE} 2020 Task on {Authorship Identification of SOurce COde (AI-SOCO)}",
author = "Fadel, Ali and Musleh, Husam and Tuffaha, Ibraheem and Al-Ayyoub, Mahmoud and Jararweh, Yaser and Benkhelifa, Elhadj and Rosso, Paolo",
booktitle = "Proceedings of The 12th meeting of the Forum for Information Retrieval Evaluation (FIRE 2020)",
year = "2020"
}
```
<a href="https://huggingface.co/exbert/?model=aliosm/ai-soco-c++-roberta-small-clas">
<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
</a>
---
language: "c++"
tags:
- exbert
- authorship-identification
- fire2020
- pan2020
- ai-soco
license: "mit"
datasets:
- ai-soco
metrics:
- perplexity
---
# ai-soco-c++-roberta-small
## Model description
From scratch pre-trained RoBERTa model with 6 layers and 12 attention heads using [AI-SOCO](https://sites.google.com/view/ai-soco-2020) dataset which consists of C++ codes crawled from CodeForces website.
## Intended uses & limitations
The model can be used to do code classification, authorship identification and other downstream tasks on C++ programming language.
#### How to use
You can use the model directly after tokenizing the text using the provided tokenizer with the model files.
#### Limitations and bias
The model is limited to C++ programming language only.
## Training data
The model initialized randomly and trained using [AI-SOCO](https://sites.google.com/view/ai-soco-2020) dataset which contains 100K C++ source codes.
## Training procedure
The model trained on Google Colab platform with 8 TPU cores for 200 epochs, 16\*8 batch size, 512 max sequence length and MLM objective. Other parameters were defaulted to the values mentioned in [`run_language_modelling.py`](https://github.com/huggingface/transformers/blob/master/examples/language-modeling/run_language_modeling.py) script. Each continues 4 spaces were converted to a single tab character (`\t`) before tokenization.
### BibTeX entry and citation info
```bibtex
@inproceedings{ai-soco-2020-fire,
title = "Overview of the {PAN@FIRE} 2020 Task on {Authorship Identification of SOurce COde (AI-SOCO)}",
author = "Fadel, Ali and Musleh, Husam and Tuffaha, Ibraheem and Al-Ayyoub, Mahmoud and Jararweh, Yaser and Benkhelifa, Elhadj and Rosso, Paolo",
booktitle = "Proceedings of The 12th meeting of the Forum for Information Retrieval Evaluation (FIRE 2020)",
year = "2020"
}
```
<a href="https://huggingface.co/exbert/?model=aliosm/ai-soco-c++-roberta-small">
<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
</a>
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment