"tests/test_modeling_deberta.py" did not exist on "c824d15aa1590ddb5d2fc977a8a1009a4b1d7262"
Unverified Commit 0c55a384 authored by Manuel Romero's avatar Manuel Romero Committed by GitHub
Browse files

Add reference to NLP dataset (#5028)



* Add reference to NLP dataset

* Update README.md
Co-authored-by: default avatarJulien Chaumond <chaumond@gmail.com>
parent 0946d120
---
language: english
thumbnail:
datasets:
- squad_v2
---
# T5-base fine-tuned on SQuAD v2
......@@ -16,13 +17,19 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
## Details of the downstream task (Q&A) - Dataset 📚 🧐 ❓
[SQuAD v2](https://rajpurkar.github.io/SQuAD-explorer/) combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering.
Dataset ID: ```squad_v2``` from [HugginFace/NLP](https://github.com/huggingface/nlp)
| Dataset | Split | # samples |
| -------- | ----- | --------- |
| SQuAD2.0 | train | 130k |
| SQuAD2.0 | eval | 12.3k |
| squad_v2 | train | 130319 |
| squad_v2 | valid | 11873 |
How to load it from [nlp](https://github.com/huggingface/nlp)
```python
train_dataset = nlp.load_dataset('squad_v2', split=nlp.Split.TRAIN)
valid_dataset = nlp.load_dataset('squad_v2', split=nlp.Split.VALIDATION)
```
Check out more about this dataset and others in [NLP Viewer](https://huggingface.co/nlp/viewer/)
## Model fine-tuning 🏋️‍
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment