Unverified Commit dbf03c05 authored by Lintang Sutawika's avatar Lintang Sutawika Committed by GitHub
Browse files

Update README.md

parent 375346e6
# [Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge](https://arxiv.org/abs/1803.05457) # ARC
## Abstract ### Paper
Title: Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
Abstract: https://arxiv.org/abs/1803.05457
The ARC dataset consists of 7,787 science exam questions drawn from a variety The ARC dataset consists of 7,787 science exam questions drawn from a variety
of sources, including science questions provided under license by a research of sources, including science questions provided under license by a research
...@@ -10,7 +14,11 @@ multiple choice structure (typically 4 answer options). The questions are sorted ...@@ -10,7 +14,11 @@ multiple choice structure (typically 4 answer options). The questions are sorted
into a Challenge Set of 2,590 “hard” questions (those that both a retrieval and into a Challenge Set of 2,590 “hard” questions (those that both a retrieval and
a co-occurrence method fail to answer correctly) and an Easy Set of 5,197 questions. a co-occurrence method fail to answer correctly) and an Easy Set of 5,197 questions.
## Citation Homepage: https://allenai.org/data/arc
### Citation
``` ```
@article{Clark2018ThinkYH, @article{Clark2018ThinkYH,
title={Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge}, title={Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge},
...@@ -20,6 +28,21 @@ a co-occurrence method fail to answer correctly) and an Easy Set of 5,197 questi ...@@ -20,6 +28,21 @@ a co-occurrence method fail to answer correctly) and an Easy Set of 5,197 questi
volume={abs/1803.05457} volume={abs/1803.05457}
} }
``` ```
## Links
Homepage: https://allenai.org/data/arc ### Subtasks
* `arc_easy`: `1-sentence description of what this particular task does`
* `arc_challange`: .....
### Checklist
For adding novel benchmarks/datasets to the library:
* [ ] Is the task an existing benchmark in the literature?
* [ ] Have you referenced the original paper that introduced the task?
* [ ] If yes, does the original paper provide a reference implementation? If so, have you checked against the reference implementation and documented how to run such a test?
If other tasks on this dataset are already supported:
* [ ] Is the "Main" variant of this task clearly denoted?
* [ ] Have you provided a short sentence in a README on what each new variant adds / evaluates?
* [ ] Have you noted which, if any, published evaluation setups are matched by this variant?
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment