Commit 68758761 authored by lintangsutawika's avatar lintangsutawika
Browse files

edit logiqa

parent 223a63c4
# LogiQA
### Paper
Title: `LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning`
Abstract: https://arxiv.org/abs/2007.08124
LogiQA is a dataset for testing human logical reasoning. It consists of 8,678 QA
instances, covering multiple types of deductive reasoning. Results show that state-
of-the-art neural models perform by far worse than human ceiling. The dataset can
also serve as a benchmark for reinvestigating logical AI under the deep learning
NLP setting.
Homepage: https://github.com/lgw863/LogiQA-dataset
### Citation
```
@misc{liu2020logiqa,
title={LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning},
author={Jian Liu and Leyang Cui and Hanmeng Liu and Dandan Huang and Yile Wang and Yue Zhang},
year={2020},
eprint={2007.08124},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
### Groups and Tasks
#### Groups
* Not part of a group yet
#### Tasks
* `logiqa`
### Checklist
For adding novel benchmarks/datasets to the library:
* [ ] Is the task an existing benchmark in the literature?
* [ ] Have you referenced the original paper that introduced the task?
* [ ] If yes, does the original paper provide a reference implementation? If so, have you checked against the reference implementation and documented how to run such a test?
If other tasks on this dataset are already supported:
* [ ] Is the "Main" variant of this task clearly denoted?
* [ ] Have you provided a short sentence in a README on what each new variant adds / evaluates?
* [ ] Have you noted which, if any, published evaluation setups are matched by this variant?
group:
- multiple_choice
task: logiqa
dataset_path: EleutherAI/logiqa
dataset_name: logiqa
......
......@@ -25,15 +25,19 @@ Homepage: https://github.com/csitfun/LogiQA2.0
doi={10.1109/TASLP.2023.3293046}}
```
### Subtasks
### Groups and Tasks
`logiqa2_zh`: The original dataset in Chinese.
#### Groups
`logiqa2_NLI`: The NLI version of the dataset converted from the MRC version.
* Not part of a group yet
`logieval`: Prompt based; https://github.com/csitfun/LogiEval
#### Tasks
The subtasks have not been verified yet.
* `logiqa2_zh`: The original dataset in Chinese.
* `logiqa2_NLI`: The NLI version of the dataset converted from the MRC version.
* `logieval`: Prompt based; https://github.com/csitfun/LogiEval
NOTE! The subtasks have not been verified yet.
### Checklist
......
group:
- greedy_until
task: logieval
dataset_path: baber/logiqa2
dataset_name: logieval
......
group:
- multiple_choice
task: logiqa2
dataset_path: baber/logiqa2
dataset_name: logiqa2
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment