"tests/pipelines/audio_diffusion/__init__.py" did not exist on "6ab2dd18a4d17d90c92409886ac22a02acf25d7d"
Unverified Commit 15821d20 authored by Lintang Sutawika's avatar Lintang Sutawika Committed by GitHub
Browse files

Merge pull request #745 from EleutherAI/haileyschoelkopf-patch-1

Update README.md
parents 2f53b190 7483a7ea
...@@ -18,9 +18,9 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for ...@@ -18,9 +18,9 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for
- [x] SciQ - [x] SciQ
- [ ] QASPER - [ ] QASPER
- [x] QA4MRE - [x] QA4MRE
- [ ] TriviaQA - [ ] TriviaQA (Lintang)
- [x] AI2 ARC - [x] AI2 ARC
- [ ] LogiQA [(WIP)](https://github.com/EleutherAI/lm-evaluation-harness/pull/711) - [x] LogiQA
- [x] HellaSwag - [x] HellaSwag
- [x] SWAG - [x] SWAG
- [x] OpenBookQA - [x] OpenBookQA
...@@ -28,7 +28,7 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for ...@@ -28,7 +28,7 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for
- [x] RACE - [x] RACE
- [x] HeadQA - [x] HeadQA
- [x] MathQA - [x] MathQA
- [ ] WebQs - [x] WebQs
- [ ] WSC273 - [ ] WSC273
- [x] Winogrande - [x] Winogrande
- [x] ANLI - [x] ANLI
...@@ -50,15 +50,15 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for ...@@ -50,15 +50,15 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for
- [ ] StoryCloze - [ ] StoryCloze
- [ ] NaturalQs - [ ] NaturalQs
- [x] CrowS-Pairs - [x] CrowS-Pairs
- [ ] XCopa - [ ] XCopa (Lintang)
- [ ] BIG-Bench - [ ] BIG-Bench (Hailey)
- [ ] XStoryCloze - [ ] XStoryCloze (Lintang)
- [x] XWinograd - [x] XWinograd
- [ ] PAWS-X - [ ] PAWS-X (Lintang)
- [ ] XNLI - [ ] XNLI (Lintang)
- [ ] MGSM - [ ] MGSM
- [ ] SCROLLS - [ ] SCROLLS
- [ ] Babi - [x] Babi
# Novel Tasks # Novel Tasks
Tasks added in the revamped harness that were not previously available. Again, a strikethrough denotes checking performed *against the original task's implementation or published results introducing the task*. Tasks added in the revamped harness that were not previously available. Again, a strikethrough denotes checking performed *against the original task's implementation or published results introducing the task*.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment