Unverified Commit 4907defd authored by Hailey Schoelkopf's avatar Hailey Schoelkopf Committed by GitHub
Browse files

Merge pull request #764 from EleutherAI/lintangsutawika-patch-2

Update README.md
parents d1a44c85 f0cc1507
......@@ -5,15 +5,15 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for
- [ ] Glue (Lintang)
- [x] SuperGlue
- [ ] CoQA
- [ ] DROP
- [ ] CoQA (Lintang)
- [ ] DROP (Lintang)
- [x] ~~Lambada~~
- [x] Lambada (Cloze variants)
- [x] ~~Lambada (Multilingual)~~
- [x] Wikitext
- [x] PiQA
- [x] PROST
- [ ] MCTACO
- [ ] MCTACO (Lintang)
- [x] Pubmed QA
- [x] SciQ
- [ ] QASPER
......@@ -24,20 +24,20 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for
- [x] HellaSwag
- [x] SWAG
- [x] OpenBookQA
- [ ] SQuADv2
- [ ] SQuADv2 (Lintang)
- [x] RACE
- [x] HeadQA
- [x] MathQA
- [x] WebQs
- [ ] WSC273
- [ ] WSC273 (Lintang)
- [x] Winogrande
- [x] ANLI
- [x] Hendrycks Ethics (missing some tasks/metrics, see PR 660: <https://github.com/EleutherAI/lm-evaluation-harness/pull/660> for more info)
- [x] TruthfulQA (mc1)
- [ ] TruthfulQA (mc2)
- [ ] TruthfulQA (gen)
- [x] TruthfulQA (mc1) (Lintang)
- [ ] TruthfulQA (mc2) (Lintang)
- [ ] TruthfulQA (gen) (Lintang)
- [ ] MuTual
- [ ] Hendrycks Math
- [ ] Hendrycks Math (Hailey)
- [ ] Asdiv
- [ ] GSM8k
- [x] Arithmetic
......@@ -47,8 +47,8 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for
- [x] ~~Pile (perplexity)~~
- [ ] BLiMP (Lintang)
- [x] ToxiGen
- [ ] StoryCloze
- [ ] NaturalQs
- [ ] StoryCloze (Lintang)
- [ ] NaturalQs (Hailey)
- [x] CrowS-Pairs
- [x] XCopa
- [ ] BIG-Bench (Hailey)
......@@ -56,7 +56,7 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for
- [x] XWinograd
- [ ] PAWS-X (Lintang)
- [ ] XNLI (Lintang)
- [ ] MGSM
- [ ] MGSM (Lintang)
- [ ] SCROLLS
- [x] Babi
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment