Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
605964ad
Unverified
Commit
605964ad
authored
Jun 30, 2023
by
Hailey Schoelkopf
Committed by
GitHub
Jun 30, 2023
Browse files
Update README.md
parent
0fe7e8eb
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
2 deletions
+4
-2
lm_eval/tasks/README.md
lm_eval/tasks/README.md
+4
-2
No files found.
lm_eval/tasks/README.md
View file @
605964ad
...
@@ -24,7 +24,7 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for
...
@@ -24,7 +24,7 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for
-
[x] HellaSwag
-
[x] HellaSwag
-
[ ] SWAG (WIP)
-
[ ] SWAG (WIP)
-
[x] OpenBookQA
-
[x] OpenBookQA
-
[ ] SQuADv2
-
[ ] SQuADv2
(WIP)
-
[ ] RACE (WIP)
-
[ ] RACE (WIP)
-
[ ] HeadQA
-
[ ] HeadQA
-
[ ] MathQA
-
[ ] MathQA
...
@@ -35,7 +35,7 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for
...
@@ -35,7 +35,7 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for
-
[ ] Hendrycks Ethics
-
[ ] Hendrycks Ethics
-
[ ] TruthfulQA
-
[ ] TruthfulQA
-
[ ] MuTual
-
[ ] MuTual
-
[ ] Hendrycks Math
-
[ ] Hendrycks Math
(WIP)
-
[ ] Asdiv
-
[ ] Asdiv
-
[ ] GSM8k
-
[ ] GSM8k
-
[ ] Arithmetic (WIP)
-
[ ] Arithmetic (WIP)
...
@@ -45,6 +45,8 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for
...
@@ -45,6 +45,8 @@ Boxes should be checked iff tasks are implemented in the refactor and tested for
-
[x] ~~Pile (perplexity)~~
-
[x] ~~Pile (perplexity)~~
-
[ ] BLiMP
-
[ ] BLiMP
-
[ ] ToxiGen
-
[ ] ToxiGen
-
[ ] StoryCloze
-
[ ] NaturalQs
-
[ ] CrowS-Pairs
-
[ ] CrowS-Pairs
-
[ ] XCopa
-
[ ] XCopa
-
[ ] BIG-Bench
-
[ ] BIG-Bench
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment