edit pile

d19e90de · lintangsutawika · 7714e30b · d19e90de · d19e90de
Commit d19e90de authored Aug 14, 2023 by lintangsutawika
Hide whitespace changes
Inline Side-by-side

Showing with 47 additions and 4 deletions

lm_eval/tasks/pile/README.md lm_eval/tasks/pile/README.md +47 -2

lm_eval/tasks/pile/pile_arxiv.yaml lm_eval/tasks/pile/pile_arxiv.yaml +0 -2

No files found.
--- a/lm_eval/tasks/pile/README.md
+++ b/lm_eval/tasks/pile/README.md
 # The Pile
 ### Paper
-The Pile: An 800GB Dataset of Diverse Text for Language Modeling
+Title: The Pile: An 800GB Dataset of Diverse Text for Language Modeling
-https://arxiv.org/pdf/2101.00027.pdf
+Abstract: https://arxiv.org/abs/2101.00027
 The Pile is a 825 GiB diverse, open source language modelling data set that consists
 of 22 smaller, high-quality datasets combined together. To score well on Pile
@@ -21,3 +22,47 @@ Homepage: https://pile.eleuther.ai/
  year={2020}
 }
 ```
+### Groups and Tasks
+#### Groups
+* `pile`
+#### Tasks
+* `pile_arxiv`
+* `pile_bookcorpus2`
+* `pile_books3`
+* `pile_dm-mathematics`
+* `pile_enron`
+* `pile_europarl`
+* `pile_freelaw`
+* `pile_github`
+* `pile_gutenberg`
+* `pile_hackernews`
+* `pile_nih-exporter`
+* `pile_opensubtitles`
+* `pile_openwebtext2`
+* `pile_philpapers`
+* `pile_pile-cc`
+* `pile_pubmed-abstracts`
+* `pile_pubmed-central`
+* `pile_stackexchange`
+* `pile_ubuntu-irc`
+* `pile_uspto`
+* `pile_wikipedia`
+* `pile_youtubesubtitles`
+### Checklist
+For adding novel benchmarks/datasets to the library:
+* [ ] Is the task an existing benchmark in the literature?
+  * [ ] Have you referenced the original paper that introduced the task?
+  * [ ] If yes, does the original paper provide a reference implementation? If so, have you checked against the reference implementation and documented how to run such a test?
+If other tasks on this dataset are already supported:
+* [ ] Is the "Main" variant of this task clearly denoted?
+* [ ] Have you provided a short sentence in a README on what each new variant adds / evaluates?
+* [ ] Have you noted which, if any, published evaluation setups are matched by this variant?
--- a/lm_eval/tasks/pile/pile_arxiv.yaml
+++ b/lm_eval/tasks/pile/pile_arxiv.yaml
 group:
  - pile
-  - perplexity
-  - loglikelihood_rolling
 task: pile_arxiv
 dataset_path: EleutherAI/pile
 dataset_name: pile_arxiv