Pile 10k new task (#1758)

* Add Pile-10k readme * Add Pile-10k task configuration file

Pile 10k new task (#1758)
* Add Pile-10k readme * Add Pile-10k task configuration file
b898bdaa · Gabriel Mukobi · GitHub · 552eeae7 · b898bdaa · b898bdaa
Unverified Commit b898bdaa authored May 01, 2024 by Gabriel Mukobi Committed by GitHub May 01, 2024
Hide whitespace changes
Inline Side-by-side

Showing with 64 additions and 0 deletions

lm_eval/tasks/pile_10k/README.md lm_eval/tasks/pile_10k/README.md +45 -0

lm_eval/tasks/pile_10k/pile_10k.yaml lm_eval/tasks/pile_10k/pile_10k.yaml +19 -0

No files found.
--- a/lm_eval/tasks/pile_10k/README.md
+++ b/lm_eval/tasks/pile_10k/README.md
+# Pile-10k
+### Paper
+Title: `NeelNanda/pile-10k`
+Abstract: The first 10K elements of [The Pile](https://pile.eleuther.ai/), useful for debugging models trained on it. See the [HuggingFace page for the full Pile](https://huggingface.co/datasets/the_pile) for more info. Inspired by [stas' great resource](https://huggingface.co/datasets/stas/openwebtext-10k) doing the same for OpenWebText
+Homepage: [https://huggingface.co/datasets/NeelNanda/pile-10k](https://huggingface.co/datasets/NeelNanda/pile-10k)
+### Citation
+```
+@misc{Nanda2022Pile10K,
+  author = {Nanda, Neel},
+  title = {{NeelNanda/pile-10k} \textendash\ Datasets at Hugging Face},
+  year = {2022},
+  howpublished = {\url{https://huggingface.co/datasets/NeelNanda/pile-10k}},
+}
+```
+### Groups and Tasks
+#### Groups
+* Not part of a group yet.
+#### Tasks
+* `pile_10k`: `The first 10K elements of The Pile, useful for debugging models trained on it.`
+### Checklist
+For adding novel benchmarks/datasets to the library:
+* [ ] Is the task an existing benchmark in the literature?
+  * [ ] Have you referenced the original paper that introduced the task?
+  * [ ] If yes, does the original paper provide a reference implementation? If so, have you checked against the reference implementation and documented how to run such a test?
+If other tasks on this dataset are already supported:
+* [ ] Is the "Main" variant of this task clearly denoted?
+* [ ] Have you provided a short sentence in a README on what each new variant adds / evaluates?
+* [ ] Have you noted which, if any, published evaluation setups are matched by this variant?
--- a/lm_eval/tasks/pile_10k/pile_10k.yaml
+++ b/lm_eval/tasks/pile_10k/pile_10k.yaml
+task: pile_10k
+dataset_path: NeelNanda/pile-10k
+dataset_name: null
+output_type: loglikelihood_rolling
+test_split: train
+doc_to_text: ""
+doc_to_target: "text"
+metric_list:
+  - metric: word_perplexity
+    aggregation: weighted_perplexity
+    higher_is_better: false
+  - metric: byte_perplexity
+    aggregation: weighted_perplexity
+    higher_is_better: false
+  - metric: bits_per_byte
+    aggregation: bits_per_byte
+    higher_is_better: false
+metadata:
+  version: 1.0