Unverified Commit ead2964e authored by Hailey Schoelkopf's avatar Hailey Schoelkopf Committed by GitHub
Browse files

Fix Paloma Template yaml (#1993)



* init paloma benchmark

* pre-process in utils function

* add `task_alias`

* updated task aliases

* Update paloma_dolma-v1_5.yaml

* Update paloma_twitterAAE_HELM_fixed.yaml

* Update paloma_dolma_100_programing_languages.yaml

* update on names

* fix paloma template issue

---------
Co-authored-by: default avatarZafir Stojanovski <zaf.stojano@gmail.com>
Co-authored-by: default avatarZafir Stojanovski <zafir.stojanovski@icloud.com>
Co-authored-by: default avatarLintang Sutawika <lintang@eleuther.ai>
parent f257d38b
include: paloma.yaml
include: _paloma_template
task: paloma_4chan_meta_sep
task_alias: 4chan Corpus
task_alias: 4chan
dataset_name: 4chan_meta_sep
include: paloma.yaml
include: _paloma_template
task: paloma_c4_100_domains
task_alias: C4-100-domains
task_alias: C4 100 Domains
dataset_name: c4_100_domains
include: paloma.yaml
include: _paloma_template
task: paloma_c4_en
task_alias: C4
dataset_name: c4_en
include: paloma.yaml
include: _paloma_template
task: paloma_dolma-v1_5
task_alias: Dolma V1.5
dataset_name: dolma-v1_5
include: paloma.yaml
include: _paloma_template
task: paloma_dolma_100_subreddits
task_alias: Dolma-100-subreddits
task_alias: 100 Subreddits
dataset_name: dolma_100_subreddits
include: paloma.yaml
include: _paloma_template
task: paloma_falcon-refinedweb
task_alias: Falcon Refinedweb
task_alias: Falcon
dataset_name: falcon-refinedweb
include: paloma.yaml
include: _paloma_template
task: paloma_gab
task_alias: Gab Corpus
task_alias: Gab
dataset_name: gab
include: paloma.yaml
include: _paloma_template
task: paloma_m2d2_s2orc_unsplit
task_alias: M2D2 S2ORC
dataset_name: m2d2_s2orc_unsplit
include: paloma.yaml
include: _paloma_template
task: paloma_m2d2_wikipedia_unsplit
task_alias: M2D2 Wikipedia
dataset_name: m2d2_wikipedia_unsplit
include: paloma.yaml
include: _paloma_template
task: paloma_manosphere_meta_sep
task_alias: Manosphere Corpus
task_alias: Manosphere
dataset_name: manosphere_meta_sep
include: paloma.yaml
include: _paloma_template
task: paloma_mc4
task_alias: mC4-en
task_alias: mC4
dataset_name: mc4
include: paloma.yaml
include: _paloma_template
task: paloma_ptb
task_alias: Penn Treebank
task_alias: PTB
dataset_name: ptb
include: paloma.yaml
include: _paloma_template
task: paloma_redpajama
task_alias: RedPajama
dataset_name: redpajama
include: paloma.yaml
include: _paloma_template
task: paloma_twitterAAE_HELM_fixed
task_alias: Twitter AAE
dataset_name: twitterAAE_HELM_fixed
include: paloma.yaml
include: _paloma_template
task: paloma_wikitext_103
task_alias: Wikitext-103
dataset_name: wikitext_103
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment