Unverified Commit 6fbebb4b authored by Angelika Romanou's avatar Angelika Romanou Committed by GitHub
Browse files

Add INCLUDE tasks (#2769)



* Add INCLUDE tasks

* pacify pre-commit

---------
Co-authored-by: default avatarBaber <baber@hey.com>
parent bb4fa95e
group: include_base_44_lithuanian
task:
- include_base_44_lithuanian_few_shot_og_arts_humanities
- include_base_44_lithuanian_few_shot_og_stem
- include_base_44_lithuanian_few_shot_og_social_science
- include_base_44_lithuanian_few_shot_og_business_commerce
- include_base_44_lithuanian_few_shot_og_professional_certification
aggregate_metric_list:
- metric: acc
weight_by_size: true
metadata:
version: 0.0
dataset_path: CohereForAI/include-base-44
dataset_name: Lithuanian
test_split: test
output_type: multiple_choice
doc_to_text: "{{question.strip()}}\nA. {{option_a}}\nB. {{option_b}}\nC. {{option_c}}\n
D. {{option_d}}\nAnswer:"
doc_to_choice:
- A
- B
- C
- D
doc_to_target: answer
metric_list:
- metric: acc
aggregation: mean
higher_is_better: true
metadata:
version: 0.0
include: _lithuanian_few_shot_og_template_yaml
description: Toliau pateikiami klausimai su atsakymų variantais (su atsakymais) apie
Arts & Humanities.
process_docs: !function 'utils.process_arts_humanities'
task: include_base_44_lithuanian_few_shot_og_arts_humanities
include: _lithuanian_few_shot_og_template_yaml
description: Toliau pateikiami klausimai su atsakymų variantais (su atsakymais) apie
Business & Commerce.
process_docs: !function 'utils.process_business_commerce'
task: include_base_44_lithuanian_few_shot_og_business_commerce
include: _lithuanian_few_shot_og_template_yaml
description: Toliau pateikiami klausimai su atsakymų variantais (su atsakymais) apie
Professional certification.
process_docs: !function 'utils.process_professional_certification'
task: include_base_44_lithuanian_few_shot_og_professional_certification
include: _lithuanian_few_shot_og_template_yaml
description: Toliau pateikiami klausimai su atsakymų variantais (su atsakymais) apie
Social Science.
process_docs: !function 'utils.process_social_science'
task: include_base_44_lithuanian_few_shot_og_social_science
include: _lithuanian_few_shot_og_template_yaml
description: Toliau pateikiami klausimai su atsakymų variantais (su atsakymais) apie
STEM.
process_docs: !function 'utils.process_stem'
task: include_base_44_lithuanian_few_shot_og_stem
from functools import partial
CATEGORIES = [
"Applied Science",
"Arts & Humanities",
"Business & Commerce",
"Driving License",
"General knowledge",
"Health oriented education",
"Marine License",
"Medical License",
"Professional certification",
"STEM",
"Social Science",
]
def process_docs(dataset, category):
return dataset.filter(lambda x: x["domain"] == category)
process_functions = {
f"process_{category.lower().replace(' & ', '_').replace(' ', '_')}": partial(
process_docs, category=category
)
for category in CATEGORIES
}
globals().update(process_functions)
group: include_base_44_malay
task:
- include_base_44_malay_few_shot_og_social_science
- include_base_44_malay_few_shot_og_business_commerce
- include_base_44_malay_few_shot_og_arts_humanities
aggregate_metric_list:
- metric: acc
weight_by_size: true
metadata:
version: 0.0
dataset_path: CohereForAI/include-base-44
dataset_name: Malay
test_split: test
output_type: multiple_choice
doc_to_text: "{{question.strip()}}\nA. {{option_a}}\nB. {{option_b}}\nC. {{option_c}}\n
D. {{option_d}}\nAnswer:"
doc_to_choice:
- A
- B
- C
- D
doc_to_target: answer
metric_list:
- metric: acc
aggregation: mean
higher_is_better: true
metadata:
version: 0.0
include: _malay_few_shot_og_template_yaml
description: Berikut ialah soalan aneka pilihan (dengan jawapan) tentang Arts & Humanities.
process_docs: !function 'utils.process_arts_humanities'
task: include_base_44_malay_few_shot_og_arts_humanities
include: _malay_few_shot_og_template_yaml
description: Berikut ialah soalan aneka pilihan (dengan jawapan) tentang Business
& Commerce.
process_docs: !function 'utils.process_business_commerce'
task: include_base_44_malay_few_shot_og_business_commerce
include: _malay_few_shot_og_template_yaml
description: Berikut ialah soalan aneka pilihan (dengan jawapan) tentang Social Science.
process_docs: !function 'utils.process_social_science'
task: include_base_44_malay_few_shot_og_social_science
from functools import partial
CATEGORIES = [
"Applied Science",
"Arts & Humanities",
"Business & Commerce",
"Driving License",
"General knowledge",
"Health oriented education",
"Marine License",
"Medical License",
"Professional certification",
"STEM",
"Social Science",
]
def process_docs(dataset, category):
return dataset.filter(lambda x: x["domain"] == category)
process_functions = {
f"process_{category.lower().replace(' & ', '_').replace(' ', '_')}": partial(
process_docs, category=category
)
for category in CATEGORIES
}
globals().update(process_functions)
group: include_base_44_malayalam
task:
- include_base_44_malayalam_few_shot_og_stem
- include_base_44_malayalam_few_shot_og_marine_license
- include_base_44_malayalam_few_shot_og_health_oriented_education
- include_base_44_malayalam_few_shot_og_arts_humanities
- include_base_44_malayalam_few_shot_og_social_science
- include_base_44_malayalam_few_shot_og_general_knowledge
aggregate_metric_list:
- metric: acc
weight_by_size: true
metadata:
version: 0.0
dataset_path: CohereForAI/include-base-44
dataset_name: Malayalam
test_split: test
output_type: multiple_choice
doc_to_text: "{{question.strip()}}\nA. {{option_a}}\nB. {{option_b}}\nC. {{option_c}}\n
D. {{option_d}}\nAnswer:"
doc_to_choice:
- A
- B
- C
- D
doc_to_target: answer
metric_list:
- metric: acc
aggregation: mean
higher_is_better: true
metadata:
version: 0.0
include: _malayalam_few_shot_og_template_yaml
description: ഇനിപ്പറയുന്നവ Arts & Humanities നെക്കുറിച്ചുള്ള മൾട്ടിപ്പിൾ ചോയ്‌സ് ചോദ്യങ്ങളാണ്
(ഉത്തരങ്ങളോടെ).
process_docs: !function 'utils.process_arts_humanities'
task: include_base_44_malayalam_few_shot_og_arts_humanities
include: _malayalam_few_shot_og_template_yaml
description: ഇനിപ്പറയുന്നവ General knowledge നെക്കുറിച്ചുള്ള മൾട്ടിപ്പിൾ ചോയ്‌സ് ചോദ്യങ്ങളാണ്
(ഉത്തരങ്ങളോടെ).
process_docs: !function 'utils.process_general_knowledge'
task: include_base_44_malayalam_few_shot_og_general_knowledge
include: _malayalam_few_shot_og_template_yaml
description: ഇനിപ്പറയുന്നവ Health oriented education നെക്കുറിച്ചുള്ള മൾട്ടിപ്പിൾ ചോയ്‌സ്
ചോദ്യങ്ങളാണ് (ഉത്തരങ്ങളോടെ).
process_docs: !function 'utils.process_health_oriented_education'
task: include_base_44_malayalam_few_shot_og_health_oriented_education
include: _malayalam_few_shot_og_template_yaml
description: ഇനിപ്പറയുന്നവ Marine License നെക്കുറിച്ചുള്ള മൾട്ടിപ്പിൾ ചോയ്‌സ് ചോദ്യങ്ങളാണ്
(ഉത്തരങ്ങളോടെ).
process_docs: !function 'utils.process_marine_license'
task: include_base_44_malayalam_few_shot_og_marine_license
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment