Commit 3041681f authored by silencealiang's avatar silencealiang
Browse files

init

parent 291fc518
Pipeline #2557 canceled with stages
"dataset_name": "tracking_shuffled_objects_five_objects"
"description": "A task requiring determining the final positions of a set of objects given their initial positions and a description of a sequence of swaps.\n\n"
"doc_to_text": "Q: Alice, Bob, and Claire are playing a game. At the start of the game, they are each holding a ball: Alice has a yellow ball, Bob has a blue ball, and Claire has a pink ball.\nAs the game progresses, pairs of players trade balls. First, Claire and Alice swap balls. Then, Alice and Bob swap balls. Finally, Claire and Bob swap balls. At the end of the game, Bob has the\nOptions:\n(A) yellow ball\n(B) blue ball\n(C) pink ball\nA: (A)\n\nQ: Alice, Bob, and Claire are playing a game. At the start of the game, they are each holding a ball: Alice has a white ball, Bob has a purple ball, and Claire has a pink ball.\nAs the game progresses, pairs of players trade balls. First, Bob and Alice swap balls. Then, Bob and Claire swap balls. Finally, Bob and Alice swap balls. At the end of the game, Alice has the\nOptions:\n(A) white ball\n(B) purple ball\n(C) pink ball\nA: (C)\n\nQ: Alice, Bob, and Claire are dancers at a square dance. At the start of a song, they each have a partner: Alice is dancing with Lola, Bob is dancing with Rodrigo, and Claire is dancing with Patrick.\nThroughout the song, the dancers often trade partners. First, Alice and Bob switch partners. Then, Claire and Bob switch partners. Finally, Bob and Alice switch partners. At the end of the dance, Alice is dancing with\nOptions:\n(A) Lola\n(B) Rodrigo\n(C) Patrick\nA: (C)\n\nQ: {{input}}\nA:"
"include": "_fewshot_template_yaml"
"task": "bbh_fewshot_tracking_shuffled_objects_five_objects"
"dataset_name": "tracking_shuffled_objects_seven_objects"
"description": "A task requiring determining the final positions of a set of objects given their initial positions and a description of a sequence of swaps.\n\n"
"doc_to_text": "Q: Alice, Bob, and Claire are playing a game. At the start of the game, they are each holding a ball: Alice has a yellow ball, Bob has a blue ball, and Claire has a pink ball.\nAs the game progresses, pairs of players trade balls. First, Claire and Alice swap balls. Then, Alice and Bob swap balls. Finally, Claire and Bob swap balls. At the end of the game, Bob has the\nOptions:\n(A) yellow ball\n(B) blue ball\n(C) pink ball\nA: (A)\n\nQ: Alice, Bob, and Claire are playing a game. At the start of the game, they are each holding a ball: Alice has a white ball, Bob has a purple ball, and Claire has a pink ball.\nAs the game progresses, pairs of players trade balls. First, Bob and Alice swap balls. Then, Bob and Claire swap balls. Finally, Bob and Alice swap balls. At the end of the game, Alice has the\nOptions:\n(A) white ball\n(B) purple ball\n(C) pink ball\nA: (C)\n\nQ: Alice, Bob, and Claire are dancers at a square dance. At the start of a song, they each have a partner: Alice is dancing with Lola, Bob is dancing with Rodrigo, and Claire is dancing with Patrick.\nThroughout the song, the dancers often trade partners. First, Alice and Bob switch partners. Then, Claire and Bob switch partners. Finally, Bob and Alice switch partners. At the end of the dance, Alice is dancing with\nOptions:\n(A) Lola\n(B) Rodrigo\n(C) Patrick\nA: (C)\n\nQ: {{input}}\nA:"
"include": "_fewshot_template_yaml"
"task": "bbh_fewshot_tracking_shuffled_objects_seven_objects"
"dataset_name": "tracking_shuffled_objects_three_objects"
"description": "A task requiring determining the final positions of a set of objects given their initial positions and a description of a sequence of swaps.\n\n"
"doc_to_text": "Q: Alice, Bob, and Claire are playing a game. At the start of the game, they are each holding a ball: Alice has a yellow ball, Bob has a blue ball, and Claire has a pink ball.\nAs the game progresses, pairs of players trade balls. First, Claire and Alice swap balls. Then, Alice and Bob swap balls. Finally, Claire and Bob swap balls. At the end of the game, Bob has the\nOptions:\n(A) yellow ball\n(B) blue ball\n(C) pink ball\nA: (A)\n\nQ: Alice, Bob, and Claire are playing a game. At the start of the game, they are each holding a ball: Alice has a white ball, Bob has a purple ball, and Claire has a pink ball.\nAs the game progresses, pairs of players trade balls. First, Bob and Alice swap balls. Then, Bob and Claire swap balls. Finally, Bob and Alice swap balls. At the end of the game, Alice has the\nOptions:\n(A) white ball\n(B) purple ball\n(C) pink ball\nA: (C)\n\nQ: Alice, Bob, and Claire are dancers at a square dance. At the start of a song, they each have a partner: Alice is dancing with Lola, Bob is dancing with Rodrigo, and Claire is dancing with Patrick.\nThroughout the song, the dancers often trade partners. First, Alice and Bob switch partners. Then, Claire and Bob switch partners. Finally, Bob and Alice switch partners. At the end of the dance, Alice is dancing with\nOptions:\n(A) Lola\n(B) Rodrigo\n(C) Patrick\nA: (C)\n\nQ: {{input}}\nA:"
"include": "_fewshot_template_yaml"
"task": "bbh_fewshot_tracking_shuffled_objects_three_objects"
"dataset_name": "web_of_lies"
"description": "Evaluate a random boolean function expressed as a word problem.\n\n"
"doc_to_text": "Q: Question: Fidel tells the truth. Jerry says Fidel tells the truth. Vina says Jerry tells the truth. Millicent says Vina lies. Raymond says Millicent lies. Does Raymond tell the truth?\nA: Yes\n\nQ: Question: Kristian lies. Millie says Kristian lies. Maybelle says Millie tells the truth. Fidel says Maybelle lies. Leda says Fidel lies. Does Leda tell the truth?\nA: Yes\n\nQ: Question: Kristian tells the truth. Michaela says Kristian lies. Raymond says Michaela tells the truth. Osvaldo says Raymond tells the truth. Jamey says Osvaldo tells the truth. Does Jamey tell the truth?\nA: No\n\nQ: {{input}}\nA:"
"include": "_fewshot_template_yaml"
"task": "bbh_fewshot_web_of_lies"
"dataset_name": "word_sorting"
"description": "Sort a list of words.\n\n"
"doc_to_text": "Q: Sort the following words alphabetically: List: oven costume counterpart\nA: costume counterpart oven\n\nQ: Sort the following words alphabetically: List: hypochlorite ponderosa phone credulity\nA: credulity hypochlorite phone ponderosa\n\nQ: Sort the following words alphabetically: List: newt arson parthia seismography mugho aspect census\nA: arson aspect census mugho newt parthia seismography\n\nQ: {{input}}\nA:"
"include": "_fewshot_template_yaml"
"task": "bbh_fewshot_word_sorting"
group: bbh_zeroshot
dataset_path: lukaemon/bbh
output_type: generate_until
test_split: test
doc_to_target: "{{target}}"
metric_list:
- metric: exact_match
aggregation: mean
higher_is_better: true
ignore_case: true
# ignore_punctuation: true
regexes_to_ignore:
- "\\.$"
- ","
- "\n"
- "\\\\"
- '"'
generation_kwargs:
until:
- "</s>"
- "Q:"
- "<|im_end|>"
do_sample: false
temperature: 0.0
num_fewshot: 0
metadata:
version: 2.0
"dataset_name": "boolean_expressions"
"description": "Evaluate the result of a random Boolean expression.\n\n"
"doc_to_text": "Q: {{input}}\nA:"
"include": "_zeroshot_template_yaml"
"task": "bbh_zeroshot_boolean_expressions"
filter_list:
- name: "strict-match"
filter:
- function: "take_first"
- name: "flexible-extract"
filter:
- function: "regex"
group_select: 0
regex_pattern: "\\b(True|False)\\b"
- function: "take_first"
"dataset_name": "causal_judgement"
"description": "Answer questions about causal attribution.\n\n"
"doc_to_text": "Q: {{input}}\nA:"
"include": "_zeroshot_template_yaml"
"task": "bbh_zeroshot_causal_judgement"
filter_list:
- name: "strict-match"
filter:
- function: "take_first"
- name: "flexible-extract"
filter:
- function: "regex"
group_select: 0
regex_pattern: "\\b(Yes|No|yes|no)\\b"
- function: "take_first"
"dataset_name": "date_understanding"
"description": "Infer the date from context.\n\n"
"doc_to_text": "Q: {{input}}\nA:"
"include": "_zeroshot_template_yaml"
"task": "bbh_zeroshot_date_understanding"
filter_list:
- name: "strict-match"
filter:
- function: "take_first"
- name: "flexible-extract"
filter:
- function: !function utils.MultiChoiceRegexFilter
group_select: 0
ignore_case: true
ignore_punctuation: true
regex_pattern: "(\\([A-Z]\\))"
- function: "take_first"
"dataset_name": "disambiguation_qa"
"description": "Clarify the meaning of sentences with ambiguous pronouns.\n\n"
"doc_to_text": "Q: {{input}}\nA:"
"include": "_zeroshot_template_yaml"
"task": "bbh_zeroshot_disambiguation_qa"
filter_list:
- name: "strict-match"
filter:
- function: "take_first"
- name: "flexible-extract"
filter:
- function: !function utils.MultiChoiceRegexFilter
group_select: 0
ignore_case: true
ignore_punctuation: true
regex_pattern: "(\\([A-Z]\\))"
- function: "take_first"
"dataset_name": "dyck_languages"
"description": "Correctly close a Dyck-n word.\n\n"
"doc_to_text": "Q: {{input}}\nA:"
"include": "_zeroshot_template_yaml"
"task": "bbh_zeroshot_dyck_languages"
filter_list:
- name: "strict-match"
filter:
- function: "take_first"
- name: "flexible-extract"
filter:
- function: "regex"
group_select: 0
regex_pattern: "(?<= )([\" \\[\\(<{}>\\)\\]]+)|([\" \\[\\(<{}>\\)\\]]+)"
- function: "take_first"
"dataset_name": "formal_fallacies"
"description": "Distinguish deductively valid arguments from formal fallacies.\n\n"
"doc_to_text": "Q: {{input}}\nA:"
"include": "_zeroshot_template_yaml"
"task": "bbh_zeroshot_formal_fallacies"
filter_list:
- name: "strict-match"
filter:
- function: "take_first"
- name: "flexible-extract"
filter:
- function: "regex"
group_select: 0
regex_pattern: "\\b(valid|invalid)\\b"
- function: "take_first"
"dataset_name": "geometric_shapes"
"description": "Name geometric shapes from their SVG paths.\n\n"
"doc_to_text": "Q: {{input}}\nA:"
"include": "_zeroshot_template_yaml"
"task": "bbh_zeroshot_geometric_shapes"
filter_list:
- name: "strict-match"
filter:
- function: "take_first"
- name: "flexible-extract"
filter:
- function: !function utils.MultiChoiceRegexFilter
group_select: 0
ignore_case: true
ignore_punctuation: true
regex_pattern: "(\\([A-Z]\\))"
- function: "take_first"
"dataset_name": "hyperbaton"
"description": "Order adjectives correctly in English sentences.\n\n"
"doc_to_text": "Q: {{input}}\nA:"
"include": "_zeroshot_template_yaml"
"task": "bbh_zeroshot_hyperbaton"
filter_list:
- name: "strict-match"
filter:
- function: "take_first"
- name: "flexible-extract"
filter:
- function: !function utils.MultiChoiceRegexFilter
group_select: 0
ignore_case: true
ignore_punctuation: true
regex_pattern: "(\\([A-Z]\\))"
- function: "take_first"
"dataset_name": "logical_deduction_five_objects"
"description": "A logical deduction task which requires deducing the order of a sequence of objects.\n\n"
"doc_to_text": "Q: {{input}}\nA:"
"include": "_zeroshot_template_yaml"
"task": "bbh_zeroshot_logical_deduction_five_objects"
filter_list:
- name: "strict-match"
filter:
- function: "take_first"
- name: "flexible-extract"
filter:
- function: !function utils.MultiChoiceRegexFilter
group_select: 0
ignore_case: true
ignore_punctuation: true
regex_pattern: "(\\([A-Z]\\))"
- function: "take_first"
"dataset_name": "logical_deduction_seven_objects"
"description": "A logical deduction task which requires deducing the order of a sequence of objects.\n\n"
"doc_to_text": "Q: {{input}}\nA:"
"include": "_zeroshot_template_yaml"
"task": "bbh_zeroshot_logical_deduction_seven_objects"
filter_list:
- name: "strict-match"
filter:
- function: "take_first"
- name: "flexible-extract"
filter:
- function: !function utils.MultiChoiceRegexFilter
group_select: 0
ignore_case: true
ignore_punctuation: true
regex_pattern: "(\\([A-Z]\\))"
- function: "take_first"
"dataset_name": "logical_deduction_three_objects"
"description": "A logical deduction task which requires deducing the order of a sequence of objects.\n\n"
"doc_to_text": "Q: {{input}}\nA:"
"include": "_zeroshot_template_yaml"
"task": "bbh_zeroshot_logical_deduction_three_objects"
filter_list:
- name: "strict-match"
filter:
- function: "take_first"
- name: "flexible-extract"
filter:
- function: !function utils.MultiChoiceRegexFilter
group_select: 0
ignore_case: true
ignore_punctuation: true
regex_pattern: "(\\([A-Z]\\))"
- function: "take_first"
"dataset_name": "movie_recommendation"
"description": "Recommend movies similar to the given list of movies.\n\n"
"doc_to_text": "Q: {{input}}\nA:"
"include": "_zeroshot_template_yaml"
"task": "bbh_zeroshot_movie_recommendation"
filter_list:
- name: "strict-match"
filter:
- function: "take_first"
- name: "flexible-extract"
filter:
- function: !function utils.MultiChoiceRegexFilter
group_select: 0
ignore_case: true
ignore_punctuation: true
regex_pattern: "(\\([A-Z]\\))"
- function: "take_first"
"dataset_name": "multistep_arithmetic_two"
"description": "Solve multi-step arithmetic problems.\n\n"
"doc_to_text": "Q: {{input}}\nA:"
"include": "_zeroshot_template_yaml"
"task": "bbh_zeroshot_multistep_arithmetic_two"
filter_list:
- name: "strict-match"
filter:
- function: "take_first"
- name: "flexible-extract"
filter:
- function: !function utils.NumberParseRegexFilter
group_select: 0
regex_pattern: "([-0-9]+)"
- function: "take_first"
"dataset_name": "navigate"
"description": "Given a series of navigation instructions, determine whether one would end up back at the starting point.\n\n"
"doc_to_text": "Q: {{input}}\nA:"
"include": "_zeroshot_template_yaml"
"task": "bbh_zeroshot_navigate"
filter_list:
- name: "strict-match"
filter:
- function: "take_first"
- name: "flexible-extract"
filter:
- function: "regex"
group_select: 0
regex_pattern: "\\b(Yes|No|yes|no)\\b"
- function: "take_first"
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment