Unverified Commit a2af2101 authored by Yen-Ting Lin's avatar Yen-Ting Lin Committed by GitHub
Browse files

Merge branch 'EleutherAI:main' into main

parents 82cb25c1 d5f39bf8
dataset_name: boolean_expressions
description: 'Evaluate the result of a random Boolean expression.
'
doc_to_choice: ["False", "True"]
fewshot_config:
sampler: first_n
samples:
- input: not ( ( not not True ) ) is
target: 'False'
- input: True and False and not True and True is
target: 'False'
- input: not not ( not ( False ) ) is
target: 'True'
include: _fewshot_template_yaml
task: leaderboard_bbh_boolean_expressions
dataset_name: causal_judgement
description: 'Answer questions about causal attribution.
'
doc_to_choice: ["Yes", "No"]
fewshot_config:
sampler: first_n
samples:
- input: 'How would a typical person answer each of the following questions about
causation?
Frank T., had an ongoing dispute with his neighbor over a stretch of land and
one day decided to shoot his neighbor in the body. Frank T. had no experience
with guns, his hand slipped on the barrel of the gun, and the shot went wild.
Nonetheless, the bullet bounced off a large boulder several feet away and hit
the neighbor''s body, causing significant injury. Did Frank T. intentionally
shoot his neighbor in the body?
Options:
- Yes
- No'
target: 'No'
- input: 'How would a typical person answer each of the following questions about
causation?
Suzy and Billy are working on a project that is very important for our nation''s
security. The boss tells them both: "Be sure that you are here at exactly 9
am. It is absolutely essential that you arrive at that time." Both Billy and
Suzy arrive at 9 am. As it happens, there was a motion detector installed in
the room where they arrived. The motion detector was set up to be triggered
if at least one person appeared in the room at the same time. So the motion
detector went off. Did Billy cause the motion detector to go off?
Options:
- Yes
- No'
target: 'Yes'
- input: 'How would a typical person answer each of the following questions about
causation?
George and his sister Lena reunite at their parents'' house for Thanksgiving.
Whereas George just got into medical school, Lena is unhappy in her marriage
and recently lost her job. Over the course of the day, George and Lena get into
a number of heated arguments. Later in the afternoon they play a game of darts.
They split the first two games, and the third game is close until the end. Who
will win comes down to George''s last shot. If he hits a high point region,
he wins; if he hits a low point region, Lena wins. George thinks of the difficult
time Lena is having, and he really wants to let her win. He aims the dart at
the low point region. He sets up his shot and the dart lands in the low point
region. After his shot, Lena wins the game and is very happy. Did George hit
the low point region intentionally?
Options:
- Yes
- No'
target: 'Yes'
include: _fewshot_template_yaml
task: leaderboard_bbh_causal_judgement
dataset_name: date_understanding
description: 'Infer the date from context.
'
doc_to_choice: ["(A)", "(B)", "(C)", "(D)", "(E)", "(F)"]
fewshot_config:
sampler: first_n
samples:
- input: 'Today is Christmas Eve of 1937. What is the date 10 days ago in MM/DD/YYYY?
Options:
(A) 12/14/2026
(B) 12/14/1950
(C) 12/14/2007
(D) 12/14/1937
(E) 07/14/1938
(F) 12/14/1988'
target: (D)
- input: 'Tomorrow is 11/12/2019. What is the date one year ago from today in MM/DD/YYYY?
Options:
(A) 09/04/2018
(B) 11/11/2018
(C) 08/25/2018
(D) 11/02/2018
(E) 11/04/2018'
target: (B)
- input: 'Jane and John married on Jan 2, 1958. It is their 5-year anniversary today.
What is the date tomorrow in MM/DD/YYYY?
Options:
(A) 01/11/1961
(B) 01/03/1963
(C) 01/18/1961
(D) 10/14/1960
(E) 01/03/1982
(F) 12/03/1960'
target: (B)
include: _fewshot_template_yaml
task: leaderboard_bbh_date_understanding
dataset_name: disambiguation_qa
description: 'Clarify the meaning of sentences with ambiguous pronouns.
'
doc_to_choice: ["(A)", "(B)", "(C)"]
fewshot_config:
sampler: first_n
samples:
- input: 'In the following sentences, explain the antecedent of the pronoun (which
thing the pronoun refers to), or state that it is ambiguous.
Sentence: The chief told the counselor that they took the day off.
Options:
(A) The chief took the day off
(B) The counselor took the day off
(C) Ambiguous'
target: (A)
- input: 'In the following sentences, explain the antecedent of the pronoun (which
thing the pronoun refers to), or state that it is ambiguous.
Sentence: The manager sent a message to the secretary, but he didn''t reply
yet.
Options:
(A) The secretary didn''t reply yet
(B) The manager didn''t reply yet
(C) Ambiguous'
target: (A)
- input: 'In the following sentences, explain the antecedent of the pronoun (which
thing the pronoun refers to), or state that it is ambiguous.
Sentence: Bailey will plan to meet the director at his office
Options:
(A) It will be Bailey''s office
(B) It will be the director''s office
(C) Ambiguous'
target: (C)
include: _fewshot_template_yaml
task: leaderboard_bbh_disambiguation_qa
dataset_name: formal_fallacies
description: 'Distinguish deductively valid arguments from formal fallacies.
'
doc_to_choice: ["valid", "invalid"]
fewshot_config:
sampler: first_n
samples:
- input: '"It is not always easy to see who is related to whom -- and in which ways.
The following argument pertains to this question: To begin with, Lesley is a
close friend of Fernando. Moreover, being a close friend of Fernando or a schoolmate
of Lowell is sufficient for being a great-grandfather of Leroy. It follows that
Lesley is a great-grandfather of Leroy."
Is the argument, given the explicitly stated premises, deductively valid or
invalid?
Options:
- valid
- invalid'
target: valid
- input: '"It is not always easy to see who is related to whom -- and in which ways.
The following argument pertains to this question: Whoever is not a great-grandfather
of Clyde is a stepbrother of Brian. Being an ancestor of Dana is sufficient
for not being a great-grandfather of Clyde. We may conclude: Everyone who is
an ancestor of Dana is a stepbrother of Brian, too."
Is the argument, given the explicitly stated premises, deductively valid or
invalid?
Options:
- valid
- invalid'
target: valid
- input: '"It is not always easy to grasp who is consuming which products. The following
argument pertains to this question: Every infrequent user of Paul Mitchell shampoo
is either a rare consumer of Nioxin shampoo or a loyal buyer of Caress soap,
or both. No regular consumer of Lush soap is a rare consumer of Nioxin shampoo
and, in the same time, a loyal buyer of Caress soap. It follows that whoever
is an infrequent user of Paul Mitchell shampoo is not a regular consumer of
Lush soap."
Is the argument, given the explicitly stated premises, deductively valid or
invalid?
Options:
- valid
- invalid'
target: invalid
include: _fewshot_template_yaml
task: leaderboard_bbh_formal_fallacies
dataset_name: geometric_shapes
description: 'Name geometric shapes from their SVG paths.
'
doc_to_choice: ["(A)","(B)","(C)","(D)","(E)","(F)","(G)","(H)","(I)","(J)","(K)"]
fewshot_config:
sampler: first_n
samples:
- input: 'This SVG path element <path d="M 31.00,73.00 L 32.00,59.00 L 44.00,50.00
L 49.00,41.00 L 64.00,37.00 L 71.00,55.00 L 64.00,76.00 L 52.00,61.00 L 31.00,73.00"/>
draws a
Options:
(A) circle
(B) heptagon
(C) hexagon
(D) kite
(E) line
(F) octagon
(G) pentagon
(H) rectangle
(I) sector
(J) triangle'
target: (F)
- input: 'This SVG path element <path d="M 14.19,26.04 L 51.43,39.21 L 58.44,36.69
L 56.63,30.17 L 48.53,26.66 L 14.19,26.04"/> draws a
Options:
(A) circle
(B) heptagon
(C) hexagon
(D) kite
(E) line
(F) octagon
(G) pentagon
(H) rectangle
(I) sector
(J) triangle'
target: (G)
- input: 'This SVG path element <path d="M 41.00,43.00 L 37.00,34.00 L 41.00,33.00
L 45.00,34.00 L 41.00,43.00"/> draws a
Options:
(A) circle
(B) heptagon
(C) hexagon
(D) kite
(E) line
(F) octagon
(G) pentagon
(H) rectangle
(I) sector
(J) triangle'
target: (D)
include: _fewshot_template_yaml
task: leaderboard_bbh_geometric_shapes
dataset_name: hyperbaton
description: 'Order adjectives correctly in English sentences.
'
doc_to_choice: ["(A)", "(B)"]
fewshot_config:
sampler: first_n
samples:
- input: 'Which sentence has the correct adjective order:
Options:
(A) rubber terrible ship
(B) terrible rubber ship'
target: (B)
- input: 'Which sentence has the correct adjective order:
Options:
(A) repulsive small Brazilian exercise ship
(B) Brazilian repulsive exercise small ship'
target: (A)
- input: 'Which sentence has the correct adjective order:
Options:
(A) blue gold wonderful square shoe
(B) wonderful square blue gold shoe'
target: (B)
include: _fewshot_template_yaml
task: leaderboard_bbh_hyperbaton
dataset_name: logical_deduction_five_objects
description: 'A logical deduction task which requires deducing the order of a sequence
of objects.
'
doc_to_choice: ["(A)","(B)","(C)","(D)","(E)"]
fewshot_config:
sampler: first_n
samples:
- input: 'The following paragraphs each describe a set of three objects arranged
in a fixed order. The statements are logically consistent within each paragraph.
In a golf tournament, there were three golfers: Amy, Eli, and Eve. Eve finished
above Amy. Eli finished below Amy.
Options:
(A) Amy finished last
(B) Eli finished last
(C) Eve finished last'
target: (B)
- input: 'The following paragraphs each describe a set of three objects arranged
in a fixed order. The statements are logically consistent within each paragraph.
On a shelf, there are three books: a white book, a green book, and an orange
book. The green book is to the right of the white book. The orange book is the
rightmost.
Options:
(A) The white book is the leftmost
(B) The green book is the leftmost
(C) The orange book is the leftmost'
target: (A)
- input: 'The following paragraphs each describe a set of three objects arranged
in a fixed order. The statements are logically consistent within each paragraph.
On a shelf, there are three books: a red book, a gray book, and a white book.
The white book is to the left of the gray book. The red book is the second from
the left.
Options:
(A) The red book is the leftmost
(B) The gray book is the leftmost
(C) The white book is the leftmost'
target: (C)
include: _fewshot_template_yaml
task: leaderboard_bbh_logical_deduction_five_objects
dataset_name: logical_deduction_seven_objects
description: 'A logical deduction task which requires deducing the order of a sequence
of objects.
'
doc_to_choice: ["(A)","(B)","(C)","(D)","(E)","(F)","(G)"]
fewshot_config:
sampler: first_n
samples:
- input: 'The following paragraphs each describe a set of three objects arranged
in a fixed order. The statements are logically consistent within each paragraph.
In a golf tournament, there were three golfers: Amy, Eli, and Eve. Eve finished
above Amy. Eli finished below Amy.
Options:
(A) Amy finished last
(B) Eli finished last
(C) Eve finished last'
target: (B)
- input: 'The following paragraphs each describe a set of three objects arranged
in a fixed order. The statements are logically consistent within each paragraph.
On a shelf, there are three books: a white book, a green book, and an orange
book. The green book is to the right of the white book. The orange book is the
rightmost.
Options:
(A) The white book is the leftmost
(B) The green book is the leftmost
(C) The orange book is the leftmost'
target: (A)
- input: 'The following paragraphs each describe a set of three objects arranged
in a fixed order. The statements are logically consistent within each paragraph.
On a shelf, there are three books: a red book, a gray book, and a white book.
The white book is to the left of the gray book. The red book is the second from
the left.
Options:
(A) The red book is the leftmost
(B) The gray book is the leftmost
(C) The white book is the leftmost'
target: (C)
include: _fewshot_template_yaml
task: leaderboard_bbh_logical_deduction_seven_objects
dataset_name: logical_deduction_three_objects
description: 'A logical deduction task which requires deducing the order of a sequence
of objects.
'
doc_to_choice: ["(A)","(B)","(C)"]
fewshot_config:
sampler: first_n
samples:
- input: 'The following paragraphs each describe a set of three objects arranged
in a fixed order. The statements are logically consistent within each paragraph.
In a golf tournament, there were three golfers: Amy, Eli, and Eve. Eve finished
above Amy. Eli finished below Amy.
Options:
(A) Amy finished last
(B) Eli finished last
(C) Eve finished last'
target: (B)
- input: 'The following paragraphs each describe a set of three objects arranged
in a fixed order. The statements are logically consistent within each paragraph.
On a shelf, there are three books: a white book, a green book, and an orange
book. The green book is to the right of the white book. The orange book is the
rightmost.
Options:
(A) The white book is the leftmost
(B) The green book is the leftmost
(C) The orange book is the leftmost'
target: (A)
- input: 'The following paragraphs each describe a set of three objects arranged
in a fixed order. The statements are logically consistent within each paragraph.
On a shelf, there are three books: a red book, a gray book, and a white book.
The white book is to the left of the gray book. The red book is the second from
the left.
Options:
(A) The red book is the leftmost
(B) The gray book is the leftmost
(C) The white book is the leftmost'
target: (C)
include: _fewshot_template_yaml
task: leaderboard_bbh_logical_deduction_three_objects
dataset_name: movie_recommendation
description: 'Recommend movies similar to the given list of movies.
'
doc_to_choice: ["(A)","(B)","(C)","(D)","(E)","(F)"]
fewshot_config:
sampler: first_n
samples:
- input: 'Find a movie similar to Star Wars Episode IV - A New Hope, Indiana Jones
and the Last Crusade, Star Wars Episode V - The Empire Strikes Back, The Big
Lebowski:
Options:
(A) Tetsuo
(B) the Ironman
(C) The Princess Bride
(D) The Barkley Marathons The Race That Eats Its Young
(E) Bug'
target: (C)
- input: 'Find a movie similar to Twister, The Silence of the Lambs, Independence
Day, Braveheart:
Options:
(A) They Shoot Horses
(B) Don''t They
(C) Forrest Gump
(D) The Salton Sea
(E) Extreme Days'
target: (C)
- input: 'Find a movie similar to Minority Report, Total Recall, Inside Out, Forrest
Gump:
Options:
(A) Phenomena
(B) Lilting
(C) Catwoman
(D) Edge of Tomorrow'
target: (D)
include: _fewshot_template_yaml
task: leaderboard_bbh_movie_recommendation
dataset_name: navigate
description: 'Given a series of navigation instructions, determine whether one would
end up back at the starting point.
'
doc_to_choice: ["Yes","No"]
fewshot_config:
sampler: first_n
samples:
- input: 'If you follow these instructions, do you return to the starting point?
Turn left. Turn around. Turn left. Take 7 steps. Take 2 steps. Take 4 steps.
Take 8 steps.
Options:
- Yes
- No'
target: 'No'
- input: 'If you follow these instructions, do you return to the starting point?
Turn around. Take 1 step. Take 6 steps. Turn around. Take 6 steps. Take 9 steps.
Take 1 step.
Options:
- Yes
- No'
target: 'No'
- input: 'If you follow these instructions, do you return to the starting point?
Always face forward. Take 2 steps right. Take 9 steps left. Take 7 steps right.
Options:
- Yes
- No'
target: 'Yes'
include: _fewshot_template_yaml
task: leaderboard_bbh_navigate
dataset_name: object_counting
description: 'Questions that involve enumerating objects and asking the model to count
them.
'
doc_to_choice: ["0","1","2","3","4","5","6","7","8","9","10", "11", "12", "13", "14", "15", "16", "17", "18"]
fewshot_config:
sampler: first_n
samples:
- input: I have a blackberry, a clarinet, a nectarine, a plum, a strawberry, a banana,
a flute, an orange, and a violin. How many fruits do I have?
target: '6'
- input: I have an orange, a raspberry, two peaches, a blackberry, an apple, a grape,
a nectarine, and three plums. How many fruits do I have?
target: '11'
- input: I have a lettuce head, a head of broccoli, an onion, a stalk of celery,
two carrots, a garlic, and a yam. How many vegetables do I have?
target: '8'
include: _fewshot_template_yaml
task: leaderboard_bbh_object_counting
dataset_name: penguins_in_a_table
description: 'Answer questions about a table of penguins and their attributes.
'
doc_to_choice: ["(A)","(B)","(C)","(D)","(E)"]
fewshot_config:
sampler: first_n
samples:
- input: 'Here is a table where the first line is a header and each subsequent line
is a penguin: name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard,
5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15 For example: the age of Louis
is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm. We now add
a penguin to the table:
James, 12, 90, 12
How many penguins are less than 8 years old?
Options:
(A) 1
(B) 2
(C) 3
(D) 4
(E) 5'
target: (B)
- input: 'Here is a table where the first line is a header and each subsequent line
is a penguin: name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard,
5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15 For example: the age of Louis
is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm. Which is
the youngest penguin?
Options:
(A) Louis
(B) Bernard
(C) Vincent
(D) Gwen
(E) James'
target: (B)
- input: 'Here is a table where the first line is a header and each subsequent line
is a penguin: name, age, height (cm), weight (kg) Louis, 7, 50, 11 Bernard,
5, 80, 13 Vincent, 9, 60, 11 Gwen, 8, 70, 15 For example: the age of Louis
is 7, the weight of Gwen is 15 kg, the height of Bernard is 80 cm. What is
the name of the second penguin sorted by alphabetic order?
Options:
(A) Louis
(B) Bernard
(C) Vincent
(D) Gwen
(E) James'
target: (D)
include: _fewshot_template_yaml
task: leaderboard_bbh_penguins_in_a_table
dataset_name: reasoning_about_colored_objects
description: 'Answer extremely simple questions about the colors of objects on a surface.
'
doc_to_choice: ["(A)","(B)","(C)","(D)","(E)","(F)","(G)","(H)","(I)","(J)","(K)","(L)","(M)","(N)","(O)","(P)","(Q)","(R)"]
fewshot_config:
sampler: first_n
samples:
- input: 'On the nightstand, there is a red pencil, a purple mug, a burgundy keychain,
a fuchsia teddy bear, a black plate, and a blue stress ball. What color is the
stress ball?
Options:
(A) red
(B) orange
(C) yellow
(D) green
(E) blue
(F) brown
(G) magenta
(H) fuchsia
(I) mauve
(J) teal
(K) turquoise
(L) burgundy
(M) silver
(N) gold
(O) black
(P) grey
(Q) purple
(R) pink'
target: (E)
- input: 'On the table, you see a bunch of objects arranged in a row: a purple paperclip,
a pink stress ball, a brown keychain, a green scrunchiephone charger, a mauve
fidget spinner, and a burgundy pen. What is the color of the object directly
to the right of the stress ball?
Options:
(A) red
(B) orange
(C) yellow
(D) green
(E) blue
(F) brown
(G) magenta
(H) fuchsia
(I) mauve
(J) teal
(K) turquoise
(L) burgundy
(M) silver
(N) gold
(O) black
(P) grey
(Q) purple
(R) pink'
target: (F)
- input: 'On the nightstand, you see the following items arranged in a row: a teal
plate, a burgundy keychain, a yellow scrunchiephone charger, an orange mug,
a pink notebook, and a grey cup. How many non-orange items do you see to the
left of the teal item?
Options:
(A) zero
(B) one
(C) two
(D) three
(E) four
(F) five
(G) six'
target: (A)
include: _fewshot_template_yaml
task: leaderboard_bbh_reasoning_about_colored_objects
dataset_name: ruin_names
description: 'Select the humorous edit that ''ruins'' the input movie or musical artist
name.
'
doc_to_choice: ["(A)","(B)","(C)","(D)","(E)","(F)"]
fewshot_config:
sampler: first_n
samples:
- input: 'Which of the following is a humorous edit of this artist or movie name:
''whitesnake''?
Options:
(A) whitesnape
(B) whitesnapke
(C) whitesnuake
(D) mwhitesnake'
target: (A)
- input: 'Which of the following is a humorous edit of this artist or movie name:
''one of our dinosaurs is missing''?
Options:
(A) ofne of our dinosaurs is missing
(B) one af our dinosaurs is missing
(C) one of our dinosaurs is pissing
(D) one of our dinosaur is missing'
target: (C)
- input: 'Which of the following is a humorous edit of this artist or movie name:
''counting crows''?
Options:
(A) countingy crows
(B) counting cows
(C) courting crows
(D) coutnting crows'
target: (B)
include: _fewshot_template_yaml
task: leaderboard_bbh_ruin_names
dataset_name: salient_translation_error_detection
description: 'Detect the type of error in an English translation of a German source
sentence.
'
doc_to_choice: ["(A)","(B)","(C)","(D)","(E)","(F)"]
fewshot_config:
sampler: first_n
samples:
- input: "The following translations from German to English contain a particular
error. That error will be one of the following types: Named Entities: An entity
(names, places, locations, etc.) is changed to a different entity. Numerical
Values: Numerical values (ordinals or cardinals), dates, and/or units are changed.
Modifiers or Adjectives: The modifiers and adjectives pertaining to a noun are
changed. Negation or Antonyms: Introduce or remove a negation or change comparatives
to their antonyms. Facts: Trivial factual errors not pertaining to the above
classes are introduced in the translations. Dropped Content: A significant clause
in the translation is removed. Please identify that error. Source: In der Liste
der Baudenkmale in Lenzen (Elbe) sind alle Baudenkmale der brandenburgischen
Stadt Lenzen (Elbe) und ihrer Ortsteile aufgelistet.
Translation: In the list of architectural monuments in Lenzen all architectural
monuments of the Brandenburg city of Lenzen and its districts are listed.
The translation contains an error pertaining to
Options:
(A) Modifiers or Adjectives
(B) Numerical Values
(C) Negation or Antonyms
(D) Named Entities
(E) Dropped Content
(F) Facts"
target: (D)
- input: "The following translations from German to English contain a particular
error. That error will be one of the following types: Named Entities: An entity
(names, places, locations, etc.) is changed to a different entity. Numerical
Values: Numerical values (ordinals or cardinals), dates, and/or units are changed.
Modifiers or Adjectives: The modifiers and adjectives pertaining to a noun are
changed. Negation or Antonyms: Introduce or remove a negation or change comparatives
to their antonyms. Facts: Trivial factual errors not pertaining to the above
classes are introduced in the translations. Dropped Content: A significant clause
in the translation is removed. Please identify that error. Source: Auf dieser
Seite sind die Baudenkm\xE4ler der oberbayerischen Gro\xDFen Kreisstadt Landsberg
am Lech zusammengestellt.
Translation: On this page are compiled the architectural monuments of the town of
Landsberg am Lech.
The translation contains an error pertaining to
Options:
(A) Modifiers or Adjectives
(B) Numerical Values
(C) Negation or Antonyms
(D) Named Entities
(E) Dropped Content
(F) Facts"
target: (E)
- input: "The following translations from German to English contain a particular
error. That error will be one of the following types: Named Entities: An entity
(names, places, locations, etc.) is changed to a different entity. Numerical
Values: Numerical values (ordinals or cardinals), dates, and/or units are changed.
Modifiers or Adjectives: The modifiers and adjectives pertaining to a noun are
changed. Negation or Antonyms: Introduce or remove a negation or change comparatives
to their antonyms. Facts: Trivial factual errors not pertaining to the above
classes are introduced in the translations. Dropped Content: A significant clause
in the translation is removed. Please identify that error. Source: \u0141eba ist
eine Kleinstadt und ein Badeort im Powiat L\u0119borski der polnischen Woiwodschaft
Pommern.
Translation: Eba is not a small town and seaside resort in the Powiat L\xE9borski
county of the Pomeranian Voivodeship of Poland.\nThe translation contains an error
pertaining to
Options:
(A) Modifiers or Adjectives
(B) Numerical Values
(C) Negation or Antonyms
(D) Named Entities
(E) Dropped Content
(F) Facts"
target: (C)
include: _fewshot_template_yaml
task: leaderboard_bbh_salient_translation_error_detection
dataset_name: snarks
description: 'Determine which of two sentences is sarcastic.
According to Cambridge University Dictionary, sarcasm is "the use of remarks that
clearly mean the opposite of what they say, made in order to hurt someone''s feelings
or to criticize something in a humorous way." Sarcastic sentences often contain
satirical or ironic utterances, hyperboles, ambivalent or witty remarks.
'
doc_to_choice: ["(A)","(B)"]
fewshot_config:
sampler: first_n
samples:
- input: 'Which statement is sarcastic?
Options:
(A) Yes, because having interests and actively researching them is a huge waste
(B) Yes, because having interests and actively researching them is a huge deal'
target: (A)
- input: 'Which statement is sarcastic?
Options:
(A) No one is going to disagree with you on this. Avoiding ad hominem attacks
really help your case
(B) No one is going to disagree with you on this. Ad hominem attacks really
help your case'
target: (B)
- input: 'Which statement is sarcastic?
Options:
(A) Consistency in the league''s punishments? What do you think this is supposed
to be, politics?
(B) Consistency in the league''s punishments? What do you think this is supposed
to be, moral?'
target: (A)
include: _fewshot_template_yaml
task: leaderboard_bbh_snarks
dataset_name: sports_understanding
description: 'Determine whether an artificially constructed sentence relating to sports
is plausible or not.
'
doc_to_choice: ["yes","no"]
fewshot_config:
sampler: first_n
samples:
- input: Is the following sentence plausible? "Bam Adebayo scored a reverse layup
in the Western Conference Finals."
target: 'yes'
- input: Is the following sentence plausible? "Santi Cazorla scored a touchdown."
target: 'no'
- input: Is the following sentence plausible? "DeMar DeRozan was called for the
goal tend."
target: 'yes'
include: _fewshot_template_yaml
task: leaderboard_bbh_sports_understanding
dataset_name: temporal_sequences
description: 'Task description: Answer questions about which times certain events
could have occurred.
'
doc_to_choice: ["(A)","(B)","(C)","(D)"]
fewshot_config:
sampler: first_n
samples:
- input: 'Today, Emily went to the museum. Between what times could they have gone?
We know that:
Emily woke up at 1pm.
Elizabeth saw Emily reading at the library from 2pm to 4pm.
Jessica saw Emily watching a movie at the theater from 4pm to 5pm.
Leslie saw Emily waiting at the airport from 5pm to 6pm.
William saw Emily buying clothes at the mall from 6pm to 7pm.
The museum was closed after 7pm.
Between what times could Emily have gone to the museum?
Options:
(A) 1pm to 2pm
(B) 6pm to 7pm
(C) 5pm to 6pm
(D) 2pm to 4pm'
target: (A)
- input: 'Today, Elizabeth went to the amusement park. Between what times could
they have gone?
We know that:
Elizabeth woke up at 7am.
David saw Elizabeth fixing their computer at the electronic store from 1pm to
2pm.
Sarah saw Elizabeth playing tennis at the tennis court from 2pm to 3pm.
Susan saw Elizabeth walking towards the Statue of Liberty from 3pm to 6pm.
Andrew saw Elizabeth taking photos near the Eiffel Tower from 6pm to 9pm.
Emily saw Elizabeth getting a coffee at the cafe from 9pm to 10pm.
The amusement park was closed after 10pm.
Between what times could Elizabeth have gone to the amusement park?
Options:
(A) 7am to 1pm
(B) 9pm to 10pm
(C) 1pm to 2pm
(D) 3pm to 6pm'
target: (A)
- input: 'Today, Tiffany went to the beach. Between what times could they have gone?
We know that:
Tiffany woke up at 5am.
Betty saw Tiffany getting a coffee at the cafe from 5am to 6am.
Jessica saw Tiffany working at the office from 6am to 9am.
John saw Tiffany stretching at a yoga studio from 9am to 12pm.
Sean saw Tiffany sitting on a rooftop from 12pm to 2pm.
Sarah saw Tiffany playing tennis at the tennis court from 2pm to 3pm.
The beach was closed after 4pm.
Between what times could Tiffany have gone to the beach?
Options:
(A) 9am to 12pm
(B) 12pm to 2pm
(C) 5am to 6am
(D) 3pm to 4pm'
target: (D)
include: _fewshot_template_yaml
task: leaderboard_bbh_temporal_sequences
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment