Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
0d1ef037
Commit
0d1ef037
authored
Jan 17, 2024
by
lintangsutawika
Browse files
solved merge conflict
parents
aa44be3f
ada4a31d
Changes
424
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
38 additions
and
49 deletions
+38
-49
lm_eval/tasks/glue/mnli/default.yaml
lm_eval/tasks/glue/mnli/default.yaml
+1
-1
lm_eval/tasks/glue/mrpc/default.yaml
lm_eval/tasks/glue/mrpc/default.yaml
+1
-1
lm_eval/tasks/glue/qnli/default.yaml
lm_eval/tasks/glue/qnli/default.yaml
+1
-1
lm_eval/tasks/glue/qqp/default.yaml
lm_eval/tasks/glue/qqp/default.yaml
+1
-1
lm_eval/tasks/glue/rte/default.yaml
lm_eval/tasks/glue/rte/default.yaml
+1
-1
lm_eval/tasks/glue/sst2/default.yaml
lm_eval/tasks/glue/sst2/default.yaml
+1
-1
lm_eval/tasks/glue/wnli/default.yaml
lm_eval/tasks/glue/wnli/default.yaml
+1
-1
lm_eval/tasks/gsm8k/gsm8k-cot-self-consistency.yaml
lm_eval/tasks/gsm8k/gsm8k-cot-self-consistency.yaml
+1
-1
lm_eval/tasks/gsm8k/gsm8k-cot.yaml
lm_eval/tasks/gsm8k/gsm8k-cot.yaml
+11
-12
lm_eval/tasks/gsm8k/gsm8k.yaml
lm_eval/tasks/gsm8k/gsm8k.yaml
+1
-2
lm_eval/tasks/headqa/headqa_en.yaml
lm_eval/tasks/headqa/headqa_en.yaml
+1
-1
lm_eval/tasks/hellaswag/hellaswag.yaml
lm_eval/tasks/hellaswag/hellaswag.yaml
+1
-1
lm_eval/tasks/hendrycks_ethics/commonsense.yaml
lm_eval/tasks/hendrycks_ethics/commonsense.yaml
+1
-1
lm_eval/tasks/hendrycks_ethics/deontology.yaml
lm_eval/tasks/hendrycks_ethics/deontology.yaml
+1
-1
lm_eval/tasks/hendrycks_ethics/justice.yaml
lm_eval/tasks/hendrycks_ethics/justice.yaml
+1
-1
lm_eval/tasks/hendrycks_ethics/utilitarianism.yaml
lm_eval/tasks/hendrycks_ethics/utilitarianism.yaml
+1
-1
lm_eval/tasks/hendrycks_ethics/utilitarianism_original_yaml
lm_eval/tasks/hendrycks_ethics/utilitarianism_original_yaml
+1
-1
lm_eval/tasks/hendrycks_ethics/virtue.yaml
lm_eval/tasks/hendrycks_ethics/virtue.yaml
+1
-1
lm_eval/tasks/ifeval/ifeval.yaml
lm_eval/tasks/ifeval/ifeval.yaml
+1
-1
lm_eval/tasks/ifeval/instructions_registry.py
lm_eval/tasks/ifeval/instructions_registry.py
+9
-18
No files found.
lm_eval/tasks/glue/mnli/default.yaml
View file @
0d1ef037
...
@@ -11,4 +11,4 @@ doc_to_choice: ["True", "Neither", "False"]
...
@@ -11,4 +11,4 @@ doc_to_choice: ["True", "Neither", "False"]
metric_list
:
metric_list
:
-
metric
:
acc
-
metric
:
acc
metadata
:
metadata
:
-
version
:
1.0
version
:
1.0
lm_eval/tasks/glue/mrpc/default.yaml
View file @
0d1ef037
...
@@ -12,4 +12,4 @@ metric_list:
...
@@ -12,4 +12,4 @@ metric_list:
-
metric
:
acc
-
metric
:
acc
-
metric
:
f1
-
metric
:
f1
metadata
:
metadata
:
-
version
:
1.0
version
:
1.0
lm_eval/tasks/glue/qnli/default.yaml
View file @
0d1ef037
...
@@ -11,4 +11,4 @@ doc_to_choice: ["yes", "no"]
...
@@ -11,4 +11,4 @@ doc_to_choice: ["yes", "no"]
metric_list
:
metric_list
:
-
metric
:
acc
-
metric
:
acc
metadata
:
metadata
:
-
version
:
1.0
version
:
1.0
lm_eval/tasks/glue/qqp/default.yaml
View file @
0d1ef037
...
@@ -12,4 +12,4 @@ metric_list:
...
@@ -12,4 +12,4 @@ metric_list:
-
metric
:
acc
-
metric
:
acc
-
metric
:
f1
-
metric
:
f1
metadata
:
metadata
:
-
version
:
1.0
version
:
1.0
lm_eval/tasks/glue/rte/default.yaml
View file @
0d1ef037
...
@@ -11,4 +11,4 @@ doc_to_choice: ["True", "False"]
...
@@ -11,4 +11,4 @@ doc_to_choice: ["True", "False"]
metric_list
:
metric_list
:
-
metric
:
acc
-
metric
:
acc
metadata
:
metadata
:
-
version
:
1.0
version
:
1.0
lm_eval/tasks/glue/sst2/default.yaml
View file @
0d1ef037
...
@@ -11,4 +11,4 @@ doc_to_choice: ["negative", "positive"]
...
@@ -11,4 +11,4 @@ doc_to_choice: ["negative", "positive"]
metric_list
:
metric_list
:
-
metric
:
acc
-
metric
:
acc
metadata
:
metadata
:
-
version
:
1.0
version
:
1.0
lm_eval/tasks/glue/wnli/default.yaml
View file @
0d1ef037
...
@@ -11,4 +11,4 @@ doc_to_choice: ["False", "True"]
...
@@ -11,4 +11,4 @@ doc_to_choice: ["False", "True"]
metric_list
:
metric_list
:
-
metric
:
acc
-
metric
:
acc
metadata
:
metadata
:
-
version
:
2.0
version
:
2.0
lm_eval/tasks/gsm8k/gsm8k-cot-self-consistency.yaml
View file @
0d1ef037
...
@@ -31,4 +31,4 @@ filter_list:
...
@@ -31,4 +31,4 @@ filter_list:
-
function
:
"
majority_vote"
-
function
:
"
majority_vote"
-
function
:
"
take_first"
-
function
:
"
take_first"
metadata
:
metadata
:
-
version
:
0
.0
version
:
2
.0
lm_eval/tasks/gsm8k/gsm8k-cot.yaml
View file @
0d1ef037
...
@@ -5,16 +5,16 @@ dataset_path: gsm8k
...
@@ -5,16 +5,16 @@ dataset_path: gsm8k
dataset_name
:
main
dataset_name
:
main
output_type
:
generate_until
output_type
:
generate_until
test_split
:
test
test_split
:
test
doc_to_text
:
"
Q:
There
are
15
trees
in
the
grove.
Grove
workers
will
plant
trees
in
the
grove
today.
After
they
are
done,
there
will
be
21
trees.
How
many
trees
did
the
grove
workers
plant
today?
\
n\
n
A:
There
are
15
trees
originally.
Then
there
were
21
trees
after
some
more
were
planted.
So
there
must
have
been
21
-
15
=
6.
The
answer
is
6.
\n\n\
doc_to_text
:
"
Q:
There
are
15
trees
in
the
grove.
Grove
workers
will
plant
trees
in
the
grove
today.
After
they
are
done,
there
will
be
21
trees.
How
many
trees
did
the
grove
workers
plant
today?
\n
A:
There
are
15
trees
originally.
Then
there
were
21
trees
after
some
more
were
planted.
So
there
must
have
been
21
-
15
=
6.
The
answer
is
6.
\n\n\
Q:
If
there
are
3
cars
in
the
parking
lot
and
2
more
cars
arrive,
how
many
cars
are
in
the
parking
lot?
\
n\
n
A:
There
are
originally
3
cars.
2
more
cars
arrive.
3
+
2
=
5.
The
answer
is
5.
\n\n\
Q:
If
there
are
3
cars
in
the
parking
lot
and
2
more
cars
arrive,
how
many
cars
are
in
the
parking
lot?
\n
A:
There
are
originally
3
cars.
2
more
cars
arrive.
3
+
2
=
5.
The
answer
is
5.
\n\n\
Q:
Leah
had
32
chocolates
and
her
sister
had
42.
If
they
ate
35,
how
many
pieces
do
they
have
left
in
total?
\
n\
n
A:
Originally,
Leah
had
32
chocolates.
Her
sister
had
42.
So
in
total
they
had
32
+
42
=
74.
After
eating
35,
they
had
74
-
35
=
39.
The
answer
is
39.
\n\n\
Q:
Leah
had
32
chocolates
and
her
sister
had
42.
If
they
ate
35,
how
many
pieces
do
they
have
left
in
total?
\n
A:
Originally,
Leah
had
32
chocolates.
Her
sister
had
42.
So
in
total
they
had
32
+
42
=
74.
After
eating
35,
they
had
74
-
35
=
39.
The
answer
is
39.
\n\n\
Q:
Jason
had
20
lollipops.
He
gave
Denny
some
lollipops.
Now
Jason
has
12
lollipops.
How
many
lollipops
did
Jason
give
to
Denny?
\
n\
n
A:
Jason
started
with
20
lollipops.
Then
he
had
12
after
giving
some
to
Denny.
So
he
gave
Denny
20
-
12
=
8.
The
answer
is
8.
\n\n\
Q:
Jason
had
20
lollipops.
He
gave
Denny
some
lollipops.
Now
Jason
has
12
lollipops.
How
many
lollipops
did
Jason
give
to
Denny?
\n
A:
Jason
started
with
20
lollipops.
Then
he
had
12
after
giving
some
to
Denny.
So
he
gave
Denny
20
-
12
=
8.
The
answer
is
8.
\n\n\
Q:
Shawn
has
five
toys.
For
Christmas,
he
got
two
toys
each
from
his
mom
and
dad.
How
many
toys
does
he
have
now?
\
n\
n
A:
Shawn
started
with
5
toys.
If
he
got
2
toys
each
from
his
mom
and
dad,
then
that
is
4
more
toys.
5
+
4
=
9.
The
answer
is
9.
\n\n\
Q:
Shawn
has
five
toys.
For
Christmas,
he
got
two
toys
each
from
his
mom
and
dad.
How
many
toys
does
he
have
now?
\n
A:
Shawn
started
with
5
toys.
If
he
got
2
toys
each
from
his
mom
and
dad,
then
that
is
4
more
toys.
5
+
4
=
9.
The
answer
is
9.
\n\n\
Q:
There
were
nine
computers
in
the
server
room.
Five
more
computers
were
installed
each
day,
from
monday
to
thursday.
How
many
computers
are
now
in
the
server
room?
\
n\
n
A:
There
were
originally
9
computers.
For
each
of
4
days,
5
more
computers
were
added.
So
5
*
4
=
20
computers
were
added.
9
+
20
is
29.
The
answer
is
29.
\n\n\
Q:
There
were
nine
computers
in
the
server
room.
Five
more
computers
were
installed
each
day,
from
monday
to
thursday.
How
many
computers
are
now
in
the
server
room?
\n
A:
There
were
originally
9
computers.
For
each
of
4
days,
5
more
computers
were
added.
So
5
*
4
=
20
computers
were
added.
9
+
20
is
29.
The
answer
is
29.
\n\n\
Q:
Michael
had
58
golf
balls.
On
tuesday,
he
lost
23
golf
balls.
On
wednesday,
he
lost
2
more.
How
many
golf
balls
did
he
have
at
the
end
of
wednesday?
\
n\
n
A:
Michael
started
with
58
golf
balls.
After
losing
23
on
tuesday,
he
had
58
-
23
=
35.
After
losing
2
more,
he
had
35
-
2
=
33
golf
balls.
The
answer
is
33.
\n\n\
Q:
Michael
had
58
golf
balls.
On
tuesday,
he
lost
23
golf
balls.
On
wednesday,
he
lost
2
more.
How
many
golf
balls
did
he
have
at
the
end
of
wednesday?
\n
A:
Michael
started
with
58
golf
balls.
After
losing
23
on
tuesday,
he
had
58
-
23
=
35.
After
losing
2
more,
he
had
35
-
2
=
33
golf
balls.
The
answer
is
33.
\n\n\
Q:
Olivia
has
$23.
She
bought
five
bagels
for
$3
each.
How
much
money
does
she
have
left?
\
n\
n
A:
Olivia
had
23
dollars.
5
bagels
for
3
dollars
each
will
be
5
x
3
=
15
dollars.
So
she
has
23
-
15
dollars
left.
23
-
15
is
8.
The
answer
is
8.
\n\n\
Q:
Olivia
has
$23.
She
bought
five
bagels
for
$3
each.
How
much
money
does
she
have
left?
\n
A:
Olivia
had
23
dollars.
5
bagels
for
3
dollars
each
will
be
5
x
3
=
15
dollars.
So
she
has
23
-
15
dollars
left.
23
-
15
is
8.
The
answer
is
8.
\n\n\
Q:
{{question}}
\
n\
n
A:"
Q:
{{question}}
\n
A:"
doc_to_target
:
"
{{answer.split('###
')[-1].
r
strip()}}"
doc_to_target
:
"
{{answer.split('###
#
')[-1].strip()}}"
metric_list
:
metric_list
:
-
metric
:
exact_match
-
metric
:
exact_match
aggregation
:
mean
aggregation
:
mean
...
@@ -31,7 +31,6 @@ generation_kwargs:
...
@@ -31,7 +31,6 @@ generation_kwargs:
-
"
Q:"
-
"
Q:"
-
"
\n\n
"
-
"
\n\n
"
do_sample
:
false
do_sample
:
false
temperature
:
0.0
repeats
:
1
repeats
:
1
num_fewshot
:
0
num_fewshot
:
0
filter_list
:
filter_list
:
...
@@ -41,4 +40,4 @@ filter_list:
...
@@ -41,4 +40,4 @@ filter_list:
regex_pattern
:
"
The
answer
is
(
\\
-?[0-9
\\
.
\\
,]+)."
regex_pattern
:
"
The
answer
is
(
\\
-?[0-9
\\
.
\\
,]+)."
-
function
:
"
take_first"
-
function
:
"
take_first"
metadata
:
metadata
:
-
version
:
0
.0
version
:
2
.0
lm_eval/tasks/gsm8k/gsm8k.yaml
View file @
0d1ef037
...
@@ -24,7 +24,6 @@ generation_kwargs:
...
@@ -24,7 +24,6 @@ generation_kwargs:
-
"
\n\n
"
-
"
\n\n
"
-
"
Question:"
-
"
Question:"
do_sample
:
false
do_sample
:
false
temperature
:
0.0
repeats
:
1
repeats
:
1
num_fewshot
:
5
num_fewshot
:
5
filter_list
:
filter_list
:
...
@@ -34,4 +33,4 @@ filter_list:
...
@@ -34,4 +33,4 @@ filter_list:
regex_pattern
:
"
####
(
\\
-?[0-9
\\
.
\\
,]+)"
regex_pattern
:
"
####
(
\\
-?[0-9
\\
.
\\
,]+)"
-
function
:
"
take_first"
-
function
:
"
take_first"
metadata
:
metadata
:
-
version
:
1
.0
version
:
2
.0
lm_eval/tasks/headqa/headqa_en.yaml
View file @
0d1ef037
...
@@ -20,4 +20,4 @@ metric_list:
...
@@ -20,4 +20,4 @@ metric_list:
aggregation
:
mean
aggregation
:
mean
higher_is_better
:
true
higher_is_better
:
true
metadata
:
metadata
:
-
version
:
1.0
version
:
1.0
lm_eval/tasks/hellaswag/hellaswag.yaml
View file @
0d1ef037
...
@@ -19,4 +19,4 @@ metric_list:
...
@@ -19,4 +19,4 @@ metric_list:
aggregation
:
mean
aggregation
:
mean
higher_is_better
:
true
higher_is_better
:
true
metadata
:
metadata
:
-
version
:
1.0
version
:
1.0
lm_eval/tasks/hendrycks_ethics/commonsense.yaml
View file @
0d1ef037
...
@@ -12,4 +12,4 @@ doc_to_choice: ['no', 'yes']
...
@@ -12,4 +12,4 @@ doc_to_choice: ['no', 'yes']
metric_list
:
metric_list
:
-
metric
:
acc
-
metric
:
acc
metadata
:
metadata
:
-
version
:
1.0
version
:
1.0
lm_eval/tasks/hendrycks_ethics/deontology.yaml
View file @
0d1ef037
...
@@ -5,5 +5,5 @@ doc_to_text: "Question: Would most people believe this reasonable or unreasonabl
...
@@ -5,5 +5,5 @@ doc_to_text: "Question: Would most people believe this reasonable or unreasonabl
doc_to_target
:
label
doc_to_target
:
label
doc_to_choice
:
[
'
unreasonable'
,
'
reasonable'
]
doc_to_choice
:
[
'
unreasonable'
,
'
reasonable'
]
metadata
:
metadata
:
-
version
:
1.0
version
:
1.0
# TODO: implement exact-match metric for this subset
# TODO: implement exact-match metric for this subset
lm_eval/tasks/hendrycks_ethics/justice.yaml
View file @
0d1ef037
...
@@ -6,4 +6,4 @@ dataset_name: justice
...
@@ -6,4 +6,4 @@ dataset_name: justice
doc_to_text
:
"
Question:
Would
most
people
believe
this
reasonable
or
unreasonable
to
say?
\"
{{scenario}}
\"\n
Answer:"
doc_to_text
:
"
Question:
Would
most
people
believe
this
reasonable
or
unreasonable
to
say?
\"
{{scenario}}
\"\n
Answer:"
# TODO: impl. exact match for this and deontology
# TODO: impl. exact match for this and deontology
metadata
:
metadata
:
-
version
:
1.0
version
:
1.0
lm_eval/tasks/hendrycks_ethics/utilitarianism.yaml
View file @
0d1ef037
...
@@ -9,4 +9,4 @@ doc_to_choice: ['no', 'yes']
...
@@ -9,4 +9,4 @@ doc_to_choice: ['no', 'yes']
metric_list
:
metric_list
:
-
metric
:
acc
-
metric
:
acc
metadata
:
metadata
:
-
version
:
1.0
version
:
1.0
lm_eval/tasks/hendrycks_ethics/utilitarianism_original_yaml
View file @
0d1ef037
...
@@ -13,4 +13,4 @@
...
@@ -13,4 +13,4 @@
# - metric: acc
# - metric: acc
# TODO: we want this to be implemented as a winograd_schema task type, actually
# TODO: we want this to be implemented as a winograd_schema task type, actually
# metadata:
# metadata:
#
-
version: 1.0
# version: 1.0
lm_eval/tasks/hendrycks_ethics/virtue.yaml
View file @
0d1ef037
...
@@ -7,4 +7,4 @@ doc_to_text: "Sentence: {{scenario}}\nQuestion: Does the character in this sente
...
@@ -7,4 +7,4 @@ doc_to_text: "Sentence: {{scenario}}\nQuestion: Does the character in this sente
doc_to_target
:
label
doc_to_target
:
label
doc_to_choice
:
[
'
no'
,
'
yes'
]
doc_to_choice
:
[
'
no'
,
'
yes'
]
metadata
:
metadata
:
-
version
:
1.0
version
:
1.0
lm_eval/tasks/ifeval/ifeval.yaml
View file @
0d1ef037
...
@@ -26,4 +26,4 @@ metric_list:
...
@@ -26,4 +26,4 @@ metric_list:
aggregation
:
!function
utils.agg_inst_level_acc
aggregation
:
!function
utils.agg_inst_level_acc
higher_is_better
:
true
higher_is_better
:
true
metadata
:
metadata
:
-
version
:
1
.0
version
:
2
.0
lm_eval/tasks/ifeval/instructions_registry.py
View file @
0d1ef037
...
@@ -78,8 +78,7 @@ INSTRUCTION_CONFLICTS = {
...
@@ -78,8 +78,7 @@ INSTRUCTION_CONFLICTS = {
# _KEYWORD + "key_sentences": instructions.KeySentenceChecker,
# _KEYWORD + "key_sentences": instructions.KeySentenceChecker,
_KEYWORD
+
"forbidden_words"
:
{
_KEYWORD
+
"forbidden_words"
},
_KEYWORD
+
"forbidden_words"
:
{
_KEYWORD
+
"forbidden_words"
},
_KEYWORD
+
"letter_frequency"
:
{
_KEYWORD
+
"letter_frequency"
},
_KEYWORD
+
"letter_frequency"
:
{
_KEYWORD
+
"letter_frequency"
},
_LANGUAGE
_LANGUAGE
+
"response_language"
:
{
+
"response_language"
:
{
_LANGUAGE
+
"response_language"
,
_LANGUAGE
+
"response_language"
,
_FORMAT
+
"multiple_sections"
,
_FORMAT
+
"multiple_sections"
,
_KEYWORD
+
"existence"
,
_KEYWORD
+
"existence"
,
...
@@ -90,16 +89,14 @@ INSTRUCTION_CONFLICTS = {
...
@@ -90,16 +89,14 @@ INSTRUCTION_CONFLICTS = {
_CHANGE_CASES
+
"english_lowercase"
,
_CHANGE_CASES
+
"english_lowercase"
,
},
},
_LENGTH
+
"number_sentences"
:
{
_LENGTH
+
"number_sentences"
},
_LENGTH
+
"number_sentences"
:
{
_LENGTH
+
"number_sentences"
},
_LENGTH
_LENGTH
+
"number_paragraphs"
:
{
+
"number_paragraphs"
:
{
_LENGTH
+
"number_paragraphs"
,
_LENGTH
+
"number_paragraphs"
,
_LENGTH
+
"nth_paragraph_first_word"
,
_LENGTH
+
"nth_paragraph_first_word"
,
_LENGTH
+
"number_sentences"
,
_LENGTH
+
"number_sentences"
,
_LENGTH
+
"nth_paragraph_first_word"
,
_LENGTH
+
"nth_paragraph_first_word"
,
},
},
_LENGTH
+
"number_words"
:
{
_LENGTH
+
"number_words"
},
_LENGTH
+
"number_words"
:
{
_LENGTH
+
"number_words"
},
_LENGTH
_LENGTH
+
"nth_paragraph_first_word"
:
{
+
"nth_paragraph_first_word"
:
{
_LENGTH
+
"nth_paragraph_first_word"
,
_LENGTH
+
"nth_paragraph_first_word"
,
_LENGTH
+
"number_paragraphs"
,
_LENGTH
+
"number_paragraphs"
,
},
},
...
@@ -110,23 +107,20 @@ INSTRUCTION_CONFLICTS = {
...
@@ -110,23 +107,20 @@ INSTRUCTION_CONFLICTS = {
# _CONTENT + "rephrase_paragraph": instructions.RephraseParagraph,
# _CONTENT + "rephrase_paragraph": instructions.RephraseParagraph,
_FORMAT
+
"constrained_response"
:
set
(
INSTRUCTION_DICT
.
keys
()),
_FORMAT
+
"constrained_response"
:
set
(
INSTRUCTION_DICT
.
keys
()),
_FORMAT
+
"number_highlighted_sections"
:
{
_FORMAT
+
"number_highlighted_sections"
},
_FORMAT
+
"number_highlighted_sections"
:
{
_FORMAT
+
"number_highlighted_sections"
},
_FORMAT
_FORMAT
+
"multiple_sections"
:
{
+
"multiple_sections"
:
{
_FORMAT
+
"multiple_sections"
,
_FORMAT
+
"multiple_sections"
,
_LANGUAGE
+
"response_language"
,
_LANGUAGE
+
"response_language"
,
_FORMAT
+
"number_highlighted_sections"
,
_FORMAT
+
"number_highlighted_sections"
,
},
},
# TODO(tianjianlu): Re-enable rephrasing with preprocessing the message.
# TODO(tianjianlu): Re-enable rephrasing with preprocessing the message.
# _FORMAT + "rephrase": instructions.RephraseChecker,
# _FORMAT + "rephrase": instructions.RephraseChecker,
_FORMAT
_FORMAT
+
"json_format"
:
set
(
INSTRUCTION_DICT
.
keys
()).
difference
(
+
"json_format"
:
set
(
INSTRUCTION_DICT
.
keys
()).
difference
(
{
_KEYWORD
+
"forbidden_words"
,
_KEYWORD
+
"existence"
}
{
_KEYWORD
+
"forbidden_words"
,
_KEYWORD
+
"existence"
}
),
),
_FORMAT
+
"title"
:
{
_FORMAT
+
"title"
},
_FORMAT
+
"title"
:
{
_FORMAT
+
"title"
},
# TODO(tianjianlu): Re-enable with specific prompts.
# TODO(tianjianlu): Re-enable with specific prompts.
# _MULTITURN + "constrained_start": instructions.ConstrainedStartChecker,
# _MULTITURN + "constrained_start": instructions.ConstrainedStartChecker,
_COMBINATION
_COMBINATION
+
"two_responses"
:
set
(
INSTRUCTION_DICT
.
keys
()).
difference
(
+
"two_responses"
:
set
(
INSTRUCTION_DICT
.
keys
()).
difference
(
{
{
_KEYWORD
+
"forbidden_words"
,
_KEYWORD
+
"forbidden_words"
,
_KEYWORD
+
"existence"
,
_KEYWORD
+
"existence"
,
...
@@ -135,20 +129,17 @@ INSTRUCTION_CONFLICTS = {
...
@@ -135,20 +129,17 @@ INSTRUCTION_CONFLICTS = {
_PUNCTUATION
+
"no_comma"
,
_PUNCTUATION
+
"no_comma"
,
}
}
),
),
_COMBINATION
_COMBINATION
+
"repeat_prompt"
:
set
(
INSTRUCTION_DICT
.
keys
()).
difference
(
+
"repeat_prompt"
:
set
(
INSTRUCTION_DICT
.
keys
()).
difference
(
{
_KEYWORD
+
"existence"
,
_FORMAT
+
"title"
,
_PUNCTUATION
+
"no_comma"
}
{
_KEYWORD
+
"existence"
,
_FORMAT
+
"title"
,
_PUNCTUATION
+
"no_comma"
}
),
),
_STARTEND
+
"end_checker"
:
{
_STARTEND
+
"end_checker"
},
_STARTEND
+
"end_checker"
:
{
_STARTEND
+
"end_checker"
},
_CHANGE_CASES
_CHANGE_CASES
+
"capital_word_frequency"
:
{
+
"capital_word_frequency"
:
{
_CHANGE_CASES
+
"capital_word_frequency"
,
_CHANGE_CASES
+
"capital_word_frequency"
,
_CHANGE_CASES
+
"english_lowercase"
,
_CHANGE_CASES
+
"english_lowercase"
,
_CHANGE_CASES
+
"english_capital"
,
_CHANGE_CASES
+
"english_capital"
,
},
},
_CHANGE_CASES
+
"english_capital"
:
{
_CHANGE_CASES
+
"english_capital"
},
_CHANGE_CASES
+
"english_capital"
:
{
_CHANGE_CASES
+
"english_capital"
},
_CHANGE_CASES
_CHANGE_CASES
+
"english_lowercase"
:
{
+
"english_lowercase"
:
{
_CHANGE_CASES
+
"english_lowercase"
,
_CHANGE_CASES
+
"english_lowercase"
,
_CHANGE_CASES
+
"english_capital"
,
_CHANGE_CASES
+
"english_capital"
,
},
},
...
...
Prev
1
…
7
8
9
10
11
12
13
14
15
…
22
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment