Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
57b86c47
"vscode:/vscode.git/clone" did not exist on "a562c8a35c93d70374e2d3b57c12f66718113f17"
Commit
57b86c47
authored
Jul 22, 2025
by
Baber
Browse files
Remove unused `doc_to_choice` and fix superglue whitespaces
parent
e0021a06
Changes
9
Hide whitespace changes
Inline
Side-by-side
Showing
9 changed files
with
6 additions
and
9 deletions
+6
-9
lm_eval/tasks/super_glue/README.md
lm_eval/tasks/super_glue/README.md
+3
-0
lm_eval/tasks/super_glue/boolq/seq2seq.yaml
lm_eval/tasks/super_glue/boolq/seq2seq.yaml
+0
-1
lm_eval/tasks/super_glue/boolq/t5-prompt.yaml
lm_eval/tasks/super_glue/boolq/t5-prompt.yaml
+0
-1
lm_eval/tasks/super_glue/cb/t5-prompt.yaml
lm_eval/tasks/super_glue/cb/t5-prompt.yaml
+0
-1
lm_eval/tasks/super_glue/copa/t5-prompt.yaml
lm_eval/tasks/super_glue/copa/t5-prompt.yaml
+0
-1
lm_eval/tasks/super_glue/multirc/t5-prompt.yaml
lm_eval/tasks/super_glue/multirc/t5-prompt.yaml
+1
-2
lm_eval/tasks/super_glue/record/default.yaml
lm_eval/tasks/super_glue/record/default.yaml
+1
-0
lm_eval/tasks/super_glue/rte/t5-prompt.yaml
lm_eval/tasks/super_glue/rte/t5-prompt.yaml
+1
-2
lm_eval/tasks/super_glue/wic/t5-prompt.yaml
lm_eval/tasks/super_glue/wic/t5-prompt.yaml
+0
-1
No files found.
lm_eval/tasks/super_glue/README.md
View file @
57b86c47
...
@@ -79,3 +79,6 @@ If other tasks on this dataset are already supported:
...
@@ -79,3 +79,6 @@ If other tasks on this dataset are already supported:
*
[ ] Is the "Main" variant of this task clearly denoted?
*
[ ] Is the "Main" variant of this task clearly denoted?
*
[ ] Have you provided a short sentence in a README on what each new variant adds / evaluates?
*
[ ] Have you provided a short sentence in a README on what each new variant adds / evaluates?
*
[ ] Have you noted which, if any, published evaluation setups are matched by this variant?
*
[ ] Have you noted which, if any, published evaluation setups are matched by this variant?
### Changelog
-
2025-07-22:
`record`
and
`multirc`
: set target_delimiter to "" and trim doc_to_text respectively.
lm_eval/tasks/super_glue/boolq/seq2seq.yaml
View file @
57b86c47
...
@@ -8,7 +8,6 @@ training_split: train
...
@@ -8,7 +8,6 @@ training_split: train
validation_split
:
validation
validation_split
:
validation
doc_to_text
:
"
{{passage}}
\n
Question:
{{question}}?
\n
Answer:"
doc_to_text
:
"
{{passage}}
\n
Question:
{{question}}?
\n
Answer:"
doc_to_target
:
"
{{
['
no',
'
yes'][label|int]
}}"
doc_to_target
:
"
{{
['
no',
'
yes'][label|int]
}}"
doc_to_choice
:
[
"
no"
,
"
yes"
]
target_delimiter
:
"
"
target_delimiter
:
"
"
generation_kwargs
:
generation_kwargs
:
until
:
until
:
...
...
lm_eval/tasks/super_glue/boolq/t5-prompt.yaml
View file @
57b86c47
...
@@ -8,7 +8,6 @@ validation_split: validation
...
@@ -8,7 +8,6 @@ validation_split: validation
output_type
:
generate_until
output_type
:
generate_until
doc_to_text
:
"
boolq
passage:
{{passage}}
question:
{{question}}"
doc_to_text
:
"
boolq
passage:
{{passage}}
question:
{{question}}"
doc_to_target
:
"
{{['False',
'True'][label|int]}}"
doc_to_target
:
"
{{['False',
'True'][label|int]}}"
doc_to_choice
:
[
"
False"
,
"
True"
]
generation_kwargs
:
generation_kwargs
:
until
:
until
:
-
"
</s>"
-
"
</s>"
...
...
lm_eval/tasks/super_glue/cb/t5-prompt.yaml
View file @
57b86c47
...
@@ -8,7 +8,6 @@ validation_split: validation
...
@@ -8,7 +8,6 @@ validation_split: validation
output_type
:
generate_until
output_type
:
generate_until
doc_to_text
:
"
cb
hypothesis:
{{hypothesis}}
premise:
{{premise}}"
doc_to_text
:
"
cb
hypothesis:
{{hypothesis}}
premise:
{{premise}}"
doc_to_target
:
"
{{
['entailment',
'contradiction',
'neutral'][label|int]
}}"
doc_to_target
:
"
{{
['entailment',
'contradiction',
'neutral'][label|int]
}}"
doc_to_choice
:
[
"
entailment"
,
"
contradiction"
,
"
neutral"
]
generation_kwargs
:
generation_kwargs
:
until
:
until
:
-
"
</s>"
-
"
</s>"
...
...
lm_eval/tasks/super_glue/copa/t5-prompt.yaml
View file @
57b86c47
...
@@ -8,7 +8,6 @@ validation_split: validation
...
@@ -8,7 +8,6 @@ validation_split: validation
output_type
:
generate_until
output_type
:
generate_until
doc_to_text
:
"
copa
choice1:
{{choice1}}
choice2:
{{choice2}}
premise:
{{premise}}
question:
{{question}}"
doc_to_text
:
"
copa
choice1:
{{choice1}}
choice2:
{{choice2}}
premise:
{{premise}}
question:
{{question}}"
doc_to_target
:
"
{{
[choice1,
choice2][label|int]
}}"
doc_to_target
:
"
{{
[choice1,
choice2][label|int]
}}"
doc_to_choice
:
[
"
choice1"
,
"
choice2"
]
generation_kwargs
:
generation_kwargs
:
until
:
until
:
-
"
</s>"
-
"
</s>"
...
...
lm_eval/tasks/super_glue/multirc/t5-prompt.yaml
View file @
57b86c47
...
@@ -6,9 +6,8 @@ dataset_name: multirc
...
@@ -6,9 +6,8 @@ dataset_name: multirc
training_split
:
train
training_split
:
train
validation_split
:
validation
validation_split
:
validation
output_type
:
generate_until
output_type
:
generate_until
doc_to_text
:
"
multirc
question:
{{question}}
answer:
{{answer}}
paragraph:
{{paragraph}}"
doc_to_text
:
"
multirc
question:
{{question}}
answer:
{{answer}}
paragraph:
{{paragraph}}
|trim
"
doc_to_target
:
"
{%
set
group_id
=
idx.question|string
%}{{[group_id+'_False',
group_id+'_True'][label]}}"
doc_to_target
:
"
{%
set
group_id
=
idx.question|string
%}{{[group_id+'_False',
group_id+'_True'][label]}}"
doc_to_choice
:
"
{%
set
group_id
=
idx.question|string
%}{{[group_id+'_False',
group_id+'_True']}}"
generation_kwargs
:
generation_kwargs
:
until
:
until
:
-
"
</s>"
-
"
</s>"
...
...
lm_eval/tasks/super_glue/record/default.yaml
View file @
57b86c47
...
@@ -11,6 +11,7 @@ doc_to_target: !function util.doc_to_target
...
@@ -11,6 +11,7 @@ doc_to_target: !function util.doc_to_target
doc_to_choice
:
!function
util.doc_to_choice
doc_to_choice
:
!function
util.doc_to_choice
process_docs
:
!function
util.process_docs
process_docs
:
!function
util.process_docs
process_results
:
!function
util.process_results
process_results
:
!function
util.process_results
target_delimiter
:
"
"
metric_list
:
metric_list
:
-
metric
:
f1
-
metric
:
f1
aggregation
:
mean
aggregation
:
mean
...
...
lm_eval/tasks/super_glue/rte/t5-prompt.yaml
View file @
57b86c47
...
@@ -7,8 +7,7 @@ training_split: train
...
@@ -7,8 +7,7 @@ training_split: train
validation_split
:
validation
validation_split
:
validation
output_type
:
generate_until
output_type
:
generate_until
doc_to_text
:
"
rte
hypothesis:
{{hypothesis}}
premise:
{{premise}}"
doc_to_text
:
"
rte
hypothesis:
{{hypothesis}}
premise:
{{premise}}"
doc_to_target
:
"
{{
[entailment,
not_entailment][label|int]
}}"
doc_to_target
:
"
{{
['entailment',
'not_entailment'][label|int]
}}"
doc_to_choice
:
[
"
entailment"
,
"
not_entailment"
]
generation_kwargs
:
generation_kwargs
:
until
:
until
:
-
"
</s>"
-
"
</s>"
...
...
lm_eval/tasks/super_glue/wic/t5-prompt.yaml
View file @
57b86c47
...
@@ -8,7 +8,6 @@ validation_split: validation
...
@@ -8,7 +8,6 @@ validation_split: validation
output_type
:
generate_until
output_type
:
generate_until
doc_to_text
:
"
wic
sentence1:
{{sentence1}}
sentence2:
{{sentence2}}
word:
{{word}}"
doc_to_text
:
"
wic
sentence1:
{{sentence1}}
sentence2:
{{sentence2}}
word:
{{word}}"
doc_to_target
:
"
{{
['False',
'True'][label|int]
}}"
doc_to_target
:
"
{{
['False',
'True'][label|int]
}}"
doc_to_choice
:
[
"
False"
,
"
True"
]
generation_kwargs
:
generation_kwargs
:
until
:
until
:
-
"
</s>"
-
"
</s>"
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment