Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
9c748204
Commit
9c748204
authored
Aug 07, 2023
by
lintangsutawika
Browse files
making t5 version of superglue prompt
parent
07f94446
Changes
8
Hide whitespace changes
Inline
Side-by-side
Showing
8 changed files
with
82 additions
and
6 deletions
+82
-6
lm_eval/evaluator.py
lm_eval/evaluator.py
+7
-2
lm_eval/models/huggingface.py
lm_eval/models/huggingface.py
+3
-0
lm_eval/tasks/super_glue/boolq/t5-prompt.yaml
lm_eval/tasks/super_glue/boolq/t5-prompt.yaml
+18
-0
lm_eval/tasks/super_glue/copa/t5-prompt.yaml
lm_eval/tasks/super_glue/copa/t5-prompt.yaml
+2
-2
lm_eval/tasks/super_glue/multirc/t5-prompt.yaml
lm_eval/tasks/super_glue/multirc/t5-prompt.yaml
+17
-0
lm_eval/tasks/super_glue/rte/t5-prompt.yaml
lm_eval/tasks/super_glue/rte/t5-prompt.yaml
+17
-0
lm_eval/tasks/super_glue/wic/t5-prompt.yaml
lm_eval/tasks/super_glue/wic/t5-prompt.yaml
+17
-0
lm_eval/tasks/super_glue/wsc/t5-prompt.yaml
lm_eval/tasks/super_glue/wsc/t5-prompt.yaml
+1
-2
No files found.
lm_eval/evaluator.py
View file @
9c748204
...
...
@@ -114,7 +114,12 @@ def simple_evaluate(
task_dict
=
lm_eval
.
tasks
.
get_task_dict
(
tasks
)
for
task_name
in
task_dict
.
keys
():
config
=
task_dict
[
task_name
].
_config
task_obj
=
task_dict
[
task_name
]
if
type
(
task_obj
)
==
tuple
:
group
,
task_obj
=
task_obj
config
=
task_obj
.
_config
if
num_fewshot
is
not
None
:
if
config
[
"num_fewshot"
]
>
0
:
default_num_fewshot
=
config
[
"num_fewshot"
]
...
...
@@ -122,7 +127,7 @@ def simple_evaluate(
f
"Overwriting default num_fewshot of
{
task_name
}
from
{
default_num_fewshot
}
to
{
num_fewshot
}
"
)
task_
dict
[
task_name
]
.
_config
[
"num_fewshot"
]
=
num_fewshot
task_
obj
.
_config
[
"num_fewshot"
]
=
num_fewshot
if
check_integrity
:
run_task_tests
(
task_list
=
tasks
)
...
...
lm_eval/models/huggingface.py
View file @
9c748204
import
os
import
torch
import
transformers
from
transformers.models.auto.modeling_auto
import
(
...
...
@@ -74,6 +76,7 @@ class HFLM(LM):
low_cpu_mem_usage
:
Optional
[
bool
]
=
True
,
trust_remote_code
:
Optional
[
bool
]
=
False
,
use_fast_tokenizer
:
Optional
[
bool
]
=
True
,
cache_dir
:
Optional
[
Union
[
str
,
os
.
PathLike
]]
=
None
,
# arguments used for splitting a model across GPUs naively.
# only used if `parallelize=True`.
parallelize
:
Optional
[
bool
]
=
False
,
...
...
lm_eval/tasks/super_glue/boolq/t5-prompt.yaml
0 → 100644
View file @
9c748204
group
:
-
super-glue-t5-prompt
task
:
super_glue-boolq-t5-prompt
dataset_path
:
super_glue
dataset_name
:
boolq
training_split
:
train
validation_split
:
validation
output_type
:
greedy_until
doc_to_text
:
"
boolq
question:
{{question}}
passage
{{passage}}"
doc_to_target
:
label
doc_to_choice
:
[
'
False'
,
'
True'
]
metric_list
:
-
metric
:
exact_match
aggregation
:
mean
higher_is_better
:
true
ignore_case
:
true
ignore_punctuation
:
true
lm_eval/tasks/super_glue/copa/t5-prompt.yaml
View file @
9c748204
...
...
@@ -6,9 +6,9 @@ dataset_name: copa
training_split
:
train
validation_split
:
validation
output_type
:
greedy_until
doc_to_text
:
"
copa
choice1:
{{choice1}}
choice2:
{{choice2}}
question:
{{question}}"
doc_to_text
:
"
copa
choice1:
{{choice1}}
choice2:
{{choice2}}
premise:
{{premise}}
question:
{{question}}"
doc_to_target
:
label
doc_to_choice
:
[
'
False'
,
'
True
'
]
doc_to_choice
:
[
'
choice1'
,
'
choice2
'
]
metric_list
:
-
metric
:
exact_match
aggregation
:
mean
...
...
lm_eval/tasks/super_glue/multirc/t5-prompt.yaml
0 → 100644
View file @
9c748204
group
:
-
super-glue-t5-prompt
task
:
super_glue-multirc-t5-prompt
dataset_path
:
super_glue
dataset_name
:
multirc
training_split
:
train
validation_split
:
validation
output_type
:
greedy_until
doc_to_text
:
"
multirc
question:
{{question}}
answer:
{{answer}}
paragraph:
{{paragraph}}"
doc_to_target
:
label
doc_to_choice
:
[
'
False'
,
'
True'
]
metric_list
:
-
metric
:
exact_match
aggregation
:
mean
higher_is_better
:
true
ignore_case
:
true
ignore_punctuation
:
true
lm_eval/tasks/super_glue/rte/t5-prompt.yaml
0 → 100644
View file @
9c748204
group
:
-
super-glue-t5-prompt
task
:
super_glue-rte-t5-prompt
dataset_path
:
super_glue
dataset_name
:
rte
training_split
:
train
validation_split
:
validation
output_type
:
greedy_until
doc_to_text
:
"
rte
premise:
{{premise}}
hypothesis:
{{hypothesis}}"
doc_to_target
:
label
doc_to_choice
:
[
'
entailment'
,
'
not_entailment'
]
metric_list
:
-
metric
:
exact_match
aggregation
:
mean
higher_is_better
:
true
ignore_case
:
true
ignore_punctuation
:
true
lm_eval/tasks/super_glue/wic/t5-prompt.yaml
0 → 100644
View file @
9c748204
group
:
-
super-glue-t5-prompt
task
:
super_glue-wic-t5-prompt
dataset_path
:
super_glue
dataset_name
:
wic
training_split
:
train
validation_split
:
validation
output_type
:
greedy_until
doc_to_text
:
"
wic
sentence1:
{{sentence1}}
sentence2:
{{sentence2}}"
doc_to_target
:
label
doc_to_choice
:
[
'
False'
,
'
True'
]
metric_list
:
-
metric
:
exact_match
aggregation
:
mean
higher_is_better
:
true
ignore_case
:
true
ignore_punctuation
:
true
lm_eval/tasks/super_glue/wsc/t5-prompt.yaml
View file @
9c748204
...
...
@@ -7,8 +7,7 @@ training_split: train
validation_split
:
validation
output_type
:
greedy_until
doc_to_text
:
!function
"
preprocess_wsc.t5_prompt_doc_to_text"
doc_to_target
:
label
doc_to_choice
:
[
'
False'
,
'
True'
]
doc_to_target
:
"
{{[span1_text,
span2_text][label]}}"
metric_list
:
-
metric
:
exact_match
aggregation
:
mean
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment