Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
2b87299e
Commit
2b87299e
authored
Jul 01, 2024
by
lintangsutawika
Browse files
add task yamls
parent
8bff2285
Changes
6
Hide whitespace changes
Inline
Side-by-side
Showing
6 changed files
with
24 additions
and
6 deletions
+24
-6
lm_eval/tasks/mmmu/_mmmu_mc_yaml
lm_eval/tasks/mmmu/_mmmu_mc_yaml
+2
-0
lm_eval/tasks/mmmu/_mmmu_open_yaml
lm_eval/tasks/mmmu/_mmmu_open_yaml
+3
-0
lm_eval/tasks/mmmu/_template_yaml
lm_eval/tasks/mmmu/_template_yaml
+6
-6
lm_eval/tasks/mmmu/mmmu_electronics.yaml
lm_eval/tasks/mmmu/mmmu_electronics.yaml
+3
-0
lm_eval/tasks/mmmu/mmmu_yaml
lm_eval/tasks/mmmu/mmmu_yaml
+4
-0
lm_eval/tasks/mmmu/utils.py
lm_eval/tasks/mmmu/utils.py
+6
-0
No files found.
lm_eval/tasks/mmmu/_mmmu_mc_yaml
0 → 100644
View file @
2b87299e
include: _template_yaml
process_docs: !function utils.process_multiple_choice
lm_eval/tasks/mmmu/_mmmu_open_yaml
0 → 100644
View file @
2b87299e
include: _template_yaml
process_docs: !utils.process_open_choice
dataset_name: Electronics
\ No newline at end of file
lm_eval/tasks/mmmu/
mmmu.
yaml
→
lm_eval/tasks/mmmu/
_template_
yaml
View file @
2b87299e
dataset_path
:
lmms-lab
/MMMU
dataset_path:
MMMU
/MMMU
task: "mmmu_val"
task: "mmmu_val"
validation_split: validation
validation_split: validation
output_type: generate_until
output_type: generate_until
input_type
:
text_image
doc_to_visual: !function utils.mmmu_doc_to_visual
doc_to_visual: !function utils.mmmu_doc_to_visual
doc_to_text: !function utils.mmmu_doc_to_text
doc_to_text: !function utils.mmmu_doc_to_text
doc_to_target: "answer"
doc_to_target: "answer"
# The return value of process_results will be used by metrics
# The return value of process_results will be used by metrics
process_results
:
!function
utils.mmmu_process_results
#
process_results: !function utils.mmmu_process_results
# Note that the metric name can be either a registed metric function (such as the case for GQA) or a key name returned by process_results
# Note that the metric name can be either a registed metric function (such as the case for GQA) or a key name returned by process_results
generation_kwargs:
generation_kwargs:
until:
until:
...
@@ -19,6 +18,7 @@ generation_kwargs:
...
@@ -19,6 +18,7 @@ generation_kwargs:
repetition_penalty: 1.0
repetition_penalty: 1.0
image_aspect_ratio: original
image_aspect_ratio: original
metric_list:
metric_list:
-
metric
:
mmmu_acc
- metric: acc
aggregation
:
!function
utils.mmmu_aggregate_results
# - metric: mmmu_acc
higher_is_better
:
true
# aggregation: !function utils.mmmu_aggregate_results
\ No newline at end of file
# higher_is_better: true
lm_eval/tasks/mmmu/mmmu_electronics.yaml
0 → 100644
View file @
2b87299e
task
:
mmmu_electronics
include
:
_mmmu_mc_yaml
dataset_name
:
Electronics
\ No newline at end of file
lm_eval/tasks/mmmu/mmmu_yaml
0 → 100644
View file @
2b87299e
group: mmmu
task:
- mmmu_mc
- mmmu_open
\ No newline at end of file
lm_eval/tasks/mmmu/utils.py
View file @
2b87299e
...
@@ -475,3 +475,9 @@ def get_multi_choice_info(options):
...
@@ -475,3 +475,9 @@ def get_multi_choice_info(options):
all_choices
.
append
(
chr
(
ord
(
start_chr
)
+
i
))
all_choices
.
append
(
chr
(
ord
(
start_chr
)
+
i
))
return
index2ans
,
all_choices
return
index2ans
,
all_choices
def
process_multiple_choice
(
dataset
):
return
dataset
.
filter
(
lambda
example
:
example
[
"question_type"
]
==
"multiple-choice"
)
def
process_open_choice
(
dataset
):
return
dataset
.
filter
(
lambda
example
:
example
[
"question_type"
]
==
"open"
)
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment