Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
b1746639
Commit
b1746639
authored
Oct 19, 2024
by
Baber
Browse files
add mathvista_mcq
parent
25869601
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
22 additions
and
0 deletions
+22
-0
lm_eval/tasks/mathvista/mathvista_mcq.yaml
lm_eval/tasks/mathvista/mathvista_mcq.yaml
+15
-0
lm_eval/tasks/mathvista/utils.py
lm_eval/tasks/mathvista/utils.py
+7
-0
No files found.
lm_eval/tasks/mathvista/mathvista_mcq.yaml
0 → 100644
View file @
b1746639
include
:
mathvista.yaml
task
:
mathvista_mcq
output_type
:
"
multiple_choice"
process_docs
:
!function
utils.process_docs_mcq
doc_to_choice
:
'
{{
["A",
"B",
"C",
"D",
"E",
"F"][:choices.length]
}}'
doc_to_target
:
"
{{choices.index(answer)}}"
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
lm_eval/tasks/mathvista/utils.py
View file @
b1746639
...
...
@@ -143,3 +143,10 @@ def process_results(doc: dict, results: list[str]):
)
res
=
safe_equal
(
normalized_extraction
,
answer
)
return
{
"acc"
:
1.0
}
if
res
else
{
"acc"
:
0.0
}
### MathVista MCQ ###
def
process_docs_mcq
(
dataset
):
return
dataset
.
filter
(
lambda
x
:
x
[
"question_type"
]
==
"multi_choice"
)
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment