Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
6a72f627
Commit
6a72f627
authored
Dec 10, 2024
by
Baber
Browse files
use `allenai/ai2_arc`
parent
6d0c60d7
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
12 additions
and
32 deletions
+12
-32
lm_eval/tasks/llama3/base/arc_challenge.yaml
lm_eval/tasks/llama3/base/arc_challenge.yaml
+12
-8
lm_eval/tasks/llama3/base/arc_easy.yaml
lm_eval/tasks/llama3/base/arc_easy.yaml
+0
-24
No files found.
lm_eval/tasks/llama3/base/arc_challenge.yaml
View file @
6a72f627
tag
:
tag
:
-
llama
3
-
llama
task
:
llama_arc_challenge
task
:
llama_arc_challenge
dataset_path
:
meta-llama/Llama-3.1-8B-evals
dataset_path
:
allenai/ai2_arc
dataset_name
:
Llama-3.1-8B-evals__arc_challenge__details
dataset_name
:
ARC-Challenge
output_type
:
multiple_choice
output_type
:
multiple_choice
test_split
:
latest
training_split
:
train
process_docs
:
!function
utils.process_arc_c_docs
validation_split
:
validation
doc_to_text
:
"
{{doc_to_text}}"
test_split
:
test
doc_to_target
:
"
{{doc_to_target}}"
fewshot_split
:
train
doc_to_choice
:
"
{{doc_to_choice}}"
doc_to_text
:
"
Question:
{{question.strip()}}
\n
A.
{{choices.text[0]}}
\n
B.
{{choices.text[1]}}
\n
C.
{{choices.text[2]}}{%
if
choices.text|length
>
3
%}
\n
D.
{{choices.text[3]}}{%
endif
%}
\n
Answer:"
fewshot_delimiter
:
"
\n\n
"
doc_to_target
:
"
{{
'ABCD'[answerKey|int
-
1]
if
answerKey|string
in
'1234'
else
answerKey
}}"
doc_to_choice
:
"
{{
choices.label|map('replace',
'1',
'A')|map('replace',
'2',
'B')|map('replace',
'3',
'C')|map('replace',
'4',
'D')|list
if
choices.label[0]
in
'1234'
else
choices.label
}}"
num_fewshot
:
25
metric_list
:
metric_list
:
-
metric
:
acc
-
metric
:
acc
aggregation
:
mean
aggregation
:
mean
...
...
lm_eval/tasks/llama3/base/arc_easy.yaml
deleted
100644 → 0
View file @
6d0c60d7
tag
:
-
llama
task
:
arc_challenge_chat
dataset_path
:
allenai/ai2_arc
dataset_name
:
ARC-Challenge
output_type
:
multiple_choice
training_split
:
train
validation_split
:
validation
test_split
:
test
#doc_to_text: "Question: {{question}}\nAnswer:"
doc_to_text
:
"
Question:
{{question.strip()}}
\n
A.
{{choices.text[0]}}
\n
B.
{{choices.text[1]}}
\n
C.
{{choices.text[2]}}{%
if
choices.text|length
>
3
%}
\n
D.
{{choices.text[3]}}{%
endif
%}
\n
Answer:"
fewshot_delimiter
:
"
\n\n
"
doc_to_target
:
"
{{answerKey}}"
doc_to_choice
:
"
{{choices.label}}"
num_fewshot
:
25
metric_list
:
-
metric
:
acc
aggregation
:
mean
higher_is_better
:
true
-
metric
:
acc_norm
aggregation
:
mean
higher_is_better
:
true
metadata
:
version
:
1.0
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment