Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
f38c7469
Commit
f38c7469
authored
Dec 27, 2023
by
lintangsutawika
Browse files
split to easy and challenge
parent
56abc3a1
Changes
64
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
98 additions
and
0 deletions
+98
-0
lm_eval/tasks/arc/alternative_worlds/arc_challenge/output_variation/style_06/c.yaml
...ive_worlds/arc_challenge/output_variation/style_06/c.yaml
+6
-0
lm_eval/tasks/arc/alternative_worlds/arc_challenge/output_variation/style_07/a.yaml
...ive_worlds/arc_challenge/output_variation/style_07/a.yaml
+6
-0
lm_eval/tasks/arc/alternative_worlds/arc_challenge/output_variation/style_07/b.yaml
...ive_worlds/arc_challenge/output_variation/style_07/b.yaml
+6
-0
lm_eval/tasks/arc/alternative_worlds/arc_challenge/output_variation/style_07/c.yaml
...ive_worlds/arc_challenge/output_variation/style_07/c.yaml
+6
-0
lm_eval/tasks/arc/alternative_worlds/arc_challenge/output_variation/style_08/a.yaml
...ive_worlds/arc_challenge/output_variation/style_08/a.yaml
+6
-0
lm_eval/tasks/arc/alternative_worlds/arc_challenge/output_variation/style_08/b.yaml
...ive_worlds/arc_challenge/output_variation/style_08/b.yaml
+6
-0
lm_eval/tasks/arc/alternative_worlds/arc_challenge/output_variation/style_08/c.yaml
...ive_worlds/arc_challenge/output_variation/style_08/c.yaml
+6
-0
lm_eval/tasks/arc/alternative_worlds/arc_challenge/output_variation/styles.py
...ternative_worlds/arc_challenge/output_variation/styles.py
+0
-0
lm_eval/tasks/arc/alternative_worlds/arc_challenge/prompt_variation/_arc_challenge_alt_yaml
...ds/arc_challenge/prompt_variation/_arc_challenge_alt_yaml
+21
-0
lm_eval/tasks/arc/alternative_worlds/arc_challenge/prompt_variation/style_01.yaml
...ative_worlds/arc_challenge/prompt_variation/style_01.yaml
+5
-0
lm_eval/tasks/arc/alternative_worlds/arc_challenge/prompt_variation/style_02.yaml
...ative_worlds/arc_challenge/prompt_variation/style_02.yaml
+5
-0
lm_eval/tasks/arc/alternative_worlds/arc_challenge/prompt_variation/style_03.yaml
...ative_worlds/arc_challenge/prompt_variation/style_03.yaml
+5
-0
lm_eval/tasks/arc/alternative_worlds/arc_easy/README.md
lm_eval/tasks/arc/alternative_worlds/arc_easy/README.md
+20
-0
lm_eval/tasks/arc/alternative_worlds/arc_easy/output_variation/_arc_easy_alt_yaml
...ative_worlds/arc_easy/output_variation/_arc_easy_alt_yaml
+0
-0
lm_eval/tasks/arc/alternative_worlds/arc_easy/output_variation/arc_easy_alt.yaml
...native_worlds/arc_easy/output_variation/arc_easy_alt.yaml
+0
-0
lm_eval/tasks/arc/alternative_worlds/arc_easy/output_variation/style_01/a.yaml
...ernative_worlds/arc_easy/output_variation/style_01/a.yaml
+0
-0
lm_eval/tasks/arc/alternative_worlds/arc_easy/output_variation/style_01/b.yaml
...ernative_worlds/arc_easy/output_variation/style_01/b.yaml
+0
-0
lm_eval/tasks/arc/alternative_worlds/arc_easy/output_variation/style_01/c.yaml
...ernative_worlds/arc_easy/output_variation/style_01/c.yaml
+0
-0
lm_eval/tasks/arc/alternative_worlds/arc_easy/output_variation/style_02/a.yaml
...ernative_worlds/arc_easy/output_variation/style_02/a.yaml
+0
-0
lm_eval/tasks/arc/alternative_worlds/arc_easy/output_variation/style_02/b.yaml
...ernative_worlds/arc_easy/output_variation/style_02/b.yaml
+0
-0
No files found.
lm_eval/tasks/arc/alternative_worlds/arc_challenge/output_variation/style_06/c.yaml
0 → 100644
View file @
f38c7469
include
:
../_arc_challenge_alt_yaml
group
:
arc_challenge_alt_ov_06
task
:
arc_challenge_alt_ov_06c
doc_to_text
:
!function
../styles.template_06
doc_to_choice
:
!function
../styles.choice_06c
doc_to_decontamination_query
:
!function
../styles.template_06
lm_eval/tasks/arc/alternative_worlds/arc_challenge/output_variation/style_07/a.yaml
0 → 100644
View file @
f38c7469
include
:
../_arc_challenge_alt_yaml
group
:
arc_challenge_alt_ov_07
task
:
arc_challenge_alt_ov_07a
doc_to_text
:
!function
../styles.template_07
doc_to_choice
:
!function
../styles.choice_07a
doc_to_decontamination_query
:
!function
../styles.template_07
lm_eval/tasks/arc/alternative_worlds/arc_challenge/output_variation/style_07/b.yaml
0 → 100644
View file @
f38c7469
include
:
../_arc_challenge_alt_yaml
group
:
arc_challenge_alt_ov_07
task
:
arc_challenge_alt_ov_07b
doc_to_text
:
!function
../styles.template_07
doc_to_choice
:
!function
../styles.choice_07b
doc_to_decontamination_query
:
!function
../styles.template_07
lm_eval/tasks/arc/alternative_worlds/arc_challenge/output_variation/style_07/c.yaml
0 → 100644
View file @
f38c7469
include
:
../_arc_challenge_alt_yaml
group
:
arc_challenge_alt_ov_07
task
:
arc_challenge_alt_ov_07c
doc_to_text
:
!function
../styles.template_07
doc_to_choice
:
!function
../styles.choice_07c
doc_to_decontamination_query
:
!function
../styles.template_07
lm_eval/tasks/arc/alternative_worlds/arc_challenge/output_variation/style_08/a.yaml
0 → 100644
View file @
f38c7469
include
:
../_arc_challenge_alt_yaml
group
:
arc_challenge_alt_ov_08
task
:
arc_challenge_alt_ov_08a
doc_to_text
:
!function
../styles.template_08
doc_to_choice
:
!function
../styles.choice_08a
doc_to_decontamination_query
:
!function
../styles.template_08
lm_eval/tasks/arc/alternative_worlds/arc_challenge/output_variation/style_08/b.yaml
0 → 100644
View file @
f38c7469
include
:
../_arc_challenge_alt_yaml
group
:
arc_challenge_alt_ov_08
task
:
arc_challenge_alt_ov_08b
doc_to_text
:
!function
../styles.template_08
doc_to_choice
:
!function
../styles.choice_08b
doc_to_decontamination_query
:
!function
../styles.template_08
lm_eval/tasks/arc/alternative_worlds/arc_challenge/output_variation/style_08/c.yaml
0 → 100644
View file @
f38c7469
include
:
../_arc_challenge_alt_yaml
group
:
arc_challenge_alt_ov_08
task
:
arc_challenge_alt_ov_08c
doc_to_text
:
!function
../styles.template_08
doc_to_choice
:
!function
../styles.choice_08c
doc_to_decontamination_query
:
!function
../styles.template_08
lm_eval/tasks/arc/alternative_worlds/output_variation/styles.py
→
lm_eval/tasks/arc/alternative_worlds/
arc_challenge/
output_variation/styles.py
View file @
f38c7469
File moved
lm_eval/tasks/arc/alternative_worlds/arc_challenge/prompt_variation/_arc_challenge_alt_yaml
0 → 100644
View file @
f38c7469
dataset_path: ai2_arc
dataset_name: ARC-Challenge
output_type: multiple_choice
training_split: train
validation_split: validation
test_split: test
doc_to_text: "Question: {{question}}\nAnswer:"
doc_to_target: "{{choices.label.index(answerKey)}}"
doc_to_choice: "{{choices.text}}"
should_decontaminate: true
doc_to_decontamination_query: "Question: {{question}}\nAnswer:"
metric_list:
- metric: acc
aggregation: mean
higher_is_better: true
- metric: acc_norm
aggregation: mean
higher_is_better: true
- metric: brier_score
aggregation: brier_score
higher_is_better: false
lm_eval/tasks/arc/alternative_worlds/arc_challenge/prompt_variation/style_01.yaml
0 → 100644
View file @
f38c7469
include
:
_arc_challenge_alt_yaml
group
:
arc_challenge_alt_pv
task
:
arc_challenge_alt_pv_01
doc_to_text
:
"
{{question}}"
doc_to_decontamination_query
:
"
{{question}}"
lm_eval/tasks/arc/alternative_worlds/arc_challenge/prompt_variation/style_02.yaml
0 → 100644
View file @
f38c7469
include
:
_arc_challenge_alt_yaml
group
:
arc_challenge_alt_pv
task
:
arc_challenge_alt_pv_02
doc_to_text
:
"
Q:
{{question}}
\n
A:"
doc_to_decontamination_query
:
"
Q:
{{question}}
\n
A:"
lm_eval/tasks/arc/alternative_worlds/arc_challenge/prompt_variation/style_03.yaml
0 → 100644
View file @
f38c7469
include
:
_arc_challenge_alt_yaml
group
:
arc_challenge_alt_pv
task
:
arc_challenge_alt_pv_03
doc_to_text
:
"
Question:
{{question}}
\n
Answer:"
doc_to_decontamination_query
:
"
Question:
{{question}}
\n
Answer:"
lm_eval/tasks/arc/alternative_worlds/arc_easy/README.md
0 → 100644
View file @
f38c7469
Investigate affect of letter options
-
(A)
-
A)
-
A.
-
A
\t
-
(a)
-
a)
-
a.
-
a
\t
Answer types:
-
letters only
-
original option
-
just letter
-
letters + continuation
-
original option
-
just letter
-
continuation
lm_eval/tasks/arc/alternative_worlds/output_variation/_arc_easy_alt_yaml
→
lm_eval/tasks/arc/alternative_worlds/
arc_easy/
output_variation/_arc_easy_alt_yaml
View file @
f38c7469
File moved
lm_eval/tasks/arc/alternative_worlds/output_variation/arc_easy_alt.yaml
→
lm_eval/tasks/arc/alternative_worlds/
arc_easy/
output_variation/arc_easy_alt.yaml
View file @
f38c7469
File moved
lm_eval/tasks/arc/alternative_worlds/output_variation/style_01/a.yaml
→
lm_eval/tasks/arc/alternative_worlds/
arc_easy/
output_variation/style_01/a.yaml
View file @
f38c7469
File moved
lm_eval/tasks/arc/alternative_worlds/output_variation/style_01/b.yaml
→
lm_eval/tasks/arc/alternative_worlds/
arc_easy/
output_variation/style_01/b.yaml
View file @
f38c7469
File moved
lm_eval/tasks/arc/alternative_worlds/output_variation/style_01/c.yaml
→
lm_eval/tasks/arc/alternative_worlds/
arc_easy/
output_variation/style_01/c.yaml
View file @
f38c7469
File moved
lm_eval/tasks/arc/alternative_worlds/output_variation/style_02/a.yaml
→
lm_eval/tasks/arc/alternative_worlds/
arc_easy/
output_variation/style_02/a.yaml
View file @
f38c7469
File moved
lm_eval/tasks/arc/alternative_worlds/output_variation/style_02/b.yaml
→
lm_eval/tasks/arc/alternative_worlds/
arc_easy/
output_variation/style_02/b.yaml
View file @
f38c7469
File moved
Prev
1
2
3
4
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment