Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
2106fbeb
Commit
2106fbeb
authored
Jan 15, 2025
by
Baber
Browse files
Merge branch 'main' into mathvista
# Conflicts: # lm_eval/models/openai_completions.py
parents
4354fe46
703fbffd
Changes
574
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
91 additions
and
0 deletions
+91
-0
lm_eval/tasks/kbl/bar_exam/public/kbl_bar_exam_em_public_2020.yaml
...asks/kbl/bar_exam/public/kbl_bar_exam_em_public_2020.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/public/kbl_bar_exam_em_public_2021.yaml
...asks/kbl/bar_exam/public/kbl_bar_exam_em_public_2021.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/public/kbl_bar_exam_em_public_2022.yaml
...asks/kbl/bar_exam/public/kbl_bar_exam_em_public_2022.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/public/kbl_bar_exam_em_public_2023.yaml
...asks/kbl/bar_exam/public/kbl_bar_exam_em_public_2023.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/public/kbl_bar_exam_em_public_2024.yaml
...asks/kbl/bar_exam/public/kbl_bar_exam_em_public_2024.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/responsibility/_base_em_yaml
lm_eval/tasks/kbl/bar_exam/responsibility/_base_em_yaml
+34
-0
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2010.yaml
...m/responsibility/kbl_bar_exam_em_responsibility_2010.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2011.yaml
...m/responsibility/kbl_bar_exam_em_responsibility_2011.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2012.yaml
...m/responsibility/kbl_bar_exam_em_responsibility_2012.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2013.yaml
...m/responsibility/kbl_bar_exam_em_responsibility_2013.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2014.yaml
...m/responsibility/kbl_bar_exam_em_responsibility_2014.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2015.yaml
...m/responsibility/kbl_bar_exam_em_responsibility_2015.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2016.yaml
...m/responsibility/kbl_bar_exam_em_responsibility_2016.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2017.yaml
...m/responsibility/kbl_bar_exam_em_responsibility_2017.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2018.yaml
...m/responsibility/kbl_bar_exam_em_responsibility_2018.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2019.yaml
...m/responsibility/kbl_bar_exam_em_responsibility_2019.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2020.yaml
...m/responsibility/kbl_bar_exam_em_responsibility_2020.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2021.yaml
...m/responsibility/kbl_bar_exam_em_responsibility_2021.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2022.yaml
...m/responsibility/kbl_bar_exam_em_responsibility_2022.yaml
+3
-0
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2023.yaml
...m/responsibility/kbl_bar_exam_em_responsibility_2023.yaml
+3
-0
No files found.
lm_eval/tasks/kbl/bar_exam/public/kbl_bar_exam_em_public_2020.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_public_2020
dataset_name
:
bar_exam_public_2020
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/public/kbl_bar_exam_em_public_2021.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_public_2021
dataset_name
:
bar_exam_public_2021
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/public/kbl_bar_exam_em_public_2022.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_public_2022
dataset_name
:
bar_exam_public_2022
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/public/kbl_bar_exam_em_public_2023.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_public_2023
dataset_name
:
bar_exam_public_2023
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/public/kbl_bar_exam_em_public_2024.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_public_2024
dataset_name
:
bar_exam_public_2024
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/responsibility/_base_em_yaml
0 → 100644
View file @
2106fbeb
tag:
- kbl
- kbl_bar_exam_em
- kbl_bar_exam_em_responsibility
description: '당신은 사용자의 질문에 친절하고 논리적으로 답변해 주는 법률 전문가 챗봇 입니다.\n'
dataset_path: lbox/kbl
test_split: test
output_type: generate_until
doc_to_text: '### 질문: {{question}}
다음 각 선택지를 읽고 A, B, C, D 중 하나를 선택하여 ''답변: A'' 와 같이 단답식으로 답해 주세요.
A. {{A}}
B. {{B}}
C. {{C}}
D. {{D}}
### 답변:'
doc_to_target: gt
metric_list:
- metric: exact_match
aggregation: mean
higher_is_better: true
ignore_case: true
ignore_punctuation: true
filter_list:
- name: get-answer
filter:
- function: regex
regex_pattern: ([A-D]).*
- function: take_first
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2010.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_responsibility_2010
dataset_name
:
bar_exam_responsibility_2010
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2011.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_responsibility_2011
dataset_name
:
bar_exam_responsibility_2011
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2012.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_responsibility_2012
dataset_name
:
bar_exam_responsibility_2012
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2013.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_responsibility_2013
dataset_name
:
bar_exam_responsibility_2013
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2014.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_responsibility_2014
dataset_name
:
bar_exam_responsibility_2014
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2015.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_responsibility_2015
dataset_name
:
bar_exam_responsibility_2015
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2016.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_responsibility_2016
dataset_name
:
bar_exam_responsibility_2016
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2017.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_responsibility_2017
dataset_name
:
bar_exam_responsibility_2017
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2018.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_responsibility_2018
dataset_name
:
bar_exam_responsibility_2018
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2019.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_responsibility_2019
dataset_name
:
bar_exam_responsibility_2019
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2020.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_responsibility_2020
dataset_name
:
bar_exam_responsibility_2020
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2021.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_responsibility_2021
dataset_name
:
bar_exam_responsibility_2021
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2022.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_responsibility_2022
dataset_name
:
bar_exam_responsibility_2022
include
:
_base_em_yaml
lm_eval/tasks/kbl/bar_exam/responsibility/kbl_bar_exam_em_responsibility_2023.yaml
0 → 100644
View file @
2106fbeb
task
:
kbl_bar_exam_em_responsibility_2023
dataset_name
:
bar_exam_responsibility_2023
include
:
_base_em_yaml
Prev
1
…
10
11
12
13
14
15
16
17
18
…
29
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment