Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
b58e5556
Commit
b58e5556
authored
Jul 27, 2025
by
Baber
Browse files
Merge branch 'main' into tasklist
# Conflicts: # pyproject.toml
parents
6e1866f5
4f8195f1
Changes
340
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
48 additions
and
9 deletions
+48
-9
lm_eval/tasks/multiblimp/multiblimp_tpn.yaml
lm_eval/tasks/multiblimp/multiblimp_tpn.yaml
+3
-0
lm_eval/tasks/multiblimp/multiblimp_ttc.yaml
lm_eval/tasks/multiblimp/multiblimp_ttc.yaml
+3
-0
lm_eval/tasks/multiblimp/multiblimp_tur.yaml
lm_eval/tasks/multiblimp/multiblimp_tur.yaml
+3
-0
lm_eval/tasks/multiblimp/multiblimp_uig.yaml
lm_eval/tasks/multiblimp/multiblimp_uig.yaml
+3
-0
lm_eval/tasks/multiblimp/multiblimp_ukr.yaml
lm_eval/tasks/multiblimp/multiblimp_ukr.yaml
+3
-0
lm_eval/tasks/multiblimp/multiblimp_urb.yaml
lm_eval/tasks/multiblimp/multiblimp_urb.yaml
+3
-0
lm_eval/tasks/multiblimp/multiblimp_urd.yaml
lm_eval/tasks/multiblimp/multiblimp_urd.yaml
+3
-0
lm_eval/tasks/multiblimp/multiblimp_uzb.yaml
lm_eval/tasks/multiblimp/multiblimp_uzb.yaml
+3
-0
lm_eval/tasks/multiblimp/multiblimp_vep.yaml
lm_eval/tasks/multiblimp/multiblimp_vep.yaml
+3
-0
lm_eval/tasks/multiblimp/multiblimp_wbp.yaml
lm_eval/tasks/multiblimp/multiblimp_wbp.yaml
+3
-0
lm_eval/tasks/multiblimp/multiblimp_wol.yaml
lm_eval/tasks/multiblimp/multiblimp_wol.yaml
+3
-0
lm_eval/tasks/multiblimp/multiblimp_xcl.yaml
lm_eval/tasks/multiblimp/multiblimp_xcl.yaml
+3
-0
lm_eval/tasks/multiblimp/multiblimp_xnr.yaml
lm_eval/tasks/multiblimp/multiblimp_xnr.yaml
+3
-0
lm_eval/tasks/multiblimp/multiblimp_xpg.yaml
lm_eval/tasks/multiblimp/multiblimp_xpg.yaml
+3
-0
lm_eval/tasks/multiblimp/multiblimp_yrl.yaml
lm_eval/tasks/multiblimp/multiblimp_yrl.yaml
+3
-0
lm_eval/tasks/mutual/mutual.yaml
lm_eval/tasks/mutual/mutual.yaml
+0
-2
lm_eval/tasks/noreval/tatoeba/_tatoeba_yaml
lm_eval/tasks/noreval/tatoeba/_tatoeba_yaml
+0
-2
lm_eval/tasks/olaph/utils.py
lm_eval/tasks/olaph/utils.py
+3
-1
lm_eval/tasks/piqa/piqa.yaml
lm_eval/tasks/piqa/piqa.yaml
+0
-2
lm_eval/tasks/portuguese_bench/flores_pt/_flores_common_yaml
lm_eval/tasks/portuguese_bench/flores_pt/_flores_common_yaml
+0
-2
No files found.
lm_eval/tasks/multiblimp/multiblimp_tpn.yaml
0 → 100644
View file @
b58e5556
include
:
_template_yaml
dataset_name
:
tpn
task
:
multiblimp_tpn
lm_eval/tasks/multiblimp/multiblimp_ttc.yaml
0 → 100644
View file @
b58e5556
include
:
_template_yaml
dataset_name
:
ttc
task
:
multiblimp_ttc
lm_eval/tasks/multiblimp/multiblimp_tur.yaml
0 → 100644
View file @
b58e5556
include
:
_template_yaml
dataset_name
:
tur
task
:
multiblimp_tur
lm_eval/tasks/multiblimp/multiblimp_uig.yaml
0 → 100644
View file @
b58e5556
include
:
_template_yaml
dataset_name
:
uig
task
:
multiblimp_uig
lm_eval/tasks/multiblimp/multiblimp_ukr.yaml
0 → 100644
View file @
b58e5556
include
:
_template_yaml
dataset_name
:
ukr
task
:
multiblimp_ukr
lm_eval/tasks/multiblimp/multiblimp_urb.yaml
0 → 100644
View file @
b58e5556
include
:
_template_yaml
dataset_name
:
urb
task
:
multiblimp_urb
lm_eval/tasks/multiblimp/multiblimp_urd.yaml
0 → 100644
View file @
b58e5556
include
:
_template_yaml
dataset_name
:
urd
task
:
multiblimp_urd
lm_eval/tasks/multiblimp/multiblimp_uzb.yaml
0 → 100644
View file @
b58e5556
include
:
_template_yaml
dataset_name
:
uzb
task
:
multiblimp_uzb
lm_eval/tasks/multiblimp/multiblimp_vep.yaml
0 → 100644
View file @
b58e5556
include
:
_template_yaml
dataset_name
:
vep
task
:
multiblimp_vep
lm_eval/tasks/multiblimp/multiblimp_wbp.yaml
0 → 100644
View file @
b58e5556
include
:
_template_yaml
dataset_name
:
wbp
task
:
multiblimp_wbp
lm_eval/tasks/multiblimp/multiblimp_wol.yaml
0 → 100644
View file @
b58e5556
include
:
_template_yaml
dataset_name
:
wol
task
:
multiblimp_wol
lm_eval/tasks/multiblimp/multiblimp_xcl.yaml
0 → 100644
View file @
b58e5556
include
:
_template_yaml
dataset_name
:
xcl
task
:
multiblimp_xcl
lm_eval/tasks/multiblimp/multiblimp_xnr.yaml
0 → 100644
View file @
b58e5556
include
:
_template_yaml
dataset_name
:
xnr
task
:
multiblimp_xnr
lm_eval/tasks/multiblimp/multiblimp_xpg.yaml
0 → 100644
View file @
b58e5556
include
:
_template_yaml
dataset_name
:
xpg
task
:
multiblimp_xpg
lm_eval/tasks/multiblimp/multiblimp_yrl.yaml
0 → 100644
View file @
b58e5556
include
:
_template_yaml
dataset_name
:
yrl
task
:
multiblimp_yrl
lm_eval/tasks/mutual/mutual.yaml
View file @
b58e5556
...
...
@@ -23,5 +23,3 @@ metric_list:
higher_is_better
:
true
metadata
:
version
:
2.0
dataset_kwargs
:
trust_remote_code
:
true
lm_eval/tasks/noreval/tatoeba/_tatoeba_yaml
View file @
b58e5556
...
...
@@ -2,8 +2,6 @@ dataset_path: Helsinki-NLP/tatoeba_mt
training_split: validation
test_split: test
output_type: generate_until
dataset_kwargs:
trust_remote_code: true
metric_list:
- metric: bleu
higher_is_better: true
...
...
lm_eval/tasks/olaph/utils.py
View file @
b58e5556
...
...
@@ -12,7 +12,9 @@ try:
except
(
ModuleNotFoundError
,
ImportError
):
raise
ModuleNotFoundError
(
"Please install evaluation metrics via pip install evaluate and pip install bert-score"
,
"Please install evaluation metrics via pip install evaluate bert-score "
"rouge_score>=0.1.2 nltk absl-py "
"git+https://github.com/google-research/bleurt.git"
)
except
Exception
as
e
:
raise
RuntimeError
(
...
...
lm_eval/tasks/piqa/piqa.yaml
View file @
b58e5556
...
...
@@ -19,5 +19,3 @@ metric_list:
higher_is_better
:
true
metadata
:
version
:
1.0
dataset_kwargs
:
trust_remote_code
:
true
lm_eval/tasks/portuguese_bench/flores_pt/_flores_common_yaml
View file @
b58e5556
...
...
@@ -23,5 +23,3 @@ metric_list:
higher_is_better: true
metadata:
version: 1.0
dataset_kwargs:
trust_remote_code: true
Prev
1
…
12
13
14
15
16
17
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment