Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
121b7096
Commit
121b7096
authored
May 02, 2022
by
Fabrizio Milo
Browse files
add pre-commit
parent
7a038118
Changes
732
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
20 additions
and
20 deletions
+20
-20
tests/testdata/hendrycksTest-public_relations-v0-loglikelihood
.../testdata/hendrycksTest-public_relations-v0-loglikelihood
+1
-1
tests/testdata/hendrycksTest-public_relations-v0-res.json
tests/testdata/hendrycksTest-public_relations-v0-res.json
+1
-1
tests/testdata/hendrycksTest-security_studies-v0-loglikelihood
.../testdata/hendrycksTest-security_studies-v0-loglikelihood
+1
-1
tests/testdata/hendrycksTest-security_studies-v0-res.json
tests/testdata/hendrycksTest-security_studies-v0-res.json
+1
-1
tests/testdata/hendrycksTest-sociology-v0-loglikelihood
tests/testdata/hendrycksTest-sociology-v0-loglikelihood
+1
-1
tests/testdata/hendrycksTest-sociology-v0-res.json
tests/testdata/hendrycksTest-sociology-v0-res.json
+1
-1
tests/testdata/hendrycksTest-us_foreign_policy-v0-loglikelihood
...testdata/hendrycksTest-us_foreign_policy-v0-loglikelihood
+1
-1
tests/testdata/hendrycksTest-us_foreign_policy-v0-res.json
tests/testdata/hendrycksTest-us_foreign_policy-v0-res.json
+1
-1
tests/testdata/hendrycksTest-virology-v0-loglikelihood
tests/testdata/hendrycksTest-virology-v0-loglikelihood
+1
-1
tests/testdata/hendrycksTest-virology-v0-res.json
tests/testdata/hendrycksTest-virology-v0-res.json
+1
-1
tests/testdata/hendrycksTest-world_religions-v0-loglikelihood
...s/testdata/hendrycksTest-world_religions-v0-loglikelihood
+1
-1
tests/testdata/hendrycksTest-world_religions-v0-res.json
tests/testdata/hendrycksTest-world_religions-v0-res.json
+1
-1
tests/testdata/iwslt17-ar-en-v0-greedy_until
tests/testdata/iwslt17-ar-en-v0-greedy_until
+1
-1
tests/testdata/iwslt17-ar-en-v0-res.json
tests/testdata/iwslt17-ar-en-v0-res.json
+1
-1
tests/testdata/iwslt17-en-ar-v0-greedy_until
tests/testdata/iwslt17-en-ar-v0-greedy_until
+1
-1
tests/testdata/iwslt17-en-ar-v0-res.json
tests/testdata/iwslt17-en-ar-v0-res.json
+1
-1
tests/testdata/lambada-v0-loglikelihood
tests/testdata/lambada-v0-loglikelihood
+1
-1
tests/testdata/lambada-v0-res.json
tests/testdata/lambada-v0-res.json
+1
-1
tests/testdata/lambada_cloze-v0-loglikelihood
tests/testdata/lambada_cloze-v0-loglikelihood
+1
-1
tests/testdata/lambada_cloze-v0-res.json
tests/testdata/lambada_cloze-v0-res.json
+1
-1
No files found.
tests/testdata/hendrycksTest-public_relations-v0-loglikelihood
View file @
121b7096
ab70f500cf24e876f6ae6bdc27525a1d6074fa9b6ea97770255d9fc2559b36ff
ab70f500cf24e876f6ae6bdc27525a1d6074fa9b6ea97770255d9fc2559b36ff
\ No newline at end of file
tests/testdata/hendrycksTest-public_relations-v0-res.json
View file @
121b7096
{
"results"
:
{
"hendrycksTest-public_relations"
:
{
"acc"
:
0.3090909090909091
,
"acc_norm"
:
0.2636363636363636
,
"acc_norm_stderr"
:
0.04220224692971987
,
"acc_stderr"
:
0.044262946482000985
}},
"versions"
:
{
"hendrycksTest-public_relations"
:
0
}}
{
"results"
:
{
"hendrycksTest-public_relations"
:
{
"acc"
:
0.3090909090909091
,
"acc_norm"
:
0.2636363636363636
,
"acc_norm_stderr"
:
0.04220224692971987
,
"acc_stderr"
:
0.044262946482000985
}},
"versions"
:
{
"hendrycksTest-public_relations"
:
0
}}
\ No newline at end of file
tests/testdata/hendrycksTest-security_studies-v0-loglikelihood
View file @
121b7096
92dfffe2acf3278256486d3e1cf1edb5a739ad0a54c0f9c67695f7a411ed5f76
92dfffe2acf3278256486d3e1cf1edb5a739ad0a54c0f9c67695f7a411ed5f76
\ No newline at end of file
tests/testdata/hendrycksTest-security_studies-v0-res.json
View file @
121b7096
{
"results"
:
{
"hendrycksTest-security_studies"
:
{
"acc"
:
0.2979591836734694
,
"acc_norm"
:
0.2693877551020408
,
"acc_norm_stderr"
:
0.02840125202902294
,
"acc_stderr"
:
0.029279567411065674
}},
"versions"
:
{
"hendrycksTest-security_studies"
:
0
}}
{
"results"
:
{
"hendrycksTest-security_studies"
:
{
"acc"
:
0.2979591836734694
,
"acc_norm"
:
0.2693877551020408
,
"acc_norm_stderr"
:
0.02840125202902294
,
"acc_stderr"
:
0.029279567411065674
}},
"versions"
:
{
"hendrycksTest-security_studies"
:
0
}}
\ No newline at end of file
tests/testdata/hendrycksTest-sociology-v0-loglikelihood
View file @
121b7096
f99a3caece11169f2a5cc951001f92027104afd25d29b2a399883bd4bf118605
f99a3caece11169f2a5cc951001f92027104afd25d29b2a399883bd4bf118605
\ No newline at end of file
tests/testdata/hendrycksTest-sociology-v0-res.json
View file @
121b7096
{
"results"
:
{
"hendrycksTest-sociology"
:
{
"acc"
:
0.23383084577114427
,
"acc_norm"
:
0.24875621890547264
,
"acc_norm_stderr"
:
0.030567675938916707
,
"acc_stderr"
:
0.02992941540834838
}},
"versions"
:
{
"hendrycksTest-sociology"
:
0
}}
{
"results"
:
{
"hendrycksTest-sociology"
:
{
"acc"
:
0.23383084577114427
,
"acc_norm"
:
0.24875621890547264
,
"acc_norm_stderr"
:
0.030567675938916707
,
"acc_stderr"
:
0.02992941540834838
}},
"versions"
:
{
"hendrycksTest-sociology"
:
0
}}
\ No newline at end of file
tests/testdata/hendrycksTest-us_foreign_policy-v0-loglikelihood
View file @
121b7096
a1a338d0083a21054f74d36a296d6bd8e2e457327c0fd630bebcc61ed758044d
a1a338d0083a21054f74d36a296d6bd8e2e457327c0fd630bebcc61ed758044d
\ No newline at end of file
tests/testdata/hendrycksTest-us_foreign_policy-v0-res.json
View file @
121b7096
{
"results"
:
{
"hendrycksTest-us_foreign_policy"
:
{
"acc"
:
0.2
,
"acc_norm"
:
0.24
,
"acc_norm_stderr"
:
0.04292346959909283
,
"acc_stderr"
:
0.040201512610368445
}},
"versions"
:
{
"hendrycksTest-us_foreign_policy"
:
0
}}
{
"results"
:
{
"hendrycksTest-us_foreign_policy"
:
{
"acc"
:
0.2
,
"acc_norm"
:
0.24
,
"acc_norm_stderr"
:
0.04292346959909283
,
"acc_stderr"
:
0.040201512610368445
}},
"versions"
:
{
"hendrycksTest-us_foreign_policy"
:
0
}}
\ No newline at end of file
tests/testdata/hendrycksTest-virology-v0-loglikelihood
View file @
121b7096
0ffa491f7bad2abbb64ecd752a295729167599b3815238cab0ecf4cb08bba9b6
0ffa491f7bad2abbb64ecd752a295729167599b3815238cab0ecf4cb08bba9b6
\ No newline at end of file
tests/testdata/hendrycksTest-virology-v0-res.json
View file @
121b7096
{
"results"
:
{
"hendrycksTest-virology"
:
{
"acc"
:
0.27710843373493976
,
"acc_norm"
:
0.2710843373493976
,
"acc_norm_stderr"
:
0.03460579907553027
,
"acc_stderr"
:
0.034843315926805875
}},
"versions"
:
{
"hendrycksTest-virology"
:
0
}}
{
"results"
:
{
"hendrycksTest-virology"
:
{
"acc"
:
0.27710843373493976
,
"acc_norm"
:
0.2710843373493976
,
"acc_norm_stderr"
:
0.03460579907553027
,
"acc_stderr"
:
0.034843315926805875
}},
"versions"
:
{
"hendrycksTest-virology"
:
0
}}
\ No newline at end of file
tests/testdata/hendrycksTest-world_religions-v0-loglikelihood
View file @
121b7096
97a0f68ba30ea3a6ef1db1a2925c964b09ecc54455a0a930da083e52677815bd
97a0f68ba30ea3a6ef1db1a2925c964b09ecc54455a0a930da083e52677815bd
\ No newline at end of file
tests/testdata/hendrycksTest-world_religions-v0-res.json
View file @
121b7096
{
"results"
:
{
"hendrycksTest-world_religions"
:
{
"acc"
:
0.21637426900584794
,
"acc_norm"
:
0.22807017543859648
,
"acc_norm_stderr"
:
0.03218093795602357
,
"acc_stderr"
:
0.03158149539338734
}},
"versions"
:
{
"hendrycksTest-world_religions"
:
0
}}
{
"results"
:
{
"hendrycksTest-world_religions"
:
{
"acc"
:
0.21637426900584794
,
"acc_norm"
:
0.22807017543859648
,
"acc_norm_stderr"
:
0.03218093795602357
,
"acc_stderr"
:
0.03158149539338734
}},
"versions"
:
{
"hendrycksTest-world_religions"
:
0
}}
\ No newline at end of file
tests/testdata/iwslt17-ar-en-v0-greedy_until
View file @
121b7096
e94d310de91fad7ce36f4cf3305552020221482c5588f2efcefaa019893504f1
e94d310de91fad7ce36f4cf3305552020221482c5588f2efcefaa019893504f1
\ No newline at end of file
tests/testdata/iwslt17-ar-en-v0-res.json
View file @
121b7096
{
"results"
:
{
"iwslt17-ar-en"
:
{
"bleu"
:
0.0
,
"bleu_stderr"
:
0.0
,
"chrf"
:
0.015049895477752772
,
"chrf_stderr"
:
0.0002940315671893584
,
"ter"
:
1.0
,
"ter_stderr"
:
0.0
}},
"versions"
:
{
"iwslt17-ar-en"
:
0
}}
{
"results"
:
{
"iwslt17-ar-en"
:
{
"bleu"
:
0.0
,
"bleu_stderr"
:
0.0
,
"chrf"
:
0.015049895477752772
,
"chrf_stderr"
:
0.0002940315671893584
,
"ter"
:
1.0
,
"ter_stderr"
:
0.0
}},
"versions"
:
{
"iwslt17-ar-en"
:
0
}}
\ No newline at end of file
tests/testdata/iwslt17-en-ar-v0-greedy_until
View file @
121b7096
b20adbcd2c6d135e28600b427113532c5df624cb3a90e8c5e48715c09a3a38fa
b20adbcd2c6d135e28600b427113532c5df624cb3a90e8c5e48715c09a3a38fa
\ No newline at end of file
tests/testdata/iwslt17-en-ar-v0-res.json
View file @
121b7096
{
"results"
:
{
"iwslt17-en-ar"
:
{
"bleu"
:
0.0
,
"bleu_stderr"
:
0.0
,
"chrf"
:
0.0
,
"chrf_stderr"
:
0.0
,
"ter"
:
1.0
,
"ter_stderr"
:
0.0
}},
"versions"
:
{
"iwslt17-en-ar"
:
0
}}
{
"results"
:
{
"iwslt17-en-ar"
:
{
"bleu"
:
0.0
,
"bleu_stderr"
:
0.0
,
"chrf"
:
0.0
,
"chrf_stderr"
:
0.0
,
"ter"
:
1.0
,
"ter_stderr"
:
0.0
}},
"versions"
:
{
"iwslt17-en-ar"
:
0
}}
\ No newline at end of file
tests/testdata/lambada-v0-loglikelihood
View file @
121b7096
6829e6a8aa5922e6c92dd31403cc060f242dc0ede4a775e085a70da095ab2e20
6829e6a8aa5922e6c92dd31403cc060f242dc0ede4a775e085a70da095ab2e20
\ No newline at end of file
tests/testdata/lambada-v0-res.json
View file @
121b7096
{
"results"
:
{
"lambada"
:
{
"acc"
:
0.0
,
"acc_stderr"
:
0.0
,
"ppl"
:
1.6479047769869253
,
"ppl_stderr"
:
0.006497321146240192
}},
"versions"
:
{
"lambada"
:
0
}}
{
"results"
:
{
"lambada"
:
{
"acc"
:
0.0
,
"acc_stderr"
:
0.0
,
"ppl"
:
1.6479047769869253
,
"ppl_stderr"
:
0.006497321146240192
}},
"versions"
:
{
"lambada"
:
0
}}
\ No newline at end of file
tests/testdata/lambada_cloze-v0-loglikelihood
View file @
121b7096
7655e748b63ae7e9911411d2d2a2577221d6c861ca4448509992541294d689f3
7655e748b63ae7e9911411d2d2a2577221d6c861ca4448509992541294d689f3
\ No newline at end of file
tests/testdata/lambada_cloze-v0-res.json
View file @
121b7096
{
"results"
:
{
"lambada_cloze"
:
{
"acc"
:
0.0
,
"acc_stderr"
:
0.0
,
"ppl"
:
1.6479047769869253
,
"ppl_stderr"
:
0.006497321146240192
}},
"versions"
:
{
"lambada_cloze"
:
0
}}
{
"results"
:
{
"lambada_cloze"
:
{
"acc"
:
0.0
,
"acc_stderr"
:
0.0
,
"ppl"
:
1.6479047769869253
,
"ppl_stderr"
:
0.006497321146240192
}},
"versions"
:
{
"lambada_cloze"
:
0
}}
\ No newline at end of file
Prev
1
…
12
13
14
15
16
17
18
19
20
…
37
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment