Commit 4d147bdd authored by Jonathan Tow's avatar Jonathan Tow
Browse files

Merge branch 'master' of https://github.com/EleutherAI/lm-evaluation-harness into task-guide

parents 011cc891 dc937d4b
{"results": {"math_precalc": {"acc": 0.0, "acc_stderr": 0.0}}, "versions": {"math_precalc": 0}}
\ No newline at end of file
a45260e49f02c7cb8886b3746db4d388890860b202dd8a9f0267e3c324e0af13
\ No newline at end of file
{"results": {"mathqa": {"acc": 0.20770519262981574, "acc_norm": 0.2050251256281407, "acc_norm_stderr": 0.007390619359738901, "acc_stderr": 0.007426217631188539}}, "versions": {"mathqa": 0}}
\ No newline at end of file
1811808ef05afd5f30ffc3471622a3dd7a1b681b17a2f7616695ad6b2a45943c
\ No newline at end of file
{"results": {"mc_taco": {"em": 0.07732732732732733, "f1": 0.41600515965511614}}, "versions": {"mc_taco": 0}}
\ No newline at end of file
4fc7b56b8f1e37e38f4a052b227baec2df914c898c3405d3e994726ba4fba976
\ No newline at end of file
{"results": {"mnli": {"acc": 0.32868059093224655, "acc_stderr": 0.004741640290753859}}, "versions": {"mnli": 0}}
\ No newline at end of file
3784acf322e79f31702a7a0612030e4ba5c4fc466ad976a34ee3f3d7278c01f0
\ No newline at end of file
{"results": {"mnli_mismatched": {"acc": 0.3360455655004068, "acc_stderr": 0.004763973908606819}}, "versions": {"mnli_mismatched": 0}}
\ No newline at end of file
9f54cbff8d6accba99cfa2c4c4b359563313941018173d7dcf9e32dc28c06583
\ No newline at end of file
{"results": {"mrpc": {"acc": 0.5392156862745098, "acc_stderr": 0.024707732873723128, "f1": 0.5982905982905982, "f1_stderr": 0.028928325246283727}}, "versions": {"mrpc": 0}}
\ No newline at end of file
cdb026c027437a8b4653212d0944d36fc16f49921dcb8e4bef899d15a55e9f80
\ No newline at end of file
{"results": {"multirc": {"acc": 0.07450157397691501, "acc_stderr": 0.008510441526175931}}, "versions": {"multirc": 0}}
\ No newline at end of file
f759213a28f0412510bf1a24c9cab0dae64bdee902d42a26225295445e7779db
\ No newline at end of file
{"results": {"mutual": {"mrr": 0.5023513920240772, "mrr_stderr": 0.009501864812936679, "r@1": 0.22573363431151242, "r@1_stderr": 0.014053085820407457, "r@2": 0.4221218961625282, "r@2_stderr": 0.016602191705517556}}, "versions": {"mutual": 0}}
\ No newline at end of file
f759213a28f0412510bf1a24c9cab0dae64bdee902d42a26225295445e7779db
\ No newline at end of file
{"results": {"mutual": {"mrr": 0.5023513920240772, "mrr_stderr": 0.009501864812936679, "r@1": 0.22460496613995484, "r@1_stderr": 0.014028122493992806, "r@2": 0.4706546275395034, "r@2_stderr": 0.016778343895001414}}, "versions": {"mutual": 1}}
\ No newline at end of file
b846bb9db109535f59a93d1ce340cf09f68bdf4fed5b8decd168784220fe07fa
\ No newline at end of file
{"results": {"mutual_plus": {"mrr": 0.5275583145221953, "mrr_stderr": 0.009940894824430708, "r@1": 0.2595936794582393, "r@1_stderr": 0.014737047402750955, "r@2": 0.45372460496614, "r@2_stderr": 0.01673517854461967}}, "versions": {"mutual_plus": 0}}
\ No newline at end of file
b846bb9db109535f59a93d1ce340cf09f68bdf4fed5b8decd168784220fe07fa
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment