Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
cb43ad4e
"vscode:/vscode.git/clone" did not exist on "ef01e245c20407d962908eb97a1f6cda28e2a90a"
Unverified
Commit
cb43ad4e
authored
Jul 08, 2024
by
Hailey Schoelkopf
Committed by
GitHub
Jul 08, 2024
Browse files
we run with bootstrap_iters=0 for printing tests (#2080)
parent
517aadc4
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
30 additions
and
30 deletions
+30
-30
tests/testdata/ai2_arc_10_hf_pretrained-EleutherAI-pythia-14m-dtype-float32-device-cpu.txt
...rained-EleutherAI-pythia-14m-dtype-float32-device-cpu.txt
+5
-5
tests/testdata/lambada_openai_10_hf_pretrained-EleutherAI-pythia-14m-dtype-float32-device-cpu.txt
...rained-EleutherAI-pythia-14m-dtype-float32-device-cpu.txt
+4
-4
tests/testdata/mmlu_stem_10_hf_pretrained-EleutherAI-pythia-14m-dtype-float32-device-cpu.txt
...rained-EleutherAI-pythia-14m-dtype-float32-device-cpu.txt
+21
-21
No files found.
tests/testdata/ai2_arc_10_hf_pretrained-EleutherAI-pythia-14m-dtype-float32-device-cpu.txt
View file @
cb43ad4e
| Tasks |Version|Filter|n-shot| Metric | |Value| |Stderr|
| Tasks |Version|Filter|n-shot| Metric | |Value| |Stderr|
|-------------|------:|------|-----:|--------|---|----:|---|-----:|
|-------------|------:|------|-----:|--------|---|----:|---|------|
|arc_challenge| 1|none | 0|acc |↑ | 0.0|± |0.0000|
|arc_challenge| 1|none | 0|acc |↑ | 0.0|± | N/A|
| | |none | 0|acc_norm|↑ | 0.0|± |0.0000|
| | |none | 0|acc_norm|↑ | 0.0|± | N/A|
|arc_easy | 1|none | 0|acc |↑ | 0.3|± |0.1528|
|arc_easy | 1|none | 0|acc |↑ | 0.3|± | N/A|
| | |none | 0|acc_norm|↑ | 0.1|± |0.1000|
| | |none | 0|acc_norm|↑ | 0.1|± | N/A|
\ No newline at end of file
\ No newline at end of file
tests/testdata/lambada_openai_10_hf_pretrained-EleutherAI-pythia-14m-dtype-float32-device-cpu.txt
View file @
cb43ad4e
| Tasks |Version|Filter|n-shot| Metric | | Value | | Stderr |
| Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr|
|--------------|------:|------|-----:|----------|---|-------:|---|--------:|
|--------------|------:|------|-----:|----------|---|-------:|---|------|
|lambada_openai| 1|none | 0|acc |↑ | 0.1000|± | 0.1000|
|lambada_openai| 1|none | 0|acc |↑ | 0.1000|± | N/A|
| | |none | 0|perplexity|↓ |605.3866|± |1636.6987|
| | |none | 0|perplexity|↓ |605.3866|± | N/A|
\ No newline at end of file
\ No newline at end of file
tests/testdata/mmlu_stem_10_hf_pretrained-EleutherAI-pythia-14m-dtype-float32-device-cpu.txt
View file @
cb43ad4e
| Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr|
| Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr|
|-------------------------------|------:|------|-----:|------|---|-----:|---|-----:|
|-------------------------------|------:|------|-----:|------|---|-----:|---|------|
|stem | 1|none | |acc |↑ |0.2474|± |0.0315|
|stem | 1|none | |acc |↑ |0.2474|± | N/A|
| - abstract_algebra | 0|none | 0|acc |↑ |0.2000|± |0.1333|
| - abstract_algebra | 0|none | 0|acc |↑ |0.2000|± | N/A|
| - anatomy | 0|none | 0|acc |↑ |0.3000|± |0.1528|
| - anatomy | 0|none | 0|acc |↑ |0.3000|± | N/A|
| - astronomy | 0|none | 0|acc |↑ |0.1000|± |0.1000|
| - astronomy | 0|none | 0|acc |↑ |0.1000|± | N/A|
| - college_biology | 0|none | 0|acc |↑ |0.3000|± |0.1528|
| - college_biology | 0|none | 0|acc |↑ |0.3000|± | N/A|
| - college_chemistry | 0|none | 0|acc |↑ |0.1000|± |0.1000|
| - college_chemistry | 0|none | 0|acc |↑ |0.1000|± | N/A|
| - college_computer_science | 0|none | 0|acc |↑ |0.2000|± |0.1333|
| - college_computer_science | 0|none | 0|acc |↑ |0.2000|± | N/A|
| - college_mathematics | 0|none | 0|acc |↑ |0.2000|± |0.1333|
| - college_mathematics | 0|none | 0|acc |↑ |0.2000|± | N/A|
| - college_physics | 0|none | 0|acc |↑ |0.3000|± |0.1528|
| - college_physics | 0|none | 0|acc |↑ |0.3000|± | N/A|
| - computer_security | 0|none | 0|acc |↑ |0.5000|± |0.1667|
| - computer_security | 0|none | 0|acc |↑ |0.5000|± | N/A|
| - conceptual_physics | 0|none | 0|acc |↑ |0.3000|± |0.1528|
| - conceptual_physics | 0|none | 0|acc |↑ |0.3000|± | N/A|
| - electrical_engineering | 0|none | 0|acc |↑ |0.4000|± |0.1633|
| - electrical_engineering | 0|none | 0|acc |↑ |0.4000|± | N/A|
| - elementary_mathematics | 0|none | 0|acc |↑ |0.0000|± |0.0000|
| - elementary_mathematics | 0|none | 0|acc |↑ |0.0000|± | N/A|
| - high_school_biology | 0|none | 0|acc |↑ |0.3000|± |0.1528|
| - high_school_biology | 0|none | 0|acc |↑ |0.3000|± | N/A|
| - high_school_chemistry | 0|none | 0|acc |↑ |0.4000|± |0.1633|
| - high_school_chemistry | 0|none | 0|acc |↑ |0.4000|± | N/A|
| - high_school_computer_science| 0|none | 0|acc |↑ |0.3000|± |0.1528|
| - high_school_computer_science| 0|none | 0|acc |↑ |0.3000|± | N/A|
| - high_school_mathematics | 0|none | 0|acc |↑ |0.2000|± |0.1333|
| - high_school_mathematics | 0|none | 0|acc |↑ |0.2000|± | N/A|
| - high_school_physics | 0|none | 0|acc |↑ |0.3000|± |0.1528|
| - high_school_physics | 0|none | 0|acc |↑ |0.3000|± | N/A|
| - high_school_statistics | 0|none | 0|acc |↑ |0.0000|± |0.0000|
| - high_school_statistics | 0|none | 0|acc |↑ |0.0000|± | N/A|
| - machine_learning | 0|none | 0|acc |↑ |0.3000|± |0.1528|
| - machine_learning | 0|none | 0|acc |↑ |0.3000|± | N/A|
\ No newline at end of file
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment