Commits · 1f6d3cd6dfd1c8473018fdf496bface746f497d4 · gaoqiong / lm-evaluation-harness

17 Jul, 2023 2 commits
- print two tables · 1f6d3cd6
  lintangsutawika authored Jul 17, 2023
  
  1f6d3cd6
- printing all the configs are made optional with --show_config · 5e2f0ffe
  lintangsutawika authored Jul 17, 2023
  
  5e2f0ffe
14 Jul, 2023 2 commits
- can include paths in main · 068f8ab2
  lintangsutawika authored Jun 21, 2023
  
  068f8ab2
- fix random seed issue, log_samples optional · 5fbc3f86
  haileyschoelkopf authored Jul 14, 2023
  
  5fbc3f86
03 Jul, 2023 1 commit
- remove output_base_path · 6a2620ad
  haileyschoelkopf authored Jul 03, 2023
  
  6a2620ad
23 Jun, 2023 1 commit
- add use_cache arg · 6449ab1a
  haileyschoelkopf authored Jun 23, 2023
  
  6449ab1a
21 Jun, 2023 1 commit
- Handle non-serializable fields in task config and result object · 3968fd0a
  nikuya3 authored Jun 20, 2023
  
  3968fd0a
20 Jun, 2023 1 commit
- Handle non-serializable fields in task config and result object · 193b0a47
  nikuya3 authored Jun 20, 2023
  
  193b0a47
19 Jun, 2023 1 commit
- add description to config, remove from cmdline args · 194a806d
  haileyschoelkopf authored Jun 19, 2023
  
  194a806d
16 Jun, 2023 1 commit
- reformatted · b7c3580a
  lintangsutawika authored Jun 16, 2023
  
  b7c3580a
15 Jun, 2023 1 commit
- minor fixes to satisify pre-commit · 400c0199
  lintangsutawika authored Jun 15, 2023
  
  400c0199
13 Jun, 2023 3 commits
- Update output path for the example logger · f1d251ca
  FarzanehNakhaee authored Jun 13, 2023
  
  f1d251ca
- set() works but not list???? · 483f86d9
  haileyschoelkopf authored Jun 13, 2023
  
  483f86d9
- fixes some minor issues on tasks. For yaml, the task name should be... · 0a1ced22
  lintangsutawika authored Jun 13, 2023
```
fixes some minor issues on tasks. For yaml, the task name should be <dataset_path>_<dataset_name>:<task>
```
  0a1ced22
12 Jun, 2023 1 commit

[Refactor] [WIP] New YAML advanced docs (#567) · 79b972d6

Hailey Schoelkopf authored Jun 12, 2023



* add wip gsm8k yaml

* cleanup tasks dir

* push gsm8k yaml changes

* rename gpt2.py

* add updated gsm8k , triviaqa baseline

* add new cot yaml

* allow for multiple filter pipelines, new filter types

* updated gsm8k + sampling gen configs

* cleanup self-consistency yaml

* push outline for advanced docs

* push docs checklist

* switch to inheritance for many tasks

* acc_norm and acc_mutual_info fixed

* fix missing newline in error msg

* remove many .py tasks

* updated GSM8k

* added more doc

* Update advanced_task_guide.md

Added list of parameters

* Update advanced_task_guide.md

* Added details on listing metrics

* Update advanced_task_guide.md

* Added more explanation

* modify current default filter name

* add new tags to tasks

* remove a lingering print()

* add rest of param docs, cleanup deprecated fields

* push docs update

* move ALL_TASKS definition location

* confirm write_out.py works if no description dict passed

---------
Co-authored-by: lintangsutawika <lintang@sutawika.com>

79b972d6

11 Jun, 2023 1 commit
- Add --max_batch_size and --batch_size auto:N · 8cec82b2
  gk authored Jun 11, 2023
  
  8cec82b2
07 Jun, 2023 1 commit
- add example logger · b7cfed19
  FarzanehNakhaee authored Jun 07, 2023
  
  b7cfed19
06 Jun, 2023 1 commit
- show list of all datasets · 5693abc5
  lintangsutawika authored Jun 06, 2023
  
  5693abc5
01 Jun, 2023 1 commit

Fix LLaMA tokenization issue (#531) · 23f30926

gakada authored Jun 02, 2023



* Fix tokenization issue in BaseLM.loglikelihood

* Add a regression script

* Use entire non-continuation length as context

---------
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

23f30926

19 May, 2023 1 commit
- pre-commit stuff · c4c20ff5
  lintangsutawika authored May 19, 2023
  
  c4c20ff5
18 May, 2023 1 commit
- minor edit · 64912d81
  lintangsutawika authored May 18, 2023
  
  64912d81
16 May, 2023 2 commits
- updated filter process · c3dabb32
  lintangsutawika authored May 16, 2023
  
  c3dabb32
- added logging process · 8d608117
  lintangsutawika authored May 16, 2023
  
  8d608117
13 May, 2023 1 commit
- changes to import and loading yaml · 275857a1
  lintangsutawika authored May 13, 2023
  
  275857a1
11 May, 2023 2 commits
- update parameter names and add docs · 99b0a42d
  Julen Etxaniz authored May 11, 2023
  
  99b0a42d
- process yaml and registered tasks through args.task · 8299ab3b
  lintangsutawika authored May 11, 2023
  
  8299ab3b
10 May, 2023 3 commits
- add --write_detailed_eval_info to dump JSON with prompts and completions · 2e046ce3
  Julen Etxaniz authored May 10, 2023
  
  2e046ce3
- simplify · 484fa090
  lintangsutawika authored May 10, 2023
  
  484fa090
- fix warning issue if single device · 4d3ea67a
  Benjamin Fattori authored May 10, 2023
  
  4d3ea67a
08 May, 2023 3 commits
- add yaml registering decorator · 38244e15
  haileyschoelkopf authored May 08, 2023
  
  38244e15
- Create output path directory if necessary · c473d7e0
  janEbert authored May 08, 2023
  
  c473d7e0
- Add perplexity task on arbitrary JSON data · 3226ed64
  janEbert authored May 08, 2023
  
  3226ed64
07 May, 2023 1 commit

allow float limit to represent data portion · 3fda1195

Ken Tsui authored May 07, 2023

When `limit` is <1, limit represents the percentage of the total number of examples.
If it is >=1,  then it means the number of examples per task (only use this for testing).

3fda1195

05 May, 2023 3 commits
- Sort task names to keep the same order always (#474) · 0542d35d
  Julen Etxaniz authored May 05, 2023
```
This makes comparing the results of different models easier because tasks are ordered in the same way.
```
  0542d35d
- bugfixes missed from local branch · 629bcfba
  Benjamin Fattori authored May 05, 2023
  
  629bcfba
- sync working changes with upstream · d4c5315a
  Benjamin Fattori authored May 05, 2023
  
  d4c5315a
24 Apr, 2023 1 commit
- make tasks and models registered by decorators · f275301a
  haileyschoelkopf authored Apr 23, 2023
  
  f275301a
19 Apr, 2023 1 commit
- in-place replace main with lm-eval2, keeping old git history · d2a9b759
  haileyschoelkopf authored Apr 19, 2023
  
  d2a9b759
09 Mar, 2023 1 commit
- single GPU automatic batching logic · d5720d5f
  Benjamin Fattori authored Mar 09, 2023
  
  d5720d5f
03 May, 2022 1 commit
- add pre-commit · 121b7096
  Fabrizio Milo authored May 02, 2022
  
  121b7096