- 13 Feb, 2024 1 commit
-
-
Hailey Schoelkopf authored
* fix weight_by_size condition * add tests, update stderr formula slightly * apply pre-commit
-
- 06 Feb, 2024 1 commit
-
-
Hailey Schoelkopf authored
* update formula for stderr aggregation * hack: see what happens when using stderr_for_metric bootstrapping on a group * undo bootstrap_for_stderr test * factor out variance-aggregation formulas into api.metrics * fix failing tests * remove stray print * update comment * further detail in comment * add back initialize_tasks() call * fix format
-
- 31 Jan, 2024 1 commit
-
-
Baber Abbasi authored
* add bypass metric * fixed `bypass` metric. * add task attributes if predict_only * add `predict_only` checks * add docs * added `overide_metric`, `override_config` to `Task` * nits * nit * changed --predict_only to generations; nits * nits * nits * change gen_kwargs warning * add note about `--predict_only` in README.md * added `predict_only` * move table to bottom * nit * change null aggregation to bypass (conflict) * bugfix; default `temp=0.0` * typo
-
- 20 Dec, 2023 1 commit
-
-
Baber Abbasi authored
* add ruff and isort. remove black and flake8 * remove unnecessary dependencies * remove dependency from table * change order * ran ruff * check 3.9 * exclude evaluator * update CI workflow * use ruff config in pyproject.toml * test * add isort rules to ruff * sort imports * import `make_table` * try stages for no-commit-to-branch * turn on mypy for pre-commit * test * test * test * change no-commit-to-branch to default * nits * fixed dependency
-
- 02 Nov, 2023 2 commits
-
-
lintangsutawika authored
-
lintangsutawika authored
-
- 19 Oct, 2023 3 commits
-
-
haileyschoelkopf authored
-
lintangsutawika authored
-
lintangsutawika authored
-
- 18 Oct, 2023 1 commit
-
-
haileyschoelkopf authored
-
- 25 Aug, 2023 1 commit
-
-
Ethan Smith authored
This adds a bunch of simple annotations suggested by https://github.com/JelleZijlstra/autotyping.
-
- 14 Aug, 2023 5 commits
- 12 Aug, 2023 1 commit
-
-
haileyschoelkopf authored
-
- 11 Aug, 2023 1 commit
-
-
haileyschoelkopf authored
-
- 03 Aug, 2023 1 commit
-
-
Aflah authored
-
- 02 Aug, 2023 4 commits
- 06 Jul, 2023 1 commit
-
-
haileyschoelkopf authored
-
- 15 Jun, 2023 1 commit
-
-
lintangsutawika authored
-
- 13 Jun, 2023 1 commit
-
-
haileyschoelkopf authored
-
- 12 Jun, 2023 1 commit
-
-
Hailey Schoelkopf authored
* add wip gsm8k yaml * cleanup tasks dir * push gsm8k yaml changes * rename gpt2.py * add updated gsm8k , triviaqa baseline * add new cot yaml * allow for multiple filter pipelines, new filter types * updated gsm8k + sampling gen configs * cleanup self-consistency yaml * push outline for advanced docs * push docs checklist * switch to inheritance for many tasks * acc_norm and acc_mutual_info fixed * fix missing newline in error msg * remove many .py tasks * updated GSM8k * added more doc * Update advanced_task_guide.md Added list of parameters * Update advanced_task_guide.md * Added details on listing metrics * Update advanced_task_guide.md * Added more explanation * modify current default filter name * add new tags to tasks * remove a lingering print() * add rest of param docs, cleanup deprecated fields * push docs update * move ALL_TASKS definition location * confirm write_out.py works if no description dict passed --------- Co-authored-by:lintangsutawika <lintang@sutawika.com>
-
- 07 Jun, 2023 2 commits
-
-
lintangsutawika authored
-
lintangsutawika authored
-
- 06 Jun, 2023 1 commit
-
-
lintangsutawika authored
-
- 19 May, 2023 1 commit
-
-
lintangsutawika authored
-
- 10 May, 2023 1 commit
-
-
lintangsutawika authored
-
- 08 May, 2023 1 commit
-
-
haileyschoelkopf authored
-
- 02 May, 2023 2 commits
-
-
haileyschoelkopf authored
-
haileyschoelkopf authored
-
- 24 Apr, 2023 1 commit
-
-
haileyschoelkopf authored
-
- 19 Apr, 2023 1 commit
-
-
haileyschoelkopf authored
-
- 03 May, 2022 1 commit
-
-
Fabrizio Milo authored
-
- 29 Apr, 2022 1 commit
-
- 27 Apr, 2022 1 commit
-
-
jon-tow authored
-