- 11 Mar, 2025 1 commit
-
-
PabloAgustin authored
* New healthcare benchmark: careqa * LAUNCH_MN5_ACC <python main.py --config config/mn5.yml --models Llama-3.2-1B-Instruct --tasks careqa_open --num_fewshot 0> * Add fixes, READMES, and remove task_list.txt * pre-commit passed, add formatting updates; add nanmean agg_metric * Fix import error. * Wrapped imports in try excepts * Wrapped imports in try excepts; also metrics to catch bert_score import error * Try except to catch ImportErrors as well * use np.nan * pre-commit --------- Co-authored-by:
PabloAgustin <pablo.martin@bsc.es> Co-authored-by:
Baber <baber@hey.com>
-
- 01 May, 2024 1 commit
-
-
Gabriel Mukobi authored
* Add Pile-10k readme * Add Pile-10k task configuration file
-
- 21 Dec, 2023 1 commit
-
-
Hailey Schoelkopf authored
* change version field formatting in metadata * mention versioning in new task guide * add instructions for changelog * run linters
-
- 28 Nov, 2023 2 commits
-
-
lintangsutawika authored
-
lintangsutawika authored
-
- 14 Aug, 2023 1 commit
-
-
lintangsutawika authored
-
- 01 Aug, 2023 1 commit
-
-
haileyschoelkopf authored
-
- 18 Jul, 2023 1 commit
-
-
haileyschoelkopf authored
-
- 12 Jun, 2023 1 commit
-
-
Hailey Schoelkopf authored
* add wip gsm8k yaml * cleanup tasks dir * push gsm8k yaml changes * rename gpt2.py * add updated gsm8k , triviaqa baseline * add new cot yaml * allow for multiple filter pipelines, new filter types * updated gsm8k + sampling gen configs * cleanup self-consistency yaml * push outline for advanced docs * push docs checklist * switch to inheritance for many tasks * acc_norm and acc_mutual_info fixed * fix missing newline in error msg * remove many .py tasks * updated GSM8k * added more doc * Update advanced_task_guide.md Added list of parameters * Update advanced_task_guide.md * Added details on listing metrics * Update advanced_task_guide.md * Added more explanation * modify current default filter name * add new tags to tasks * remove a lingering print() * add rest of param docs, cleanup deprecated fields * push docs update * move ALL_TASKS definition location * confirm write_out.py works if no description dict passed --------- Co-authored-by:lintangsutawika <lintang@sutawika.com>
-
- 08 Jun, 2023 1 commit
-
-
lintangsutawika authored
-
- 02 Jun, 2023 1 commit
-
-
haileyschoelkopf authored
-