Making hardcoded few shots compatible with the chat template mechanism (#1895)
* init test 1 * fix * this format seems to be working - need to update all other tasks with the new format * bbh with few shot format * fix fewshot bbh * add mmlu flan cot * samples of cot * kmmlu * fix gsm8k * update keys for mmlu * minerva math * bbh * fix * fix samples * small fixes to templates * last prompt format change * fixing prompt * fixed minerva math format * rm accidental commited file * added doc for few shot samples * Update lm_eval/loggers/evaluation_tracker.py * Update lm_eval/loggers/evaluation_tracker.py * Update docs/new_task_guide.md Co-authored-by:Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com> * added check in sampler per code review * added the system from a function, plus an example in minerva math * style * Apply suggestions from code review Co-authored-by:
Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com> * fix unit tests 1 * forcing use of test split --------- Co-authored-by:
Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
Showing
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Please register or sign in to comment