- 05 Feb, 2024 1 commit
-
-
lintangsutawika authored
-
- 01 Feb, 2024 2 commits
-
-
Lintang Sutawika authored
* add trust_remote_code as default * task for testing recursive * changed source of ALL_TASKS * tasks should only accept TaskObjects * initialize_tasks returns list of tasks and groups * remove trust_remote_code for now * moved constructor process to inside load_yaml_config * more comprehensive way to index tasks and groups * pre-commit format * add exit after error * adjust how task objects are called * no need to use get_task_dict * load_task_or_group works but only for tasks * pre-commit format * half working for nested groups * changed variable names * allow groups and tasks to work * temp save * indexing and loading are part of a task_manager object * adapted initialize_tasks * iron out bugs * fixed typo * fixed typo * simplified code * further tidy up * remove lines for testing * removed test lines * removed unused code * remove unused import * fixed bug * removed comments * group in a list of group can accept parameter changes like `num_fewshot` * add trust_remote_code as default * task for testing recursive * changed source of ALL_TASKS * tasks should only accept TaskObjects * initialize_tasks returns list of tasks and groups * remove trust_remote_code for now * moved constructor process to inside load_yaml_config * more comprehensive way to index tasks and groups * pre-commit format * add exit after error * adjust how task objects are called * no need to use get_task_dict * load_task_or_group works but only for tasks * pre-commit format * half working for nested groups * changed variable names * allow groups and tasks to work * temp save * indexing and loading are part of a task_manager object * adapted initialize_tasks * iron out bugs * fixed typo * fixed typo * simplified code * further tidy up * remove lines for testing * removed test lines * removed unused code * remove unused import * fixed bug * removed comments * group in a list of group can accept parameter changes like `num_fewshot` * check if config is task update * add GroupConfig object * edit test yaml * remove args * testing returning to python task list * add weight_by_size config * describe weight_by_size in docs * fix weight by size potential error * can load individual custom python class task * moved import_function into the config loading file * remove print lines * add squadv2 yaml * temporary scroll implementation * revert back to use load_yaml_config but with modes * fix group being loaded with a None * reformat * can load unregistered tasks from a group * update scrolls * edit scrolls multiplechoice task * adjust class initialization * fix initialization * changed how to identify group and python tasks, fix logger * allow loading "include" that is nested in a group config * reworked flan benchmark * allow duplicate task in the same group to co-exist * process group_alias * removed group_alias * allow parameters set in group_config to apply to all tasks in tasklist * add function, but comment for now * reworked processing dict-base config * fixed how configs in group are processed * update to allow root group to have its alias used * remove unused classes * remove unused classes * revert some parts to original * forgot to change one variable * adapt the new process to use get_task_dict * fix for singular group call * fix variable names * add TaskManager into the evaluator * format * changed how dict tasks are loaded * add docs * Update docs/new_task_guide.md Co-authored-by:
Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com> * Update evaluator.py * Update evaluator.py * remove groupconfig for now * changed _config to config * update interface.md to explain TaskManager * added property functions * adjusted logger * update write_out.py * updated tests * added documentation and some modifications * added docstring documentation * precommit format * updated task loading for tests * updates tests * changed arg order for load_yaml_config * update to handle scrolls and edit log message * remove unused lines * return a list of task classes and not a dict * Update __init__.py * Delete lm_eval/tasks/benchmarks/test.yaml * Update task.py * Update lm_eval/utils.py Co-authored-by:
Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com> * Update lm_eval/utils.py Co-authored-by:
Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com> * Update utils.py * re-added old functions with new log message * Update docs/new_task_guide.md Co-authored-by:
Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com> * Update new_task_guide.md * added infor regarding `get_task_dict` and documentation * add get_config for Task * pre-commit formatting --------- Co-authored-by:
Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
-
Hailey Schoelkopf authored
* allow tasks to specify printed fewshot val * fix to belebele * update metadata field's documentation
-
- 31 Jan, 2024 1 commit
-
-
Baber Abbasi authored
* add bypass metric * fixed `bypass` metric. * add task attributes if predict_only * add `predict_only` checks * add docs * added `overide_metric`, `override_config` to `Task` * nits * nit * changed --predict_only to generations; nits * nits * nits * change gen_kwargs warning * add note about `--predict_only` in README.md * added `predict_only` * move table to bottom * nit * change null aggregation to bypass (conflict) * bugfix; default `temp=0.0` * typo
-
- 30 Jan, 2024 1 commit
-
-
Baber Abbasi authored
* delay filter init; remove `*args` * bugfix * optimize * type hint
-
- 29 Jan, 2024 1 commit
-
-
Baber Abbasi authored
-
- 28 Jan, 2024 1 commit
-
-
LSinev authored
* raise Exception, not a string Additional info https://peps.python.org/pep-0352/#exception-hierarchy-changes https://docs.python.org/3.8/tutorial/errors.html#raising-exceptions * Apply PEP8 recommendation to prefer isinstance "Object type comparisons should always use isinstance() instead of comparing types directly" https://peps.python.org/pep-0008/ * Remove dangerous default mutable values in arguments https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/dangerous-default-value.html * Format logging messages with fstring (not with format) Additional info https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/logging-format-interpolation.html There are also discussions about the speed of formatting while logging or some unintended code executions https://github.com/pylint-dev/pylint/issues/2395 https://stackoverflow.com/a/54368109 but at least one format (fstring one) will be used throughout the project * Specify utf-8 encoding for `open` explicitly If not specified, it may be supposed differently in different environments, OSes, and Python versions. See https://peps.python.org/pep-0597/ https://docs.python.org/3.11/library/locale.html#locale.getencoding https://docs.python.org/3.10/library/os.html#utf8-mode https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/unspecified-encoding.html Helps also if some code from English language tasks is taken as inspiration for tasks in non-English languages. * Use inline-ignoring comments to pass pre-commit instead of identity process https://flake8.pycqa.org/en/3.0.1/user/ignoring-errors.html#in-line-ignoring-errors https://www.flake8rules.com/rules/F841.html flake8 comments are supported by ruff: https://docs.astral.sh/ruff/linter/#error-suppression
-
- 25 Jan, 2024 2 commits
-
-
Baber Abbasi authored
* get `doc` from instance * acceletate bugfix: get ground doc from instance * convert filter to `process_result` * get docs from instances in `FilterEnsemble` * rename * nit * better looping * fix typehint
-
lintangsutawika authored
-
- 23 Jan, 2024 4 commits
-
-
lintangsutawika authored
-
lintangsutawika authored
-
lintangsutawika authored
-
lintangsutawika authored
-
- 19 Jan, 2024 2 commits
-
-
lintangsutawika authored
-
lintangsutawika authored
-
- 18 Jan, 2024 1 commit
-
-
Lintang Sutawika authored
* tuple should be considered as well * set option to keep callable as callable
-
- 12 Jan, 2024 2 commits
-
-
Hailey Schoelkopf authored
-
Hailey Schoelkopf authored
-
- 10 Jan, 2024 1 commit
-
-
Baber Abbasi authored
* Refine scoring logic for multiple_target "exact_match" metric * skip old tests from master * skip old tests from master * delete tests from master
-
- 08 Jan, 2024 1 commit
-
-
Lintang Sutawika authored
-
- 05 Jan, 2024 1 commit
-
-
JorgeDeCorte authored
* add hellaswag_nl * add other languages and update readme to hellaswag * refactor as new task * update readme * add endline to yaml files and readme.md * add group, change folder location and update yaml file * rename default hellaswag yaml file * fix whitespace error in some labels * downgrade log level of whitespace checking --------- Co-authored-by:
JorgeDeCorte <jorge.decorte@ravago.be> Co-authored-by:
Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
-
- 04 Jan, 2024 1 commit
-
-
Lintang Sutawika authored
* Remove self.dataset_path post_init process * Update task.py * Update task.py
-
- 20 Dec, 2023 1 commit
-
-
Baber Abbasi authored
* add ruff and isort. remove black and flake8 * remove unnecessary dependencies * remove dependency from table * change order * ran ruff * check 3.9 * exclude evaluator * update CI workflow * use ruff config in pyproject.toml * test * add isort rules to ruff * sort imports * import `make_table` * try stages for no-commit-to-branch * turn on mypy for pre-commit * test * test * test * change no-commit-to-branch to default * nits * fixed dependency
-
- 15 Dec, 2023 2 commits
-
-
lintangsutawika authored
-
Lintang Sutawika authored
-
- 14 Dec, 2023 2 commits
-
-
Lintang Sutawika authored
* doc_to_decontamination_query can use function * add option for doc_to_decontamination_query to follow doc_to_text * added documentation for doc_to_decontamination_query * adjust description * format
-
Lintang Sutawika authored
* Additional process for doc_to_choice * doc_to_choice can also parse a string
-
- 13 Dec, 2023 1 commit
-
-
lintangsutawika authored
-
- 06 Dec, 2023 2 commits
-
-
lintangsutawika authored
-
lintangsutawika authored
-
- 04 Dec, 2023 1 commit
-
-
Hailey Schoelkopf authored
-
- 29 Nov, 2023 2 commits
- 28 Nov, 2023 3 commits
-
-
lintangsutawika authored
-
lintangsutawika authored
-
lintangsutawika authored
-
- 17 Nov, 2023 1 commit
-
-
lintangsutawika authored
-
- 16 Nov, 2023 1 commit
-
-
haileyschoelkopf authored
-
- 14 Nov, 2023 1 commit
-
-
lintangsutawika authored
-
- 09 Nov, 2023 1 commit
-
-
lintangsutawika authored
-