- 06 May, 2025 1 commit
-
-
Vladislav Mikhailov authored
* added noreval * added a checklist for noreval * run pre-commit * changed imports and added short noreval description * fixed norsumm path * refactored multi-folder tasks * refactored multi-folder tasks
-
- 14 Oct, 2024 1 commit
-
-
Elron Bandel authored
* Add Unitxt Multimodality Support Signed-off-by:
elronbandel <elronbandel@gmail.com> * Update Signed-off-by:
elronbandel <elronbandel@gmail.com> * Fix formatting Signed-off-by:
elronbandel <elronbandel@gmail.com> --------- Signed-off-by:
elronbandel <elronbandel@gmail.com>
-
- 15 Jul, 2024 1 commit
-
-
Nathan Weinberg authored
Also add 'test_logs/' to .gitignore Signed-off-by:Nathan Weinberg <nweinber@redhat.com>
-
- 26 Feb, 2024 1 commit
-
-
Aaron V authored
* Create a means for caching task registration and request building. Add the ability to specify an args dict for simple_evaluate(). * Remove extra S in cache path in caching module Co-authored-by:
Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com> * Rename requests cache args, make model_args polymorphic so that a dict can also be accepted. * Update docs to reflect new caching behavior, add CLI args for requests caching. Create a function for deleting items in the cache. * Update documentation, fix minor bug with arg parsing for requests caching where an undefined variable was used. * Remove line from gitignore, add to cli for caching datasets. * Add hashing suffix to .pickles. Update test script typo. * Favor isinstance() over type() in evaluator.py * Add tests for caching, gets tests working, remove unneeded arg from build_all_requests(). * Update arg description to simple_evaluate. * Update pyproject.toml * Fix typehint * Remove the use of random() for creating default cache pickle hash. * Check that cache dir exists before clearing it in request cache tests. * Fix linting problems. * Fix additional formatting errors. * Remove trailing whitespace. * Add new line to the end of .gitignore. --------- Co-authored-by:
Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
-
- 22 Feb, 2024 1 commit
-
-
Ayush Thakur authored
* add wandb as extra dependency * wandb metrics logging * refactor * log samples as tables * fix linter * refactor: put in a class * change dir * add panels * log eval as table * improve tables logging * improve reports logging * precommit run * ruff check * handle importing reports api gracefully * ruff * compare results * minor pre-commit fixes * build comparison report * ruff check * log results as artifacts * remove comparison script * update dependency * type annotate and docstring * add example * update readme * fix typo * teardown * handle outside wandb run * gracefully fail reports creation * precommit checks * add report url to summary * use wandb printer for better url stdout * fix ruff * handle N/A and groups * fix eval table * remove unused var * update wandb version req + disable reports stdout * remove reports feature to TODO * add label to multi-choice question data * log model predictions * lints * loglikelihood_rolling * log eval result for groups * log tables by group for better handling * precommit * choices column for multi-choice * graciously fail wandb * remove reports feature * track system metrics + total eval time + stdout --------- Co-authored-by:Lintang Sutawika <lintang@eleuther.ai>
-
- 21 Jul, 2023 1 commit
-
-
baberabb authored
-
- 14 Jul, 2023 1 commit
-
-
lintangsutawika authored
-
- 01 Jul, 2023 2 commits
-
-
FarzanehNakhaee authored
-
FarzanehNakhaee authored
-
- 29 Jun, 2023 1 commit
-
-
FarzanehNakhaee authored
-
- 16 Jun, 2023 1 commit
-
-
Lintang Sutawika authored
-
- 07 Jun, 2023 1 commit
-
-
FarzanehNakhaee authored
-
- 11 May, 2023 1 commit
-
-
lintangsutawika authored
-
- 03 May, 2022 1 commit
-
-
Fabrizio Milo authored
-
- 29 Mar, 2021 1 commit
-
-
& authored
-
- 12 Feb, 2021 3 commits
- 07 Sep, 2020 1 commit
-
-
Anish Thite authored
-
- 28 Aug, 2020 1 commit
-
-
Leo Gao authored
-