Don't silence errors when loading tasks (#1148)
* Add example failing task
This task includes an invalid import. This will cause an exception and
the task will not be loaded. But this just results in a DEBUG level log
message, so in normal usage you'll see no error, and will be told the
task doesn't exist.
Here's an example command line to run the task:
python -m lm_eval --model hf --model_args pretrained=rinna/japanese-gpt-1b --tasks fail
This task is based on a Japanese Winograd task, but that's not
important, and was just used due to familiarity.
* Do not ignore errors when loading tasks
* Change how task errors are logged
This makes the proposed changes from PR discussion.
1. Exceptions not related to missing modules/imports are logged as
warnings.
2. module/import related exceptions are still logged at debug level, but
if any of them happen there is a warning about it with instructions
on how to show logs.
* Remove intentionally failing task
---------
Co-authored-by:
Paul O'Leary McCann <polm@dampfkraft.com>
Showing
Please register or sign in to comment