- 29 Apr, 2025 1 commit
-
-
Baber Abbasi authored
-
- 14 Mar, 2025 1 commit
-
-
Oskar van der Wal authored
* Implementation of Winogender * Minor fixes README.md * Add winogender * Clean winogender utils.py * Change dataset to one containing All subsets * Flesh out README for BBQ task * Add missing tasks for BBQ * Add simple cooccurrence bias task * Fix wrong mask for ambiguated context+rename metrics * Made generate_until evaluation (following PALM paper) default Also moved separate config files per category to separate metrics using custom function. Created config file for multiple_choice way of evaluating BBQ. * Add missing version metadata * Add missing versionmetadata for bbq multiple choice * Fix metrics and address edge cases * Made BBQ multiple choice the default version * Added settings following winogrande * Add num_fewshot to simple_cooccurrence_bias * Fixes for bbq (multiple choice) * Fix wrong dataset * CrowS-Pairs: make it easier to use another dataset by removing dataset_name from the subsets. * Use simplest prompt possible without description * Merge * BBQ: Fix np.NaN related bug * BBQ: Fix wrong aggregation method for disamb accuracy * BBQ: Make it possible to only evaluate on (dis)ambiguous subset (needed for few shot eval) * BBQ: fix showing one target in case of few-shot evals * BBQ: Fix few-shot example for bbq_generate * BBQ: simplify subtasks * BBQ: Minimize number of UNK variations to reduce inference time * BBQ: Add extra UNK keywords for the generate task * Add a generate_until version of simple_cooccurrence_bias * Change system/description prompt to include few-shot examples * Group agg rework * Run pre-commit * add tasks to readme table * remove trailing space from simple_cooccurrence_bias_gen.yaml `doc_to_text` * fix * fix * fix version --------- Co-authored-by:Baber <baber@hey.com>
-