• PabloAgustin's avatar
    New healthcare benchmark: careqa (#2714) · 7c9fbcf8
    PabloAgustin authored
    
    
    * New healthcare benchmark: careqa
    
    * LAUNCH_MN5_ACC <python main.py --config config/mn5.yml --models Llama-3.2-1B-Instruct --tasks careqa_open --num_fewshot 0>
    
    * Add fixes, READMES, and remove task_list.txt
    
    * pre-commit passed, add formatting updates; add nanmean agg_metric
    
    * Fix import error.
    
    * Wrapped imports in try excepts
    
    * Wrapped imports in try excepts; also metrics to catch bert_score import error
    
    * Try except to catch ImportErrors as well
    
    * use np.nan
    
    * pre-commit
    
    ---------
    Co-authored-by: default avatarPabloAgustin <pablo.martin@bsc.es>
    Co-authored-by: default avatarBaber <baber@hey.com>
    7c9fbcf8
utils_perplexity.py 414 Bytes