Unverified Commit 4bb92ebc authored by Jess's avatar Jess Committed by GitHub
Browse files

Merge pull request #18 from JessicaOjo/africamgsm

fix exact match bug and restructure mmlu folder
parents 348e304a 5ba791e2
dataset_name: zul dataset_name: zul
include: afrimmlu_common_yaml include: afrimmlu_common_yaml
task: afrimmlu_zul task: afrimmlu_direct_zul
\ No newline at end of file \ No newline at end of file
from sklearn.metrics import f1_score from sklearn.metrics import f1_score
def doc_to_choice(doc): def doc_to_choice(doc):
choices = eval(doc["choices"]) choices = eval(doc["choices"])
return choices return choices
def doc_to_text(doc): def doc_to_text(doc):
output = """You are a highly knowledgeable and intelligent artificial intelligence output = """You are a highly knowledgeable and intelligent artificial intelligence
model answers multiple-choice questions about '{subject}' model answers multiple-choice questions about '{subject}'
...@@ -27,6 +29,7 @@ def doc_to_text(doc): ...@@ -27,6 +29,7 @@ def doc_to_text(doc):
choice4=choices[3]) choice4=choices[3])
return text return text
def weighted_f1_score(items): def weighted_f1_score(items):
unzipped_list = list(zip(*items)) unzipped_list = list(zip(*items))
golds = unzipped_list[0] golds = unzipped_list[0]
......
lm_eval --model hf \ lm_eval --model hf \
--model_args pretrained=masakhane/African-ultrachat-alpaca \ --model_args pretrained=masakhane/African-ultrachat-alpaca \
--tasks afrimmlu_amh,afrimmlu_eng,afrimmlu_ewe,afrimmlu_fra,afrimmlu_hau,afrimmlu_ibo,afrimmlu_kin,afrimmlu_lin,afrimmlu_lug,afrimmlu_orm,afrimmlu_sna,afrimmlu_sot,afrimmlu_twi,afrimmlu_wol,afrimmlu_xho,afrimmlu_yor,afrimmlu_zul \ --tasks afrimmlu_direct_amh,afrimmlu_direct_eng,afrimmlu_direct_ewe,afrimmlu_direct_fra,afrimmlu_direct_hau,afrimmlu_direct_ibo,afrimmlu_direct_kin,afrimmlu_direct_lin,afrimmlu_direct_lug,afrimmlu_direct_orm,afrimmlu_direct_sna,afrimmlu_direct_sot,afrimmlu_direct_twi,afrimmlu_direct_wol,afrimmlu_direct_xho,afrimmlu_direct_yor,afrimmlu_direct_zul \
--device cuda:0 \ --device cuda:0 \
--batch_size 1 \ --batch_size 1 \
--num_fewshot 0 \ --num_fewshot 0 \
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment