Unverified Commit 3a4e4674 authored by Minho Ryu's avatar Minho Ryu Committed by GitHub
Browse files

apply precommit (#2636)

parent 6dac8c69
...@@ -6,9 +6,26 @@ Title: `Global MMLU: Understanding and Addressing Cultural and Linguistic Biases ...@@ -6,9 +6,26 @@ Title: `Global MMLU: Understanding and Addressing Cultural and Linguistic Biases
Abstract: [https://arxiv.org/abs/2412.03304](https://arxiv.org/abs/2412.03304) Abstract: [https://arxiv.org/abs/2412.03304](https://arxiv.org/abs/2412.03304)
Global-MMLU 🌍 is a multilingual evaluation set spanning 42 languages, including English. This dataset combines machine translations for MMLU questions along with professional translations and crowd-sourced post-edits. It also includes cultural sensitivity annotations for a subset of the questions (2850 questions per language) and classifies them as Culturally Sensitive (CS) 🗽 or Culturally Agnostic (CA) ⚖️. These annotations were collected as part of an open science initiative led by Cohere For AI in collaboration with many external collaborators from both industry and academia.
Global-MMLU-Lite is a balanced collection of culturally sensitive and culturally agnostic MMLU tasks. It is designed for efficient evaluation of multilingual models in 15 languages (including English). Only languages with human translations and post-edits in the original [Global-MMLU](https://huggingface.co/datasets/CohereForAI/Global-MMLU) 🌍 dataset have been included in the lite version. Global-MMLU-Lite is a balanced collection of culturally sensitive and culturally agnostic MMLU tasks. It is designed for efficient evaluation of multilingual models in 15 languages (including English). Only languages with human translations and post-edits in the original [Global-MMLU](https://huggingface.co/datasets/CohereForAI/Global-MMLU) 🌍 dataset have been included in the lite version.
Homepage: [https://huggingface.co/datasets/CohereForAI/Global-MMLU-Lite](https://huggingface.co/datasets/CohereForAI/Global-MMLU-Lite) Homepage: \
[https://huggingface.co/datasets/CohereForAI/Global-MMLU](https://huggingface.co/datasets/CohereForAI/Global-MMLU) \
[https://huggingface.co/datasets/CohereForAI/Global-MMLU-Lite](https://huggingface.co/datasets/CohereForAI/Global-MMLU-Lite)
#### Groups
* `global_mmlu_{lang}`: This group uses `Global-MMLU-Lite` benchmark which supports 14 languages.
* `global_mmlu_full_{lang}`: This group uses `Global-MMLU` benchmark which supports 42 languages.
#### Subgroups (support only for `full` version)
* `global_mmlu_full_stem`
* `global_mmlu_full_humanities`
* `global_mmlu_full_social_sciences`
* `global_mmlu_full_other`
### Citation ### Citation
......
dataset_path: CohereForAI/Global-MMLU
dataset_name: am
test_split: test
fewshot_split: dev
fewshot_config:
sampler: first_n
output_type: multiple_choice
doc_to_text: "{{question.strip()}}\nA. {{option_a}}\nB. {{option_b}}\nC. {{option_c}}\nD. {{option_d}}\nAnswer:"
doc_to_choice: ["A", "B", "C", "D"]
doc_to_target: answer
metric_list:
- metric: acc
aggregation: mean
higher_is_better: true
metadata:
version: 0.0
group: global_mmlu_full_am
task:
- global_mmlu_full_am_stem
- global_mmlu_full_am_other
- global_mmlu_full_am_social_sciences
- global_mmlu_full_am_humanities
aggregate_metric_list:
- metric: acc
weight_by_size: True
metadata:
version: 1.0
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment