Commit 2411c9d3 authored by Benjamin Fattori's avatar Benjamin Fattori
Browse files

update race preprocess + rename dataset

parent 7feab5e3
import ast
def process_ast(string):
return ast.literal_eval(string)
def last_problem(doc):
return doc["problems"][-1]
return process_ast(doc["problems"])[-1]
def get_answer_option(problem):
letter_to_num = {"A": 0, "B": 1, "C": 2, "D": 3}
......@@ -13,7 +18,7 @@ def create_choices(doc):
def doc_to_text(doc):
text = "Article: " + doc["article"] + "\n\n"
for problem in doc["problems"][:-1]:
for problem in process_ast(doc["problems"])[:-1]:
if problem["question"][-6:] == " _ .":
text += (
problem["question"][-5:] + get_answer_option(problem) + "\n"
......
group:
- multiple_choice
task: race
dataset_path: bfattori/race_grouped
dataset_path: bfattori/race
dataset_name: high
output_type: multiple_choice
training_split: train
validation_split: validation
test_split: test
create_choices: !function preprocess_race.create_choices
doc_to_text: !function preprocess_race.doc_to_text
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment