Commit 0e26c0bd authored by lintangsutawika's avatar lintangsutawika
Browse files

update to match t5 prompt template

parent d46d792d
...@@ -6,7 +6,7 @@ dataset_name: boolq ...@@ -6,7 +6,7 @@ dataset_name: boolq
training_split: train training_split: train
validation_split: validation validation_split: validation
output_type: greedy_until output_type: greedy_until
doc_to_text: "boolq question: {{question}} passage: {{passage}}" doc_to_text: "boolq passage: {{passage}} question: {{question}}"
doc_to_target: label doc_to_target: label
doc_to_choice: ['False', 'True'] doc_to_choice: ['False', 'True']
metric_list: metric_list:
......
...@@ -6,7 +6,7 @@ dataset_name: record ...@@ -6,7 +6,7 @@ dataset_name: record
training_split: train training_split: train
validation_split: validation validation_split: validation
output_type: greedy_until output_type: greedy_until
doc_to_text: "record query: {{query}} entities: {{entities}} passage: {{passage}}" doc_to_text: "record query: {{query}} entities: {{entities|join(\", \")}} passage: {{passage}}"
doc_to_target: "{{answers}}" doc_to_target: "{{answers}}"
metric_list: metric_list:
- metric: exact_match - metric: exact_match
......
...@@ -6,7 +6,7 @@ dataset_name: rte ...@@ -6,7 +6,7 @@ dataset_name: rte
training_split: train training_split: train
validation_split: validation validation_split: validation
output_type: greedy_until output_type: greedy_until
doc_to_text: "rte premise: {{premise}} hypothesis: {{hypothesis}}" doc_to_text: "rte hypothesis: {{hypothesis}} premise: {{premise}}"
doc_to_target: label doc_to_target: label
doc_to_choice: ['entailment', 'not_entailment'] doc_to_choice: ['entailment', 'not_entailment']
metric_list: metric_list:
......
...@@ -7,15 +7,12 @@ def t5_prompt_doc_to_text(x): ...@@ -7,15 +7,12 @@ def t5_prompt_doc_to_text(x):
pattern_tmpl = r"^((?:\S+\s){N})(W)" pattern_tmpl = r"^((?:\S+\s){N})(W)"
pattern = re.sub("N", str(span_idx), pattern_tmpl) pattern = re.sub("N", str(span_idx), pattern_tmpl)
pattern = re.sub("W", span_str, pattern) pattern = re.sub("W", span_str, pattern)
return re.sub(pattern, r"\1{0} \2 {0}".format(mark), text) return re.sub(pattern, r"\1{0}\2{0}".format(mark), text)
text = x["text"] text = x["text"]
text = _mark_span(text, x["span1_text"], x["span1_index"], "*") text = _mark_span(text, x["span2_text"], x["span2_index"], "*")
# Compensate for 2 added "words" added in previous step.
span2_index = x["span2_index"] + 2 * (x["span1_index"] < x["span2_index"])
text = _mark_span(text, x["span2_text"], span2_index, "#")
return text return "wsc: "+text
def default_doc_to_text(x): def default_doc_to_text(x):
......
...@@ -7,7 +7,7 @@ training_split: train ...@@ -7,7 +7,7 @@ training_split: train
validation_split: validation validation_split: validation
output_type: greedy_until output_type: greedy_until
doc_to_text: !function "preprocess_wsc.t5_prompt_doc_to_text" doc_to_text: !function "preprocess_wsc.t5_prompt_doc_to_text"
doc_to_target: "{{[span1_text, span2_text][label]}}" doc_to_target: span1_text
metric_list: metric_list:
- metric: exact_match - metric: exact_match
aggregation: mean aggregation: mean
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment