The 2020 Bilingual, Bi-Directional WebNLG+ Shared Task:
Overview and Evaluation Results (WebNLG+ 2020)
https://aclanthology.org/2020.webnlg-1.7/
WebNLG+ offers two challenges: (i) mapping sets of RDF triples
to English or Russian text (generation) and (ii) converting
English or Russian text to sets of RDF triples (semantic parsing).
Compared to the eponymous WebNLG challenge, WebNLG+ provides an
extended dataset that enable the training, evaluation, and
comparison of microplanners and semantic parsers. In this paper,
we present the results of the generation and semantic parsing
task for both English and Russian and provide a brief
description of the participating systems.
"""
fromlm_eval.baseimportPromptSourceTask
fromlm_eval.baseimportPromptSourceTask
_CITATION="""
@inproceedings{castro-ferreira-etal-2020-2020,
title = "The 2020 Bilingual, Bi-Directional {W}eb{NLG}+ Shared Task: Overview and Evaluation Results ({W}eb{NLG}+ 2020)",
author = "Castro Ferreira, Thiago and
Gardent, Claire and
Ilinykh, Nikolai and
van der Lee, Chris and
Mille, Simon and
Moussallem, Diego and
Shimorina, Anastasia",
booktitle = "Proceedings of the 3rd International Workshop on Natural Language Generation from the Semantic Web (WebNLG+)",
month = "12",
year = "2020",
address = "Dublin, Ireland (Virtual)",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2020.webnlg-1.7",
pages = "55--76",
abstract = "WebNLG+ offers two challenges: (i) mapping sets of RDF triples to English or Russian text (generation) and (ii) converting English or Russian text to sets of RDF triples (semantic parsing). Compared to the eponymous WebNLG challenge, WebNLG+ provides an extended dataset that enable the training, evaluation, and comparison of microplanners and semantic parsers. In this paper, we present the results of the generation and semantic parsing task for both English and Russian and provide a brief description of the participating systems.",
}
"""
classWebNLG(PromptSourceTask):
classWebNLG(PromptSourceTask):
VERSION=0
VERSION=0
DATASET_PATH="GEM/web_nlg"
DATASET_PATH="GEM/web_nlg"
DATASET_NAME="en"
DATASET_NAME="en"
SPLIT=None
defhas_training_docs(self):
defhas_training_docs(self):
returnFalse
returnFalse
...
@@ -27,11 +65,71 @@ class WebNLG(PromptSourceTask):
...
@@ -27,11 +65,71 @@ class WebNLG(PromptSourceTask):
title = "Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods",
author = "Zhao, Jieyu and
Wang, Tianlu and
Yatskar, Mark and
Ordonez, Vicente and
Chang, Kai-Wei",
booktitle = "Proceedings of the 2018 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)",
month = jun,
year = "2018",
address = "New Orleans, Louisiana",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/N18-2003",
doi = "10.18653/v1/N18-2003",
pages = "15--20",
abstract = "In this paper, we introduce a new benchmark for co-reference resolution focused on gender bias, WinoBias. Our corpus contains Winograd-schema style sentences with entities corresponding to people referred by their occupation (e.g. the nurse, the doctor, the carpenter). We demonstrate that a rule-based, a feature-rich, and a neural coreference system all link gendered pronouns to pro-stereotypical entities with higher accuracy than anti-stereotypical entities, by an average difference of 21.1 in F1 score. Finally, we demonstrate a data-augmentation approach that, in combination with existing word-embedding debiasing techniques, removes the bias demonstrated by these systems in WinoBias without significantly affecting their performance on existing datasets.",
}
"""
classWinoBias(PromptSourceTask):
VERSION=0
DATASET_PATH="wino_bias"
defhas_training_docs(self):
returnFalse
defhas_validation_docs(self):
returnTrue
defhas_test_docs(self):
returnTrue
deftraining_docs(self):
pass
defvalidation_docs(self):
returnself.dataset["validation"]
deftest_docs(self):
returnself.dataset["test"]
defstopping_criteria(self):
return"\n"
defprocess_results(self,doc,results):
"""Take a single document and the LM results and evaluates, returning a
dict where keys are the names of submetrics and values are the values of
the metric for that one document
:param doc:
The document as returned from training_docs, validation_docs, or test_docs.
:param results:
The results of the requests created in construct_requests.