Unverified Commit fcb39a5a authored by Lintang Sutawika's avatar Lintang Sutawika Committed by GitHub
Browse files

doc_to_decontamination_query can use function (#1082)

* doc_to_decontamination_query can use function

* add option for doc_to_decontamination_query to follow doc_to_text

* added documentation for doc_to_decontamination_query

* adjust description

* format
parent a2ed953f
...@@ -46,8 +46,8 @@ Scoring details: ...@@ -46,8 +46,8 @@ Scoring details:
- **generation_kwargs** (`dict`, *optional*) — Auxiliary arguments for the `generate` function from HF transformers library. Advanced keyword arguments may not be supported for non-HF LM classes. - **generation_kwargs** (`dict`, *optional*) — Auxiliary arguments for the `generate` function from HF transformers library. Advanced keyword arguments may not be supported for non-HF LM classes.
- **repeats** (`int`, *optional*, defaults to 1) — Number of repeated runs through model for each sample. can be used for cases such as self-consistency. - **repeats** (`int`, *optional*, defaults to 1) — Number of repeated runs through model for each sample. can be used for cases such as self-consistency.
- **filter_list** (`Union[str, list]`, *optional*) — List of filters to postprocess model outputs. See below for further detail on the filter API. - **filter_list** (`Union[str, list]`, *optional*) — List of filters to postprocess model outputs. See below for further detail on the filter API.
- **should_decontaminate** (`bool`, *optional*, defaults to False) - - **should_decontaminate** (`bool`, *optional*, defaults to False) - Whether to decontaminate or not.
- **doc_to_decontamination_query** (`str`, *optional*) — - **doc_to_decontamination_query** (`str`, *optional*) — Query for decontamination if `should_decontaminate` is True. If `should_decontaminate` is True but `doc_to_decontamination_query` is `None`, `doc_to_decontamination_query` will follow `doc_to_text`.
Other: Other:
- **metadata** (`Union[str, list]`, *optional*) — An optional field where arbitrary metadata can be passed. A good example would be `version` that is used to denote the version of the yaml config. - **metadata** (`Union[str, list]`, *optional*) — An optional field where arbitrary metadata can be passed. A good example would be `version` that is used to denote the version of the yaml config.
......
...@@ -831,12 +831,20 @@ class ConfigurableTask(Task): ...@@ -831,12 +831,20 @@ class ConfigurableTask(Task):
def doc_to_decontamination_query(self, doc): def doc_to_decontamination_query(self, doc):
if self.config.should_decontaminate: if self.config.should_decontaminate:
if self.config.doc_to_decontamination_query in self.features: if self.config.doc_to_decontamination_query is None:
return doc[self.config.doc_to_decontamination_query] return self.doc_to_text(doc)
else: else:
return ast.literal_eval( doc_to_decontamination_query = self.config.doc_to_decontamination_query
utils.apply_template(self.config.doc_to_decontamination_query, doc) if doc_to_decontamination_query in self.features:
) return doc[doc_to_decontamination_query]
elif callable(doc_to_decontamination_query):
return doc_to_decontamination_query(doc)
else:
return ast.literal_eval(
utils.apply_template(
self.config.doc_to_decontamination_query, doc
)
)
def _process_doc(self, doc): def _process_doc(self, doc):
""" """
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment