"src/fastertransformer/kernels/logprob_kernels.cu" did not exist on "720fc533da804ac3f46ee938864403e51fcd9fa7"
Commit 9492b6e7 authored by lintangsutawika's avatar lintangsutawika
Browse files

merged with weight_by_size

parents 86ec5a53 c4594d27
...@@ -232,6 +232,10 @@ If you would like to run evaluation on all prompt templates, you can simply call ...@@ -232,6 +232,10 @@ If you would like to run evaluation on all prompt templates, you can simply call
use_prompt: "promptsource:*" use_prompt: "promptsource:*"
``` ```
### Weighting evaluation based on task size
By default, all tasks are aggregated by simple average (A group of 2 task with the same metric will simply be summed and divided by 2 for its group metric). You might find it necessary to aggregate multiple task scores by their weight. To do this, you can set within the task config `weight_by_size` to `True` to have its scores be weighted by the number of samples it has.
### Setting metrics ### Setting metrics
You're almost done! Now we need to choose how to score our task. You're almost done! Now we need to choose how to score our task.
......
...@@ -95,7 +95,7 @@ class TaskConfig(dict): ...@@ -95,7 +95,7 @@ class TaskConfig(dict):
filter_list: Union[str, list] = None filter_list: Union[str, list] = None
should_decontaminate: bool = False should_decontaminate: bool = False
doc_to_decontamination_query: str = None doc_to_decontamination_query: str = None
weight_by_size: bool = False
metadata: Union[ metadata: Union[
str, list str, list
] = None # by default, not used in the code. allows for users to pass arbitrary info to tasks ] = None # by default, not used in the code. allows for users to pass arbitrary info to tasks
......
...@@ -485,7 +485,7 @@ def evaluate( ...@@ -485,7 +485,7 @@ def evaluate(
if "alias" in metrics: if "alias" in metrics:
metrics.pop("alias") metrics.pop("alias")
if weight_by_size: if configs[task]["weight_by_size"]:
current_size = metrics.pop("samples") current_size = metrics.pop("samples")
else: else:
metrics.pop("samples") metrics.pop("samples")
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment