Unitxt is a library for customizable textual data preparation and evaluation tailored to generative language models. Unitxt natively integrates with common libraries like HuggingFace and LM-eval-harness and deconstructs processing flows into modular components, enabling easy customization and sharing between practitioners. These components encompass model-specific formats, task prompts, and many other comprehensive dataset processing definitions. These components are centralized in the Unitxt-Catalog, thus fostering collaboration and exploration in modern textual data workflows.
The full Unitxt catalog can be viewed in an [online explorer](https://unitxt.readthedocs.io/en/latest/docs/demo.html).
Read more about Unitxt at [www.unitxt.ai](https://www.unitxt.ai/).
### Paper
Title: `Unitxt: Flexible, Shareable and Reusable Data Preparation and Evaluation for Generative AI`
Abstract: `https://arxiv.org/abs/2401.14019`
Unitxt is a library for customizable textual data preparation and evaluation tailored to generative language models. Unitxt natively integrates with common libraries like HuggingFace and LM-eval-harness and deconstructs processing flows into modular components, enabling easy customization and sharing between practitioners. These components encompass model-specific formats, task prompts, and many other comprehensive dataset processing definitions. These components are centralized in the Unitxt-Catalog, thus fostering collaboration and exploration in modern textual data workflows.
@@ -36,42 +38,4 @@ The full list of Unitxt tasks currently supported can be seen under `tasks/unitx
### Adding tasks
You can add additional tasks from the Unitxt catalog by generating new LM-Eval yaml files for these datasets.
The Unitxt task yaml files are generated via the `generate_yamls.py` script in the `tasks/unitxt` directory.
To add a yaml file for an existing dataset Unitxt which is not yet in LM-Eval:
1. Add the card name to the `unitxt_datasets` file in the `tasks/unitxt` directory.
2. The generate_yaml.py contains the default Unitxt [template](https://unitxt.readthedocs.io/en/latest/docs/adding_template.html) used for each kind of NLP task in the `default_template_per_task` dictionary. If the dataset is of a Unitxt task type, previously not used in LM-Eval, you will need to add a default template for it in the dictionary.
For adding novel benchmarks/datasets to the library:
* [ ] Is the task an existing benchmark in the literature?
* [ ] Have you referenced the original paper that introduced the task?
* [ ] If yes, does the original paper provide a reference implementation? If so, have you checked against the reference implementation and documented how to run such a test?
If other tasks on this dataset are already supported:
* [ ] Is the "Main" variant of this task clearly denoted?
* [ ] Have you provided a short sentence in a README on what each new variant adds / evaluates?
* [ ] Have you noted which, if any, published evaluation setups are matched by this variant?
See the [adding tasks guide](https://www.unitxt.ai/en/latest/docs/lm_eval.html#).
@@ -4,9 +4,12 @@ In the dynamic landscape of generative NLP, traditional text processing pipeline
Addressing this need, we present Unitxt, an innovative library for customizable textual data preparation and evaluation tailored to generative language models. Unitxt natively integrates with common libraries like HuggingFace and LM-eval-harness and deconstructs processing flows into modular components, enabling easy customization and sharing between practitioners. These components encompass model-specific formats, task prompts, and many other comprehensive dataset processing definitions. The Unitxt-Catalog centralizes these components, fostering collaboration and exploration in modern textual data workflows. Beyond being a tool, Unitxt is a community-driven platform, empowering users to build, share, and advance their pipelines collaboratively.
"""
importimportlib.util
importre
fromfunctoolsimportpartial
fromtypingimportOptional
fromtypingimportAny,Dict,Optional
importdatasets
importevaluate
fromlm_eval.api.instanceimportInstance
...
...
@@ -25,6 +28,10 @@ _CITATION = """
"""
defis_unitxt_installed()->bool:
returnimportlib.util.find_spec("unitxt")isnotNone
defscore(items,metric):
predictions,references=zip(*items)
evaluator=evaluate.load("unitxt/metric")
...
...
@@ -41,17 +48,30 @@ class Unitxt(ConfigurableTask):
self,
config:Optional[dict]=None,
)->None:
ifconfigisNone:
config={}
assert"recipe"inconfig,"Unitxt task must have a 'recipe' string."