Add diagram and workflow overview

4e9412d5 · Stella Biderman · GitHub · d1caf4a4 · 4e9412d5
Unverified Commit 4e9412d5 authored Jun 05, 2023 by Stella Biderman Committed by GitHub Jun 05, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 26 additions and 1 deletion

README.md README.md +26 -1

No files found.
--- a/README.md
+++ b/README.md
@@ -7,13 +7,38 @@
 This project provides a unified framework to test generative language models on a large number of different evaluation tasks.
-Features:
+**Features:**
 - 200+ tasks implemented. See the [task-table](./docs/task_table.md) for a complete list.
 - Support for the Hugging Face `transformers` library, GPT-NeoX, Megatron-DeepSpeed, and the OpenAI API, with flexible tokenization-agnostic interface.
 - Support for evaluation on adapters (e.g. LoRa) supported in [HuggingFace's PEFT library](https://github.com/huggingface/peft).
 - Task versioning to ensure reproducibility.
+**Evaluation Overview**
+`Task` and `Prompt` classes contain information that, when combined, produces the input to the language model. The language model is then queried to obtain an output. One or more `Filters` can then be applied to perform arbitrary operations on the model's raw output, such as selecting the final answer (for chain of thought) or calling an external API. This final output is then evaluated using a `Metric` to obtain the final result.
+```mermaid
+graph LR;
+    classDef empty width:0px,height:0px;
+    T[Task]
+    I[Input]
+    F[Filter]
+    M[Model]
+    O[Ouput]:::empty
+    P[Prompt]
+    Me[Metric]
+    R[Result]
+    T --- I:::empty
+    P --- I
+    I --> M
+    M --> O
+    O --> F
+    Me --> R:::empty
+    F --> R
+ ```
 ## Install
 To install `lm-eval` from the github repository main branch, run: