Update README.md

44a4c374 · Stella Biderman · GitHub · 082d6db3 · 44a4c374
Unverified Commit 44a4c374 authored May 09, 2023 by Stella Biderman Committed by GitHub May 09, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 12 additions and 11 deletions

README.md README.md +12 -11

No files found.
--- a/README.md
+++ b/README.md
@@ -34,14 +34,14 @@ pip install -e ".[multilingual]"

 > **Note**: When reporting results from eval harness, please include the task versions (shown in `results["versions"]`) for reproducibility. This allows bug fixes to tasks while also ensuring that previously reported scores are reproducible. See the [Task Versioning](#task-versioning) section for more info.

-To evaluate a model hosted on the [HuggingFace Hub](https://huggingface.co/models) (e.g. GPT-J-6B) on tasks with names matching the pattern `lambada_*` and `hellaswag` you can use the following command:
+To evaluate a model hosted on the [HuggingFace Hub](https://huggingface.co/models) (e.g. GPT-J-6B) on `hellaswag` you can use the following command:


 ```bash
 python main.py \
    --model hf-causal \
    --model_args pretrained=EleutherAI/gpt-j-6B \
-    --tasks lambada_*,hellaswag \
+    --tasks hellaswag \
    --device cuda:0
 ```

@@ -59,15 +59,6 @@ To evaluate models that are loaded via `AutoSeq2SeqLM` in Huggingface, you inste

 > **Warning**: Choosing the wrong model may result in erroneous outputs despite not erroring.

-To use with [PEFT](https://github.com/huggingface/peft), take the call you would run to evaluate the base model and add `,peft=PATH` to the `model_args` argument as shown below:
-```bash
-python main.py \
-    --model hf-causal-experimental \
-    --model_args pretrained=EleutherAI/gpt-j-6b,peft=nomic-ai/gpt4all-j-lora \
-    --tasks openbookqa,arc_easy,winogrande,hellaswag,arc_challenge,piqa,boolq \
-    --device cuda:0
-```
-
 Our library also supports the OpenAI API:

 ```bash
@@ -106,6 +97,16 @@ This will write out one text file for each task.

 ## Advanced Usage

+For models loaded with the HuggingFace  `transformers` library, any arguments provided via `--model_args` get passed to the relevant constructor directly. This means that anything you can do with `AutoModel` can be done with our library. For example, you can pass a local path via `pretrained=` or use models finetuned with [PEFT](https://github.com/huggingface/peft) by taking the call you would run to evaluate the base model and add `,peft=PATH` to the `model_args` argument:
+```bash
+python main.py \
+    --model hf-causal-experimental \
+    --model_args pretrained=EleutherAI/gpt-j-6b,peft=nomic-ai/gpt4all-j-lora \
+    --tasks openbookqa,arc_easy,winogrande,hellaswag,arc_challenge,piqa,boolq \
+    --device cuda:0
+```
+
+
 We support wildcards in task names, for example you can run all of the machine-translated lambada tasks via `--task lambada_openai_mt_*`.

 We currently only support one prompt per task, which we strive to make the "standard" as defined by the benchmark's authors. If you would like to study how varying prompts causes changes in the evaluation score, check out the [BigScience fork](https://github.com/bigscience-workshop/lm-evaluation-harness) of this repo. We are currently working on upstreaming this capability to `main`.