Unverified Commit 592b2a23 authored by Leo Gao's avatar Leo Gao Committed by GitHub
Browse files

Merge pull request #210 from EleutherAI/StellaAthena-patch-1

Updated readme for clarity
parents f6e7ae25 adec7faa
......@@ -13,7 +13,7 @@ The goal of this project is to build a set of tools for evaluating LMs on typica
2. Removing task val/test data from LM training set
3. Adding task training data to LM training set
### Overview of Tasks
### Full Task List
| Task Name |Train|Val|Test|Val/Test Docs| Metrics |
|-------------------------------------------------|-----|---|----|------------:|------------------------------------------------------------------------------|
......@@ -211,9 +211,8 @@ To evaluate a model, (e.g. GPT-2) on NLU tasks (e.g. RTE, Winograd Scheme Challe
```bash
python main.py \
--model gpt2 \
--model_args device=cuda:0 \
--tasks rte,wsc \
--provide_description \
--device cuda:0 \
--tasks lambada,hellaswag \
--num_fewshot 2
```
......@@ -223,11 +222,23 @@ If you have access to an OpenAI API key, you can also evaluate GPT-3 on various
export OPENAI_API_SECRET_KEY=YOUR_KEY_HERE
python main.py \
--model gpt3 \
--tasks rte,wsc \
--tasks lambada,hellaswag \
--provide_description \
--num_fewshot 2
```
Additional arguments can be provided to the model constructor using the `--model_args` flag. Most importantly, the `gpt2` model can be used to load an arbitrary HuggingFace model as follows:
```bash
python main.py \
--model gpt2 \
--model_args pretrained=EleutherAI/gpt-neo-1.3B \
--device cuda:0 \
--tasks lambada,hellaswag \
--num_fewshot 2
```
To inspect what the LM inputs look like, you can run the following command:
```bash
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment