README.md 2.29 KB
Newer Older
Lintang Sutawika's avatar
Lintang Sutawika committed
1
2
3
4
5
6
# Arithmetic

### Paper

Title: `Language Models are Few-Shot Learners`
Abstract: https://arxiv.org/abs/2005.14165
Lintang Sutawika's avatar
Lintang Sutawika committed
7
8
9
10

A small battery of 10 tests that involve asking language models a simple arithmetic
problem in natural language.

Lintang Sutawika's avatar
Lintang Sutawika committed
11
12
Homepage: https://github.com/openai/gpt-3/tree/master/data

Lintang Sutawika's avatar
Lintang Sutawika committed
13

Lintang Sutawika's avatar
Lintang Sutawika committed
14
15
16
### Citation

```
Lintang Sutawika's avatar
Lintang Sutawika committed
17
18
19
20
21
22
23
24
25
26
27
@inproceedings{NEURIPS2020_1457c0d6,
    author = {Brown, Tom and Mann, Benjamin and Ryder, Nick and Subbiah, Melanie and Kaplan, Jared D and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and Agarwal, Sandhini and Herbert-Voss, Ariel and Krueger, Gretchen and Henighan, Tom and Child, Rewon and Ramesh, Aditya and Ziegler, Daniel and Wu, Jeffrey and Winter, Clemens and Hesse, Chris and Chen, Mark and Sigler, Eric and Litwin, Mateusz and Gray, Scott and Chess, Benjamin and Clark, Jack and Berner, Christopher and McCandlish, Sam and Radford, Alec and Sutskever, Ilya and Amodei, Dario},
    booktitle = {Advances in Neural Information Processing Systems},
    editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
    pages = {1877--1901},
    publisher = {Curran Associates, Inc.},
    title = {Language Models are Few-Shot Learners},
    url = {https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf},
    volume = {33},
    year = {2020}
}
Lintang Sutawika's avatar
Lintang Sutawika committed
28
```
Lintang Sutawika's avatar
Lintang Sutawika committed
29

30
### Groups, Tags, and Tasks
lintangsutawika's avatar
lintangsutawika committed
31

32
#### Tags
lintangsutawika's avatar
lintangsutawika committed
33
34
35
36

* `arithmetic`: Evaluates `1dc` to `5ds`

#### Tasks
Lintang Sutawika's avatar
Lintang Sutawika committed
37

Lintang Sutawika's avatar
Lintang Sutawika committed
38
39
* `arithmetic_1dc`
* `arithmetic_2da`
lintangsutawika's avatar
lintangsutawika committed
40
* `arithmetic_2dm`
Lintang Sutawika's avatar
Lintang Sutawika committed
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
* `arithmetic_2ds`
* `arithmetic_3da`
* `arithmetic_3ds`
* `arithmetic_4da`
* `arithmetic_4ds`
* `arithmetic_5da`
* `arithmetic_5ds`

### Checklist

For adding novel benchmarks/datasets to the library:
* [ ] Is the task an existing benchmark in the literature?
  * [ ] Have you referenced the original paper that introduced the task?
  * [ ] If yes, does the original paper provide a reference implementation? If so, have you checked against the reference implementation and documented how to run such a test?


If other tasks on this dataset are already supported:
* [ ] Is the "Main" variant of this task clearly denoted?
* [ ] Have you provided a short sentence in a README on what each new variant adds / evaluates?
* [ ] Have you noted which, if any, published evaluation setups are matched by this variant?