Commits · 4dbde45aa93b290ae57775648d5e7d77cf075e4a · gaoqiong / lm-evaluation-harness

25 Dec, 2020 1 commit
- First pass at StoryCloze evaluation script · 4dbde45a
  uyhcire authored Dec 24, 2020
  
  4dbde45a
24 Dec, 2020 1 commit
- Basic setup · 90e02bca
  uyhcire authored Dec 24, 2020
  
  90e02bca
01 Dec, 2020 1 commit
- Merge pull request #63 from EleutherAI/refactor_tokenization · 61ff104e
  Stella Biderman authored Dec 01, 2020
```
Refactor to remove generate and fix some bad tokenization.
```
  61ff104e
30 Nov, 2020 11 commits
- Update docstring · 75db3899
  Leo Gao authored Nov 30, 2020
  
  75db3899
- Remove num_tokens · e3031e84
  Leo Gao authored Nov 30, 2020
  
  e3031e84
- Refactor to remove generate and fix some bad tokenization · 90e50b4c
  Leo Gao authored Nov 30, 2020
```
In particular, the following assumptions are FALSE in general:
tokenize(context + continuation) = tokenize(context) + tokenize(continuation)
len(tokenize(context + continuation)) = len(tokenize(context)) + len(tokenize(continuation))
tokenize(context + continuation)[:len(tokenize(context))] = tokenize(context)

So we need to tip-toe around the problem by being careful with how we do it.

In particular, using Fast is not just for performance; while behavour of GPT2Tokenizer differs across Transformers 2 and 3, GPT2TokenizerFast doesn't.
```
  90e50b4c
- Make fewshot_examples fast · 6de520af
  Leo Gao authored Nov 30, 2020
  
  6de520af
- Undo MNLI changes · ff3adfe2
  Leo Gao authored Nov 29, 2020
  
  ff3adfe2
- Merge branch 'master' of github.com:EleutherAI/lm_evaluation_harness · 70e30a52
  Leo Gao authored Nov 29, 2020
```
# Conflicts:
#	write_out.py
```
  70e30a52
- Fix MNLIMismatched · 26d59f34
  Leo Gao authored Nov 29, 2020
  
  26d59f34
- Allow specifying sets for write_out · 49cc6f5d
  Leo Gao authored Nov 29, 2020
  
  49cc6f5d
- Allow specifying sets for write_out · d4ae0c00
  Leo Gao authored Nov 29, 2020
  
  d4ae0c00
- Fix MNLI train set · 1c9432de
  Leo Gao authored Nov 29, 2020
  
  1c9432de
- Remove old coqa file · 5076d212
  Leo Gao authored Nov 29, 2020
  
  5076d212
29 Nov, 2020 1 commit
- Allow num_examples to fetch all if num_examples < 0 · a1ef56aa
  Leo Gao authored Nov 29, 2020
  
  a1ef56aa
23 Nov, 2020 6 commits
- Merge pull request #59 from EleutherAI/dev · 15c87e9b
  Stella Biderman authored Nov 23, 2020
```
Changing implementation of the —limit flag
```
  15c87e9b
- Merge branch 'master' into dev · 6c93185c
  Stella Biderman authored Nov 23, 2020
  
  6c93185c
- Changed arg name to match master · 37fab200
  Stella Biderman authored Nov 23, 2020
  
  37fab200
- added import statement · 760788d5
  Stella Biderman authored Nov 23, 2020
  
  760788d5
- Added truncation flag · 3f5d4beb
  Stella Biderman authored Nov 23, 2020
```
This should allow the user to only do the first few eval tasks.
```
  3f5d4beb
- Add --limit flag · fb5aaf51
  Leo Gao authored Nov 23, 2020
  
  fb5aaf51
31 Oct, 2020 1 commit
- Put gpt2 in eval mode · 8d7d2132
  Leo Gao authored Oct 31, 2020
  
  8d7d2132
26 Oct, 2020 1 commit
- Merge pull request #57 from cfoster0/sat_analogies · 0d291df6
  Stella Biderman authored Oct 26, 2020
```
Add SAT Analogy dataset
```
  0d291df6
25 Oct, 2020 2 commits
- Add SAT analogies dataset. Manual download needed. Checksums currently not... · 8b83f341
  Charles Foster authored Oct 25, 2020
```
Add SAT analogies dataset. Manual download needed. Checksums currently not verified, but hash is included as a comment.
```
  8b83f341
- Merge pull request #3 from EleutherAI/master · 4a294d8a
  Charles Foster authored Oct 25, 2020
```
Sync up to EAI
```
  4a294d8a
24 Oct, 2020 15 commits
- Merge pull request #56 from anishthite/master · 946cb2bc
  Stella Biderman authored Oct 24, 2020
```
add rte text
```
  946cb2bc
- update lambada for new dataset class · 8e4ff678
  Anish Thite authored Oct 24, 2020
  
  8e4ff678
- add rte text · d4daf44c
  Anish Thite authored Oct 24, 2020
  
  d4daf44c
- Merge pull request #52 from cfoster0/natural_questions · dcadfa7c
  Stella Biderman authored Oct 24, 2020
```
Add Natural Questions
```
  dcadfa7c
- Merge pull request #55 from anishthite/master · 9eec7ce8
  Stella Biderman authored Oct 24, 2020
```
Add piqa and implement include_target for triviqa, storycloze, coqa
```
  9eec7ce8
- add no target for triviaqa · 171f2924
  Anish Thite authored Oct 24, 2020
  
  171f2924
- add no target for storycloze · c9f3b113
  Anish Thite authored Oct 24, 2020
  
  c9f3b113
- add include target for coqa · e3c2e692
  Anish Thite authored Oct 24, 2020
  
  e3c2e692
- add piqa text · 7c55e46b
  Anish Thite authored Oct 24, 2020
  
  7c55e46b
- Merge pull request #54 from zphang/mnli_mismatched · 3c480b61
  Stella Biderman authored Oct 24, 2020
```
Adding mnli_mismatched
```
  3c480b61
- Merge pull request #53 from anishthite/master · b278e42d
  Stella Biderman authored Oct 24, 2020
```
Update drop to be consistent with gpt3 paper
```
  b278e42d
- Fixes to natural questions. · c013679d
  Charles Foster authored Oct 24, 2020
  
  c013679d
- Updates to natural questions answer handling. Now uses long answers alwasy. · dc3560d0
  Charles Foster authored Oct 24, 2020
  
  dc3560d0
- Small typos · d51f3e7f
  Charles Foster authored Oct 24, 2020
  
  d51f3e7f
- Updates to natural questions to deal with memory issues. · 9291edaa
  Charles Foster authored Oct 24, 2020
  
  9291edaa