Commits · e41a082cd0e7b387be4de4ae3c993cfb3f548060 · gaoqiong / lm-evaluation-harness

28 Dec, 2020 2 commits
- Update · e41a082c
  Leo Gao authored Dec 27, 2020
  
  e41a082c
- Update interfaces · 76e65788
  Leo Gao authored Dec 27, 2020
  
  76e65788
30 Nov, 2020 10 commits
- Update docstring · 75db3899
  Leo Gao authored Nov 30, 2020
  
  75db3899
- Remove num_tokens · e3031e84
  Leo Gao authored Nov 30, 2020
  
  e3031e84
- Add missing import · cf69ba9c
  Leo Gao authored Nov 30, 2020
  
  cf69ba9c
- Refactor to remove generate and fix some bad tokenization · 90e50b4c
  Leo Gao authored Nov 30, 2020
```
In particular, the following assumptions are FALSE in general:
tokenize(context + continuation) = tokenize(context) + tokenize(continuation)
len(tokenize(context + continuation)) = len(tokenize(context)) + len(tokenize(continuation))
tokenize(context + continuation)[:len(tokenize(context))] = tokenize(context)

So we need to tip-toe around the problem by being careful with how we do it.

In particular, using Fast is not just for performance; while behavour of GPT2Tokenizer differs across Transformers 2 and 3, GPT2TokenizerFast doesn't.
```
  90e50b4c
- Make fewshot_examples fast · 6de520af
  Leo Gao authored Nov 30, 2020
  
  6de520af
- Undo MNLI changes · ff3adfe2
  Leo Gao authored Nov 29, 2020
  
  ff3adfe2
- Fix MNLIMismatched · 26d59f34
  Leo Gao authored Nov 29, 2020
  
  26d59f34
- Allow specifying sets for write_out · 49cc6f5d
  Leo Gao authored Nov 29, 2020
  
  49cc6f5d
- Allow specifying sets for write_out · d4ae0c00
  Leo Gao authored Nov 29, 2020
  
  d4ae0c00
- Fix MNLI train set · 1c9432de
  Leo Gao authored Nov 29, 2020
  
  1c9432de
31 Oct, 2020 1 commit
- Put gpt2 in eval mode · 8d7d2132
  Leo Gao authored Oct 31, 2020
  
  8d7d2132
25 Oct, 2020 1 commit
- Add SAT analogies dataset. Manual download needed. Checksums currently not... · 8b83f341
  Charles Foster authored Oct 25, 2020
```
Add SAT analogies dataset. Manual download needed. Checksums currently not verified, but hash is included as a comment.
```
  8b83f341
24 Oct, 2020 21 commits
- update lambada for new dataset class · 8e4ff678
  Anish Thite authored Oct 24, 2020
  
  8e4ff678
- add rte text · d4daf44c
  Anish Thite authored Oct 24, 2020
  
  d4daf44c
- add no target for triviaqa · 171f2924
  Anish Thite authored Oct 24, 2020
  
  171f2924
- add no target for storycloze · c9f3b113
  Anish Thite authored Oct 24, 2020
  
  c9f3b113
- add include target for coqa · e3c2e692
  Anish Thite authored Oct 24, 2020
  
  e3c2e692
- add piqa text · 7c55e46b
  Anish Thite authored Oct 24, 2020
  
  7c55e46b
- Fixes to natural questions. · c013679d
  Charles Foster authored Oct 24, 2020
  
  c013679d
- Updates to natural questions answer handling. Now uses long answers alwasy. · dc3560d0
  Charles Foster authored Oct 24, 2020
  
  dc3560d0
- Small typos · d51f3e7f
  Charles Foster authored Oct 24, 2020
  
  d51f3e7f
- Updates to natural questions to deal with memory issues. · 9291edaa
  Charles Foster authored Oct 24, 2020
  
  9291edaa
- Removed apache_beam dependency. · 0dda9f63
  Charles Foster authored Oct 24, 2020
  
  0dda9f63
- add storycloze · 042e2926
  Anish Thite authored Oct 24, 2020
  
  042e2926
- mnli_mismatched · 25131aa7
  Jason Phang authored Oct 24, 2020
  
  25131aa7
- update triviaqa download · 0f2b9317
  Anish Thite authored Oct 24, 2020
  
  0f2b9317
- add triviaqa data · 8bfaaa50
  Anish Thite authored Oct 24, 2020
  
  8bfaaa50
- update coqa dev set · 97e5a566
  Anish Thite authored Oct 24, 2020
  
  97e5a566
- update coqa dev set · db3fe0d6
  Anish Thite authored Oct 24, 2020
  
  db3fe0d6
- update coqa to be consistent with gpt3 paper · 5d601e14
  Anish Thite authored Oct 24, 2020
  
  5d601e14
- make drop text consistent with gpt3 paper · 43fc77a2
  Anish Thite authored Oct 24, 2020
  
  43fc77a2
- Added natural questions. Not yet validated or tested with a call to write. · 9426db16
  Charles Foster authored Oct 23, 2020
  
  9426db16
- add Wikitext2 + wikitext103 data · 262fe250
  Anish Thite authored Oct 23, 2020
  
  262fe250
23 Oct, 2020 1 commit

Renamed WSC to make distinction between SuperGLUE Winograd Schemas... · 05bd05e9

Charles Foster authored Oct 23, 2020

Renamed WSC to make distinction between SuperGLUE Winograd Schemas (SGWinogradSchemaChallenge) and WSC273 (WinogradSchemaChallenge273) clearer. Also, added WSC273.

05bd05e9

22 Oct, 2020 4 commits
- Added QuAC dataset. · 40cc4e28
  Charles Foster authored Oct 22, 2020
  
  40cc4e28
- Clarified that SQuaD does not include a test set. · 76b08133
  Charles Foster authored Oct 21, 2020
  
  76b08133
- Added SQuAD v2 dataset · 5c58b267
  Charles Foster authored Oct 21, 2020
  
  5c58b267
- Updated RACE to use datasets instead of nlp package. · b106e4f7
  Charles Foster authored Oct 21, 2020
  
  b106e4f7