Commits · baa8b0d3d0cf5604c7eea2ebc2eb0ca082e75bb7 · gaoqiong / lm-evaluation-harness

18 May, 2023 1 commit
- fix for merge from master · baa8b0d3
  bzantium authored May 18, 2023
  
  baa8b0d3
13 Feb, 2022 1 commit
- gpt3: print exceptions · f6fb7f6a
  Leo Gao authored Feb 12, 2022
  
  f6fb7f6a
05 Dec, 2021 1 commit
- Get rid of get_token_logprobs · 7c011370
  Leo Gao authored Dec 05, 2021
  
  7c011370
24 Nov, 2021 2 commits
- update GPT3 test data and more docs · 53f6bc34
  Jason Phang authored Nov 23, 2021
  
  53f6bc34
- Some refactor/clean-up, and GPT-3 update · 76ebb792
  Jason Phang authored Nov 23, 2021
  
  76ebb792
05 Nov, 2021 1 commit
- Switch gpt3 to BaseLM · 8ef7a515
  Leo Gao authored Nov 05, 2021
  
  8ef7a515
11 Oct, 2021 1 commit
- Refactor LM organization for more reuse · 7f24a08b
  Leo Gao authored Oct 11, 2021
  
  7f24a08b
10 Jun, 2021 1 commit
- Add gpt3 tests · c59cd334
  Leo Gao authored Jun 10, 2021
  
  c59cd334
22 May, 2021 1 commit

make create_from_arg_string fully general · 9d6d291f

Leo Gao authored May 21, 2021

for some reason putting it in LM and having it be inherited breaks
everything. should try to figure this out at some point.

9d6d291f

11 May, 2021 1 commit

Overhaul command flags a bit · 5f42f976

Leo Gao authored May 10, 2021

model_args should only be things that affect output of the model
therefore, stuff like batch size, device, etc shouldn't be in there

5f42f976

06 May, 2021 1 commit
- Rename a ton of stuff · 45127aa7
  Leo Gao authored May 05, 2021
  
  45127aa7
05 May, 2021 1 commit
- Minor changes · f16c301e
  Leo Gao authored May 05, 2021
  
  f16c301e
03 May, 2021 2 commits
- gpt3 compatibility · fa80f7bd
  Jason Phang authored May 02, 2021
  
  fa80f7bd
- gpt2 perplexity · 9454c839
  Jason Phang authored May 02, 2021
  
  9454c839
15 Apr, 2021 1 commit
- Initial implementation of gpt2 batching · be3a6a2d
  Leo Gao authored Apr 14, 2021
  
  be3a6a2d
11 Apr, 2021 2 commits
- More refactoring of model code · a586a5c4
  Leo Gao authored Apr 11, 2021
  
  a586a5c4
- Refactor gpt3 loglikelihood · ab1fdc54
  Leo Gao authored Apr 11, 2021
  
  ab1fdc54
05 Apr, 2021 1 commit
- Implement partial caching · efbe6e7f
  Leo Gao authored Apr 04, 2021
```
Now, if a run gets interrupted halfway, you can easily resume
```
  efbe6e7f
27 Mar, 2021 1 commit
- Patch gpt3lm · ca57061c
  Leo Gao authored Mar 26, 2021
  
  ca57061c
26 Mar, 2021 1 commit
- Add Reorderer and implement in gpt2 and gpt3 · 8e8d7c6d
  Leo Gao authored Mar 26, 2021
  
  8e8d7c6d
21 Feb, 2021 2 commits
- Fix gpt3 batching bug · 03f34463
  Leo Gao authored Feb 20, 2021
  
  03f34463
- Add gpt3 chunking · 1ff4e07f
  Leo Gao authored Feb 20, 2021
  
  1ff4e07f
19 Feb, 2021 1 commit
- Add gpt2/3 tokenizer sanity check · 77b44470
  Leo Gao authored Feb 18, 2021
  
  77b44470
11 Feb, 2021 1 commit
- Implement GPT2 greedy_until · e8f9dc71
  Leo Gao authored Feb 10, 2021
  
  e8f9dc71
08 Feb, 2021 1 commit
- LM: handle empty context · 359114fd
  Leo Gao authored Feb 07, 2021
  
  359114fd
05 Feb, 2021 2 commits
- Add retry with backoff for GPT3 · b1f7284e
  Leo Gao authored Feb 04, 2021
  
  b1f7284e
- Get rid of annoying logging · c55e8237
  Leo Gao authored Feb 04, 2021
  
  c55e8237
04 Feb, 2021 4 commits
- Add missing import · 1815286c
  Leo Gao authored Feb 03, 2021
  
  1815286c
- Implement gpt3 greedy_until · 5f4c7c50
  Leo Gao authored Feb 03, 2021
  
  5f4c7c50
- Massive refactor · 778e0f91
  Leo Gao authored Feb 03, 2021
```
- Extract evaluator (still needs work to clean up)
- Add tests for evaluator
- Fix all the things that break on the new tests
- Misc cleanup
```
  778e0f91
- Implement gpt3 logprobs · 52c1c56a
  Leo Gao authored Feb 03, 2021
  
  52c1c56a
28 Jan, 2021 1 commit
- Implement unit testing and fix lots of problems with tasks · 60a6fd8c
  Leo Gao authored Jan 27, 2021
  
  60a6fd8c
05 Jan, 2021 1 commit
- Add reminder to rewrite · 08dc67ea
  Leo Gao authored Jan 05, 2021
  
  08dc67ea
30 Nov, 2020 2 commits

Remove num_tokens · e3031e84
Leo Gao authored Nov 30, 2020

e3031e84

Refactor to remove generate and fix some bad tokenization · 90e50b4c

Leo Gao authored Nov 30, 2020

In particular, the following assumptions are FALSE in general:
tokenize(context + continuation) = tokenize(context) + tokenize(continuation)
len(tokenize(context + continuation)) = len(tokenize(context)) + len(tokenize(continuation))
tokenize(context + continuation)[:len(tokenize(context))] = tokenize(context)

So we need to tip-toe around the problem by being careful with how we do it.

In particular, using Fast is not just for performance; while behavour of GPT2Tokenizer differs across Transformers 2 and 3, GPT2TokenizerFast doesn't.

90e50b4c

04 Oct, 2020 1 commit
- Remove residual MODEL_REGISTRY · d9e50f87
  Leo Gao authored Oct 03, 2020
  
  d9e50f87
14 Sep, 2020 1 commit
- SuperGLUE, and truncation · 8161c22e
  Jason Phang authored Sep 14, 2020
  
  8161c22e
07 Sep, 2020 3 commits
- lib · f88bb827
  Jason Phang authored Sep 07, 2020
  
  f88bb827
- checkin · 2d4b3a8c
  Jason Phang authored Sep 07, 2020
  
  2d4b3a8c
- gpt3 · 12e12bc0
  Jason Phang authored Sep 07, 2020
  
  12e12bc0