Commits · add-chat-templating · gaoqiong / lm-evaluation-harness

27 Feb, 2024 2 commits
- Update lm_eval/models/huggingface.py · 495d50bf
  Hailey Schoelkopf authored Feb 27, 2024
```
Co-authored-by: Daniel Furman <dryanfurman@gmail.com>
```
  495d50bf
- Update lm_eval/models/huggingface.py · 37db34cb
  Hailey Schoelkopf authored Feb 27, 2024
```
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
```
  37db34cb
16 Jan, 2024 1 commit
- push most recent code · 68c30aa7
  haileyschoelkopf authored Jan 16, 2024
  
  68c30aa7
15 Jan, 2024 11 commits
- Merge branch 'main' into add-chat-templating · b8bda478
  haileyschoelkopf authored Jan 15, 2024
  
  b8bda478
- clean up wrap_chat_template + add TODOs · 6ca8ab15
  haileyschoelkopf authored Jan 15, 2024
  
  6ca8ab15
- update Instance.args setter · c47de8be
  haileyschoelkopf authored Jan 15, 2024
  
  c47de8be
- Merge branch 'main' into add-chat-templating · 2b40017b
  haileyschoelkopf authored Jan 15, 2024
  
  2b40017b
- Update CITATION.bib (#1285) · 588a493c
  Hailey Schoelkopf authored Jan 15, 2024
```
Bumping CITATION.bib to match re-adding the citation in readme. 

cc @StellaAthena
```
  588a493c
- Re-add citation · 39a465ca
  Stella Biderman authored Jan 15, 2024
```
It looks like Google Scholar has [already noticed](https://scholar.google.com/scholar?hl=en&as_sdt=0%2C9&authuser=2&q=%22A+framework+for+few-shot+language+model+evaluation%2C+12+2023%22&btnG=) the updated citation block so let's add it back in.
```
  39a465ca
- Rework documentation for explaining local dataset (#1284) · b074ccb6
  Lintang Sutawika authored Jan 15, 2024
```
* rewor documentation for explaining local dataset

* fix typo

* Update new_task_guide.md
```
  b074ccb6
- Fix data-parallel evaluation with quantized models (#1270) · ef665088
  Hailey Schoelkopf authored Jan 15, 2024
```
* add WIP device_map overrides

* update handling outside of accelerate launcher

* change .to(device) log to debug level

* run linter
```
  ef665088
- Allow parameter edits for registered tasks when listed in a benchmark (#1273) · 03e7df51
  Lintang Sutawika authored Jan 15, 2024
```
* benchmark yamls allow minor edits of already registered tasks

* add documentation

* removed print
```
  03e7df51
- Make `parallelize=True` vs. `accelerate launch` distinction clearer in docs (#1261) · 39e7b264
  Hailey Schoelkopf authored Jan 15, 2024
```
* Make parallelize=True distinction clearer in documentation.

* run linter
```
  39e7b264
- fix whitespace in target + prompt for CoT gsm8k (#1275) · ace4393e
  Hailey Schoelkopf authored Jan 15, 2024
  
  ace4393e
13 Jan, 2024 3 commits
- remove system · bbcdffb8
  daniel-furman authored Jan 12, 2024
  
  bbcdffb8
- llama test · 39a11d02
  daniel-furman authored Jan 12, 2024
  
  39a11d02
- llama test · 43dee065
  daniel-furman authored Jan 12, 2024
  
  43dee065
12 Jan, 2024 3 commits
- apply process_docs() to fewshot_split too (#1276) · 89618bf8
  Hailey Schoelkopf authored Jan 12, 2024
  
  89618bf8
- add Kobest (#1263) · 653217a7
  jp authored Jan 12, 2024
```
* Add: kobest config file

* Add: kobest utils

* Add: README

* Update utils.py
```
  653217a7
- update versioning logging (#1271) · 75dc2b87
  Hailey Schoelkopf authored Jan 11, 2024
  
  75dc2b87
11 Jan, 2024 9 commits
- Update README.md · eed2d3a6
  Stella Biderman authored Jan 11, 2024
  
  eed2d3a6
- Fix bug in multi-token Stop Sequences (#1268) · ff739414
  Hailey Schoelkopf authored Jan 11, 2024
```
* fix incorrect lookback protections

* bump generate_until task versions
```
  ff739414
- llama test · 2e27053d
  daniel-furman authored Jan 10, 2024
  
  2e27053d
- llama test · 1ea8470c
  daniel-furman authored Jan 10, 2024
  
  1ea8470c
- llama test · c38b9d21
  daniel-furman authored Jan 10, 2024
  
  c38b9d21
- llama test · 047dde8c
  daniel-furman authored Jan 10, 2024
  
  047dde8c
- llama test · b6c75ed1
  daniel-furman authored Jan 10, 2024
  
  b6c75ed1
- llama test · 021232be
  daniel-furman authored Jan 10, 2024
  
  021232be
- MultiMedQA (#1198) · 818c056b
  Tanishq Abraham authored Jan 10, 2024
```
* multimedqa

* Update medqa.yaml

* move to benchmarks folder

* add README.md

---------
Co-authored-by: Lintang Sutawika <lintang@sutawika.com>
```
  818c056b
10 Jan, 2024 11 commits
- Call "exact_match" once for each multiple-target sample (#1266) · 692e0f83
  Baber Abbasi authored Jan 10, 2024
```
* Refine scoring logic for multiple_target "exact_match" metric

* skip old tests from master

* skip old tests from master

* delete tests from master
```
  692e0f83
- fixed belebele (#1267) · 9b0b15b1
  James A. Michaelov authored Jan 10, 2024
  
  9b0b15b1
- specify utf-8 encoding to save samples to file. (#1265) · 7264a2e0
  Baber Abbasi authored Jan 10, 2024
  
  7264a2e0
- first stab at wrap_chat_template, various · 49f43f9f
  daniel-furman authored Jan 09, 2024
  
  49f43f9f
- first stab at wrap_chat_template, remove arc experiment · 2d3c835c
  daniel-furman authored Jan 09, 2024
  
  2d3c835c
- first stab at wrap_chat_template, arc conversation test · 9949e4fb
  daniel-furman authored Jan 09, 2024
  
  9949e4fb
- first stab at wrap_chat_template, arc conversation test · 7191904f
  daniel-furman authored Jan 09, 2024
  
  7191904f
- first stab at wrap_chat_template, various · 59e3b17c
  daniel-furman authored Jan 09, 2024
  
  59e3b17c
- first stab at wrap_chat_template, various · 34b32f77
  daniel-furman authored Jan 09, 2024
  
  34b32f77
- Merge branch 'EleutherAI:main' into main · 6c68fd16
  Daniel Furman authored Jan 09, 2024
  
  6c68fd16
- first stab at wrap_chat_template, remove special chars tab indenting style fix · 337c084b
  daniel-furman authored Jan 09, 2024
  
  337c084b