Commits · 2ebef4705f12e6f894a925d73c35673fecd896ac · gaoqiong / lm-evaluation-harness

03 Jul, 2025 1 commit
- Humaneval - fix regression (#3102) · 8c1016cb
  Baber Abbasi authored Jul 03, 2025
```
* use double quotes
```
  8c1016cb
30 Jun, 2025 1 commit

FixBug: Align the Humaneval with official results for Llama-3.1-70B-Instruct (#3092) · a7ca0435

jinze authored Jul 01, 2025

* Fix: Align the Humaneval dataset with official results

Details:(1) modified the "doc_to_text" and "gen_prefix" in the "humaneval_instruct.yaml" file to make them the same as the Prompt in "meta-llama/Llama-3.1-70B-Instruct-evals".

(2) Change r.rfind("```") to r.find("```"), so it can locate the first "```", not the last one.

Results: Partially reproduced the official results: The result of LLaMA3.1-8B-Instruct is 66.5 (the official result is 72.6), and the result of LLaMA3.1-70B-Instruct is 80.5 (the official result is 80.5).

Ref: PR#2650

* add changelog and version

* add changelog

a7ca0435

20 Mar, 2025 1 commit
- fix typo (#2820) · 110e65da
  Baber Abbasi authored Mar 20, 2025
  
  110e65da
11 Mar, 2025 1 commit
- humaneval instruct (#2650) · c8489857
  Baber Abbasi authored Mar 11, 2025
```
* add instruct humaneval

* nit

* add to readme

* nit
```
  c8489857