Commits · edf3aa7a4bc991d49c802eaa01dc9a1f0ee56171 · gaoqiong / lm-evaluation-harness

04 Aug, 2025 1 commit

Fix humaneval_instruct (#3201) · edf3aa7a

Idan Tene authored Aug 04, 2025

* Update humaneval_64_instruct.yaml

Sync doc_to_text with humaneval_instruct.yaml

* Update humaneval_instruct.yaml

Remove redundant (flawed) spaces

* Update README.md

* Bump task version

edf3aa7a

03 Jul, 2025 1 commit
- Humaneval - fix regression (#3102) · 8c1016cb
  Baber Abbasi authored Jul 03, 2025
```
* use double quotes
```
  8c1016cb
30 Jun, 2025 1 commit

FixBug: Align the Humaneval with official results for Llama-3.1-70B-Instruct (#3092) · a7ca0435

jinze authored Jul 01, 2025

* Fix: Align the Humaneval dataset with official results

Details:(1) modified the "doc_to_text" and "gen_prefix" in the "humaneval_instruct.yaml" file to make them the same as the Prompt in "meta-llama/Llama-3.1-70B-Instruct-evals".

(2) Change r.rfind("```") to r.find("```"), so it can locate the first "```", not the last one.

Results: Partially reproduced the official results: The result of LLaMA3.1-8B-Instruct is 66.5 (the official result is 72.6), and the result of LLaMA3.1-70B-Instruct is 80.5 (the official result is 80.5).

Ref: PR#2650

* add changelog and version

* add changelog

a7ca0435

20 Mar, 2025 1 commit
- fix typo (#2820) · 110e65da
  Baber Abbasi authored Mar 20, 2025
  
  110e65da
11 Mar, 2025 1 commit
- humaneval instruct (#2650) · c8489857
  Baber Abbasi authored Mar 11, 2025
```
* add instruct humaneval

* nit

* add to readme

* nit
```
  c8489857
25 Feb, 2025 1 commit
- add humaneval+ and mbpp+ (#2734) · 86bbf6ac
  Minho Ryu authored Feb 25, 2025
```
* add humaneval+ and mbpp+

* add newline at end of file
```
  86bbf6ac
15 Jan, 2025 1 commit

Add HumanEval (#1992) · 4c11206b

Hojin Lee authored Jan 16, 2025



* add custom filter

* fix type casting of references

* add humaneval

* fix a bug in humaneval

* add greedy version of humaneval

* update tasks README

* test humaneval

* return multiple metrics

* nit

* add confirmation to run code tasks

* nit

* nit

---------
Co-authored-by: Hojin Lee <19949034+hjlee1371@users.noreply.github.com>
Co-authored-by: Baber <baber@hey.com>

4c11206b