Fix humaneval_instruct (#3201)

* Update humaneval_64_instruct.yaml Sync doc_to_text with humaneval_instruct.yaml * Update humaneval_instruct.yaml Remove redundant (flawed) spaces * Update README.md * Bump task version

Fix humaneval_instruct (#3201)
* Update humaneval_64_instruct.yaml Sync doc_to_text with humaneval_instruct.yaml * Update humaneval_instruct.yaml Remove redundant (flawed) spaces * Update README.md * Bump task version
edf3aa7a · Idan Tene · GitHub · 06ba1d28 · edf3aa7a · edf3aa7a
Unverified Commit edf3aa7a authored Aug 04, 2025 by Idan Tene Committed by GitHub Aug 04, 2025
3 changed files
--- a/lm_eval/tasks/humaneval/README.md
+++ b/lm_eval/tasks/humaneval/README.md
@@ -52,3 +52,5 @@ If other tasks on this dataset are already supported:
 v2 20-MAR-2025: `humaneval_instruct`, `humaneval_instruct_64`: fixed typo in gen_prefix
 v3 30-JUN-2025: Updated prompt generation and output parsing to align with the official `Llama-3.1-70B-Instruct-evals`. This corrects the prompt format and fixes a bug in locating the code block. See PR [#3092](https://github.com/EleutherAI/lm-evaluation-harness/pull/3092).
+v4 01-AUG-2025: Synchronized definitions between `humaneval_instruct` and `humaneval_instruct_64`. The former had a trailing space in `gen_prefix`, and the latter's `doc_to_text` was outdated.
--- a/lm_eval/tasks/humaneval/humaneval_64_instruct.yaml
+++ b/lm_eval/tasks/humaneval/humaneval_64_instruct.yaml
 include: humaneval_64.yaml
 task: humaneval_64_instruct
-doc_to_text: "Write a solution to the following problem and make sure that it passes the tests:\n```{{prompt}}"
+doc_to_text: "Write a solution to the following problem and make sure that it passes the tests:\n```python\n{{ prompt }}\n```\n"
 gen_prefix: "Here is the completed function:\n```python\n{{prompt}}\n"
 filter_list:
  - name: "create_test"
@@ -8,4 +8,4 @@ filter_list:
      - function: "custom"
        filter_fn: !function utils.build_predictions_instruct
 metadata:
-  version: 2.0
+  version: 3.0
--- a/lm_eval/tasks/humaneval/humaneval_instruct.yaml
+++ b/lm_eval/tasks/humaneval/humaneval_instruct.yaml
 include: humaneval.yaml
 task: humaneval_instruct
-doc_to_text: "Write a solution to the following problem and make sure that it passes the tests:\n```python\n{{ prompt }}\n```\n "
+doc_to_text: "Write a solution to the following problem and make sure that it passes the tests:\n```python\n{{ prompt }}\n```\n"
-gen_prefix: "Here is the completed function:\n```python\n{{ prompt }}\n "
+gen_prefix: "Here is the completed function:\n```python\n{{ prompt }}\n"
 filter_list:
  - name: "create_test"
    filter:
      - function: "custom"
        filter_fn: !function utils.build_predictions_instruct
 metadata:
-  version: 3.0
+  version: 4.0