FixBug: Align the Humaneval with official results for Llama-3.1-70B-Instruct (#3092)
* Fix: Align the Humaneval dataset with official results
Details:(1) modified the "doc_to_text" and "gen_prefix" in the "humaneval_instruct.yaml" file to make them the same as the Prompt in "meta-llama/Llama-3.1-70B-Instruct-evals".
(2) Change r.rfind("```") to r.find("```"), so it can locate the first "```", not the last one.
Results: Partially reproduced the official results: The result of LLaMA3.1-8B-Instruct is 66.5 (the official result is 72.6), and the result of LLaMA3.1-70B-Instruct is 80.5 (the official result is 80.5).
Ref: PR#2650
* add changelog and version
* add changelog
Showing
Please register or sign in to comment