1. 14 Jul, 2025 1 commit
  2. 13 Jul, 2025 1 commit
  3. 12 Jul, 2025 4 commits
  4. 11 Jul, 2025 8 commits
  5. 10 Jul, 2025 4 commits
  6. 06 Jul, 2025 3 commits
  7. 05 Jul, 2025 4 commits
  8. 04 Jul, 2025 1 commit
  9. 03 Jul, 2025 4 commits
  10. 30 Jun, 2025 2 commits
    • jinze's avatar
      FixBug: Align the Humaneval with official results for Llama-3.1-70B-Instruct (#3092) · a7ca0435
      jinze authored
      * Fix: Align the Humaneval dataset with official results
      
      Details:(1) modified the "doc_to_text" and "gen_prefix" in the "humaneval_instruct.yaml" file to make them the same as the Prompt in "meta-llama/Llama-3.1-70B-Instruct-evals".
      
      (2) Change r.rfind("```") to r.find("```"), so it can locate the first "```", not the last one.
      
      Results: Partially reproduced the official results: The result of LLaMA3.1-8B-Instruct is 66.5 (the official result is 72.6), and the result of LLaMA3.1-70B-Instruct is 80.5 (the official result is 80.5).
      
      Ref: PR#2650
      
      * add changelog and version
      
      * add changelog
      a7ca0435
    • Baber Abbasi's avatar
      [HF] fix quantization config (#3039) · fea4d11d
      Baber Abbasi authored
      * Try fixing issue 3026 which is caused by the quantization_config argument introduced in Commit 758c5ed8
      
      .
      The argument is in Dict type, but for a GPTQ quantized model, it has a conflict with the huggingface interface which expects QuantizationConfigMixin type.
      Current solution is removing quantization_config argument in HFLM._create_model() of lm_eval/models/huggingface.py.
      Require further modification to restore the functionality provided by the previous commit.
      
      * wrap quantization_config in AutoQuantizationConfig
      
      * handle quantization config not dict
      
      * wrap quantization_config in AutoQuantizationConfig if dict
      
      ---------
      Co-authored-by: default avatarshanhx2000 <hs359@duke.edu>
      fea4d11d
  11. 25 Jun, 2025 3 commits
  12. 23 Jun, 2025 1 commit
    • NourFahmy's avatar
      Fix Anthropic API compatibility issues in chat completions (#3054) · 8bc46207
      NourFahmy authored
      
      
      * Fix Anthropic API compatibility issues in chat completions
      
      solves two important compatibility issues between the LM Eval Harness and Anthropic's API:
      
      1) The type field issue - Anthropic's Messages API doesn't accept the type field that other APIs might expect, that was previously included
      2) The stop sequences issue - Anthropic requires stop sequences to contain non-whitespace characters
      
      tested with most recent models from anthopic; claude-sonnet-4-0, claude-opus-4-0, resolved my local api errors
      
      * pacufy pre-commit
      
      * add type
      
      ---------
      Co-authored-by: default avatarBaber <baber@hey.com>
      8bc46207
  13. 20 Jun, 2025 1 commit
  14. 19 Jun, 2025 3 commits