1. 25 Sep, 2025 6 commits
  2. 25 Aug, 2025 1 commit
  3. 04 Aug, 2025 1 commit
  4. 26 Jul, 2025 1 commit
  5. 23 Jul, 2025 2 commits
  6. 22 Jul, 2025 1 commit
  7. 06 Jul, 2025 2 commits
  8. 05 Jul, 2025 1 commit
  9. 19 Jun, 2025 1 commit
  10. 19 May, 2025 1 commit
    • Harsha's avatar
      Adding ACPBench Hard tasks (#2980) · 0daf28fd
      Harsha authored
      * adding ACPBench_hard
      
      * adding Clingo
      
      * changing tarski to tarski[clingo]
      
      * denoting the main variants in each paper
      0daf28fd
  11. 13 May, 2025 1 commit
  12. 16 Apr, 2025 1 commit
  13. 03 Apr, 2025 1 commit
  14. 20 Mar, 2025 1 commit
  15. 19 Mar, 2025 1 commit
  16. 18 Mar, 2025 1 commit
    • Baber Abbasi's avatar
      Add loncxt tasks (#2629) · 80a10075
      Baber Abbasi authored
      suport for longcontext (and other synthetic tasks)
      * add ruler
      * add longbench
      * pass `metadata` to TaskConfig
      80a10075
  17. 17 Mar, 2025 1 commit
  18. 14 Mar, 2025 1 commit
  19. 05 Mar, 2025 1 commit
  20. 04 Mar, 2025 2 commits
  21. 21 Feb, 2025 1 commit
  22. 17 Dec, 2024 2 commits
  23. 15 Nov, 2024 1 commit
    • Nikodem Szwast's avatar
      IBM watsonx_llm fixes & refactor (#2464) · 4259a6d4
      Nikodem Szwast authored
      * refactor code, fix config path bug
      
      * update types to be from typing lib
      
      * add pre-commit formatting
      
      * specify version of ibm_watsonx_ai package
      
      * adjust get_watsonx_credentials() function, add minor refactor to adress PR review comments
      
      * change missing installation hint from ibm_watsonx_ai to lm_eval[ibm_watsonx_ai]
      4259a6d4
  24. 05 Nov, 2024 1 commit
    • mtkachenko's avatar
      Add Japanese Leaderboard (#2439) · 26f607f5
      mtkachenko authored
      * add jaqket_v2 and jcommonsenseqa
      
      * remove comments
      
      * remove num_beams as it is incompatible with vllm
      
      * add jnli + refactor
      
      * rename jnla -> jnli
      
      * add jsquad + replace colon chars with the Japanese unicode
      
      * ignore whitespaces in generation tasks
      
      * add marc_ja
      
      * add xwinograd + simplify other yamls
      
      * add mgsm and xlsum
      
      * refactor xlsum
      
      * add ja_leaderboard tag
      
      * edit README.md
      
      * update README.md
      
      * add credit + minor changes
      
      * run ruff format
      
      * address review comments + add group
      
      * remove aggregate_metric_list
      
      * remove tags
      
      * update tasks/README.md
      26f607f5
  25. 31 Oct, 2024 1 commit
    • Qubitium-ModelCloud's avatar
      Add GPTQModel support for evaluating GPTQ models (#2217) · 4f8e479e
      Qubitium-ModelCloud authored
      
      
      * support gptqmodel
      
      * code opt
      
      * add gptqmodel option
      
      * Update huggingface.py
      
      * Update pyproject.toml
      
      * gptqmodel version upgraded to 1.0.6
      
      * GPTQModel version upgraded to 1.0.8
      
      * Update pyproject.toml
      
      * fix ruff-format error
      
      * add gptqmodel test
      
      * Update gptqmodel test model
      
      * skip cuda
      
      * python3.8 compatible
      
      * Update README.md
      
      * Update README.md
      
      ---------
      Co-authored-by: default avatarCL-ModelCloud <cl@modelcloud.ai>
      4f8e479e
  26. 25 Oct, 2024 1 commit
  27. 23 Oct, 2024 1 commit
    • Nikodem Szwast's avatar
      Support for IBM watsonx_llm (#2397) · 1185e89a
      Nikodem Szwast authored
      
      
      * add support for IBM watsonx_llm
      
      * add ibm_watsonx_ai package to optional-dependencies
      
      * move global scope imports to inner scope
      
      * change cache to lru_cache
      
      * fix circular import
      
      * use 3.8 typing
      
      * use 3.8 typing
      
      ---------
      Co-authored-by: default avatarBaber <baber@hey.com>
      1185e89a
  28. 08 Oct, 2024 1 commit
  29. 05 Sep, 2024 1 commit
  30. 28 Aug, 2024 1 commit
  31. 01 Aug, 2024 1 commit