1. 22 Nov, 2024 1 commit
  2. 20 Nov, 2024 1 commit
    • Baber Abbasi's avatar
      Nits (#2500) · 867413f8
      Baber Abbasi authored
      * fix test task
      
      * dont call lm.chat_template each time
      867413f8
  3. 18 Nov, 2024 3 commits
    • Kozzy Voudouris's avatar
      Add metabench task to LM Evaluation Harness (#2357) · 62b4364d
      Kozzy Voudouris authored
      
      
      * Add metabench (Kipnis et al. 2024)
      
      * Update metabench tasks for full replication of original benchmarks, using publicly available datasets
      
      * Remove unnecessary import
      
      * Add permute versions of each task, where the answer orders are randomly shuffled.
      
      * Add metabench group for easier evaluations
      
      * Fix mmlu counts after removing duplicate
      
      * Add secondary datasets
      
      * Fix f-string error
      
      * Fix f-string error for permute processing
      
      * Add original hash to outputs for easy matching to original results
      
      * Add line break at end of utils files
      
      * Remove extra line from winogrande
      
      * Reformat for linters
      
      * fix multiple input test
      
      * appease pre-commit
      
      * Add metabench to tasks README
      
      * fix multiple input `test_doc_to_text`
      
      ---------
      Co-authored-by: default avatarBaber <baber@hey.com>
      62b4364d
    • Baber Abbasi's avatar
      remove duplicate `arc_ca` (#2499) · 8222ad0a
      Baber Abbasi authored
      8222ad0a
    • Baber Abbasi's avatar
      Add mamba hf to `mamba_ssm` (#2496) · 0f5dc265
      Baber Abbasi authored
      * add hf mamba to mamba_lm
      
      * fix _model_generate for hf
      0f5dc265
  4. 16 Nov, 2024 2 commits
    • Wonseok Hwang's avatar
      kbl-v0.1.1 (#2493) · cbc31eb8
      Wonseok Hwang authored
      * release kbl-v0.1
      
      * fix linting
      
      * remove rag tasks as  doc_to_text functions cause trouble
      
      * remove remaining rag tasks
      
      * remove unnecessary repeat in yaml files and rag dataset in hf-hub
      
      * remove unncessary newline; introduce cfg files in lbox/kbl in hf
      
      * Make task yaml files consistent to hf-datasets-config
      
      * Make task yaml files consistent to hf-datasets-config
      
      * Remove trailing empty space in doc-to-text
      
      * Remove unncessary yaml file
      
      * Fix task nameing error
      
      * trailing space removed
      cbc31eb8
    • Baber Abbasi's avatar
      update pre-commit hooks and git actions (#2497) · badf273a
      Baber Abbasi authored
      * pre-commit update
      
      * update github actions
      
      * make logging less verbose
      
      * fix artifacts
      badf273a
  5. 15 Nov, 2024 2 commits
  6. 12 Nov, 2024 1 commit
  7. 11 Nov, 2024 2 commits
  8. 09 Nov, 2024 2 commits
  9. 07 Nov, 2024 3 commits
  10. 06 Nov, 2024 1 commit
  11. 05 Nov, 2024 3 commits
    • mtkachenko's avatar
      Add Japanese Leaderboard (#2439) · 26f607f5
      mtkachenko authored
      * add jaqket_v2 and jcommonsenseqa
      
      * remove comments
      
      * remove num_beams as it is incompatible with vllm
      
      * add jnli + refactor
      
      * rename jnla -> jnli
      
      * add jsquad + replace colon chars with the Japanese unicode
      
      * ignore whitespaces in generation tasks
      
      * add marc_ja
      
      * add xwinograd + simplify other yamls
      
      * add mgsm and xlsum
      
      * refactor xlsum
      
      * add ja_leaderboard tag
      
      * edit README.md
      
      * update README.md
      
      * add credit + minor changes
      
      * run ruff format
      
      * address review comments + add group
      
      * remove aggregate_metric_list
      
      * remove tags
      
      * update tasks/README.md
      26f607f5
    • zxcvuser's avatar
      Modify label errors in catcola and paws-x (#2434) · fb2e4b59
      zxcvuser authored
      
      
      * Modify label errors in catcola and paws
      
      * Update version to 1.0 in pawsx_template_yaml
      
      * add changelog
      
      ---------
      Co-authored-by: default avatarBaber <baber@hey.com>
      fb2e4b59
    • Sypherd's avatar
      Add real process_docs example (#2456) · 0b8358ec
      Sypherd authored
      0b8358ec
  12. 04 Nov, 2024 1 commit
  13. 01 Nov, 2024 1 commit
  14. 31 Oct, 2024 1 commit
    • Qubitium-ModelCloud's avatar
      Add GPTQModel support for evaluating GPTQ models (#2217) · 4f8e479e
      Qubitium-ModelCloud authored
      
      
      * support gptqmodel
      
      * code opt
      
      * add gptqmodel option
      
      * Update huggingface.py
      
      * Update pyproject.toml
      
      * gptqmodel version upgraded to 1.0.6
      
      * GPTQModel version upgraded to 1.0.8
      
      * Update pyproject.toml
      
      * fix ruff-format error
      
      * add gptqmodel test
      
      * Update gptqmodel test model
      
      * skip cuda
      
      * python3.8 compatible
      
      * Update README.md
      
      * Update README.md
      
      ---------
      Co-authored-by: default avatarCL-ModelCloud <cl@modelcloud.ai>
      4f8e479e
  15. 30 Oct, 2024 3 commits
  16. 25 Oct, 2024 1 commit
  17. 23 Oct, 2024 1 commit
    • Nikodem Szwast's avatar
      Support for IBM watsonx_llm (#2397) · 1185e89a
      Nikodem Szwast authored
      
      
      * add support for IBM watsonx_llm
      
      * add ibm_watsonx_ai package to optional-dependencies
      
      * move global scope imports to inner scope
      
      * change cache to lru_cache
      
      * fix circular import
      
      * use 3.8 typing
      
      * use 3.8 typing
      
      ---------
      Co-authored-by: default avatarBaber <baber@hey.com>
      1185e89a
  18. 22 Oct, 2024 2 commits
  19. 20 Oct, 2024 1 commit
  20. 17 Oct, 2024 2 commits
  21. 16 Oct, 2024 1 commit
  22. 14 Oct, 2024 1 commit
  23. 08 Oct, 2024 4 commits