- Support for models loaded via [transformers](https://github.com/huggingface/transformers/)(including quantization via [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ)), [GPT-NeoX](https://github.com/EleutherAI/gpt-neox), and [Megatron-DeepSpeed](https://github.com/microsoft/Megatron-DeepSpeed/), with a flexible tokenization-agnostic interface.
- Support for commercial APIs including [OpenAI](https://openai.com), [Anthropic](https://goose.ai), [Cohere](https://goose.ai), and [goose.ai](https://textsynth.com/).
...
...
@@ -13,6 +10,8 @@ Features:
- Support for local models and benchmark datasets.
- Evaluating with publicly available prompts ensures reproducibility and comparability between papers.
The EleutherAI is thrilled that the Language Model Evaluation Harness is the backend for 🤗 Hugging Face's popular [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
## Install
To install the `lm-eval` refactor branch from the github repository, run: