server_args:"--tensor-parallel-size2"# Server arguments
startup_max_wait_seconds:1800# Max wait for server startup (default: 1800)
env:# Environment variables (optional)
SOME_VAR:"value"
```
The `server_args` field accepts any arguments that can be passed to `vllm serve`.
The `env` field accepts a dictionary of environment variables to set for the server process.
## Adding New Models
1. Create a new YAML config file in the `configs/` directory
2. Add the filename to the appropriate `models-*.txt` file
## Tiktoken Encoding Files
The tiktoken encoding files required by the vLLM server are automatically downloaded from OpenAI's public blob storage on first run:
-`cl100k_base.tiktoken`
-`o200k_base.tiktoken`
Files are cached in the `data/` directory. The `TIKTOKEN_ENCODINGS_BASE` environment variable is automatically set to point to this directory when running evaluations.