feat: Add request template support for default inference parameters (#841)
Adds support for specifying default request parameters through a json template file that can be applied across all inference requests. This enables consistent parameter settings while still allowing per-request overrides.
Changes:
- Add --request-template CLI flag to specify template file path
- Integrate template support in HTTP, batch and text input modes
- Template values can be overridden by individual request parameters
- Example template.json:
```
{
"model": "Qwen2.5-3B-Instruct",
"temperature": 0.7,
"max_completion_tokens": 4096
}
```
Showing
Please register or sign in to comment