README.md 298 Bytes
Newer Older
1
2
3
4
5
6
7
# Using vLLM

vLLM supports the following usage patterns:

- [Inference and Serving](../serving/offline_inference.md): Run a single instance of a model.
- [Deployment](../deployment/docker.md): Scale up model instances for production.
- [Training](../training/rlhf.md): Train or fine-tune a model.