"vllm/vscode:/vscode.git/clone" did not exist on "e1eefa4c40fc5b28bd7e83b6596bb5d2f420fd92"
triton.md 462 Bytes
Newer Older
1
2
3
4
---
title: NVIDIA Triton
---
[](){ #deployment-triton }
5
6

The [Triton Inference Server](https://github.com/triton-inference-server) hosts a tutorial demonstrating how to quickly deploy a simple [facebook/opt-125m](https://huggingface.co/facebook/opt-125m) model using vLLM. Please see [Deploying a vLLM model in Triton](https://github.com/triton-inference-server/tutorials/blob/main/Quick_Deploy/vLLM/README.md#deploying-a-vllm-model-in-triton) for more details.