Unverified Commit f509a208 authored by Elieser Pereira's avatar Elieser Pereira Committed by GitHub
Browse files

[DOC] Update production-stack.md (#26177)


Signed-off-by: default avatarElieser Pereira <elieser.pereiraa@gmail.com>
parent 60bc25e7
......@@ -55,7 +55,7 @@ sudo kubectl port-forward svc/vllm-router-service 30080:80
And then you can send out a query to the OpenAI-compatible API to check the available models:
```bash
curl -o- http://localhost:30080/models
curl -o- http://localhost:30080/v1/models
```
??? console "Output"
......@@ -78,7 +78,7 @@ curl -o- http://localhost:30080/models
To send an actual chatting request, you can issue a curl request to the OpenAI `/completion` endpoint:
```bash
curl -X POST http://localhost:30080/completions \
curl -X POST http://localhost:30080/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "facebook/opt-125m",
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment