Unverified Commit f509a208 authored by Elieser Pereira's avatar Elieser Pereira Committed by GitHub
Browse files

[DOC] Update production-stack.md (#26177)


Signed-off-by: default avatarElieser Pereira <elieser.pereiraa@gmail.com>
parent 60bc25e7
...@@ -55,7 +55,7 @@ sudo kubectl port-forward svc/vllm-router-service 30080:80 ...@@ -55,7 +55,7 @@ sudo kubectl port-forward svc/vllm-router-service 30080:80
And then you can send out a query to the OpenAI-compatible API to check the available models: And then you can send out a query to the OpenAI-compatible API to check the available models:
```bash ```bash
curl -o- http://localhost:30080/models curl -o- http://localhost:30080/v1/models
``` ```
??? console "Output" ??? console "Output"
...@@ -78,7 +78,7 @@ curl -o- http://localhost:30080/models ...@@ -78,7 +78,7 @@ curl -o- http://localhost:30080/models
To send an actual chatting request, you can issue a curl request to the OpenAI `/completion` endpoint: To send an actual chatting request, you can issue a curl request to the OpenAI `/completion` endpoint:
```bash ```bash
curl -X POST http://localhost:30080/completions \ curl -X POST http://localhost:30080/v1/completions \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-d '{ -d '{
"model": "facebook/opt-125m", "model": "facebook/opt-125m",
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment