Update deploying_with_k8s.rst (#10922)

da6f4092 · AlexHe99 · GitHub · 25ebed2f · da6f4092
Unverified Commit da6f4092 authored Dec 16, 2024 by AlexHe99 Committed by GitHub Dec 15, 2024
Hide whitespace changes
Inline Side-by-side

Showing with 2 additions and 2 deletions

docs/source/serving/deploying_with_k8s.rst docs/source/serving/deploying_with_k8s.rst +2 -2

No files found.
--- a/docs/source/serving/deploying_with_k8s.rst
+++ b/docs/source/serving/deploying_with_k8s.rst
@@ -162,7 +162,7 @@ To test the deployment, run the following ``curl`` command:
    curl http://mistral-7b.default.svc.cluster.local/v1/completions \
      -H "Content-Type: application/json" \
      -d '{
-            "model": "facebook/opt-125m",
+            "model": "mistralai/Mistral-7B-Instruct-v0.3",
            "prompt": "San Francisco is a",
            "max_tokens": 7,
            "temperature": 0
@@ -172,4 +172,4 @@ If the service is correctly deployed, you should receive a response from the vLL
 Conclusion
 ----------
 Deploying vLLM with Kubernetes allows for efficient scaling and management of ML models leveraging GPU resources. By following the steps outlined above, you should be able to set up and test a vLLM deployment within your Kubernetes cluster. If you encounter any issues or have suggestions, please feel free to contribute to the documentation.
\ No newline at end of file