Unverified Commit c2a8b08f authored by Michael Yao's avatar Michael Yao Committed by GitHub
Browse files

[Doc] Fix issues in integrations/llamastack.md (#24428)


Signed-off-by: default avatarwindsonsea <haifeng.yao@daocloud.io>
parent f4962a6d
# Llama Stack # Llama Stack
vLLM is also available via [Llama Stack](https://github.com/meta-llama/llama-stack) . vLLM is also available via [Llama Stack](https://github.com/llamastack/llama-stack).
To install Llama Stack, run To install Llama Stack, run
...@@ -8,9 +8,9 @@ To install Llama Stack, run ...@@ -8,9 +8,9 @@ To install Llama Stack, run
pip install llama-stack -q pip install llama-stack -q
``` ```
## Inference using OpenAI Compatible API ## Inference using OpenAI-Compatible API
Then start Llama Stack server pointing to your vLLM server with the following configuration: Then start the Llama Stack server and configure it to point to your vLLM server with the following settings:
```yaml ```yaml
inference: inference:
...@@ -20,15 +20,15 @@ inference: ...@@ -20,15 +20,15 @@ inference:
url: http://127.0.0.1:8000 url: http://127.0.0.1:8000
``` ```
Please refer to [this guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/remote-vllm.html) for more details on this remote vLLM provider. Please refer to [this guide](https://llama-stack.readthedocs.io/en/latest/providers/inference/remote_vllm.html) for more details on this remote vLLM provider.
## Inference via Embedded vLLM ## Inference using Embedded vLLM
An [inline vLLM provider](https://github.com/meta-llama/llama-stack/tree/main/llama_stack/providers/inline/inference/vllm) An [inline provider](https://github.com/llamastack/llama-stack/tree/main/llama_stack/providers/inline/inference)
is also available. This is a sample of configuration using that method: is also available. This is a sample of configuration using that method:
```yaml ```yaml
inference inference:
- provider_type: vllm - provider_type: vllm
config: config:
model: Llama3.1-8B-Instruct model: Llama3.1-8B-Instruct
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment