"vscode:/vscode.git/clone" did not exist on "c817b1415121cf88178af1e4e78f651d802df4da"
Unverified Commit 55be93ba authored by Didier Durand's avatar Didier Durand Committed by GitHub
Browse files

[Doc]: fix 2 hyperlinks leading to Ray site after they changed Ray's doc structure (#24438)


Signed-off-by: default avatarDidier Durand <durand.didier@gmail.com>
Signed-off-by: default avatarHarry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: default avatarHarry Mellor <19981378+hmellor@users.noreply.github.com>
parent 717fc00e
...@@ -66,7 +66,7 @@ Ray is a distributed computing framework for scaling Python programs. Multi-node ...@@ -66,7 +66,7 @@ Ray is a distributed computing framework for scaling Python programs. Multi-node
vLLM uses Ray to manage the distributed execution of tasks across multiple nodes and control where execution happens. vLLM uses Ray to manage the distributed execution of tasks across multiple nodes and control where execution happens.
Ray also offers high-level APIs for large-scale [offline batch inference](https://docs.ray.io/en/latest/data/working-with-llms.html) and [online serving](https://docs.ray.io/en/latest/serve/llm/serving-llms.html) that can leverage vLLM as the engine. These APIs add production-grade fault tolerance, scaling, and distributed observability to vLLM workloads. Ray also offers high-level APIs for large-scale [offline batch inference](https://docs.ray.io/en/latest/data/working-with-llms.html) and [online serving](https://docs.ray.io/en/latest/serve/llm) that can leverage vLLM as the engine. These APIs add production-grade fault tolerance, scaling, and distributed observability to vLLM workloads.
For details, see the [Ray documentation](https://docs.ray.io/en/latest/index.html). For details, see the [Ray documentation](https://docs.ray.io/en/latest/index.html).
...@@ -104,7 +104,7 @@ Note that `VLLM_HOST_IP` is unique for each worker. Keep the shells running thes ...@@ -104,7 +104,7 @@ Note that `VLLM_HOST_IP` is unique for each worker. Keep the shells running thes
From any node, enter a container and run `ray status` and `ray list nodes` to verify that Ray finds the expected number of nodes and GPUs. From any node, enter a container and run `ray status` and `ray list nodes` to verify that Ray finds the expected number of nodes and GPUs.
!!! tip !!! tip
Alternatively, set up the Ray cluster using KubeRay. For more information, see [KubeRay vLLM documentation](https://docs.ray.io/en/latest/cluster/kubernetes/examples/vllm-rayservice.html). Alternatively, set up the Ray cluster using KubeRay. For more information, see [KubeRay vLLM documentation](https://docs.ray.io/en/latest/cluster/kubernetes/examples/rayserve-llm-example.html).
### Running vLLM on a Ray cluster ### Running vLLM on a Ray cluster
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment