Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
c4e3b125
Unverified
Commit
c4e3b125
authored
Jul 17, 2025
by
Ricardo Decal
Committed by
GitHub
Jul 17, 2025
Browse files
[Docs] Add minimal demo of Ray Data API usage (#21080)
Signed-off-by:
Ricardo Decal
<
rdecal@anyscale.com
>
parent
8dfb45ca
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
26 additions
and
3 deletions
+26
-3
docs/serving/offline_inference.md
docs/serving/offline_inference.md
+26
-3
No files found.
docs/serving/offline_inference.md
View file @
c4e3b125
...
...
@@ -30,8 +30,31 @@ This API adds several batteries-included capabilities that simplify large-scale,
-
Automatic sharding, load balancing, and autoscaling distribute work across a Ray cluster with built-in fault tolerance.
-
Continuous batching keeps vLLM replicas saturated and maximizes GPU utilization.
-
Transparent support for tensor and pipeline parallelism enables efficient multi-GPU inference.
The following example shows how to run batched inference with Ray Data and vLLM:
<gh-file:examples
/
offline_inference
/
batch_llm_inference.py
>
-
Reading and writing to most popular file formats and cloud object storage.
-
Scaling up the workload without code changes.
??? code
```python
import ray # Requires ray>=2.44.1
from ray.data.llm import vLLMEngineProcessorConfig, build_llm_processor
config = vLLMEngineProcessorConfig(model_source="unsloth/Llama-3.2-1B-Instruct")
processor = build_llm_processor(
config,
preprocess=lambda row: {
"messages": [
{"role": "system", "content": "You are a bot that completes unfinished haikus."},
{"role": "user", "content": row["item"]},
],
"sampling_params": {"temperature": 0.3, "max_tokens": 250},
},
postprocess=lambda row: {"answer": row["generated_text"]},
)
ds = ray.data.from_items(["An old silent pond..."])
ds = processor(ds)
ds.write_parquet("local:///tmp/data/")
```
For more information about the Ray Data LLM API, see the
[
Ray Data LLM documentation
](
https://docs.ray.io/en/latest/data/working-with-llms.html
)
.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment