Unverified Commit 9df36b38 authored by Ryan McCormick's avatar Ryan McCormick Committed by GitHub
Browse files

fix(docs): Fix mermaid chart in MM doc and remove extra table (#7111)

parent f5b1cb47
...@@ -11,7 +11,6 @@ Dynamo supports multimodal inference across multiple LLM backends, enabling mode ...@@ -11,7 +11,6 @@ Dynamo supports multimodal inference across multiple LLM backends, enabling mode
**Security Requirement**: Multimodal processing must be explicitly enabled at startup. See the relevant backend documentation ([vLLM](multimodal-vllm.md), [SGLang](multimodal-sglang.md), [TRT-LLM](multimodal-trtllm.md)) for the necessary flags. This prevents unintended processing of multimodal data from untrusted sources. **Security Requirement**: Multimodal processing must be explicitly enabled at startup. See the relevant backend documentation ([vLLM](multimodal-vllm.md), [SGLang](multimodal-sglang.md), [TRT-LLM](multimodal-trtllm.md)) for the necessary flags. This prevents unintended processing of multimodal data from untrusted sources.
</Warning> </Warning>
## Key Features
```mermaid ```mermaid
--- ---
title: Sample flow for an aggregated VLM serving scenario title: Sample flow for an aggregated VLM serving scenario
...@@ -28,6 +27,9 @@ flowchart TD ...@@ -28,6 +27,9 @@ flowchart TD
C --> I[DECODE] C --> I[DECODE]
H --> I H --> I
I --> J[Response] I --> J[Response]
```
## Key Features
Dynamo provides support for improving latency and throughput for vision-and-language workloads through the following features, that can be used together or separately, depending on your workload characteristics: Dynamo provides support for improving latency and throughput for vision-and-language workloads through the following features, that can be used together or separately, depending on your workload characteristics:
| Feature | Description | | Feature | Description |
...@@ -46,14 +48,6 @@ Dynamo provides support for improving latency and throughput for vision-and-lang ...@@ -46,14 +48,6 @@ Dynamo provides support for improving latency and throughput for vision-and-lang
**Status:** ✅ Supported | 🧪 Experimental | ❌ Not supported **Status:** ✅ Supported | 🧪 Experimental | ❌ Not supported
### Input Format Support
| Format | SGLang | TRT-LLM | vLLM |
|--------|--------|---------|------|
| HTTP/HTTPS URL | ✅ | ✅ | ✅ |
| Data URL (Base64) | ❌ | ❌ | ✅ |
| Pre-computed Embeddings (.pt) | ❌ | ✅ | ❌ |
## Example Workflows ## Example Workflows
Reference implementations for deploying multimodal models: Reference implementations for deploying multimodal models:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment