[Doc] Add Realtime Transcription section to supported_models.md (#39845)

Signed-off-by: Ziying Tao <tzzying@outlook.com>

[Doc] Add Realtime Transcription section to supported_models.md (#39845)
Signed-off-by: Ziying Tao <tzzying@outlook.com>
d0697cc7 · z1ying · GitHub · b0755523 · d0697cc7 · d0697cc7
Unverified Commit d0697cc7 authored Apr 17, 2026 by z1ying Committed by GitHub Apr 18, 2026
Hide whitespace changes
Inline Side-by-side

Showing with 19 additions and 1 deletion

docs/models/supported_models.md docs/models/supported_models.md +18 -0

docs/serving/openai_compatible_server.md docs/serving/openai_compatible_server.md +1 -1

No files found.
--- a/docs/models/supported_models.md
+++ b/docs/models/supported_models.md
@@ -682,6 +682,24 @@ Speech2Text models trained specifically for Automatic Speech Recognition.
 !!! note
    `VoxtralForConditionalGeneration` requires `mistral-common[audio]` to be installed.
+#### Realtime Transcription
+Speech models that support streaming transcription via the
+[`/v1/realtime`](../serving/openai_compatible_server.md#realtime-api)
+WebSocket endpoint.
+| Architecture | Models | Example HF Models | [LoRA](../features/lora.md) | [PP](../serving/parallelism_scaling.md) |
+| ------------ | ------ | ----------------- | -------------------- | ------------------------- |
+| `VoxtralRealtimeGeneration` | Voxtral Realtime | `mistralai/Voxtral-Mini-4B-Realtime-2602` | | |
+| `Qwen3ASRRealtimeGeneration` | Qwen3-ASR Realtime | `Qwen/Qwen3-ASR-0.6B` | | |
+!!! note
+    `VoxtralRealtimeGeneration` requires `mistral-common[audio]` to be installed, and must be served with `--tokenizer-mode mistral`.
+    `Qwen3ASRRealtimeGeneration` is not auto-detected from `config.json`.
+    You must pass `--hf-overrides '{"architectures":["Qwen3ASRRealtimeGeneration"]}'`
+    when serving.
 ## Pooling Models
 See [this page](pooling_models/README.md) for more information on how to use pooling models.

--- a/docs/serving/openai_compatible_server.md
+++ b/docs/serving/openai_compatible_server.md
@@ -60,7 +60,7 @@ We currently support the following OpenAI APIs:
 - [Translation API](#translations-api) (`/v1/audio/translations`)
    - Only applicable to [Automatic Speech Recognition (ASR) models](../models/supported_models.md#transcription).
 - [Realtime API](#realtime-api) (`/v1/realtime`)
-    - Only applicable to [Automatic Speech Recognition (ASR) models](../models/supported_models.md#transcription).
+    - Only applicable to [Automatic Speech Recognition (ASR) models](../models/supported_models.md#realtime-transcription).
 In addition, we have the following custom APIs: