Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
d0697cc7
Unverified
Commit
d0697cc7
authored
Apr 17, 2026
by
z1ying
Committed by
GitHub
Apr 18, 2026
Browse files
[Doc] Add Realtime Transcription section to supported_models.md (#39845)
Signed-off-by:
Ziying Tao
<
tzzying@outlook.com
>
parent
b0755523
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
19 additions
and
1 deletion
+19
-1
docs/models/supported_models.md
docs/models/supported_models.md
+18
-0
docs/serving/openai_compatible_server.md
docs/serving/openai_compatible_server.md
+1
-1
No files found.
docs/models/supported_models.md
View file @
d0697cc7
...
@@ -682,6 +682,24 @@ Speech2Text models trained specifically for Automatic Speech Recognition.
...
@@ -682,6 +682,24 @@ Speech2Text models trained specifically for Automatic Speech Recognition.
!!! note
!!! note
`VoxtralForConditionalGeneration`
requires
`mistral-common[audio]`
to be installed.
`VoxtralForConditionalGeneration`
requires
`mistral-common[audio]`
to be installed.
#### Realtime Transcription
Speech models that support streaming transcription via the
[
`/v1/realtime`
](
../serving/openai_compatible_server.md#realtime-api
)
WebSocket endpoint.
| Architecture | Models | Example HF Models |
[
LoRA
](
../features/lora.md
)
|
[
PP
](
../serving/parallelism_scaling.md
)
|
| ------------ | ------ | ----------------- | -------------------- | ------------------------- |
|
`VoxtralRealtimeGeneration`
| Voxtral Realtime |
`mistralai/Voxtral-Mini-4B-Realtime-2602`
| | |
|
`Qwen3ASRRealtimeGeneration`
| Qwen3-ASR Realtime |
`Qwen/Qwen3-ASR-0.6B`
| | |
!!! note
`VoxtralRealtimeGeneration`
requires
`mistral-common[audio]`
to be installed, and must be served with
`--tokenizer-mode mistral`
.
`Qwen3ASRRealtimeGeneration` is not auto-detected from `config.json`.
You must pass `--hf-overrides '{"architectures":["Qwen3ASRRealtimeGeneration"]}'`
when serving.
## Pooling Models
## Pooling Models
See
[
this page
](
pooling_models/README.md
)
for more information on how to use pooling models.
See
[
this page
](
pooling_models/README.md
)
for more information on how to use pooling models.
...
...
docs/serving/openai_compatible_server.md
View file @
d0697cc7
...
@@ -60,7 +60,7 @@ We currently support the following OpenAI APIs:
...
@@ -60,7 +60,7 @@ We currently support the following OpenAI APIs:
-
[
Translation API
](
#translations-api
)
(
`/v1/audio/translations`
)
-
[
Translation API
](
#translations-api
)
(
`/v1/audio/translations`
)
-
Only applicable to
[
Automatic Speech Recognition (ASR) models
](
../models/supported_models.md#transcription
)
.
-
Only applicable to
[
Automatic Speech Recognition (ASR) models
](
../models/supported_models.md#transcription
)
.
-
[
Realtime API
](
#realtime-api
)
(
`/v1/realtime`
)
-
[
Realtime API
](
#realtime-api
)
(
`/v1/realtime`
)
-
Only applicable to
[
Automatic Speech Recognition (ASR) models
](
../models/supported_models.md#transcription
)
.
-
Only applicable to
[
Automatic Speech Recognition (ASR) models
](
../models/supported_models.md#
realtime-
transcription
)
.
In addition, we have the following custom APIs:
In addition, we have the following custom APIs:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment