Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
2be07a0d
Unverified
Commit
2be07a0d
authored
Aug 09, 2025
by
Thomas Parnell
Committed by
GitHub
Aug 09, 2025
Browse files
Update docs for Minimax-Text support (#22562)
Signed-off-by:
Thomas Parnell
<
tpa@zurich.ibm.com
>
parent
0edc0cd5
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
6 additions
and
2 deletions
+6
-2
docs/models/supported_models.md
docs/models/supported_models.md
+2
-2
docs/usage/v1_guide.md
docs/usage/v1_guide.md
+4
-0
No files found.
docs/models/supported_models.md
View file @
2be07a0d
...
...
@@ -404,8 +404,8 @@ th {
|
`TeleChat2ForCausalLM`
| TeleChat2 |
`Tele-AI/TeleChat2-3B`
,
`Tele-AI/TeleChat2-7B`
,
`Tele-AI/TeleChat2-35B`
, etc. | ✅︎ | ✅︎ | ✅︎ |
|
`TeleFLMForCausalLM`
| TeleFLM |
`CofeAI/FLM-2-52B-Instruct-2407`
,
`CofeAI/Tele-FLM`
, etc. | ✅︎ | ✅︎ | ✅︎ |
|
`XverseForCausalLM`
| XVERSE |
`xverse/XVERSE-7B-Chat`
,
`xverse/XVERSE-13B-Chat`
,
`xverse/XVERSE-65B-Chat`
, etc. | ✅︎ | ✅︎ | ✅︎ |
|
`MiniMaxM1ForCausalLM`
| MiniMax-Text |
`MiniMaxAI/MiniMax-M1-40k`
,
`MiniMaxAI/MiniMax-M1-80k`
, etc. | | | |
|
`MiniMaxText01ForCausalLM`
| MiniMax-Text |
`MiniMaxAI/MiniMax-Text-01`
, etc. | | | |
|
`MiniMaxM1ForCausalLM`
| MiniMax-Text |
`MiniMaxAI/MiniMax-M1-40k`
,
`MiniMaxAI/MiniMax-M1-80k`
, etc. | | |
✅︎
|
|
`MiniMaxText01ForCausalLM`
| MiniMax-Text |
`MiniMaxAI/MiniMax-Text-01`
, etc. | | |
✅︎
|
|
`Zamba2ForCausalLM`
| Zamba2 |
`Zyphra/Zamba2-7B-instruct`
,
`Zyphra/Zamba2-2.7B-instruct`
,
`Zyphra/Zamba2-1.2B-instruct`
, etc. | | | ✅︎ |
!!! note
...
...
docs/usage/v1_guide.md
View file @
2be07a0d
...
...
@@ -111,6 +111,10 @@ Models that combine Mamba-2 and Mamba-1 layers with standard attention layers ar
`Zamba2ForCausalLM`
,
`NemotronHForCausalLM`
,
`FalconH1ForCausalLM`
and
`GraniteMoeHybridForCausalLM`
,
`JambaForCausalLM`
). Please note that
these models currently require disabling prefix caching and using the FlashInfer attention backend in V1.
Hybrid models with mechanisms different to Mamba are also supported (e.g,
`MiniMaxText01ForCausalLM`
,
`MiniMaxM1ForCausalLM`
).
Please note that these models currently require disabling prefix caching, enforcing eager mode, and using the FlashInfer
attention backend in V1.
#### Encoder-Decoder Models
Models requiring cross-attention between separate encoder and decoder (e.g.,
`BartForConditionalGeneration`
,
`MllamaForConditionalGeneration`
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment