Unverified Commit 63ced7b4 authored by Cyrus Leung's avatar Cyrus Leung Committed by GitHub
Browse files

[Doc] Update notes for H2O-VL and Gemma3 (#17219)


Signed-off-by: default avatarDarkLight1337 <tlleungac@connect.ust.hk>
parent dc47ba32
......@@ -1118,11 +1118,6 @@ See [this page](#generative-models) for more information on how to use generativ
<sup>E</sup> Pre-computed embeddings can be inputted for this modality.
<sup>+</sup> Multiple items can be inputted per text prompt for this modality.
:::{important}
Pan-and-scan image pre-processing is currently supported on V0 (but not V1).
You can enable it by passing `--mm-processor-kwargs '{"do_pan_and_scan": true}'`.
:::
:::{warning}
Both V0 and V1 support `Gemma3ForConditionalGeneration` for text-only inputs.
However, there are differences in how they handle text + image inputs:
......@@ -1142,7 +1137,7 @@ This limitation exists because the model's mixed attention pattern (bidirectiona
:::
:::{note}
`h2oai/h2ovl-mississippi-2b` will be available in V1 once we support backends other than FlashAttention.
`h2oai/h2ovl-mississippi-2b` will be available in V1 once we support head size 80.
:::
:::{note}
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment