Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
7cd74091
Unverified
Commit
7cd74091
authored
Dec 13, 2024
by
Jani Monoses
Committed by
GitHub
Dec 13, 2024
Browse files
PaliGemma 2 support (#11142)
parent
be39e3cd
Changes
3
Show whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
25 additions
and
3 deletions
+25
-3
docs/source/models/supported_models.rst
docs/source/models/supported_models.rst
+2
-2
examples/offline_inference_vision_language.py
examples/offline_inference_vision_language.py
+13
-0
vllm/model_executor/models/paligemma.py
vllm/model_executor/models/paligemma.py
+10
-1
No files found.
docs/source/models/supported_models.rst
View file @
7cd74091
...
...
@@ -664,9 +664,9 @@ Text Generation (``--task generate``)
- ✅︎
- ✅︎
* - :code:`PaliGemmaForConditionalGeneration`
- PaliGemma
- PaliGemma
, PaliGemma 2
- T + I\ :sup:`E`
- :code:`google/paligemma-3b-pt-224`, :code:`google/paligemma-3b-mix-224`, etc.
- :code:`google/paligemma-3b-pt-224`, :code:`google/paligemma-3b-mix-224`,
:code:`google/paligemma2-3b-ft-docci-448`,
etc.
-
- ✅︎
-
...
...
examples/offline_inference_vision_language.py
View file @
7cd74091
...
...
@@ -137,6 +137,18 @@ def run_paligemma(question: str, modality: str):
return
llm
,
prompt
,
stop_token_ids
# PaliGemma 2
def
run_paligemma2
(
question
:
str
,
modality
:
str
):
assert
modality
==
"image"
# PaliGemma 2 has special prompt format for VQA
prompt
=
"caption en"
llm
=
LLM
(
model
=
"google/paligemma2-3b-ft-docci-448"
,
mm_cache_preprocessor
=
args
.
mm_cache_preprocessor
)
stop_token_ids
=
None
return
llm
,
prompt
,
stop_token_ids
# Chameleon
def
run_chameleon
(
question
:
str
,
modality
:
str
):
assert
modality
==
"image"
...
...
@@ -473,6 +485,7 @@ model_example_map = {
"fuyu"
:
run_fuyu
,
"phi3_v"
:
run_phi3v
,
"paligemma"
:
run_paligemma
,
"paligemma2"
:
run_paligemma2
,
"chameleon"
:
run_chameleon
,
"minicpmv"
:
run_minicpmv
,
"blip-2"
:
run_blip2
,
...
...
vllm/model_executor/models/paligemma.py
View file @
7cd74091
...
...
@@ -105,6 +105,11 @@ def input_processor_for_paligemma(ctx: InputContext,
orig_prompt_ids
.
remove
(
hf_config
.
image_token_index
)
new_prompt
=
f
"
{
image_token_str_pad
}{
bos_token
}{
orig_prompt
}
\n
"
# The PaliGemma 2 tokenizer does not include a starting BOS token
if
orig_prompt_ids
[
0
]
!=
hf_config
.
bos_token_id
:
orig_prompt_ids
=
[
hf_config
.
bos_token_id
]
+
orig_prompt_ids
new_token_ids
=
image_token_ids_pad
+
orig_prompt_ids
+
[
108
]
#newline
# NOTE: Create a defensive copy of the original inputs
...
...
@@ -149,7 +154,11 @@ class PaliGemmaForConditionalGeneration(nn.Module, SupportsMultiModal,
projection_dim
=
config
.
vision_config
.
projection_dim
)
self
.
quant_config
=
quant_config
if
config
.
text_config
.
model_type
==
"gemma"
:
config
.
text_config
.
architectures
=
[
"GemmaForCausalLM"
]
else
:
config
.
text_config
.
architectures
=
[
"Gemma2ForCausalLM"
]
self
.
language_model
=
init_vllm_registered_model
(
vllm_config
=
vllm_config
,
hf_config
=
config
.
text_config
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment