Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
e66b629c
Unverified
Commit
e66b629c
authored
Mar 26, 2024
by
Woosuk Kwon
Committed by
GitHub
Mar 26, 2024
Browse files
[Misc] Minor fix in KVCache type (#3652)
parent
76879342
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
4 additions
and
8 deletions
+4
-8
docs/source/models/adding_model.rst
docs/source/models/adding_model.rst
+2
-2
vllm/model_executor/models/llava.py
vllm/model_executor/models/llava.py
+2
-4
vllm/worker/neuron_model_runner.py
vllm/worker/neuron_model_runner.py
+0
-2
No files found.
docs/source/models/adding_model.rst
View file @
e66b629c
...
...
@@ -56,8 +56,8 @@ Next, you need to rewrite the :code:`forward` methods of your model by following
- return_dict: Optional[bool] = None,
-) -> Union[Tuple, CausalLMOutputWithPast]:
+ positions: torch.Tensor,
+ kv_caches: List[
KVCache
],
+
input
_metadata:
Input
Metadata,
+ kv_caches: List[
torch.Tensor
],
+
attn
_metadata:
Attention
Metadata,
+) -> Optional[SamplerOutput]:
1. Update the code by considering that :code:`input_ids` and :code:`positions` are now flattened tensors.
...
...
vllm/model_executor/models/llava.py
View file @
e66b629c
from
typing
import
List
,
Optional
,
Tuple
from
typing
import
List
,
Optional
import
torch
from
torch
import
nn
...
...
@@ -19,8 +19,6 @@ from vllm.model_executor.weight_utils import (default_weight_loader,
hf_model_weights_iterator
)
from
vllm.sequence
import
SamplerOutput
KVCache
=
Tuple
[
torch
.
Tensor
,
torch
.
Tensor
]
_KEYS_TO_MODIFY_MAPPING
=
{
"language_model.lm_head"
:
"lm_head"
,
"language_model.model"
:
"language_model"
,
...
...
@@ -102,7 +100,7 @@ class LlavaForConditionalGeneration(nn.Module):
self
,
input_ids
:
torch
.
Tensor
,
positions
:
torch
.
Tensor
,
kv_caches
:
List
[
KVCache
],
kv_caches
:
List
[
torch
.
Tensor
],
attn_metadata
:
AttentionMetadata
,
image_input
:
Optional
[
torch
.
Tensor
]
=
None
)
->
SamplerOutput
:
# noqa: E501
...
...
vllm/worker/neuron_model_runner.py
View file @
e66b629c
...
...
@@ -14,8 +14,6 @@ from vllm.utils import (async_tensor_h2d, is_pin_memory_available,
logger
=
init_logger
(
__name__
)
KVCache
=
Tuple
[
torch
.
Tensor
,
torch
.
Tensor
]
class
NeuronModelRunner
:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment