Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
128bf752
Unverified
Commit
128bf752
authored
Mar 13, 2025
by
TY-AMD
Committed by
GitHub
Mar 12, 2025
Browse files
[BugFix][TritonMLA] Process weights after model loading for GGUF (#14555)
Signed-off-by:
TianyuanWu
<
Tianyuan.Wu@amd.com
>
parent
a94a699c
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
1 deletion
+4
-1
vllm/model_executor/model_loader/loader.py
vllm/model_executor/model_loader/loader.py
+4
-1
No files found.
vllm/model_executor/model_loader/loader.py
View file @
128bf752
...
...
@@ -1330,11 +1330,14 @@ class GGUFModelLoader(BaseModelLoader):
local_model_path
,
gguf_weights_map
):
model_config
.
hf_config
.
update
({
"tie_word_embeddings"
:
True
})
target_device
=
torch
.
device
(
device_config
.
device
)
with
set_default_torch_dtype
(
model_config
.
dtype
):
with
t
orch
.
device
(
device_config
.
device
)
:
with
t
arget_
device
:
model
=
_initialize_model
(
vllm_config
=
vllm_config
)
model
.
load_weights
(
self
.
_get_weights_iterator
(
local_model_path
,
gguf_weights_map
))
_process_weights_after_loading
(
model
,
model_config
,
target_device
)
return
model
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment