Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
e2afb03c
Unverified
Commit
e2afb03c
authored
Jun 14, 2024
by
Thomas Parnell
Committed by
GitHub
Jun 14, 2024
Browse files
[Bugfix] Enable loading FP8 checkpoints for gpt_bigcode models (#5460)
Signed-off-by:
Thomas Parnell
<
tpa@zurich.ibm.com
>
parent
6e2527a7
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
7 additions
and
1 deletion
+7
-1
vllm/model_executor/models/gpt_bigcode.py
vllm/model_executor/models/gpt_bigcode.py
+7
-1
No files found.
vllm/model_executor/models/gpt_bigcode.py
View file @
e2afb03c
...
@@ -299,4 +299,10 @@ class GPTBigCodeForCausalLM(nn.Module):
...
@@ -299,4 +299,10 @@ class GPTBigCodeForCausalLM(nn.Module):
param
=
params_dict
[
name
]
param
=
params_dict
[
name
]
weight_loader
=
getattr
(
param
,
"weight_loader"
,
weight_loader
=
getattr
(
param
,
"weight_loader"
,
default_weight_loader
)
default_weight_loader
)
weight_loader
(
param
,
loaded_weight
)
# TODO (@robertgshaw2-neuralmagic): move to fp8 linear method
if
"c_attn.input_scale"
in
name
or
"c_attn.weight_scale"
in
name
:
weight_loader
(
param
,
loaded_weight
,
'q'
)
weight_loader
(
param
,
loaded_weight
,
'k'
)
weight_loader
(
param
,
loaded_weight
,
'v'
)
else
:
weight_loader
(
param
,
loaded_weight
)
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment