Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
norm
vllm
Commits
621980bd
Unverified
Commit
621980bd
authored
Aug 05, 2023
by
Wen Sun
Committed by
GitHub
Aug 04, 2023
Browse files
fix: incorrect bigcode attention heads num (#676)
parent
aa84c92e
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
3 deletions
+5
-3
vllm/config.py
vllm/config.py
+5
-3
No files found.
vllm/config.py
View file @
621980bd
...
@@ -98,9 +98,11 @@ class ModelConfig:
...
@@ -98,9 +98,11 @@ class ModelConfig:
# Note: for falcon, when new_decoder_architecture is True, the
# Note: for falcon, when new_decoder_architecture is True, the
# multi_query flag is ignored and we use n_head_kv for the number of
# multi_query flag is ignored and we use n_head_kv for the number of
# KV heads.
# KV heads.
if
(
getattr
(
self
.
hf_config
,
"multi_query"
,
False
)
and
new_decoder_arch_falcon
=
(
(
self
.
hf_config
.
model_type
==
"falcon"
and
self
.
hf_config
.
model_type
==
"falcon"
not
getattr
(
self
.
hf_config
,
"new_decoder_architecture"
,
False
))):
and
getattr
(
self
.
hf_config
,
"new_decoder_architecture"
,
False
))
if
not
new_decoder_arch_falcon
and
getattr
(
self
.
hf_config
,
"multi_query"
,
False
):
# Multi-query attention, only one KV head.
# Multi-query attention, only one KV head.
return
1
return
1
# For Falcon:
# For Falcon:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment