Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
AutoAWQ
Commits
80222c63
".github/vscode:/vscode.git/clone" did not exist on "2255a0fc9f6b204f152da7920f116a0c22a1da35"
Commit
80222c63
authored
Jun 25, 2023
by
Jiaming Tang
Browse files
[Minor] skip qk bmm (Bloom, MPT, Falcon)
parent
71d8e68d
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
1 addition
and
1 deletion
+1
-1
awq/quantize/auto_clip.py
awq/quantize/auto_clip.py
+1
-1
No files found.
awq/quantize/auto_clip.py
View file @
80222c63
...
@@ -73,7 +73,7 @@ def auto_clip_block(module,
...
@@ -73,7 +73,7 @@ def auto_clip_block(module,
clip_list
=
[]
clip_list
=
[]
for
name
in
named_linears
:
for
name
in
named_linears
:
# due to qk bmm, it is hard to clip precisely
# due to qk bmm, it is hard to clip precisely
if
any
([
_
in
name
for
_
in
[
"q_"
,
"k_"
]]):
if
any
([
_
in
name
for
_
in
[
"q_"
,
"k_"
,
"query"
,
"key"
,
"Wqkv"
]]):
continue
continue
max_val
=
auto_clip_layer
(
max_val
=
auto_clip_layer
(
named_linears
[
name
].
weight
,
input_feat
[
name
],
n_bit
=
w_bit
,
q_config
=
q_config
)
named_linears
[
name
].
weight
,
input_feat
[
name
],
n_bit
=
w_bit
,
q_config
=
q_config
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment