Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
AutoAWQ
Commits
5bd6fbc7
Commit
5bd6fbc7
authored
Sep 09, 2023
by
Casper Hansen
Browse files
Update module name
parent
adc5304b
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
2 deletions
+2
-2
awq/models/llama.py
awq/models/llama.py
+2
-2
No files found.
awq/models/llama.py
View file @
5bd6fbc7
...
...
@@ -70,7 +70,7 @@ from typing import List, Tuple, Union
from
awq.utils.utils
import
set_module_name
from
awq.modules.fused.mlp
import
QuantLlamaMLP
from
awq.modules.fused.norm
import
FTLlamaRMSNorm
from
awq.modules.fused.attn
import
Quant
Llama
AttentionFused
from
awq.modules.fused.attn
import
QuantAttentionFused
from
awq.modules.linear
import
WQLinear_GEMM
,
WQLinear_GEMV
from
transformers.models.llama.modeling_llama
import
LlamaAttention
,
LlamaRMSNorm
,
LlamaMLP
...
...
@@ -97,7 +97,7 @@ class LlamaFuser:
def
fuse_attention
(
self
):
for
name
,
module
in
self
.
attention_modules
:
qkv_layer
:
Union
[
WQLinear_GEMM
,
WQLinear_GEMV
]
=
self
.
_fuse_qkv
(
module
)
attn
=
Quant
Llama
AttentionFused
(
attn
=
QuantAttentionFused
(
module
.
hidden_size
,
module
.
num_heads
,
qkv_layer
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment