Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
AutoAWQ
Commits
f3a71d1d
"...text-generation-inference.git" did not exist on "0d9917f74497b81927001bde9928b7c1ae2aedcd"
Commit
f3a71d1d
authored
Sep 19, 2023
by
Casper Hansen
Browse files
Use GEMM v2 kernel for context processing
parent
2fa3a5d1
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
7 additions
and
1 deletion
+7
-1
awq/modules/linear.py
awq/modules/linear.py
+7
-1
No files found.
awq/modules/linear.py
View file @
f3a71d1d
...
@@ -194,7 +194,13 @@ class WQLinear_GEMV(nn.Module):
...
@@ -194,7 +194,13 @@ class WQLinear_GEMV(nn.Module):
@
torch
.
no_grad
()
@
torch
.
no_grad
()
def
forward
(
self
,
x
):
def
forward
(
self
,
x
):
out_shape
=
x
.
shape
[:
-
1
]
+
(
self
.
out_features
,
)
out_shape
=
x
.
shape
[:
-
1
]
+
(
self
.
out_features
,
)
out
=
awq_inference_engine
.
gemv_forward_cuda
(
x
.
reshape
(
-
1
,
x
.
shape
[
-
1
]),
self
.
qweight
,
self
.
scales
,
self
.
qzeros
,
self
.
group_size
)
inputs
=
x
.
reshape
(
-
1
,
x
.
shape
[
-
1
])
if
inputs
.
shape
[
0
]
>
8
:
out
=
awq_inference_engine
.
gemmv2_forward_cuda
(
inputs
,
self
.
qweight
,
self
.
scales
,
self
.
qzeros
,
self
.
group_size
,
self
.
split_k_iters
)
else
:
out
=
awq_inference_engine
.
gemv_forward_cuda
(
inputs
,
self
.
qweight
,
self
.
scales
,
self
.
qzeros
,
self
.
group_size
)
out
=
out
+
self
.
bias
if
self
.
bias
is
not
None
else
out
out
=
out
+
self
.
bias
if
self
.
bias
is
not
None
else
out
return
out
.
reshape
(
out_shape
)
return
out
.
reshape
(
out_shape
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment