Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
AutoAWQ
Commits
f8273a0c
Commit
f8273a0c
authored
Sep 06, 2023
by
Casper Hansen
Browse files
Multi-GPU support for quantized models
parent
7cf0d987
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
15 additions
and
7 deletions
+15
-7
awq/models/base.py
awq/models/base.py
+15
-7
No files found.
awq/models/base.py
View file @
f8273a0c
...
@@ -297,21 +297,29 @@ class BaseAWQForCausalLM(nn.Module):
...
@@ -297,21 +297,29 @@ class BaseAWQForCausalLM(nn.Module):
model
.
tie_weights
()
model
.
tie_weights
()
device_map
=
infer_auto_device_map
(
model
,
no_split_module_classes
=
[
self
.
layer_type
],
dtype
=
torch_dtype
)
# Load model weights
# Load model weights
if
is_quantized
:
if
is_quantized
:
model
=
load_checkpoint_and_dispatch
(
model
,
model_filename
,
device_map
=
device
,
no_split_module_classes
=
[
self
.
layer_type
])
model
=
load_checkpoint_and_dispatch
(
model
,
model_filename
,
device_map
=
device_map
,
no_split_module_classes
=
[
self
.
layer_type
]
)
if
fuse_layers
:
if
fuse_layers
:
self
.
fuse_layers
(
model
)
self
.
fuse_layers
(
model
)
from
awq.utils.utils
import
simple_dispatch_model
model
=
simple_dispatch_model
(
model
,
device_map
)
else
:
else
:
# If not quantized, must load with AutoModelForCausalLM
# If not quantized, must load with AutoModelForCausalLM
device_map
=
infer_auto_device_map
(
model
,
no_split_module_classes
=
[
self
.
layer_type
],
dtype
=
torch_dtype
)
del
model
del
model
# Load model weights
# Load model weights
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment