Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
73030b7d
Unverified
Commit
73030b7d
authored
Jul 14, 2024
by
Robert Shaw
Committed by
GitHub
Jul 14, 2024
Browse files
[ Misc ] Enable Quantizing All Layers of DeekSeekv2 (#6423)
parent
ccd3c045
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
6 additions
and
1 deletion
+6
-1
.buildkite/lm-eval-harness/run-lm-eval-gsm-vllm-baseline.sh
.buildkite/lm-eval-harness/run-lm-eval-gsm-vllm-baseline.sh
+1
-1
vllm/model_executor/model_loader/weight_utils.py
vllm/model_executor/model_loader/weight_utils.py
+5
-0
No files found.
.buildkite/lm-eval-harness/run-lm-eval-gsm-vllm-baseline.sh
View file @
73030b7d
...
@@ -46,6 +46,6 @@ while getopts "m:b:l:f:t:" OPT; do
...
@@ -46,6 +46,6 @@ while getopts "m:b:l:f:t:" OPT; do
done
done
lm_eval
--model
vllm
\
lm_eval
--model
vllm
\
--model_args
pretrained
=
$MODEL
,tensor_parallel_size
=
$TP_SIZE
,add_bos_token
=
true
,distributed_executor_backend
=
"ray"
,trust_remote_code
=
true
\
--model_args
pretrained
=
$MODEL
,tensor_parallel_size
=
$TP_SIZE
,add_bos_token
=
true
,distributed_executor_backend
=
"ray"
,trust_remote_code
=
true
,max_model_len
=
4096
\
--tasks
gsm8k
--num_fewshot
$FEWSHOT
--limit
$LIMIT
\
--tasks
gsm8k
--num_fewshot
$FEWSHOT
--limit
$LIMIT
\
--batch_size
$BATCH_SIZE
--batch_size
$BATCH_SIZE
vllm/model_executor/model_loader/weight_utils.py
View file @
73030b7d
...
@@ -431,6 +431,11 @@ def convert_pyslice_to_tensor(x: Any) -> torch.Tensor:
...
@@ -431,6 +431,11 @@ def convert_pyslice_to_tensor(x: Any) -> torch.Tensor:
def
default_weight_loader
(
param
:
torch
.
Tensor
,
def
default_weight_loader
(
param
:
torch
.
Tensor
,
loaded_weight
:
torch
.
Tensor
)
->
None
:
loaded_weight
:
torch
.
Tensor
)
->
None
:
"""Default weight loader."""
"""Default weight loader."""
# If the weight on disk does not have a shape, give it one
# (such scales for AutoFp8).
if
len
(
loaded_weight
.
shape
)
==
0
:
loaded_weight
=
loaded_weight
.
reshape
(
1
)
assert
param
.
size
()
==
loaded_weight
.
size
()
assert
param
.
size
()
==
loaded_weight
.
size
()
param
.
data
.
copy_
(
loaded_weight
)
param
.
data
.
copy_
(
loaded_weight
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment