Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
norm
vllm
Commits
30fb0956
Unverified
Commit
30fb0956
authored
Dec 17, 2023
by
Woosuk Kwon
Committed by
GitHub
Dec 17, 2023
Browse files
[Minor] Add more detailed explanation on `quantization` argument (#2145)
parent
3a765bd5
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
10 additions
and
4 deletions
+10
-4
vllm/engine/arg_utils.py
vllm/engine/arg_utils.py
+6
-1
vllm/entrypoints/llm.py
vllm/entrypoints/llm.py
+4
-3
No files found.
vllm/engine/arg_utils.py
View file @
30fb0956
...
...
@@ -183,7 +183,12 @@ class EngineArgs:
type
=
str
,
choices
=
[
'awq'
,
'gptq'
,
'squeezellm'
,
None
],
default
=
None
,
help
=
'Method used to quantize the weights'
)
help
=
'Method used to quantize the weights. If '
'None, we first check the `quantization_config` '
'attribute in the model config file. If that is '
'None, we assume the model weights are not '
'quantized and use `dtype` to determine the data '
'type of the weights.'
)
parser
.
add_argument
(
'--enforce-eager'
,
action
=
'store_true'
,
help
=
'Always use eager-mode PyTorch. If False, '
...
...
vllm/entrypoints/llm.py
View file @
30fb0956
...
...
@@ -38,9 +38,10 @@ class LLM:
However, if the `torch_dtype` in the config is `float32`, we will
use `float16` instead.
quantization: The method used to quantize the model weights. Currently,
we support "awq", "gptq" and "squeezellm". If None, we assume the
model weights are not quantized and use `dtype` to determine the
data type of the weights.
we support "awq", "gptq" and "squeezellm". If None, we first check
the `quantization_config` attribute in the model config file. If
that is None, we assume the model weights are not quantized and use
`dtype` to determine the data type of the weights.
revision: The specific model version to use. It can be a branch name,
a tag name, or a commit id.
tokenizer_revision: The specific tokenizer version to use. It can be a
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment