Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
1c2bec0f
Unverified
Commit
1c2bec0f
authored
Mar 22, 2025
by
wwl2755
Committed by
GitHub
Mar 21, 2025
Browse files
[Doc] add load_format items in docs (#14804)
Signed-off-by:
wwl2755
<
wangwenlong2755@gmail.com
>
parent
ec870fba
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
14 additions
and
2 deletions
+14
-2
vllm/config.py
vllm/config.py
+6
-0
vllm/engine/arg_utils.py
vllm/engine/arg_utils.py
+8
-2
No files found.
vllm/config.py
View file @
1c2bec0f
...
@@ -1294,6 +1294,12 @@ class LoadConfig:
...
@@ -1294,6 +1294,12 @@ class LoadConfig:
"tensorizer" will use CoreWeave's tensorizer library for
"tensorizer" will use CoreWeave's tensorizer library for
fast weight loading.
fast weight loading.
"bitsandbytes" will load nf4 type weights.
"bitsandbytes" will load nf4 type weights.
"sharded_state" will load weights from pre-sharded checkpoint files,
supporting efficient loading of tensor-parallel models.
"gguf" will load weights from GGUF format files.
"mistral" will load weights from consolidated safetensors files used
by Mistral models.
"runai_streamer" will load weights from RunAI streamer format files.
model_loader_extra_config: The extra config for the model loader.
model_loader_extra_config: The extra config for the model loader.
ignore_patterns: The list of patterns to ignore when loading the model.
ignore_patterns: The list of patterns to ignore when loading the model.
Default to "original/**/*" to avoid repeated loading of llama's
Default to "original/**/*" to avoid repeated loading of llama's
...
...
vllm/engine/arg_utils.py
View file @
1c2bec0f
...
@@ -339,9 +339,15 @@ class EngineArgs:
...
@@ -339,9 +339,15 @@ class EngineArgs:
'CoreWeave. See the Tensorize vLLM Model script in the Examples '
'CoreWeave. See the Tensorize vLLM Model script in the Examples '
'section for more information.
\n
'
'section for more information.
\n
'
'* "runai_streamer" will load the Safetensors weights using Run:ai'
'* "runai_streamer" will load the Safetensors weights using Run:ai'
'Model Streamer
\n
'
'Model Streamer
.
\n
'
'* "bitsandbytes" will load the weights using bitsandbytes '
'* "bitsandbytes" will load the weights using bitsandbytes '
'quantization.
\n
'
)
'quantization.
\n
'
'* "sharded_state" will load weights from pre-sharded checkpoint '
'files, supporting efficient loading of tensor-parallel models
\n
'
'* "gguf" will load weights from GGUF format files (details '
'specified in https://github.com/ggml-org/ggml/blob/master/docs/gguf.md).
\n
'
'* "mistral" will load weights from consolidated safetensors files '
'used by Mistral models.
\n
'
)
parser
.
add_argument
(
parser
.
add_argument
(
'--config-format'
,
'--config-format'
,
default
=
EngineArgs
.
config_format
,
default
=
EngineArgs
.
config_format
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment