Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
e656f638
Unverified
Commit
e656f638
authored
Feb 26, 2025
by
Albert
Committed by
GitHub
Feb 25, 2025
Browse files
[Doc] fix the incorrect module path of tensorize_vllm_model (#13863)
parent
145944cb
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
5 deletions
+5
-5
examples/other/tensorize_vllm_model.py
examples/other/tensorize_vllm_model.py
+5
-5
No files found.
examples/other/tensorize_vllm_model.py
View file @
e656f638
...
...
@@ -27,7 +27,7 @@ https://github.com/coreweave/tensorizer
To serialize a model, install vLLM from source, then run something
like this from the root level of this repository:
python -m examples.o
ffline_inference
.tensorize_vllm_model
\
python -m examples.o
ther
.tensorize_vllm_model
\
--model facebook/opt-125m
\
serialize
\
--serialized-directory s3://my-bucket
\
...
...
@@ -47,7 +47,7 @@ providing a `--keyfile` argument.
To deserialize a model, you can run something like this from the root
level of this repository:
python -m examples.o
ffline_inference
.tensorize_vllm_model
\
python -m examples.o
ther
.tensorize_vllm_model
\
--model EleutherAI/gpt-j-6B
\
--dtype float16
\
deserialize
\
...
...
@@ -65,11 +65,11 @@ shard's rank. Sharded models serialized with this script will be named as
model-rank-%03d.tensors
For more information on the available arguments for serializing, run
`python -m examples.o
ffline_inference
.tensorize_vllm_model serialize --help`.
`python -m examples.o
ther
.tensorize_vllm_model serialize --help`.
Or for deserializing:
`python -m examples.o
ffline_inference
.tensorize_vllm_model deserialize --help`.
`python -m examples.o
ther
.tensorize_vllm_model deserialize --help`.
Once a model is serialized, tensorizer can be invoked with the `LLM` class
directly to load models:
...
...
@@ -90,7 +90,7 @@ TensorizerConfig arguments desired.
In order to see all of the available arguments usable to configure
loading with tensorizer that are given to `TensorizerConfig`, run:
`python -m examples.o
ffline_inference
.tensorize_vllm_model deserialize --help`
`python -m examples.o
ther
.tensorize_vllm_model deserialize --help`
under the `tensorizer options` section. These can also be used for
deserialization in this example script, although `--tensorizer-uri` and
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment