Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
e656f638
Unverified
Commit
e656f638
authored
Feb 26, 2025
by
Albert
Committed by
GitHub
Feb 25, 2025
Browse files
[Doc] fix the incorrect module path of tensorize_vllm_model (#13863)
parent
145944cb
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
5 deletions
+5
-5
examples/other/tensorize_vllm_model.py
examples/other/tensorize_vllm_model.py
+5
-5
No files found.
examples/other/tensorize_vllm_model.py
View file @
e656f638
...
@@ -27,7 +27,7 @@ https://github.com/coreweave/tensorizer
...
@@ -27,7 +27,7 @@ https://github.com/coreweave/tensorizer
To serialize a model, install vLLM from source, then run something
To serialize a model, install vLLM from source, then run something
like this from the root level of this repository:
like this from the root level of this repository:
python -m examples.o
ffline_inference
.tensorize_vllm_model
\
python -m examples.o
ther
.tensorize_vllm_model
\
--model facebook/opt-125m
\
--model facebook/opt-125m
\
serialize
\
serialize
\
--serialized-directory s3://my-bucket
\
--serialized-directory s3://my-bucket
\
...
@@ -47,7 +47,7 @@ providing a `--keyfile` argument.
...
@@ -47,7 +47,7 @@ providing a `--keyfile` argument.
To deserialize a model, you can run something like this from the root
To deserialize a model, you can run something like this from the root
level of this repository:
level of this repository:
python -m examples.o
ffline_inference
.tensorize_vllm_model
\
python -m examples.o
ther
.tensorize_vllm_model
\
--model EleutherAI/gpt-j-6B
\
--model EleutherAI/gpt-j-6B
\
--dtype float16
\
--dtype float16
\
deserialize
\
deserialize
\
...
@@ -65,11 +65,11 @@ shard's rank. Sharded models serialized with this script will be named as
...
@@ -65,11 +65,11 @@ shard's rank. Sharded models serialized with this script will be named as
model-rank-%03d.tensors
model-rank-%03d.tensors
For more information on the available arguments for serializing, run
For more information on the available arguments for serializing, run
`python -m examples.o
ffline_inference
.tensorize_vllm_model serialize --help`.
`python -m examples.o
ther
.tensorize_vllm_model serialize --help`.
Or for deserializing:
Or for deserializing:
`python -m examples.o
ffline_inference
.tensorize_vllm_model deserialize --help`.
`python -m examples.o
ther
.tensorize_vllm_model deserialize --help`.
Once a model is serialized, tensorizer can be invoked with the `LLM` class
Once a model is serialized, tensorizer can be invoked with the `LLM` class
directly to load models:
directly to load models:
...
@@ -90,7 +90,7 @@ TensorizerConfig arguments desired.
...
@@ -90,7 +90,7 @@ TensorizerConfig arguments desired.
In order to see all of the available arguments usable to configure
In order to see all of the available arguments usable to configure
loading with tensorizer that are given to `TensorizerConfig`, run:
loading with tensorizer that are given to `TensorizerConfig`, run:
`python -m examples.o
ffline_inference
.tensorize_vllm_model deserialize --help`
`python -m examples.o
ther
.tensorize_vllm_model deserialize --help`
under the `tensorizer options` section. These can also be used for
under the `tensorizer options` section. These can also be used for
deserialization in this example script, although `--tensorizer-uri` and
deserialization in this example script, although `--tensorizer-uri` and
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment