Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
a7bab0c9
Unverified
Commit
a7bab0c9
authored
Jul 04, 2025
by
Reid
Committed by
GitHub
Jul 03, 2025
Browse files
[Misc] small update (#20462)
Signed-off-by:
reidliu41
<
reid201711@gmail.com
>
parent
25950dca
Changes
3
Show whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
18 additions
and
9 deletions
+18
-9
examples/offline_inference/profiling_tpu/README.md
examples/offline_inference/profiling_tpu/README.md
+4
-1
examples/online_serving/structured_outputs/README.md
examples/online_serving/structured_outputs/README.md
+7
-3
examples/others/tensorize_vllm_model.py
examples/others/tensorize_vllm_model.py
+7
-5
No files found.
examples/offline_inference/profiling_tpu/README.md
View file @
a7bab0c9
...
...
@@ -57,7 +57,10 @@ Once you have collected your profiles with this script, you can visualize them u
Here are most likely the dependencies you need to install:
```
bash
pip
install
tensorflow-cpu tensorboard-plugin-profile etils importlib_resources
pip
install
tensorflow-cpu
\
tensorboard-plugin-profile
\
etils
\
importlib_resources
```
Then you just need to point TensorBoard to the directory where you saved the profiles and visit
`http://localhost:6006/`
in your browser:
...
...
examples/online_serving/structured_outputs/README.md
View file @
a7bab0c9
...
...
@@ -13,13 +13,15 @@ vllm serve Qwen/Qwen2.5-3B-Instruct
To serve a reasoning model, you can use the following command:
```
bash
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
--reasoning-parser
deepseek_r1
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
\
--reasoning-parser
deepseek_r1
```
If you want to run this script standalone with
`uv`
, you can use the following:
```
bash
uvx
--from
git+https://github.com/vllm-project/vllm#subdirectory
=
examples/online_serving/structured_outputs structured-output
uvx
--from
git+https://github.com/vllm-project/vllm#subdirectory
=
examples/online_serving/structured_outputs
\
structured-output
```
See
[
feature docs
](
https://docs.vllm.ai/en/latest/features/structured_outputs.html
)
for more information.
...
...
@@ -44,7 +46,9 @@ uv run structured_outputs.py --stream
Run certain constraints, for example
`structural_tag`
and
`regex`
, streaming:
```
bash
uv run structured_outputs.py
--constraint
structural_tag regex
--stream
uv run structured_outputs.py
\
--constraint
structural_tag regex
\
--stream
```
Run all constraints, with reasoning models and streaming:
...
...
examples/others/tensorize_vllm_model.py
View file @
a7bab0c9
...
...
@@ -202,7 +202,7 @@ def parse_args():
def
deserialize
():
def
deserialize
(
args
,
tensorizer_config
):
if
args
.
lora_path
:
tensorizer_config
.
lora_dir
=
tensorizer_config
.
tensorizer_dir
llm
=
LLM
(
model
=
args
.
model
,
...
...
@@ -242,7 +242,7 @@ def deserialize():
return
llm
if
__name__
==
'__
main
__'
:
def
main
()
:
args
=
parse_args
()
s3_access_key_id
=
(
getattr
(
args
,
's3_access_key_id'
,
None
)
...
...
@@ -260,8 +260,6 @@ if __name__ == '__main__':
model_ref
=
args
.
model
model_name
=
model_ref
.
split
(
"/"
)[
1
]
if
args
.
command
==
"serialize"
or
args
.
command
==
"deserialize"
:
keyfile
=
args
.
keyfile
else
:
...
...
@@ -309,6 +307,10 @@ if __name__ == '__main__':
encryption_keyfile
=
keyfile
,
**
credentials
)
deserialize
()
deserialize
(
args
,
tensorizer_config
)
else
:
raise
ValueError
(
"Either serialize or deserialize must be specified."
)
if
__name__
==
"__main__"
:
main
()
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment