Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
eaffe448
Unverified
Commit
eaffe448
authored
Sep 18, 2025
by
Kay Yan
Committed by
GitHub
Sep 18, 2025
Browse files
[Docs] Fix pooling-params doc references in openai_compatible_server.md (#24939)
parent
8ed039d5
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
26 additions
and
15 deletions
+26
-15
docs/api/README.md
docs/api/README.md
+0
-1
docs/serving/openai_compatible_server.md
docs/serving/openai_compatible_server.md
+12
-8
vllm/pooling_params.py
vllm/pooling_params.py
+14
-6
No files found.
docs/api/README.md
View file @
eaffe448
...
...
@@ -46,7 +46,6 @@ Engine classes for offline and online inference.
Inference parameters for vLLM APIs.
[](
){
#sampling-params }
[](
){
#pooling-params }
-
[
vllm.SamplingParams
][]
-
[
vllm.PoolingParams
][]
...
...
docs/serving/openai_compatible_server.md
View file @
eaffe448
...
...
@@ -317,10 +317,11 @@ Full example: <gh-file:examples/online_serving/pooling/openai_chat_embedding_cli
#### Extra parameters
The following
[
pooling parameters
][
p
ooling
-p
arams
]
are supported.
The following
[
pooling parameters
][
vllm.P
ooling
P
arams
]
are supported.
```
python
--
8
<--
"vllm/entrypoints/openai/protocol.py:embedding-pooling-params"
--
8
<--
"vllm/pooling_params.py:common-pooling-params"
--
8
<--
"vllm/pooling_params.py:embedding-pooling-params"
```
The following extra parameters are supported by default:
...
...
@@ -527,10 +528,11 @@ curl -v "http://127.0.0.1:8000/classify" \
#### Extra parameters
The following
[
pooling parameters
][
p
ooling
-p
arams
]
are supported.
The following
[
pooling parameters
][
vllm.P
ooling
P
arams
]
are supported.
```
python
--
8
<--
"vllm/entrypoints/openai/protocol.py:classification-pooling-params"
--
8
<--
"vllm/pooling_params.py:common-pooling-params"
--
8
<--
"vllm/pooling_params.py:classification-pooling-params"
```
The following extra parameters are supported:
...
...
@@ -733,10 +735,11 @@ Full example: <gh-file:examples/online_serving/openai_cross_encoder_score_for_mu
#### Extra parameters
The following
[
pooling parameters
][
p
ooling
-p
arams
]
are supported.
The following
[
pooling parameters
][
vllm.P
ooling
P
arams
]
are supported.
```
python
--
8
<--
"vllm/entrypoints/openai/protocol.py:score-pooling-params"
--
8
<--
"vllm/pooling_params.py:common-pooling-params"
--
8
<--
"vllm/pooling_params.py:classification-pooling-params"
```
The following extra parameters are supported:
...
...
@@ -815,10 +818,11 @@ Result documents will be sorted by relevance, and the `index` property can be us
#### Extra parameters
The following
[
pooling parameters
][
p
ooling
-p
arams
]
are supported.
The following
[
pooling parameters
][
vllm.P
ooling
P
arams
]
are supported.
```
python
--
8
<--
"vllm/entrypoints/openai/protocol.py:rerank-pooling-params"
--
8
<--
"vllm/pooling_params.py:common-pooling-params"
--
8
<--
"vllm/pooling_params.py:classification-pooling-params"
```
The following extra parameters are supported:
...
...
vllm/pooling_params.py
View file @
eaffe448
...
...
@@ -20,25 +20,33 @@ class PoolingParams(
"""API parameters for pooling models.
Attributes:
truncate_prompt_tokens: Controls prompt truncation.
Set to -1 to use the model's default truncation size.
Set to k to keep only the last k tokens (left truncation).
Set to None to disable truncation.
normalize: Whether to normalize the embeddings outputs.
dimensions: Reduce the dimensions of embeddings
if model support matryoshka representation.
if model support matryoshka representation.
activation: Whether to apply activation function to
the classification outputs.
the classification outputs.
softmax: Whether to apply softmax to the reward outputs.
"""
# --8<-- [start:common-pooling-params]
truncate_prompt_tokens
:
Optional
[
Annotated
[
int
,
msgspec
.
Meta
(
ge
=-
1
)]]
=
None
"""If set to -1, will use the truncation size supported by the model. If
set to an integer k, will use only the last k tokens from the prompt
(i.e., left truncation). If set to `None`, truncation is disabled."""
# --8<-- [end:common-pooling-params]
## for embeddings models
# --8<-- [start:embedding-pooling-params]
dimensions
:
Optional
[
int
]
=
None
normalize
:
Optional
[
bool
]
=
None
# --8<-- [end:embedding-pooling-params]
## for classification models
## for classification, scoring and rerank
# --8<-- [start:classification-pooling-params]
activation
:
Optional
[
bool
]
=
None
# --8<-- [end:classification-pooling-params]
## for reward models
softmax
:
Optional
[
bool
]
=
None
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment