Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
949b3fbf
Unverified
Commit
949b3fbf
authored
Jan 20, 2025
by
Hongpeng Guo
Committed by
GitHub
Jan 20, 2025
Browse files
[Doc] Update doc of custom logit processor (#3021)
Signed-off-by:
Hongpeng Guo
<
hpguo@anyscale.com
>
parent
da4e8b38
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
75 additions
and
4 deletions
+75
-4
docs/references/sampling_params.md
docs/references/sampling_params.md
+68
-0
python/sglang/srt/managers/io_struct.py
python/sglang/srt/managers/io_struct.py
+7
-4
No files found.
docs/references/sampling_params.md
View file @
949b3fbf
...
@@ -32,6 +32,20 @@ class GenerateReqInput:
...
@@ -32,6 +32,20 @@ class GenerateReqInput:
return_text_in_logprobs
:
bool
=
False
return_text_in_logprobs
:
bool
=
False
# Whether to stream output.
# Whether to stream output.
stream
:
bool
=
False
stream
:
bool
=
False
# Whether to log metrics for this request (e.g. health_generate calls do not log metrics)
log_metrics
:
bool
=
True
# The modalities of the image data [image, multi-images, video]
modalities
:
Optional
[
List
[
str
]]
=
None
# LoRA related
lora_path
:
Optional
[
Union
[
List
[
Optional
[
str
]],
Optional
[
str
]]]
=
None
# Session info for continual prompting
session_params
:
Optional
[
Union
[
List
[
Dict
],
Dict
]]
=
None
# Custom logit processor for advanced sampling control. Must be a serialized instance
# of `CustomLogitProcessor` in python/sglang/srt/sampling/custom_logit_processor.py
# Use the processor's `to_str()` method to generate the serialized string.
custom_logit_processor
:
Optional
[
Union
[
List
[
Optional
[
str
]],
str
]]
=
None
```
```
The
`sampling_params`
follows this format
The
`sampling_params`
follows this format
...
@@ -90,6 +104,14 @@ repetition_penalty: float = 1.0,
...
@@ -90,6 +104,14 @@ repetition_penalty: float = 1.0,
# difficult to infer the correct token ID by given `stop` strings.
# difficult to infer the correct token ID by given `stop` strings.
# Must be 0 <= value < max_new_tokens. Setting to 0 (default) will disable this penalty.
# Must be 0 <= value < max_new_tokens. Setting to 0 (default) will disable this penalty.
min_new_tokens
:
int
=
0
,
min_new_tokens
:
int
=
0
,
## Custom Parameters for Custom Logit Processor.
# A dictionary of custom parameters for the custom logit processor.
# The custom logit processor takes a list of dictionaries as input, where each
# dictionary is the custom parameters for one token in a batch of the input.
# See also python/sglang/srt/sampling/custom_logit_processor.py
custom_params
:
Optional
[
Dict
[
str
,
Any
]]
=
None
,
```
```
## Examples
## Examples
...
@@ -253,3 +275,49 @@ response = requests.post(
...
@@ -253,3 +275,49 @@ response = requests.post(
)
)
print
(
response
.
json
())
print
(
response
.
json
())
```
```
### Custom Logit Processor
Launch a server with
`--enable-custom-logit-processor`
flag on.
```
python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct --port 30000 --enable-custom-logit-processor
```
Define a custom logit processor that will always sample a specific token id.
```
python
from
sglang.srt.sampling.custom_logit_processor
import
CustomLogitProcessor
class
DeterministicLogitProcessor
(
CustomLogitProcessor
):
"""A dummy logit processor that changes the logits to always
sample the given token id.
"""
def
__call__
(
self
,
logits
,
custom_param_list
):
# Check that the number of logits matches the number of custom parameters
assert
logits
.
shape
[
0
]
==
len
(
custom_param_list
)
key
=
"token_id"
for
i
,
param_dict
in
enumerate
(
custom_param_list
):
# Mask all other tokens
logits
[
i
,
:]
=
-
float
(
"inf"
)
# Assign highest probability to the specified token
logits
[
i
,
param_dict
[
key
]]
=
0.0
return
logits
```
Send a request
```
python
import
requests
response
=
requests
.
post
(
"http://localhost:30000/generate"
,
json
=
{
"text"
:
"The capital of France is"
,
"custom_logit_processor"
:
DeterministicLogitProcessor
().
to_str
(),
"sampling_params"
:
{
"temperature"
:
0.0
,
"max_new_tokens"
:
32
,
"custom_params"
:
{
"token_id"
:
5
},
},
},
)
print
(
response
.
json
())
```
python/sglang/srt/managers/io_struct.py
View file @
949b3fbf
...
@@ -69,8 +69,10 @@ class GenerateReqInput:
...
@@ -69,8 +69,10 @@ class GenerateReqInput:
# Session info for continual prompting
# Session info for continual prompting
session_params
:
Optional
[
Union
[
List
[
Dict
],
Dict
]]
=
None
session_params
:
Optional
[
Union
[
List
[
Dict
],
Dict
]]
=
None
# Custom logit processor (serialized function)
# Custom logit processor for advanced sampling control. Must be a serialized instance
custom_logit_processor
:
Optional
[
Union
[
List
[
Optional
[
str
]],
Optional
[
str
]]]
=
None
# of `CustomLogitProcessor` in python/sglang/srt/sampling/custom_logit_processor.py
# Use the processor's `to_str()` method to generate the serialized string.
custom_logit_processor
:
Optional
[
Union
[
List
[
Optional
[
str
]],
str
]]
=
None
def
normalize_batch_and_arguments
(
self
):
def
normalize_batch_and_arguments
(
self
):
if
(
if
(
...
@@ -248,8 +250,9 @@ class TokenizedGenerateReqInput:
...
@@ -248,8 +250,9 @@ class TokenizedGenerateReqInput:
# Session info for continual prompting
# Session info for continual prompting
session_params
:
Optional
[
SessionParams
]
=
None
session_params
:
Optional
[
SessionParams
]
=
None
# Custom logit processor (serialized function)
# Custom logit processor for advanced sampling control. Must be a serialized instance
# TODO (hpguo): Add an example and update doc string here
# of `CustomLogitProcessor` in python/sglang/srt/sampling/custom_logit_processor.py
# Use the processor's `to_str()` method to generate the serialized string.
custom_logit_processor
:
Optional
[
str
]
=
None
custom_logit_processor
:
Optional
[
str
]
=
None
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment