Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
6d70198b
Unverified
Commit
6d70198b
authored
Jan 01, 2025
by
Kazuhiro Serizawa
Committed by
GitHub
Jan 01, 2025
Browse files
[Doc] Fix typo (#11666)
Signed-off-by:
Kazuhiro Serizawa
<
nserihiro@gmail.com
>
parent
f962f426
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
2 additions
and
2 deletions
+2
-2
vllm/model_executor/layers/rejection_sampler.py
vllm/model_executor/layers/rejection_sampler.py
+1
-1
vllm/v1/sample/ops/topk_topp_sampler.py
vllm/v1/sample/ops/topk_topp_sampler.py
+1
-1
No files found.
vllm/model_executor/layers/rejection_sampler.py
View file @
6d70198b
...
@@ -39,7 +39,7 @@ class RejectionSampler(SpecDecodeStochasticBaseSampler):
...
@@ -39,7 +39,7 @@ class RejectionSampler(SpecDecodeStochasticBaseSampler):
strict_mode: Whether or not to perform shape/device/dtype checks
strict_mode: Whether or not to perform shape/device/dtype checks
during sampling. This catches correctness issues but adds
during sampling. This catches correctness issues but adds
nontrivial latency.
nontrivial latency.
use_f
a
lshinfer: We will use this parameter to determine whether
use_fl
a
shinfer: We will use this parameter to determine whether
to use the FlashInfer rejection sampling kernel or not. If it's
to use the FlashInfer rejection sampling kernel or not. If it's
None, we will use the default value from the environment variable.
None, we will use the default value from the environment variable.
This parameter is only used for testing purposes.
This parameter is only used for testing purposes.
...
...
vllm/v1/sample/ops/topk_topp_sampler.py
View file @
6d70198b
...
@@ -44,7 +44,7 @@ class TopKTopPSampler(nn.Module):
...
@@ -44,7 +44,7 @@ class TopKTopPSampler(nn.Module):
logger
.
warning
(
logger
.
warning
(
"FlashInfer is not available. Falling back to the PyTorch-"
"FlashInfer is not available. Falling back to the PyTorch-"
"native implementation of top-p & top-k sampling. For the "
"native implementation of top-p & top-k sampling. For the "
"best performance, please install F
a
lshInfer."
)
"best performance, please install Fl
a
shInfer."
)
self
.
forward
=
self
.
forward_native
self
.
forward
=
self
.
forward_native
else
:
else
:
self
.
forward
=
self
.
forward_native
self
.
forward
=
self
.
forward_native
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment