Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
76a2c86b
"vscode:/vscode.git/clone" did not exist on "6674a5157f10f6f3a7ef41f2397ec90f8d20d0ef"
Unverified
Commit
76a2c86b
authored
Sep 07, 2025
by
Lianmin Zheng
Committed by
GitHub
Sep 07, 2025
Browse files
Fix flashinfer version in sgl-kernel (#10135)
parent
e719bb0e
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
6 additions
and
2 deletions
+6
-2
python/sglang/srt/layers/attention/flashinfer_backend.py
python/sglang/srt/layers/attention/flashinfer_backend.py
+5
-1
sgl-kernel/CMakeLists.txt
sgl-kernel/CMakeLists.txt
+1
-1
No files found.
python/sglang/srt/layers/attention/flashinfer_backend.py
View file @
76a2c86b
...
@@ -1187,7 +1187,7 @@ class FlashInferMultiStepDraftBackend:
...
@@ -1187,7 +1187,7 @@ class FlashInferMultiStepDraftBackend:
def
init_cuda_graph_state
(
self
,
max_bs
:
int
,
max_num_tokens
:
int
):
def
init_cuda_graph_state
(
self
,
max_bs
:
int
,
max_num_tokens
:
int
):
self
.
cuda_graph_kv_indices
=
torch
.
zeros
(
self
.
cuda_graph_kv_indices
=
torch
.
zeros
(
(
self
.
speculative_num_steps
,
max_bs
*
self
.
max_context_len
),
(
self
.
speculative_num_steps
,
max_bs
*
self
.
topk
*
self
.
max_context_len
),
dtype
=
torch
.
int32
,
dtype
=
torch
.
int32
,
device
=
"cuda"
,
device
=
"cuda"
,
)
)
...
@@ -1349,6 +1349,10 @@ def fast_decode_plan(
...
@@ -1349,6 +1349,10 @@ def fast_decode_plan(
self
.
device
,
non_blocking
=
non_blocking
self
.
device
,
non_blocking
=
non_blocking
)
)
# TODO:
# We want to cache `empty_q_data`, `empty_kv_cache`, `last_page_len_host` (if it is ones) in the wrapper
# so that we do not need to create them every time.
# Create empty tensors for dtype info if needed
# Create empty tensors for dtype info if needed
empty_q_data
=
torch
.
empty
(
empty_q_data
=
torch
.
empty
(
0
,
0
,
...
...
sgl-kernel/CMakeLists.txt
View file @
76a2c86b
...
@@ -81,7 +81,7 @@ FetchContent_Populate(repo-triton)
...
@@ -81,7 +81,7 @@ FetchContent_Populate(repo-triton)
FetchContent_Declare
(
FetchContent_Declare
(
repo-flashinfer
repo-flashinfer
GIT_REPOSITORY https://github.com/flashinfer-ai/flashinfer.git
GIT_REPOSITORY https://github.com/flashinfer-ai/flashinfer.git
GIT_TAG
018b551825c8e5579206e6eb9d3229fa679202b
3
GIT_TAG
1a85c439a064c1609568675aa580a409a53fb18
3
GIT_SHALLOW OFF
GIT_SHALLOW OFF
)
)
FetchContent_Populate
(
repo-flashinfer
)
FetchContent_Populate
(
repo-flashinfer
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment