Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
ad1ae7f7
Unverified
Commit
ad1ae7f7
authored
Mar 14, 2025
by
Yineng Zhang
Committed by
GitHub
Mar 14, 2025
Browse files
use topk_softmax with sgl-kernel (#4439)
parent
e73167ad
Changes
18
Hide whitespace changes
Inline
Side-by-side
Showing
18 changed files
with
48 additions
and
35 deletions
+48
-35
.github/workflows/execute-notebook.yml
.github/workflows/execute-notebook.yml
+1
-1
.github/workflows/experiment-runner.yml
.github/workflows/experiment-runner.yml
+1
-1
.github/workflows/lint.yml
.github/workflows/lint.yml
+1
-1
.github/workflows/nightly-test.yml
.github/workflows/nightly-test.yml
+1
-1
.github/workflows/pr-test-amd.yml
.github/workflows/pr-test-amd.yml
+2
-2
.github/workflows/pr-test-rust.yml
.github/workflows/pr-test-rust.yml
+2
-2
.github/workflows/pr-test-sgl-kernel.yml
.github/workflows/pr-test-sgl-kernel.yml
+1
-1
.github/workflows/pr-test.yml
.github/workflows/pr-test.yml
+9
-9
.github/workflows/release-docker-amd-nightly.yml
.github/workflows/release-docker-amd-nightly.yml
+1
-1
.github/workflows/release-docker-amd.yml
.github/workflows/release-docker-amd.yml
+1
-1
.github/workflows/release-docker-dev.yml
.github/workflows/release-docker-dev.yml
+1
-1
.github/workflows/release-docker.yml
.github/workflows/release-docker.yml
+1
-1
.github/workflows/release-docs.yml
.github/workflows/release-docs.yml
+1
-1
.github/workflows/release-fake-tag.yml
.github/workflows/release-fake-tag.yml
+1
-1
.github/workflows/release-pypi.yml
.github/workflows/release-pypi.yml
+1
-1
python/pyproject.toml
python/pyproject.toml
+1
-1
python/sglang/srt/layers/moe/topk.py
python/sglang/srt/layers/moe/topk.py
+21
-8
scripts/ci_install_dependency.sh
scripts/ci_install_dependency.sh
+1
-1
No files found.
.github/workflows/execute-notebook.yml
View file @
ad1ae7f7
...
...
@@ -20,7 +20,7 @@ jobs:
if
:
github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request'
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Set up Python
uses
:
actions/setup-python@v4
...
...
.github/workflows/experiment-runner.yml
View file @
ad1ae7f7
...
...
@@ -17,7 +17,7 @@ jobs:
runs-on
:
1-gpu-runner
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Install dependencies
run
:
|
...
...
.github/workflows/lint.yml
View file @
ad1ae7f7
...
...
@@ -6,7 +6,7 @@ jobs:
lint
:
runs-on
:
ubuntu-latest
steps
:
-
uses
:
actions/checkout@v
2
-
uses
:
actions/checkout@v
4
-
name
:
Set up Python
uses
:
actions/setup-python@v4
...
...
.github/workflows/nightly-test.yml
View file @
ad1ae7f7
...
...
@@ -20,7 +20,7 @@ jobs:
runs-on
:
2-gpu-runner
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Install dependencies
run
:
|
...
...
.github/workflows/pr-test-amd.yml
View file @
ad1ae7f7
...
...
@@ -25,7 +25,7 @@ jobs:
runs-on
:
linux-mi300-gpu-1
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Setup docker
run
:
|
...
...
@@ -64,7 +64,7 @@ jobs:
runs-on
:
linux-mi300-gpu-1
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Setup docker
run
:
|
...
...
.github/workflows/pr-test-rust.yml
View file @
ad1ae7f7
...
...
@@ -21,7 +21,7 @@ jobs:
runs-on
:
ubuntu-latest
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Install dependencies
run
:
|
...
...
@@ -45,7 +45,7 @@ jobs:
runs-on
:
2-gpu-runner
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Install rust dependencies
run
:
|
...
...
.github/workflows/pr-test-sgl-kernel.yml
View file @
ad1ae7f7
...
...
@@ -20,7 +20,7 @@ jobs:
runs-on
:
ubuntu-latest
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Check clang-format
uses
:
DoozyX/clang-format-lint-action@v0.18.1
...
...
.github/workflows/pr-test.yml
View file @
ad1ae7f7
...
...
@@ -39,7 +39,7 @@ jobs:
run_tests
:
${{ steps.set_run_tests.outputs.run_tests }}
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Filter changes
id
:
filter
uses
:
dorny/paths-filter@v2
...
...
@@ -72,7 +72,7 @@ jobs:
runs-on
:
1-gpu-runner
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Install dependencies
env
:
...
...
@@ -98,7 +98,7 @@ jobs:
part
:
[
0
,
1
,
2
,
3
,
4
,
5
,
6
]
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Install dependencies
env
:
...
...
@@ -120,7 +120,7 @@ jobs:
runs-on
:
2-gpu-runner
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Install dependencies
env
:
...
...
@@ -172,7 +172,7 @@ jobs:
runs-on
:
1-gpu-runner
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Install dependencies
env
:
...
...
@@ -218,7 +218,7 @@ jobs:
runs-on
:
1-gpu-runner
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Install dependencies
env
:
...
...
@@ -252,7 +252,7 @@ jobs:
runs-on
:
2-gpu-runner
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Install dependencies
env
:
...
...
@@ -294,7 +294,7 @@ jobs:
runs-on
:
1-gpu-runner
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Install dependencies
env
:
...
...
@@ -319,7 +319,7 @@ jobs:
runs-on
:
2-gpu-runner
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Install dependencies
env
:
...
...
.github/workflows/release-docker-amd-nightly.yml
View file @
ad1ae7f7
...
...
@@ -23,7 +23,7 @@ jobs:
build_type
:
[
'
all'
,
'
srt'
]
steps
:
-
name
:
Checkout repository
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
"
Set
Date"
run
:
|
...
...
.github/workflows/release-docker-amd.yml
View file @
ad1ae7f7
...
...
@@ -18,7 +18,7 @@ jobs:
build_type
:
[
'
all'
,
'
srt'
]
steps
:
-
name
:
Checkout repository
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Free disk space
uses
:
jlumbroso/free-disk-space@main
...
...
.github/workflows/release-docker-dev.yml
View file @
ad1ae7f7
...
...
@@ -10,7 +10,7 @@ jobs:
runs-on
:
ubuntu-22.04
steps
:
-
name
:
Checkout repository
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Free disk space
uses
:
jlumbroso/free-disk-space@main
...
...
.github/workflows/release-docker.yml
View file @
ad1ae7f7
...
...
@@ -21,7 +21,7 @@ jobs:
run
:
rm -rf /opt/hostedtoolcache
-
name
:
Checkout repository
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Login to Docker Hub
uses
:
docker/login-action@v2
...
...
.github/workflows/release-docs.yml
View file @
ad1ae7f7
...
...
@@ -20,7 +20,7 @@ jobs:
if
:
github.repository == 'sgl-project/sglang'
steps
:
-
name
:
Checkout code
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Set up Python
uses
:
actions/setup-python@v4
...
...
.github/workflows/release-fake-tag.yml
View file @
ad1ae7f7
...
...
@@ -17,7 +17,7 @@ jobs:
environment
:
'
prod'
steps
:
-
name
:
Checkout repository
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Get version
id
:
get_version
...
...
.github/workflows/release-pypi.yml
View file @
ad1ae7f7
...
...
@@ -19,7 +19,7 @@ jobs:
python-version
:
'
3.9'
-
name
:
Checkout repository
uses
:
actions/checkout@v
3
uses
:
actions/checkout@v
4
-
name
:
Upload to pypi
run
:
|
...
...
python/pyproject.toml
View file @
ad1ae7f7
...
...
@@ -43,7 +43,7 @@ runtime_common = [
srt
=
[
"sglang[runtime_common]"
,
"sgl-kernel==0.0.5"
,
"sgl-kernel==0.0.5
.post1
"
,
"flashinfer_python==0.2.3"
,
"torch==2.5.1"
,
"vllm>=0.6.4.post1,<=0.7.2"
,
...
...
python/sglang/srt/layers/moe/topk.py
View file @
ad1ae7f7
...
...
@@ -17,7 +17,9 @@ from typing import Callable, Optional
import
torch
import
torch.nn.functional
as
F
from
sglang.srt.utils
import
get_compiler_backend
from
sglang.srt.utils
import
get_compiler_backend
,
is_cuda
_is_cuda
=
is_cuda
()
def
fused_topk_native
(
...
...
@@ -47,7 +49,10 @@ def fused_topk(
topk
:
int
,
renormalize
:
bool
,
):
from
vllm
import
_custom_ops
as
ops
if
_is_cuda
:
from
sgl_kernel
import
topk_softmax
else
:
from
vllm
import
_custom_ops
as
ops
assert
hidden_states
.
shape
[
0
]
==
gating_output
.
shape
[
0
],
"Number of tokens mismatch"
...
...
@@ -61,12 +66,20 @@ def fused_topk(
M
,
topk
,
dtype
=
torch
.
int32
,
device
=
hidden_states
.
device
)
ops
.
topk_softmax
(
topk_weights
,
topk_ids
,
token_expert_indicies
,
gating_output
.
float
(),
)
if
_is_cuda
:
topk_softmax
(
topk_weights
,
topk_ids
,
token_expert_indicies
,
gating_output
.
float
(),
)
else
:
ops
.
topk_softmax
(
topk_weights
,
topk_ids
,
token_expert_indicies
,
gating_output
.
float
(),
)
del
token_expert_indicies
if
renormalize
:
...
...
scripts/ci_install_dependency.sh
View file @
ad1ae7f7
...
...
@@ -26,4 +26,4 @@ pip install transformers==4.45.2 sentence_transformers accelerate==1.4.0 peft pa
pip
install
cuda-python nvidia-cuda-nvrtc-cu12
# reinstall sgl-kernel
pip
install
sgl-kernel
==
0.0.5
--force-reinstall
--no-deps
pip
install
sgl-kernel
==
0.0.5
.post1
--force-reinstall
--no-deps
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment