Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
ad0ff62a
Unverified
Commit
ad0ff62a
authored
Sep 12, 2024
by
Lianmin Zheng
Committed by
GitHub
Sep 12, 2024
Browse files
Balance test in CI (#1411)
parent
9a903a87
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
18 additions
and
18 deletions
+18
-18
.github/workflows/pr-test.yml
.github/workflows/pr-test.yml
+16
-16
python/sglang/README.md
python/sglang/README.md
+1
-1
test/srt/test_bench_serving.py
test/srt/test_bench_serving.py
+1
-1
No files found.
.github/workflows/pr-test.yml
View file @
ad0ff62a
...
...
@@ -88,29 +88,23 @@ jobs:
pip install -e "python[all]"
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ --force-reinstall
-
name
:
Benchmark Offline Throughput
timeout-minutes
:
10
run
:
|
cd test/srt
python3 -m unittest test_bench_serving.TestBenchServing.test_offline_throughput_default
-
name
:
Benchmark Offline Throughput (w/o RadixAttention)
-
name
:
Benchmark Single Latency
timeout-minutes
:
10
run
:
|
cd test/srt
python3 -m unittest test_bench_
serving
.TestBench
Serving.test_offline_throughput_without_radix_cache
python3 -m unittest test_bench_
latency
.TestBench
Latency.test_default
-
name
:
Benchmark O
ff
line
Throughput (w/o ChunkedPrefill)
-
name
:
Benchmark O
n
line
Latency
timeout-minutes
:
10
run
:
|
cd test/srt
python3 -m unittest test_bench_serving.TestBenchServing.test_o
ff
line_
throughput_without_chunked_prefill
python3 -m unittest test_bench_serving.TestBenchServing.test_o
n
line_
latency_default
-
name
:
Benchmark Offline Throughput
(w/ Triton)
-
name
:
Benchmark Offline Throughput
timeout-minutes
:
10
run
:
|
cd test/srt
python3 -m unittest test_bench_serving.TestBenchServing.test_offline_throughput_
with_triton_attention_backend
python3 -m unittest test_bench_serving.TestBenchServing.test_offline_throughput_
default
performance-test-1-gpu-part-2
:
if
:
github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request'
...
...
@@ -125,17 +119,23 @@ jobs:
pip install -e "python[all]"
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ --force-reinstall
-
name
:
Benchmark
Single Latency
-
name
:
Benchmark
Offline Throughput (w/o RadixAttention)
timeout-minutes
:
10
run
:
|
cd test/srt
python3 -m unittest test_bench_
latency
.TestBench
Latency.test_default
python3 -m unittest test_bench_
serving
.TestBench
Serving.test_offline_throughput_without_radix_cache
-
name
:
Benchmark O
n
line
Latency
-
name
:
Benchmark O
ff
line
Throughput (w/o ChunkedPrefill)
timeout-minutes
:
10
run
:
|
cd test/srt
python3 -m unittest test_bench_serving.TestBenchServing.test_online_latency_default
python3 -m unittest test_bench_serving.TestBenchServing.test_offline_throughput_without_chunked_prefill
-
name
:
Benchmark Offline Throughput (w/ Triton)
timeout-minutes
:
10
run
:
|
cd test/srt
python3 -m unittest test_bench_serving.TestBenchServing.test_offline_throughput_with_triton_attention_backend
performance-test-2-gpu
:
if
:
github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request'
...
...
python/sglang/README.md
View file @
ad0ff62a
...
...
@@ -7,5 +7,5 @@
-
`bench_latency.py`
: Benchmark a single static batch.
-
`bench_serving.py`
: Benchmark online serving with dynamic requests.
-
`global_config.py`
: The global configs and constants.
-
`launch_server.py`
: The entry point
o
f launching local server.
-
`launch_server.py`
: The entry point f
or
launching
the
local server.
-
`utils.py`
: Common utilities.
test/srt/test_bench_serving.py
View file @
ad0ff62a
...
...
@@ -69,7 +69,7 @@ class TestBenchServing(unittest.TestCase):
if
os
.
getenv
(
"SGLANG_IS_IN_CI"
,
"false"
)
==
"true"
:
assert
res
[
"median_e2e_latency_ms"
]
<
12000
assert
res
[
"median_ttft_ms"
]
<
7
8
assert
res
[
"median_ttft_ms"
]
<
8
0
assert
res
[
"median_itl_ms"
]
<
12
def
test_moe_offline_throughput_default
(
self
):
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment