Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
61bb223e
Unverified
Commit
61bb223e
authored
Aug 25, 2024
by
Lianmin Zheng
Committed by
GitHub
Aug 25, 2024
Browse files
Update CI runner docs (#1213)
parent
15f1a49d
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
30 additions
and
75 deletions
+30
-75
.github/workflows/moe-test.yml
.github/workflows/moe-test.yml
+2
-2
docs/en/setup_github_runner.md
docs/en/setup_github_runner.md
+28
-73
No files found.
.github/workflows/moe-test.yml
View file @
61bb223e
...
...
@@ -33,13 +33,13 @@ jobs:
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ --force-reinstall
-
name
:
Benchmark MoE Serving Throughput
timeout
_
minutes
:
10
timeout
-
minutes
:
10
run
:
|
cd test/srt
python3 -m unittest test_moe_serving_throughput.TestServingThroughput.test_default
-
name
:
Benchmark MoE Serving Throughput (w/o RadixAttention)
timeout
_
minutes
:
10
timeout
-
minutes
:
10
run
:
|
cd test/srt
python3 -m unittest test_moe_serving_throughput.TestServingThroughput.test_default_without_radix_cache
docs/en/setup_github_runner.md
View file @
61bb223e
# Set
u
p
s
elf
hosted
r
unner for GitHub Action
# Set
U
p
S
elf
-
hosted
R
unner
s
for GitHub Action
##
Config
Runner
##
Add a
Runner
```
bash
# https://github.com/sgl-project/sglang/settings/actions/runners/new?arch=x64&os=linux
# Involves some TOKEN and other private information, click the link to view specific steps.
```
### Step 1: Start a docker container.
## Start Runner
You can mount a folder for the shared huggingface model weights cache. The command below uses
`/tmp/huggingface`
as an example.
add
`/lib/systemd/system/e2e.service`
```
[Unit]
StartLimitIntervalSec=0
[Service]
Environment="CUDA_VISIBLE_DEVICES=7"
Environment="XDG_CACHE_HOME=/data/.cache"
Environment="HF_TOKEN=hf_xx"
Environment="OPENAI_API_KEY=sk-xx"
Environment="HOME=/data/zhyncs/runner-v1"
Environment="SGLANG_IS_IN_CI=true"
Restart=always
RestartSec=1
ExecStart=/data/zhyncs/runner-v1/actions-runner/run.sh
[Install]
WantedBy=multi-user.target
docker pull nvidia/cuda:12.1.1-devel-ubuntu22.04
docker run --shm-size 64g -it -v /tmp/huggingface:/hf_home --gpus all nvidia/cuda:12.1.1-devel-ubuntu22.04 /bin/bash
```
add
`/lib/systemd/system/unit.service`
```
[Unit]
StartLimitIntervalSec=0
[Service]
Environment="CUDA_VISIBLE_DEVICES=6"
Environment="XDG_CACHE_HOME=/data/.cache"
Environment="HF_TOKEN=hf_xx"
Environment="OPENAI_API_KEY=sk-xx"
Environment="HOME=/data/zhyncs/runner-v2"
Environment="SGLANG_IS_IN_CI=true"
Restart=always
RestartSec=1
ExecStart=/data/zhyncs/runner-v2/actions-runner/run.sh
[Install]
WantedBy=multi-user.target
```
### Step 2: Configure the runner by `config.sh`
Run these commands inside the container.
add
`/lib/systemd/system/accuracy.service`
```
[Unit]
StartLimitIntervalSec=0
[Service]
Environment="CUDA_VISIBLE_DEVICES=5"
Environment="XDG_CACHE_HOME=/data/.cache"
Environment="HF_TOKEN=hf_xx"
Environment="OPENAI_API_KEY=sk-xx"
Environment="HOME=/data/zhyncs/runner-v3"
Environment="SGLANG_IS_IN_CI=true"
Restart=always
RestartSec=1
ExecStart=/data/zhyncs/runner-v3/actions-runner/run.sh
[Install]
WantedBy=multi-user.target
apt update && apt install -y curl python3-pip git
export RUNNER_ALLOW_RUNASROOT=1
```
```
bash
cd
/data/zhyncs/runner-v1
python3
-m
venv venv
Then follow https://github.com/sgl-project/sglang/settings/actions/runners/new?arch=x64&os=linux to run
`config.sh`
cd
/data/zhyncs/runner-v2
python3
-m
venv venv
**Notes**
-
Do not need to specify the runner group
-
Give it a name (e.g.,
`test-sgl-gpu-0`
) and some labels (e.g.,
`unit-test`
). The labels can be editted later in Github Settings.
-
Do not need to change the work folder.
cd
/data/zhyncs/runner-v3
python3
-m
venv venv
### Step 3: Run the runner by `run.sh`
sudo
systemctl daemon-reload
sudo
systemctl start e2e
sudo
systemctl
enable
e2e
sudo
systemctl status e2e
sudo
systemctl start unit
sudo
systemctl
enable
unit
sudo
systemctl status unit
-
Set up environment variables
```
export HF_HOME=/hf_home
export SGLANG_IS_IN_CI=true
export HF_TOKEN=hf_xxx
export OPENAI_API_KEY=sk-xxx
export CUDA_VISIBLE_DEVICES=0
```
sudo
systemctl start accuracy
sudo
systemctl
enable
accuracy
sudo
systemctl status accuracy
-
Run it forever
```
while true; do ./run.sh; echo "Restarting..."; sleep 2; done
```
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment