@@ -7,14 +7,14 @@ The router is a independent Python package, and it can be used as a drop-in repl
...
@@ -7,14 +7,14 @@ The router is a independent Python package, and it can be used as a drop-in repl
## Installation
## Installation
```bash
```bash
pip install sglang-router
$ pip install sglang-router
```
```
Detailed usage of the router can be found in [launch_router](https://github.com/sgl-project/sglang/blob/main/rust/py_src/sglang_router/launch_router.py) and [launch_server](https://github.com/sgl-project/sglang/blob/main/rust/py_src/sglang/launch_server.py). Also, you can directly run the following command to see the usage of the router.
Detailed usage of the router can be found in [launch_router](https://github.com/sgl-project/sglang/blob/main/rust/py_src/sglang_router/launch_router.py) and [launch_server](https://github.com/sgl-project/sglang/blob/main/rust/py_src/sglang/launch_server.py). Also, you can directly run the following command to see the usage of the router.
```bash
```bash
python -m sglang_router.launch_server --help
$ python -m sglang_router.launch_server --help
python -m sglang_router.launch_router --help
$ python -m sglang_router.launch_router --help
```
```
The router supports two working modes:
The router supports two working modes:
...
@@ -27,7 +27,7 @@ The router supports two working modes:
...
@@ -27,7 +27,7 @@ The router supports two working modes:
This will be a drop-in replacement for the existing `--dp-size` arguement of SGLang Runtime. Under the hood, it uses multi-processes to launch multiple workers, wait for them to be ready, then connect the router to all workers.
This will be a drop-in replacement for the existing `--dp-size` arguement of SGLang Runtime. Under the hood, it uses multi-processes to launch multiple workers, wait for them to be ready, then connect the router to all workers.
After the server is ready, you can directly send requests to the router as the same way as sending requests to each single worker.
After the server is ready, you can directly send requests to the router as the same way as sending requests to each single worker.
...
@@ -47,12 +47,62 @@ print(response.json())
...
@@ -47,12 +47,62 @@ print(response.json())
This is useful for multi-node DP. First, launch workers on multiple nodes, then launch a router on the main node, and connect the router to all workers.
This is useful for multi-node DP. First, launch workers on multiple nodes, then launch a router on the main node, and connect the router to all workers.
@@ -158,7 +62,7 @@ For development purposes, you can install the package in editable mode:
...
@@ -158,7 +62,7 @@ For development purposes, you can install the package in editable mode:
Warning: Using editable python binding can suffer from performance degradation!! Please build a fresh wheel for every update if you want to test performance.
Warning: Using editable python binding can suffer from performance degradation!! Please build a fresh wheel for every update if you want to test performance.
```bash
```bash
pip install-e .
$ pip install-e .
```
```
**Note:** When modifying Rust code, you must rebuild the wheel for changes to take effect.
**Note:** When modifying Rust code, you must rebuild the wheel for changes to take effect.