Unverified Commit e79f6cd7 authored by Lianmin Zheng's avatar Lianmin Zheng Committed by GitHub
Browse files

Release v0.3.1 (#1430)

parent 9ba1f097
...@@ -60,7 +60,7 @@ pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ ...@@ -60,7 +60,7 @@ pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/
### Method 2: From source ### Method 2: From source
``` ```
# Use the last release branch # Use the last release branch
git clone -b v0.3.0 https://github.com/sgl-project/sglang.git git clone -b v0.3.1 https://github.com/sgl-project/sglang.git
cd sglang cd sglang
pip install --upgrade pip pip install --upgrade pip
...@@ -139,7 +139,7 @@ sky status --endpoint 30000 sglang ...@@ -139,7 +139,7 @@ sky status --endpoint 30000 sglang
### Common Notes ### Common Notes
- [FlashInfer](https://github.com/flashinfer-ai/flashinfer) is the default attention kernel backend. It only supports sm75 and above. If you encounter any FlashInfer-related issues on sm75+ devices (e.g., T4, A10, A100, L4, L40S, H100), please disable it by adding `--disable-flashinfer --disable-flashinfer-sampling` and open an issue on GitHub. - [FlashInfer](https://github.com/flashinfer-ai/flashinfer) is the default attention kernel backend. It only supports sm75 and above. If you encounter any FlashInfer-related issues on sm75+ devices (e.g., T4, A10, A100, L4, L40S, H100), please switch to other kernels by adding `--attention-backend triton --sampling-backend pytorch` and open an issue on GitHub.
- If you only need to use the OpenAI backend, you can avoid installing other dependencies by using `pip install "sglang[openai]"`. - If you only need to use the OpenAI backend, you can avoid installing other dependencies by using `pip install "sglang[openai]"`.
## Backend: SGLang Runtime (SRT) ## Backend: SGLang Runtime (SRT)
......
...@@ -92,5 +92,5 @@ sky status --endpoint 30000 sglang ...@@ -92,5 +92,5 @@ sky status --endpoint 30000 sglang
</details> </details>
### Common Notes ### Common Notes
- [FlashInfer](https://github.com/flashinfer-ai/flashinfer) is the default attention kernel backend. It only supports sm75 and above. If you encounter any FlashInfer-related issues on sm75+ devices (e.g., T4, A10, A100, L4, L40S, H100), please disable it by adding `--disable-flashinfer --disable-flashinfer-sampling` and open an issue on GitHub. - [FlashInfer](https://github.com/flashinfer-ai/flashinfer) is the default attention kernel backend. It only supports sm75 and above. If you encounter any FlashInfer-related issues on sm75+ devices (e.g., T4, A10, A100, L4, L40S, H100), please switch to other kernels by adding `--attention-backend triton --sampling-backend pytorch` and open an issue on GitHub.
- If you only need to use the OpenAI backend, you can avoid installing other dependencies by using `pip install "sglang[openai]"`. - If you only need to use the OpenAI backend, you can avoid installing other dependencies by using `pip install "sglang[openai]"`.
...@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta" ...@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project] [project]
name = "sglang" name = "sglang"
version = "0.3.0" version = "0.3.1"
description = "SGLang is yet another fast serving framework for large language models and vision language models." description = "SGLang is yet another fast serving framework for large language models and vision language models."
readme = "README.md" readme = "README.md"
requires-python = ">=3.8" requires-python = ">=3.8"
......
__version__ = "0.3.0" __version__ = "0.3.1"
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment