"git@developer.sourcefind.cn:renzhc/diffusers_dcu.git" did not exist on "2a7f43a73bda387385a47a15d7b6fe9be9c65eb2"
Unverified Commit f7dd651d authored by hzh0425's avatar hzh0425 Committed by GitHub
Browse files

feat(hicache-3fs): 3FS-SGLang Hierarchical Cache Deployment Guide​ (#9213)

parent 9d54c6e6
# HF3FS as L3 KV Cache # Using HF3FS as L3 Global KV Cache
This document describes how to use deepseek-hf3fs as the L3 KV cache for SGLang. This document provides step-by-step instructions for setting up a k8s + 3FS + SGLang runtime environment from scratch, describing how to utilize deepseek-hf3fs as the L3 KV cache for SGLang.
The process consists of five main steps:
## Step1: Install deepseek-3fs by 3fs-Operator (Coming Soon) ## Step 1: Install deepseek-3fs via 3fs-Operator
Refer to the [3fs-operator documentation](https://github.com/aliyun/kvc-3fs-operator/blob/main/README_en.md) to deploy 3FS components in your Kubernetes environment using the Operator with one-click deployment.
## Step2: Setup usrbio client ## Step 2: Launch SGLang Pod
Start your SGLang Pod while specifying 3FS-related labels in the YAML configuration. Follow the [fuse-client-creation guide](https://github.com/aliyun/kvc-3fs-operator/blob/main/README_en.md#fuse-client-creation).
Please follow the document [setup_usrbio_client.md](setup_usrbio_client.md) to setup usrbio client. ## Step 3: Configure Usrbio Client in SGLang Pod
The Usrbio client is required for accessing 3FS. Install it in your SGLang Pod using either method below:
## Step3: Deployment **Alternative 1 (Recommend):** Build from source (refer to [setup_usrbio_client.md](setup_usrbio_client.md))
### Single node deployment **Alternative 2:** Run `pip3 install hf3fs-py-usrbio` (Follow https://pypi.org/project/hf3fs-py-usrbio/#files)
## Step 4: Deploy Model Serving
### Single Node Deployment
```bash ```bash
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/python3.10/dist-packages export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/python3.12/dist-packages
python3 -m sglang.launch_server \ python3 -m sglang.launch_server \
--model-path /code/models/Qwen3-32B/ \ --model-path /code/models/Qwen3-32B/ \
--host 0.0.0.0 --port 10000 \ --host 0.0.0.0 --port 10000 \
...@@ -24,6 +31,5 @@ python3 -m sglang.launch_server \ ...@@ -24,6 +31,5 @@ python3 -m sglang.launch_server \
--hicache-storage-backend hf3fs --hicache-storage-backend hf3fs
``` ```
### Multi nodes deployment to share KV cache ### Multi-Node Deployment (Shared KV Cache)
Follow the [deploy_sglang_3fs_multinode.md](deploy_sglang_3fs_multinode.md) guide to deploy SGLang with 3FS across multiple nodes for shared KV caching.
Please follow the document [deploy_sglang_3fs_multinode.md](deploy_sglang_3fs_multinode.md) to deploy SGLang with 3FS on multiple nodes to share KV cache.
...@@ -20,7 +20,7 @@ vim /sgl-workspace/sglang/benchmark/hf3fs/hf3fs_config.json ...@@ -20,7 +20,7 @@ vim /sgl-workspace/sglang/benchmark/hf3fs/hf3fs_config.json
## node1 ## node1
```bash ```bash
export SGLANG_HICACHE_HF3FS_CONFIG_PATH=/sgl-workspace/sglang/benchmark/hf3fs/hf3fs_config.json export SGLANG_HICACHE_HF3FS_CONFIG_PATH=/sgl-workspace/sglang/benchmark/hf3fs/hf3fs_config.json
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/python3.10/dist-packages export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/python3.12/dist-packages
rm -rf instance1.out && \ rm -rf instance1.out && \
nohup python3 -m sglang.launch_server \ nohup python3 -m sglang.launch_server \
--model-path /code/models/Qwen3-32B/ \ --model-path /code/models/Qwen3-32B/ \
...@@ -35,7 +35,7 @@ nohup python3 -m sglang.launch_server \ ...@@ -35,7 +35,7 @@ nohup python3 -m sglang.launch_server \
## node2 ## node2
```bash ```bash
export SGLANG_HICACHE_HF3FS_CONFIG_PATH=/sgl-workspace/sglang/benchmark/hf3fs/hf3fs_config.json export SGLANG_HICACHE_HF3FS_CONFIG_PATH=/sgl-workspace/sglang/benchmark/hf3fs/hf3fs_config.json
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/python3.10/dist-packages export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/python3.12/dist-packages
rm -rf instance2.out && \ rm -rf instance2.out && \
nohup python3 -m sglang.launch_server \ nohup python3 -m sglang.launch_server \
--model-path /code/models/Qwen3-32B/ \ --model-path /code/models/Qwen3-32B/ \
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment