update docker images and FA3

739b3308 · helloyongyang · 4f7c54d8 · 739b3308 · 739b3308 · 739b3308
Commit 739b3308 authored Aug 01, 2025 by helloyongyang
4 changed files
--- a/Dockerfile
+++ b/Dockerfile
@@ -34,4 +34,9 @@ RUN cd flash-attention && python setup.py install && rm -rf build

 RUN cd flash-attention/hopper && python setup.py install && rm -rf build

+# RUN git clone https://github.com/thu-ml/SageAttention.git
+
+# # install sageattention with hopper gpu sm9.0
+# RUN cd SageAttention && sed -i 's/set()/{"9.0"}/' setup.py && EXT_PARALLEL=4 NVCC_APPEND_FLAGS="--threads 8" MAX_JOBS=32 pip install --no-cache-dir -v -e .
+
 WORKDIR /workspace
--- a/docs/EN/source/getting_started/quickstart.md
+++ b/docs/EN/source/getting_started/quickstart.md
@@ -27,11 +27,28 @@ We strongly recommend using the Docker environment, which is the simplest and fa

 #### 1. Pull Image

-Visit LightX2V's [Docker Hub](https://hub.docker.com/r/lightx2v/lightx2v/tags) and select a tag with the latest date, such as `25061301`:
+Visit LightX2V's [Docker Hub](https://hub.docker.com/r/lightx2v/lightx2v/tags) and select a tag with the latest date, such as `25080104`:

 ```bash
 # Pull the latest version of LightX2V image
-docker pull lightx2v/lightx2v:25061301
+docker pull lightx2v/lightx2v:25080104
+```
+
+If you need to use `SageAttention`, you can use docker image versions with the `-SageSmXX` suffix. The use of `SageAttention` requires selection based on GPU type, where:
+
+1. A100: -SageSm80
+2. RTX30 series: -SageSm86
+3. RTX40 series: -SageSm89
+4. H100: -SageSm90
+5. RTX50 series: -SageSm120
+
+For example, to use `SageAttention` on 4090 or H100, the docker image pull command would be:
+
+```bash
+# For 4090
+docker pull lightx2v/lightx2v:25080104-SageSm89
+# For H100
+docker pull lightx2v/lightx2v:25080104-SageSm90
 ```

 #### 2. Run Container

--- a/docs/ZH_CN/source/getting_started/quickstart.md
+++ b/docs/ZH_CN/source/getting_started/quickstart.md
@@ -27,11 +27,28 @@

 #### 1. 拉取镜像

-访问 LightX2V 的 [Docker Hub](https://hub.docker.com/r/lightx2v/lightx2v/tags)，选择一个最新日期的 tag，比如 `25061301`：
+访问 LightX2V 的 [Docker Hub](https://hub.docker.com/r/lightx2v/lightx2v/tags)，选择一个最新日期的 tag，比如 `25080104`：

 ```bash
 # 拉取最新版本的 LightX2V 镜像
-docker pull lightx2v/lightx2v:25061301
+docker pull lightx2v/lightx2v:25080104
+```
+
+如果需要使用`SageAttention`，可以使用带`-SageSmXX`后缀的镜像版本，`SageAttention`的使用需要针对GPU类型进行选择，其中：
+
+1. A100: -SageSm80
+2. RTX30系列: -SageSm86
+3. RTX40系列: -SageSm89
+4. H100: -SageSm90
+5. RTX50系列: -SageSm120
+
+比如要在4090或者H100上使用`SageAttention`，则拉取镜像命令为：
+
+```bash
+# 对于4090
+docker pull lightx2v/lightx2v:25080104-SageSm89
+# 对于H100
+docker pull lightx2v/lightx2v:25080104-SageSm90
 ```

 #### 2. 运行容器

--- a/lightx2v/common/ops/attn/flash_attn.py
+++ b/lightx2v/common/ops/attn/flash_attn.py
@@ -71,5 +71,5 @@ class FlashAttn3Weight(AttnWeightTemplate):
            cu_seqlens_kv,
            max_seqlen_q,
            max_seqlen_kv,
-        )[0].reshape(max_seqlen_q, -1)
+        ).reshape(max_seqlen_q, -1)
        return x