@@ -34,9 +34,8 @@ RUN cd flash-attention && python setup.py install && rm -rf build
...
@@ -34,9 +34,8 @@ RUN cd flash-attention && python setup.py install && rm -rf build
RUN cd flash-attention/hopper && python setup.py install&&rm-rf build
RUN cd flash-attention/hopper && python setup.py install&&rm-rf build
# RUN git clone https://github.com/thu-ml/SageAttention.git
RUN git clone https://github.com/ModelTC/SageAttention.git
# # install sageattention with hopper gpu sm9.0
RUN cd SageAttention &&CUDA_ARCHITECTURES="8.0,8.6,8.9,9.0,12.0"EXT_PARALLEL=4 NVCC_APPEND_FLAGS="--threads 8"MAX_JOBS=32 pip install--no-cache-dir-v-e .
# RUN cd SageAttention && sed -i 's/set()/{"9.0"}/' setup.py && EXT_PARALLEL=4 NVCC_APPEND_FLAGS="--threads 8" MAX_JOBS=32 pip install --no-cache-dir -v -e .
@@ -27,39 +27,16 @@ We strongly recommend using the Docker environment, which is the simplest and fa
...
@@ -27,39 +27,16 @@ We strongly recommend using the Docker environment, which is the simplest and fa
#### 1. Pull Image
#### 1. Pull Image
Visit LightX2V's [Docker Hub](https://hub.docker.com/r/lightx2v/lightx2v/tags), select a tag with the latest date, such as `25080601-cu128`:
Visit LightX2V's [Docker Hub](https://hub.docker.com/r/lightx2v/lightx2v/tags), select a tag with the latest date, such as `25082901-cu128`:
```bash
```bash
# Pull the latest version of LightX2V image, this image does not have SageAttention installed
docker pull lightx2v/lightx2v:25082901-cu128
docker pull lightx2v/lightx2v:25080601-cu128
```
If you need to use `SageAttention`, you can use image versions with the `-SageSmXX` suffix. The use of `SageAttention` requires selection based on GPU type, where:
1. A100: -SageSm80
2. RTX30 series: -SageSm86
3. RTX40 series: -SageSm89
4. H100: -SageSm90
5. RTX50 series: -SageSm120
For example, to use `SageAttention` on 4090 or H100, the image pull commands are:
We recommend using the `cuda128` environment for faster inference speed. If you need to use the `cuda124` environment, you can use image versions with the `-cu124` suffix:
We recommend using the `cuda128` environment for faster inference speed. If you need to use the `cuda124` environment, you can use image versions with the `-cu124` suffix:
```bash
```bash
# cuda124 version, without SageAttention installed
docker pull lightx2v/lightx2v:25082901-cu124
docker pull lightx2v/lightx2v:25080601-cu124
# For 4090, cuda124 version, with SageAttention installed