We recommend using a docker environment. Here is the [dockerhub](https://hub.docker.com/r/lightx2v/lightx2v/tags) for lightx2v. Please select the tag with the latest date, for example, 25042502.
```shell
docker pull lightx2v/lightx2v:25042502
docker run --gpus all -itd--ipc=host --name[container_name] -v[mount_settings] --entrypoint /bin/bash [image_id]
```
If you want to set up the environment yourself using conda, you can refer to the following steps:
# Install again separately to bypass the version conflict check
# The Hunyuan model needs to run under this version of transformers. If you do not need to run the Hunyuan model, you can ignore this step.
pip install transformers==4.45.2
# install flash-attention 2
cd lightx2v/3rd/flash-attention && pip install--no-cache-dir-v-e.
# install flash-attention 3, only if hopper
cd lightx2v/3rd/flash-attention/hopper && pip install--no-cache-dir-v-e .
```
# Infer
```shell
# Modify the path in the script
bash scripts/run_wan_t2v.sh
```
In addition to the existing input arguments in the script, there are also some necessary parameters in the `${lightx2v_path}/configs/wan_t2v.json` file specified by `--config_json`. You can modify them as needed.
lightx2v provides asynchronous service functionality. The code entry point is [here](https://github.com/ModelTC/lightx2v/blob/main/lightx2v/api_server.py)
### Start the Service
```shell
# Modify the paths in the script
bash scripts/start_server.sh
```
The `--port 8000` option means the service will bind to port `8000` on the local machine. You can change this as needed.
### Client Sends Request
```shell
python scripts/post.py
```
The service endpoint is: `/v1/local/video/generate`
The `message` parameter in `scripts/post.py` is as follows:
```python
message={
"task_id":generate_task_id(),
"task_id_must_unique":True,
"prompt":"Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage.",
1.`prompt`, `negative_prompt`, and `image_path` are basic inputs for video generation. `image_path` can be an empty string, indicating no image input is needed.
2.`save_video_path` specifies the path where the generated video will be saved on the server. The relative path is relative to the server's startup directory. It is recommended to set an absolute path according to your environment.
3.`task_id` is the ID of the task, which is a string. You can customize a string or use the `generate_task_id()` function to generate a random string. The task ID is used to distinguish between different video generation tasks.
4.`task_id_must_unique` indicates whether each `task_id` must be unique. If set to `False`, there is no such restriction. In this case, if duplicate `task_id`s are sent, the server's `task` record will be overwritten by the newer task with the same `task_id`. If you do not need to keep a record of all tasks for querying, you can set this to `False`.
### Client Checks Server Status
```shell
python scripts/check_status.py
```
The service endpoints include:
1.`/v1/local/video/generate/service_status` is used to check the status of the service. It returns whether the service is `busy` or `idle`. The service only accepts new requests when it is `idle`.
2.`/v1/local/video/generate/get_all_tasks` is used to get all tasks received and completed by the server.
3.`/v1/local/video/generate/task_status` is used to get the status of a specified `task_id`. It returns whether the task is `processing` or `completed`.
### Client Stops the Current Task on the Server at Any Time
```shell
python scripts/stop_running_task.py
```
The service endpoint is: `/v1/local/video/generate/stop_running_task`
After terminating the task, the server will not exit but will return to waiting for new requests.
lightx2v supports quantized inference for linear layers, supporting w8a8 and fp8 matrix multiplication.
### Run Quantized Inference
```shell
# Modify the path in the script
bash scripts/run_wan_t2v_save_quant.sh
```
There are two execution commands in the script:
#### Save Quantization Weights
Set the `RUNNING_FLAG` environment variable to `save_naive_quant`, and set `--config_json` to the corresponding `json` file: `${lightx2v_path}/configs/wan_t2v_save_quant.json`. In this file, `quant_model_path` specifies the path to save the quantized model.
#### Load Quantization Weights and Inference
Set the `RUNNING_FLAG` environment variable to `infer`, and set `--config_json` to the `json` file from the previous step.
### Start Quantization Service
After saving the quantized weights, as in the previous loading step, set the `RUNNING_FLAG` environment variable to `infer`, and set `--config_json` to the `json` file from the first step.
For example, modify the `scripts/start_server.sh` script as follows: