Refactor/api (#94)

* fix: correct frequency computation in WanTransformerInfer * refactor: restructure API server and distributed inference services - Removed the old api_server_dist.py file and integrated its functionality into a new modular structure. - Created a new ApiServer class to handle API routes and services. - Introduced DistributedInferenceService and FileService for better separation of concerns. - Updated the main entry point to initialize and run the new API server with distributed inference capabilities. - Added schema definitions for task requests and responses to improve data handling. - Enhanced error handling and logging throughout the services. * refactor: enhance API structure and file handling in server - Introduced APIRouter for modular route management in the ApiServer class. - Updated task creation and file download endpoints to improve clarity and functionality. - Implemented a new method for streaming file responses with proper MIME type handling. - Refactored task request schema to auto-generate task IDs and handle optional video save paths. - Improved error handling and logging for better debugging and user feedback. * feat: add configurable parameters for video generation - Introduced new parameters: infer_steps, target_video_length, and seed to the API and task request schema. - Updated DefaultRunner and VideoGenerationService to handle the new parameters for enhanced video generation control. - Improved default values for parameters to ensure consistent behavior. * refactor: enhance profiling context for async support * refactor: improve signal handling in API server * feat: enhance video generation capabilities with audio support * refactor: improve subprocess call for audio-video merging in wan_audio_runner.py * refactor: enhance API server argument parsing and improve code readability * refactor: enhance logging and improve code comments for clarity * refactor: update response model for task listing endpoint to return a dictionary * docs: update API endpoints and improve documentation clarity * refactor: update API endpoints in scripts for task management and remove unused code * fix: pre-commit

Refactor/api (#94)
* fix: correct frequency computation in WanTransformerInfer * refactor: restructure API server and distributed inference services - Removed the old api_server_dist.py file and integrated its functionality into a new modular structure. - Created a new ApiServer class to handle API routes and services. - Introduced DistributedInferenceService and FileService for better separation of concerns. - Updated the main entry point to initialize and run the new API server with distributed inference capabilities. - Added schema definitions for task requests and responses to improve data handling. - Enhanced error handling and logging throughout the services. * refactor: enhance API structure and file handling in server - Introduced APIRouter for modular route management in the ApiServer class. - Updated task creation and file download endpoints to improve clarity and functionality. - Implemented a new method for streaming file responses with proper MIME type handling. - Refactored task request schema to auto-generate task IDs and handle optional video save paths. - Improved error handling and logging for better debugging and user feedback. * feat: add configurable parameters for video generation - Introduced new parameters: infer_steps, target_video_length, and seed to the API and task request schema. - Updated DefaultRunner and VideoGenerationService to handle the new parameters for enhanced video generation control. - Improved default values for parameters to ensure consistent behavior. * refactor: enhance profiling context for async support * refactor: improve signal handling in API server * feat: enhance video generation capabilities with audio support * refactor: improve subprocess call for audio-video merging in wan_audio_runner.py * refactor: enhance API server argument parsing and improve code readability * refactor: enhance logging and improve code comments for clarity * refactor: update response model for task listing endpoint to return a dictionary * docs: update API endpoints and improve documentation clarity * refactor: update API endpoints in scripts for task management and remove unused code * fix: pre-commit
398b598a · PengGao · GitHub · 1e422663 · 1e422663 · 398b598a
Commit 398b598a authored Jul 09, 2025 by PengGao Committed by GitHub Jul 09, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 19 additions and 56 deletions

scripts/server/start_api_with_dist_inference.sh scripts/server/start_api_with_dist_inference.sh +0 -48

scripts/server/start_server.sh scripts/server/start_server.sh +19 -8

No files found.
--- a/scripts/server/start_api_with_dist_inference.sh
+++ b/scripts/server/start_api_with_dist_inference.sh
-#!/bin/bash
-# 设置路径
-lightx2v_path=/mnt/aigc/users/lijiaqi2/ComfyUI/custom_nodes/ComfyUI-Lightx2vWrapper/lightx2v
-model_path=/mnt/aigc/users/lijiaqi2/wan_model/Wan2.1-I2V-14B-720P-cfg
-# 检查参数
-if [ -z "${CUDA_VISIBLE_DEVICES}" ]; then
-    cuda_devices=2,3
-    echo "Warn: CUDA_VISIBLE_DEVICES is not set, using default value: ${cuda_devices}"
-    export CUDA_VISIBLE_DEVICES=${cuda_devices}
-fi
-if [ -z "${lightx2v_path}" ]; then
-    echo "Error: lightx2v_path is not set. Please set this variable first."
-    exit 1
-fi
-if [ -z "${model_path}" ]; then
-    echo "Error: model_path is not set. Please set this variable first."
-    exit 1
-fi
-# 设置环境变量
-export TOKENIZERS_PARALLELISM=false
-export PYTHONPATH=${lightx2v_path}:$PYTHONPATH
-export ENABLE_PROFILING_DEBUG=true
-export ENABLE_GRAPH_MODE=false
-export DTYPE=BF16
-echo "=========================================="
-echo "启动分布式推理API服务器"
-echo "模型路径: $model_path"
-echo "CUDA设备: $CUDA_VISIBLE_DEVICES"
-echo "API端口: 8000"
-echo "=========================================="
-# 启动API服务器，同时启动分布式推理服务
-python -m lightx2v.api_server_dist \
--model_cls wan2.1 \
--task i2v \
--model_path $model_path \
--config_json ${lightx2v_path}/configs/wan/wan_i2v_dist.json \
--port 8000 \
--start_inference \
--nproc_per_node 2
-echo "服务已停止"
--- a/scripts/server/start_server.sh
+++ b/scripts/server/start_server.sh
 #!/bin/bash
-# set path and first
+# Set paths
 lightx2v_path=
 model_path=
-# check section
+# Check parameters
 if [ -z "${CUDA_VISIBLE_DEVICES}" ]; then
    cuda_devices=0
-    echo "Warn: CUDA_VISIBLE_DEVICES is not set, using default value: ${cuda_devices}, change at shell script or set env variable."
+    echo "Warn: CUDA_VISIBLE_DEVICES is not set, using default value: ${cuda_devices}"
    export CUDA_VISIBLE_DEVICES=${cuda_devices}
 fi
@@ -21,17 +21,28 @@ if [ -z "${model_path}" ]; then
    exit 1
 fi
+# Set environment variables
 export TOKENIZERS_PARALLELISM=false
 export PYTHONPATH=${lightx2v_path}:$PYTHONPATH
 export ENABLE_PROFILING_DEBUG=true
 export ENABLE_GRAPH_MODE=false
 export DTYPE=BF16
+echo "=========================================="
+echo "Starting distributed inference API server"
+echo "Model path: $model_path"
+echo "CUDA devices: $CUDA_VISIBLE_DEVICES"
+echo "API port: 8000"
+echo "=========================================="
+# Start API server with distributed inference service
 python -m lightx2v.api_server \
 --model_cls wan2.1 \
--task t2v \
+--task i2v \
 --model_path $model_path \
--config_json ${lightx2v_path}/configs/wan/wan_t2v.json \
+--config_json ${lightx2v_path}/configs/wan/wan_i2v_dist.json \
--port 8000
+--port 8000 \
+--start_inference \
+--nproc_per_node 1
+echo "Service stopped"