Enable 720p model inference on low-spec GPUs/CPUs and accelerate T5/CLIP quantized models with vLLM operators
Co-authored-by: gushiqiao <gushiqiao@sensetime.com>