Enable 720p model inference on low-spec GPUs/CPUs and accelerate T5/CLIP quantized models with vLLM operators