LMDeploy supports LLM model inference of 4-bit weight, with the minimum requirement for NVIDIA graphics cards being sm80, such as A10, A100, Geforce 30/40 series.
Before proceeding with the inference, please ensure that lmdeploy(>=v0.0.4) is installed.
Before proceeding with the inference, please ensure that lmdeploy is installed.