Commit 734a433d authored by zhuwenwen's avatar zhuwenwen
Browse files

update supported models

parent 58042164
...@@ -8,38 +8,40 @@ vLLM是一个快速且易于使用的LLM推理和服务库,使用PageAttention ...@@ -8,38 +8,40 @@ vLLM是一个快速且易于使用的LLM推理和服务库,使用PageAttention
## 支持模型结构列表 ## 支持模型结构列表
| 结构 | 模型 | FP16/BF16 | AWQ | GPTQ |
| :------: | :------: | :------: | :------: |:------: | | 结构 | 模型 | FP16/BF16 | AWQ | GPTQ | 支持版本 | 是否优化 |
| LlamaForCausalLM | Llama 3.2, Llama 3.1,Llama 3,Llama 2,Llama,Yi,Codellama,DeepSeek-R1-Distill-Llama | Yes | Yes | Yes | | :------: | :------: | :------: | :------: |:------: | :------: |:------: |
| Llama4ForConditionalGeneration | Llama 4 | No/Yes | - | - | | LlamaForCausalLM | Llama 3.2, Llama 3.1,Llama 3,Llama 2,Llama,Yi,Codellama,DeepSeek-R1-Distill-Llama | Yes | Yes | Yes | v0.5.0,Llama 3.2>=v0.6.2 | Yes |
| QWenLMHeadModel | QWen,Qwen-VL | Yes | Yes | Yes | | Llama4ForConditionalGeneration | Llama 4 | No/Yes | - | - | v0.8.5.post1 | No |
| Qwen2ForCausalLM | QWen2,QWen1.5,CodeQwen1.5,DeepSeek-R1-Distill-Qwen,gte_Qwen2-1.5B-instruct | Yes | Yes | Yes | | QWenLMHeadModel | QWen,Qwen-VL | Yes | Yes | Yes | v0.5.0,Qwen-VL>=v0.6.2 | Yes |
| Qwen3ForCausalLM | QWen3 | Yes | - | - | | Qwen2ForCausalLM | QWen2,QWen1.5,CodeQwen1.5,DeepSeek-R1-Distill-Qwen,gte_Qwen2-1.5B-instruct | Yes | Yes | Yes | v0.5.0,gte>=v0.7.2 | Yes |
| Qwen3MoeForCausalLM | QWen3MoE | Yes | - | - | | Qwen3ForCausalLM | QWen3 | Yes | - | - | v0.8.4 | Yes |
| ChatGLMModel | glm-4v-9b,chatglm3,chatglm2 | Yes | No | Yes | | Qwen3MoeForCausalLM | QWen3MoE | Yes | - | - | v0.8.4 | Yes |
| Glm4ForCausalLM | GLM-4-0414 | No/Yes | - | - | | ChatGLMModel | glm-4v-9b,chatglm3,chatglm2 | Yes | No | Yes | v0.5.0 | Yes |
| DeepseekForCausalLM | Deepseek | Yes | No | - | | Glm4ForCausalLM | GLM-4-0414 | No/Yes | - | - | v0.8.5.post1 | Yes |
| DeepseekV2ForCausalLM | DeepSeek-V2 | Yes | No | - | | DeepseekForCausalLM | Deepseek | Yes | No | - | v0.5.0 | Yes |
| DeepseekV3ForCausalLM | DeepSeek-V3 | Yes | Yes | - | | DeepseekV2ForCausalLM | DeepSeek-V2 | Yes | No | - | v0.6.2 | Yes |
| BaiChuanForCausalLM | Baichuan2,Baichuan | Yes | Yes | - | | DeepseekVLV2ForCausalLM | DeepSeek-VL2 | Yes | No | - | v0.7.2 | Yes |
| BloomForCausalLM | BLOOM | Yes | No | Yes | | DeepseekV3ForCausalLM | DeepSeek-V3 | Yes | Yes | - | v0.7.2 | Yes |
| InternLMForCausalLM | InternLM | Yes | No | - | | BaiChuanForCausalLM | Baichuan2,Baichuan | Yes | Yes | - | v0.5.0 | Yes |
| InternLM2ForCausalLM | InternLM2 | Yes | No | - | | BloomForCausalLM | BLOOM | Yes | No | Yes | v0.5.0 | Yes |
| FalconForCausalLM | falcon | Yes | No | Yes | | InternLMForCausalLM | InternLM | Yes | No | - | v0.5.0 | Yes |
| TeleChat2ForCausalLM | TeleChat2 | Yes | No | - | | InternLM2ForCausalLM | InternLM2 | Yes | No | - | v0.5.0 | Yes |
| MiniCPMForCausalLM | MiniCPM | Yes | No | - | | FalconForCausalLM | falcon | Yes | No | Yes | v0.5.0 | Yes |
| MiniCPM3ForCausalLM | MiniCPM3 | Yes | No | - | | TeleChat2ForCausalLM | TeleChat2 | Yes | No | - | v0.7.2 | Yes |
| MixtralForCausalLM | Mixtral-8x7B,Mixtral-8x7B-Instruct | Yes | No | - | | MiniCPMForCausalLM | MiniCPM | Yes | No | - | v0.5.0 | Yes |
| Qwen2MoeForCausalLM | Qwen2-57B-A14B,Qwen2-57B-A14B-Instruct | Yes | No | - | | MiniCPM3ForCausalLM | MiniCPM3 | Yes | No | - | v0.6.2 | Yes |
| LlavaForConditionalGeneration | LLaMA,LLaMA-2,LLaMA-3 | Yes | No | - | | MixtralForCausalLM | Mixtral-8x7B,Mixtral-8x7B-Instruct | Yes | No | - | v0.5.0 | Yes |
| Qwen2VLForConditionalGeneration | Qwen2-VL | Yes | No | Yes | | Qwen2MoeForCausalLM | Qwen2-57B-A14B,Qwen2-57B-A14B-Instruct | Yes | No | - | v0.5.0 | No |
| Qwen2_5_VLForConditionalGeneration | Qwen.5-VL | Yes | No | Yes | | LlavaForConditionalGeneration | LLaMA,LLaMA-2,LLaMA-3 | Yes | No | - | v0.6.2 | No |
| Gemma3ForConditionalGeneration | Gemma 3 | Yes | - | - | | Qwen2VLForConditionalGeneration | Qwen2-VL | Yes | No | Yes | v0.6.2 | No |
| MiniCPMV | MiniCPM-V | Yes | No | - | | Qwen2_5_VLForConditionalGeneration | Qwen.5-VL | Yes | No | Yes | v0.7.2 | No |
| Phi3VForCausalLM | Phi-3.5-vision | Yes | No | - | | Gemma3ForConditionalGeneration | Gemma 3 | Yes | - | - | v0.8.5.post1 | No |
| BertModel | bge-large-zh-v1.5 | Yes | No | - | | MiniCPMV | MiniCPM-V | Yes | No | - | v0.6.2 | No |
| XLMRobertaModel | bge-m3 | Yes | No | - | | Phi3VForCausalLM | Phi-3.5-vision | Yes | No | - | v0.6.2 | No |
| XLMRobertaForSequenceClassification | bge-reranker-v2-m3 | Yes | No | - | | BertModel | bge-large-zh-v1.5 | Yes | No | - | v0.7.2 | No |
| XLMRobertaModel | bge-m3 | Yes | No | - | v0.7.2 | No |
| XLMRobertaForSequenceClassification | bge-reranker-v2-m3 | Yes | No | - | v0.7.2 | No |
## 安装 ## 安装
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment