## Multi-Modal Documentation ### 📚 Tutorial 1. [MLLM Deployment Documentation](mutlimodal-deployment.md) ### Multi-Modal Best Practice A single round of dialogue can contain multiple images (or no images): 1. [Qwen-VL Best Practice](qwen-vl-best-practice.md) 2. [Qwen-Audio Best Practice](qwen-audio-best-practice.md) 3. [Deepseek-VL Best Practice](deepseek-vl-best-practice.md) 4. [Internlm2-Xcomposers Best Practice](internlm-xcomposer2-best-practice.md) 5. [Phi3-Vision Best Practice](phi3-vision-best-practice.md) A single round of dialogue can only contain one image: 1. [Llava Best Practice](llava-best-practice.md) 2. [Yi-VL Best Practice.md](yi-vl-best-practice.md) The entire conversation revolves around one image. 1. [CogVLM Best Practice](cogvlm-best-practice.md), [CogVLM2 Best Practice](cogvlm2-best-practice.md), [GLM4V Best Practice](glm4v-best-practice.md) 2. [MiniCPM-V Best Practice](minicpm-v-best-practice.md) 3. [InternVL-Chat-V1.5 Best Practice](internvl-best-practice.md)