"A complete reference for the API is available in the [OpenAI API Reference](https://platform.openai.com/docs/guides/vision).\n",
"A complete reference for the API is available in the [OpenAI API Reference](https://platform.openai.com/docs/guides/vision).\n",
"This tutorial covers the vision APIs for vision language models.\n",
"This tutorial covers the vision APIs for vision language models.\n",
"\n",
"\n",
"SGLang supports vision language models such as Llama 3.2, LLaVA-OneVision, and QWen-VL2 \n",
"SGLang supports various vision language models such as Llama 3.2, LLaVA-OneVision, Qwen2.5-VL, Gemma3 and [more](https://docs.sglang.ai/references/supported_models): \n",
"As an alternative to the OpenAI API, you can also use the [SGLang offline engine](https://github.com/sgl-project/sglang/blob/main/examples/runtime/engine/offline_batch_inference_vlm.py)."
"As an alternative to the OpenAI API, you can also use the [SGLang offline engine](https://github.com/sgl-project/sglang/blob/main/examples/runtime/engine/offline_batch_inference_vlm.py)."
]
]
...
@@ -26,7 +29,7 @@
...
@@ -26,7 +29,7 @@
"\n",
"\n",
"Launch the server in your terminal and wait for it to initialize.\n",
"Launch the server in your terminal and wait for it to initialize.\n",
"\n",
"\n",
"**Remember to add** `--chat-template llama_3_vision` **to specify the vision chat template, otherwise the server only supports text, and performance degradation may occur.**\n",
"**Remember to add** `--chat-template llama_3_vision` **to specify the [vision chat template](https://docs.sglang.ai/backend/openai_api_vision.html#Chat-Template), otherwise, the server will only support text (images won’t be passed in), which can lead to degraded performance.**\n",
"\n",
"\n",
"We need to specify `--chat-template` for vision language models because the chat template provided in Hugging Face tokenizer only supports text."
"We need to specify `--chat-template` for vision language models because the chat template provided in Hugging Face tokenizer only supports text."
]
]
...
@@ -259,7 +262,10 @@
...
@@ -259,7 +262,10 @@
"We list popular vision models with their chat templates:\n",
"We list popular vision models with their chat templates:\n",