"This tutorial shows how to format model outputs using constrained decoding in SGLang."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Constrained Decoding\n",
"\n",
"With SGLang, You can define a JSON schema, EBNF or regular expression to constrain the model's output.\n",
"\n",
"[JSON Schema](https://json-schema.org/): Formats output into structured JSON objects with validation rules.\n",
"\n",
"[EBNF (Extended Backus-Naur Form)](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form): Defines complex syntax rules, especially for recursive patterns like nested structures.\n",
"\n",
"[Regular Expressions](https://en.wikipedia.org/wiki/Regular_expression): Matches text patterns for simple validation and formatting.\n",
"\n",
"### Constrained Decoding Backends\n",
"\n",
"SGLang has two backends: [Outlines](https://github.com/dottxt-ai/outlines) (default) and [XGrammar](https://blog.mlc.ai/2024/11/22/achieving-efficient-flexible-portable-structured-generation-with-xgrammar). We suggest using XGrammar whenever possible for its better performance. For more details, see [XGrammar technical overview](https://blog.mlc.ai/2024/11/22/achieving-efficient-flexible-portable-structured-generation-with-xgrammar).\n",
"\n",
"* Xgrammar Backend: JSON and EBNF\n",
"* Outlines Backend: JSON and regular expressions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## OpenAI Compatible API"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To use Xgrammar, simply add `--grammar-backend xgrammar` when launching the server. If no backend is specified, Outlines will be used as the default."