Unverified Commit 4592afc2 authored by simveit's avatar simveit Committed by GitHub
Browse files

Docs: Fix layout to docs (#3733)

parent 9af0e21e
...@@ -6,7 +6,7 @@ ...@@ -6,7 +6,7 @@
"source": [ "source": [
"# Tool and Function Calling\n", "# Tool and Function Calling\n",
"\n", "\n",
"This guide demonstrates how to use SGLang’s **Tool Calling** functionality." "This guide demonstrates how to use SGLang’s [Funcion calling](https://platform.openai.com/docs/guides/function-calling) functionality."
] ]
}, },
{ {
......
...@@ -15,7 +15,7 @@ ...@@ -15,7 +15,7 @@
"- `completions`\n", "- `completions`\n",
"- `batches`\n", "- `batches`\n",
"\n", "\n",
"Check out other tutorials to learn about vision APIs for vision-language models and embedding APIs for embedding models." "Check out other tutorials to learn about [vision APIs](https://docs.sglang.ai/backend/openai_api_vision.html) for vision-language models and [embedding APIs](https://docs.sglang.ai/backend/openai_api_embeddings.html) for embedding models."
] ]
}, },
{ {
......
...@@ -13,7 +13,9 @@ ...@@ -13,7 +13,9 @@
"SGLang supports vision language models such as Llama 3.2, LLaVA-OneVision, and QWen-VL2 \n", "SGLang supports vision language models such as Llama 3.2, LLaVA-OneVision, and QWen-VL2 \n",
"- [meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct) \n", "- [meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct) \n",
"- [lmms-lab/llava-onevision-qwen2-72b-ov-chat](https://huggingface.co/lmms-lab/llava-onevision-qwen2-72b-ov-chat) \n", "- [lmms-lab/llava-onevision-qwen2-72b-ov-chat](https://huggingface.co/lmms-lab/llava-onevision-qwen2-72b-ov-chat) \n",
"- [Qwen/Qwen2-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct) " "- [Qwen/Qwen2-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct) \n",
"\n",
"As an alternative to the OpenAI API, you can also use the [SGLang offline engine](https://github.com/sgl-project/sglang/blob/main/examples/runtime/engine/offline_batch_inference_vlm.py)."
] ]
}, },
{ {
......
...@@ -10,7 +10,7 @@ Online quantization dynamically computes scaling parameters—such as the maximu ...@@ -10,7 +10,7 @@ Online quantization dynamically computes scaling parameters—such as the maximu
## Offline Quantization ## Offline Quantization
To load already quantized models, simply load the model weights and config. **Again, if the model has been quantized offline, there's no need to add "--quantization" argument when starting the engine. The quantization method will be parsed from the downloaded Hugging Face config. For example, DeepSeek V3/R1 models are already in FP8, so do not add redundant parameters.** To load already quantized models, simply load the model weights and config. **Again, if the model has been quantized offline, there's no need to add `--quantization` argument when starting the engine. The quantization method will be parsed from the downloaded Hugging Face config. For example, DeepSeek V3/R1 models are already in FP8, so do not add redundant parameters.**
```bash ```bash
python3 -m sglang.launch_server \ python3 -m sglang.launch_server \
......
...@@ -4,7 +4,7 @@ SGLang provides several optimizations specifically designed for the DeepSeek mod ...@@ -4,7 +4,7 @@ SGLang provides several optimizations specifically designed for the DeepSeek mod
## Launch DeepSeek V3 with SGLang ## Launch DeepSeek V3 with SGLang
SGLang is recognized as one of the top engines for [DeepSeek model inference](https://github.com/sgl-project/sglang/tree/main/benchmark/deepseek_v3). SGLang is recognized as one of the top engines for [DeepSeek model inference](https://github.com/sgl-project/sglang/tree/main/benchmark/deepseek_v3). Refer to [installation and launch](https://github.com/sgl-project/sglang/tree/main/benchmark/deepseek_v3#installation--launch) to learn how to run fast inference of DeepSeek V3/R1 with SGLang.
### Download Weights ### Download Weights
......
...@@ -22,7 +22,7 @@ python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-405B-Instr ...@@ -22,7 +22,7 @@ python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-405B-Instr
## DeepSeek V3/R1 ## DeepSeek V3/R1
Please refer to [DeepSeek documents for reference.](https://docs.sglang.ai/references/deepseek.html#running-examples-on-multi-node). Please refer to [DeepSeek documents for reference](https://docs.sglang.ai/references/deepseek.html#running-examples-on-multi-node).
## Multi-Node Inference on SLURM ## Multi-Node Inference on SLURM
......
...@@ -7,14 +7,14 @@ The router is an independent Python package, and it can be used as a drop-in rep ...@@ -7,14 +7,14 @@ The router is an independent Python package, and it can be used as a drop-in rep
## Installation ## Installation
```bash ```bash
$ pip install sglang-router pip install sglang-router
``` ```
Detailed usage of the router can be found in [launch_router](https://github.com/sgl-project/sglang/blob/main/sgl-router/py_src/sglang_router/launch_router.py) and [launch_server](https://github.com/sgl-project/sglang/blob/main/sgl-router/py_src/sglang/launch_server.py). Also, you can directly run the following command to see the usage of the router. Detailed usage of the router can be found in [launch_router](https://github.com/sgl-project/sglang/blob/main/sgl-router/py_src/sglang_router/launch_router.py) and [launch_server](https://github.com/sgl-project/sglang/blob/main/sgl-router/py_src/sglang/launch_server.py). Also, you can directly run the following command to see the usage of the router.
```bash ```bash
$ python -m sglang_router.launch_server --help python -m sglang_router.launch_server --help
$ python -m sglang_router.launch_router --help python -m sglang_router.launch_router --help
``` ```
The router supports two working modes: The router supports two working modes:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment