Commit 2134f089 authored by Lianmin Zheng's avatar Lianmin Zheng
Browse files

Fix links in the docs (#1878)

parent a54f278d
...@@ -20,7 +20,7 @@ curl http://localhost:30000/generate \ ...@@ -20,7 +20,7 @@ curl http://localhost:30000/generate \
}' }'
``` ```
Learn more about the argument specification, streaming, and multi-modal support [here](https://sgl-project.github.io/sampling_params.html). Learn more about the argument specification, streaming, and multi-modal support [here](https://sgl-project.github.io/references/sampling_params.html).
## OpenAI Compatible API ## OpenAI Compatible API
In addition, the server supports OpenAI-compatible APIs. In addition, the server supports OpenAI-compatible APIs.
...@@ -74,7 +74,7 @@ python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct ...@@ -74,7 +74,7 @@ python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct
``` ```
python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct --mem-fraction-static 0.7 python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct --mem-fraction-static 0.7
``` ```
- See [hyperparameter tuning](https://sgl-project.github.io/hyperparameter_tuning.html) on tuning hyperparameters for better performance. - See [hyperparameter tuning](https://sgl-project.github.io/references/hyperparameter_tuning.html) on tuning hyperparameters for better performance.
- If you see out-of-memory errors during prefill for long prompts, try to set a smaller chunked prefill size. - If you see out-of-memory errors during prefill for long prompts, try to set a smaller chunked prefill size.
``` ```
python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct --chunked-prefill-size 4096 python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct --chunked-prefill-size 4096
...@@ -161,7 +161,7 @@ You can view the full example [here](https://github.com/sgl-project/sglang/tree/ ...@@ -161,7 +161,7 @@ You can view the full example [here](https://github.com/sgl-project/sglang/tree/
- gte-Qwen2 - gte-Qwen2
- `python -m sglang.launch_server --model-path Alibaba-NLP/gte-Qwen2-7B-instruct --is-embedding` - `python -m sglang.launch_server --model-path Alibaba-NLP/gte-Qwen2-7B-instruct --is-embedding`
Instructions for supporting a new model are [here](https://sgl-project.github.io/model_support.html). Instructions for supporting a new model are [here](https://sgl-project.github.io/references/model_support.html).
### Use Models From ModelScope ### Use Models From ModelScope
<details> <details>
......
.. _custom-chat-template:
# Custom Chat Template in SGLang Runtime # Custom Chat Template in SGLang Runtime
**NOTE**: There are two chat template systems in SGLang project. This document is about setting a custom chat template for the OpenAI-compatible API server (defined at [conversation.py](https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/conversation.py)). It is NOT related to the chat template used in the SGLang language frontend (defined at [chat_template.py](https://github.com/sgl-project/sglang/blob/main/python/sglang/lang/chat_template.py)). **NOTE**: There are two chat template systems in SGLang project. This document is about setting a custom chat template for the OpenAI-compatible API server (defined at [conversation.py](https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/conversation.py)). It is NOT related to the chat template used in the SGLang language frontend (defined at [chat_template.py](https://github.com/sgl-project/sglang/blob/main/python/sglang/lang/chat_template.py)).
......
Here’s the text with corrected grammar and refined phrasing in U.S. English:
# Frequently Asked Questions # Frequently Asked Questions
## The results are not deterministic, even with a temperature of 0 ## The results are not deterministic, even with a temperature of 0
...@@ -14,4 +12,4 @@ We are still investigating the root causes and potential solutions. In the short ...@@ -14,4 +12,4 @@ We are still investigating the root causes and potential solutions. In the short
We have two issues to track our progress: We have two issues to track our progress:
- The deterministic mode is tracked at [https://github.com/sgl-project/sglang/issues/1729](https://github.com/sgl-project/sglang/issues/1729). - The deterministic mode is tracked at [https://github.com/sgl-project/sglang/issues/1729](https://github.com/sgl-project/sglang/issues/1729).
- The per-request random seed is tracked at [https://github.com/sgl-project/sglang/issues/1335](https://github.com/sgl-project/sglang/issues/1335). - The per-request random seed is tracked at [https://github.com/sgl-project/sglang/issues/1335](https://github.com/sgl-project/sglang/issues/1335).
\ No newline at end of file
.. _sampling-parameters:
# Sampling Parameters in SGLang Runtime # Sampling Parameters in SGLang Runtime
This doc describes the sampling parameters of the SGLang Runtime. This doc describes the sampling parameters of the SGLang Runtime.
It is the low-level endpoint of the runtime. It is the low-level endpoint of the runtime.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment