Unverified Commit a5e0defb authored by Xuehai Pan's avatar Xuehai Pan Committed by GitHub
Browse files

minor: Add basic editorconfig and pre-commit hooks to enforce style for whitespaces (#1926)

parent 96766101
File mode changed from 100644 to 100755
...@@ -30,4 +30,4 @@ python3 bench_other.py --backend guidance --num-questions 25 --parallel 1 --n-ct ...@@ -30,4 +30,4 @@ python3 bench_other.py --backend guidance --num-questions 25 --parallel 1 --n-ct
``` ```
python3 bench_other.py --backend lmql --num-questions 25 --parallel 1 python3 bench_other.py --backend lmql --num-questions 25 --parallel 1
``` ```
\ No newline at end of file
wget https://people.eecs.berkeley.edu/~hendrycks/data.tar wget https://people.eecs.berkeley.edu/~hendrycks/data.tar
tar xf data.tar tar xf data.tar
\ No newline at end of file
...@@ -43,7 +43,7 @@ python3 bench_other.py --num-questions 8 --backend guidance --parallel 1 --n-ctx ...@@ -43,7 +43,7 @@ python3 bench_other.py --num-questions 8 --backend guidance --parallel 1 --n-ctx
``` ```
### Benchmark lmql ### Benchmark lmql
``` ```
python3 bench_other.py --num-questions 64 --backend lmql --parallel 1 python3 bench_other.py --num-questions 64 --backend lmql --parallel 1
``` ```
...@@ -31,4 +31,4 @@ python3 bench_other.py --num-questions 100 --backend guidance --parallel 1 --n-c ...@@ -31,4 +31,4 @@ python3 bench_other.py --num-questions 100 --backend guidance --parallel 1 --n-c
``` ```
python3 bench_other.py --num-questions 100 --backend lmql --parallel 1 python3 bench_other.py --num-questions 100 --backend lmql --parallel 1
``` ```
\ No newline at end of file
...@@ -11,7 +11,7 @@ from sglang.utils import dump_state_text, read_jsonl ...@@ -11,7 +11,7 @@ from sglang.utils import dump_state_text, read_jsonl
def get_prompt(question): def get_prompt(question):
prompt = ( prompt = (
"""Solve a question answering task with interleaving Thought, Action, Observation steps. Thought can reason about the current situation, and Action can be three types: """Solve a question answering task with interleaving Thought, Action, Observation steps. Thought can reason about the current situation, and Action can be three types:
(1) Search[entity], which searches the exact entity on Wikipedia and returns the first paragraph if it exists. If not, it will return some similar entities to search. (1) Search[entity], which searches the exact entity on Wikipedia and returns the first paragraph if it exists. If not, it will return some similar entities to search.
(2) Lookup[keyword], which returns the next sentence containing keyword in the current passage. (2) Lookup[keyword], which returns the next sentence containing keyword in the current passage.
(3) Finish[answer], which returns the answer and finishes the task. (3) Finish[answer], which returns the answer and finishes the task.
...@@ -37,7 +37,7 @@ Action 1: Search[Milhouse] ...@@ -37,7 +37,7 @@ Action 1: Search[Milhouse]
Observation 1: Milhouse Mussolini Van Houten is a recurring character in the Fox animated television series The Simpsons voiced by Pamela Hayden and created by Matt Groening. Observation 1: Milhouse Mussolini Van Houten is a recurring character in the Fox animated television series The Simpsons voiced by Pamela Hayden and created by Matt Groening.
Thought 2: The paragraph does not tell who Milhouse is named after, maybe I can look up "named after". Thought 2: The paragraph does not tell who Milhouse is named after, maybe I can look up "named after".
Action 2: Lookup[named after] Action 2: Lookup[named after]
Observation 2: (Result 1 / 1) Milhouse was named after U.S. president Richard Nixon, whose middle name was Milhous. Observation 2: (Result 1 / 1) Milhouse was named after U.S. president Richard Nixon, whose middle name was Milhous.
Thought 3: Milhouse was named after U.S. president Richard Nixon, so the answer is Richard Nixon. Thought 3: Milhouse was named after U.S. president Richard Nixon, so the answer is Richard Nixon.
Action 3: Finish[Richard Nixon] Action 3: Finish[Richard Nixon]
Question: Which documentary is about Finnish rock groups, Adam Clayton Powell or The Saimaa Gesture? Question: Which documentary is about Finnish rock groups, Adam Clayton Powell or The Saimaa Gesture?
...@@ -62,10 +62,10 @@ Action 3: Finish[director, screenwriter, actor] ...@@ -62,10 +62,10 @@ Action 3: Finish[director, screenwriter, actor]
Question: Which magazine was started first Arthur's Magazine or First for Women? Question: Which magazine was started first Arthur's Magazine or First for Women?
Thought 1: I need to search Arthur's Magazine and First for Women, and find which was started first. Thought 1: I need to search Arthur's Magazine and First for Women, and find which was started first.
Action 1: Search[Arthur's Magazine] Action 1: Search[Arthur's Magazine]
Observation 1: Arthur's Magazine (1844-1846) was an American literary periodical published in Philadelphia in the 19th century. Observation 1: Arthur's Magazine (1844-1846) was an American literary periodical published in Philadelphia in the 19th century.
Thought 2: Arthur's Magazine was started in 1844. I need to search First for Women next. Thought 2: Arthur's Magazine was started in 1844. I need to search First for Women next.
Action 2: Search[First for Women] Action 2: Search[First for Women]
Observation 2: First for Women is a woman's magazine published by Bauer Media Group in the USA.[1] The magazine was started in 1989. Observation 2: First for Women is a woman's magazine published by Bauer Media Group in the USA.[1] The magazine was started in 1989.
Thought 3: First for Women was started in 1989. 1844 (Arthur's Magazine) < 1989 (First for Women), so Arthur's Magazine was started first. Thought 3: First for Women was started in 1989. 1844 (Arthur's Magazine) < 1989 (First for Women), so Arthur's Magazine was started first.
Action 3: Finish[Arthur's Magazine] Action 3: Finish[Arthur's Magazine]
Question: Were Pavel Urysohn and Leonid Levin known for the same type of work? Question: Were Pavel Urysohn and Leonid Levin known for the same type of work?
...@@ -74,8 +74,8 @@ Action 1: Search[Pavel Urysohn] ...@@ -74,8 +74,8 @@ Action 1: Search[Pavel Urysohn]
Observation 1: Pavel Samuilovich Urysohn (February 3, 1898 â August 17, 1924) was a Soviet mathematician who is best known for his contributions in dimension theory. Observation 1: Pavel Samuilovich Urysohn (February 3, 1898 â August 17, 1924) was a Soviet mathematician who is best known for his contributions in dimension theory.
Thought 2: Pavel Urysohn is a mathematician. I need to search Leonid Levin next and find its type of work. Thought 2: Pavel Urysohn is a mathematician. I need to search Leonid Levin next and find its type of work.
Action 2: Search[Leonid Levin] Action 2: Search[Leonid Levin]
Observation 2: Leonid Anatolievich Levin is a Soviet-American mathematician and computer scientist. Observation 2: Leonid Anatolievich Levin is a Soviet-American mathematician and computer scientist.
Thought 3: Leonid Levin is a mathematician and computer scientist. So Pavel Urysohn and Leonid Levin have the same type of work. Thought 3: Leonid Levin is a mathematician and computer scientist. So Pavel Urysohn and Leonid Levin have the same type of work.
Action 3: Finish[yes] Action 3: Finish[yes]
""" """
+ question + question
......
...@@ -13,7 +13,7 @@ from sglang.utils import dump_state_text, read_jsonl ...@@ -13,7 +13,7 @@ from sglang.utils import dump_state_text, read_jsonl
@sgl.function @sgl.function
def webthink(s, question, triplets): def webthink(s, question, triplets):
s += ( s += (
"""Solve a question answering task with interleaving Thought, Action, Observation steps. Thought can reason about the current situation, and Action can be three types: """Solve a question answering task with interleaving Thought, Action, Observation steps. Thought can reason about the current situation, and Action can be three types:
(1) Search[entity], which searches the exact entity on Wikipedia and returns the first paragraph if it exists. If not, it will return some similar entities to search. (1) Search[entity], which searches the exact entity on Wikipedia and returns the first paragraph if it exists. If not, it will return some similar entities to search.
(2) Lookup[keyword], which returns the next sentence containing keyword in the current passage. (2) Lookup[keyword], which returns the next sentence containing keyword in the current passage.
(3) Finish[answer], which returns the answer and finishes the task. (3) Finish[answer], which returns the answer and finishes the task.
...@@ -39,7 +39,7 @@ Action 1: Search[Milhouse] ...@@ -39,7 +39,7 @@ Action 1: Search[Milhouse]
Observation 1: Milhouse Mussolini Van Houten is a recurring character in the Fox animated television series The Simpsons voiced by Pamela Hayden and created by Matt Groening. Observation 1: Milhouse Mussolini Van Houten is a recurring character in the Fox animated television series The Simpsons voiced by Pamela Hayden and created by Matt Groening.
Thought 2: The paragraph does not tell who Milhouse is named after, maybe I can look up "named after". Thought 2: The paragraph does not tell who Milhouse is named after, maybe I can look up "named after".
Action 2: Lookup[named after] Action 2: Lookup[named after]
Observation 2: (Result 1 / 1) Milhouse was named after U.S. president Richard Nixon, whose middle name was Milhous. Observation 2: (Result 1 / 1) Milhouse was named after U.S. president Richard Nixon, whose middle name was Milhous.
Thought 3: Milhouse was named after U.S. president Richard Nixon, so the answer is Richard Nixon. Thought 3: Milhouse was named after U.S. president Richard Nixon, so the answer is Richard Nixon.
Action 3: Finish[Richard Nixon] Action 3: Finish[Richard Nixon]
Question: Which documentary is about Finnish rock groups, Adam Clayton Powell or The Saimaa Gesture? Question: Which documentary is about Finnish rock groups, Adam Clayton Powell or The Saimaa Gesture?
...@@ -64,10 +64,10 @@ Action 3: Finish[director, screenwriter, actor] ...@@ -64,10 +64,10 @@ Action 3: Finish[director, screenwriter, actor]
Question: Which magazine was started first Arthur's Magazine or First for Women? Question: Which magazine was started first Arthur's Magazine or First for Women?
Thought 1: I need to search Arthur's Magazine and First for Women, and find which was started first. Thought 1: I need to search Arthur's Magazine and First for Women, and find which was started first.
Action 1: Search[Arthur's Magazine] Action 1: Search[Arthur's Magazine]
Observation 1: Arthur's Magazine (1844-1846) was an American literary periodical published in Philadelphia in the 19th century. Observation 1: Arthur's Magazine (1844-1846) was an American literary periodical published in Philadelphia in the 19th century.
Thought 2: Arthur's Magazine was started in 1844. I need to search First for Women next. Thought 2: Arthur's Magazine was started in 1844. I need to search First for Women next.
Action 2: Search[First for Women] Action 2: Search[First for Women]
Observation 2: First for Women is a woman's magazine published by Bauer Media Group in the USA.[1] The magazine was started in 1989. Observation 2: First for Women is a woman's magazine published by Bauer Media Group in the USA.[1] The magazine was started in 1989.
Thought 3: First for Women was started in 1989. 1844 (Arthur's Magazine) < 1989 (First for Women), so Arthur's Magazine was started first. Thought 3: First for Women was started in 1989. 1844 (Arthur's Magazine) < 1989 (First for Women), so Arthur's Magazine was started first.
Action 3: Finish[Arthur's Magazine] Action 3: Finish[Arthur's Magazine]
Question: Were Pavel Urysohn and Leonid Levin known for the same type of work? Question: Were Pavel Urysohn and Leonid Levin known for the same type of work?
...@@ -76,8 +76,8 @@ Action 1: Search[Pavel Urysohn] ...@@ -76,8 +76,8 @@ Action 1: Search[Pavel Urysohn]
Observation 1: Pavel Samuilovich Urysohn (February 3, 1898 â August 17, 1924) was a Soviet mathematician who is best known for his contributions in dimension theory. Observation 1: Pavel Samuilovich Urysohn (February 3, 1898 â August 17, 1924) was a Soviet mathematician who is best known for his contributions in dimension theory.
Thought 2: Pavel Urysohn is a mathematician. I need to search Leonid Levin next and find its type of work. Thought 2: Pavel Urysohn is a mathematician. I need to search Leonid Levin next and find its type of work.
Action 2: Search[Leonid Levin] Action 2: Search[Leonid Levin]
Observation 2: Leonid Anatolievich Levin is a Soviet-American mathematician and computer scientist. Observation 2: Leonid Anatolievich Levin is a Soviet-American mathematician and computer scientist.
Thought 3: Leonid Levin is a mathematician and computer scientist. So Pavel Urysohn and Leonid Levin have the same type of work. Thought 3: Leonid Levin is a mathematician and computer scientist. So Pavel Urysohn and Leonid Levin have the same type of work.
Action 3: Finish[yes] Action 3: Finish[yes]
""" """
+ question + question
......
!topic.jsonl !topic.jsonl
\ No newline at end of file
...@@ -30,4 +30,4 @@ python3 bench_other.py --backend guidance --num-questions 32 --parallel 1 --n-ct ...@@ -30,4 +30,4 @@ python3 bench_other.py --backend guidance --num-questions 32 --parallel 1 --n-ct
``` ```
python3 bench_other.py --backend lmql --num-questions 32 --parallel 1 python3 bench_other.py --backend lmql --num-questions 32 --parallel 1
``` ```
\ No newline at end of file
...@@ -47,4 +47,4 @@ ...@@ -47,4 +47,4 @@
{"topic": "self-publishing a book", "number": 7} {"topic": "self-publishing a book", "number": 7}
{"topic": "starting an urban farm", "number": 6} {"topic": "starting an urban farm", "number": 6}
{"topic": "improving your memory", "number": 8} {"topic": "improving your memory", "number": 8}
{"topic": "creating a personal brand online", "number": 9} {"topic": "creating a personal brand online", "number": 9}
\ No newline at end of file
...@@ -31,4 +31,4 @@ Clone [sgl-project.github.io](https://github.com/sgl-project/sgl-project.github. ...@@ -31,4 +31,4 @@ Clone [sgl-project.github.io](https://github.com/sgl-project/sgl-project.github.
```bash ```bash
export DOC_SITE_PATH=../../sgl-project.github.io # update this with your path export DOC_SITE_PATH=../../sgl-project.github.io # update this with your path
python3 deploy.py python3 deploy.py
``` ```
\ No newline at end of file
...@@ -5,13 +5,13 @@ ...@@ -5,13 +5,13 @@
table.autosummary td { table.autosummary td {
width: 50% width: 50%
} }
img.align-center { img.align-center {
display: block; display: block;
margin-left: auto; margin-left: auto;
margin-right: auto; margin-right: auto;
} }
.output_area.stderr { .output_area.stderr {
color: #d3d3d3 !important; color: #d3d3d3 !important;
} }
...@@ -26,4 +26,4 @@ div.output_area.stderr { ...@@ -26,4 +26,4 @@ div.output_area.stderr {
div.output_area.stdout { div.output_area.stdout {
color: #d3d3d3 !important; color: #d3d3d3 !important;
} }
\ No newline at end of file
...@@ -147,7 +147,7 @@ docker run --gpus all \ ...@@ -147,7 +147,7 @@ docker run --gpus all \
lmsysorg/sglang:latest \ lmsysorg/sglang:latest \
python3 -m sglang.launch_server --model-path Qwen/Qwen2.5-7B-Instruct --host 0.0.0.0 --port 30000 python3 -m sglang.launch_server --model-path Qwen/Qwen2.5-7B-Instruct --host 0.0.0.0 --port 30000
``` ```
</details> </details>
## Example: Run Llama 3.1 405B ## Example: Run Llama 3.1 405B
......
...@@ -198,4 +198,4 @@ nbsphinx_prolog = """ ...@@ -198,4 +198,4 @@ nbsphinx_prolog = """
color: #d3d3d3 !important; /* light gray */ color: #d3d3d3 !important; /* light gray */
} }
</style> </style>
""" """
\ No newline at end of file
# Deploy the documents # Deploy the documents
import os import os
from datetime import datetime from datetime import datetime
def run_cmd(cmd): def run_cmd(cmd):
print(cmd) print(cmd)
os.system(cmd) os.system(cmd)
run_cmd("cd $DOC_SITE_PATH; git pull") run_cmd("cd $DOC_SITE_PATH; git pull")
# (Optional) Remove old files # (Optional) Remove old files
# run_cmd("rm -rf $ALPA_SITE_PATH/*") # run_cmd("rm -rf $ALPA_SITE_PATH/*")
run_cmd("cp -r _build/html/* $DOC_SITE_PATH") run_cmd("cp -r _build/html/* $DOC_SITE_PATH")
cmd_message = f"Update {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}" cmd_message = f"Update {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"
run_cmd( run_cmd(
f"cd $DOC_SITE_PATH; git add .; git commit -m '{cmd_message}'; git push origin main" f"cd $DOC_SITE_PATH; git add .; git commit -m '{cmd_message}'; git push origin main"
) )
...@@ -74,4 +74,4 @@ def example(s): ...@@ -74,4 +74,4 @@ def example(s):
choices_method=sgl.unconditional_likelihood_normalized, choices_method=sgl.unconditional_likelihood_normalized,
) )
) )
``` ```
\ No newline at end of file
...@@ -37,4 +37,4 @@ You can also use the Jinja template format, defined by Hugging Face transformers ...@@ -37,4 +37,4 @@ You can also use the Jinja template format, defined by Hugging Face transformers
``` ```
python -m sglang.launch_server --model-path meta-llama/Llama-2-7b-chat-hf --port 30000 --chat-template ./my_model_template.jinja python -m sglang.launch_server --model-path meta-llama/Llama-2-7b-chat-hf --port 30000 --chat-template ./my_model_template.jinja
``` ```
\ No newline at end of file
...@@ -25,9 +25,9 @@ If you see `decode out of memory happened` occasionally but not frequently, it i ...@@ -25,9 +25,9 @@ If you see `decode out of memory happened` occasionally but not frequently, it i
Data parallelism is better for throughput. When there is enough GPU memory, always favor data parallelism for throughput. Data parallelism is better for throughput. When there is enough GPU memory, always favor data parallelism for throughput.
### Avoid out-of-memory by Tuning `--chunked-prefill-size`, `--mem-fraction-static`, `--max-running-requests` ### Avoid out-of-memory by Tuning `--chunked-prefill-size`, `--mem-fraction-static`, `--max-running-requests`
If you see out of memory (OOM) errors, you can try to tune the following parameters. If you see out of memory (OOM) errors, you can try to tune the following parameters.
If OOM happens during prefill, try to decrease `--chunked-prefill-size` to `4096` or `2048`. If OOM happens during prefill, try to decrease `--chunked-prefill-size` to `4096` or `2048`.
If OOM happens during decoding, try to decrease `--max-running-requests`. If OOM happens during decoding, try to decrease `--max-running-requests`.
You can also try to decrease `--mem-fraction-static`, which reduces the memory usage of the KV cache memory pool and helps both prefill and decoding. You can also try to decrease `--mem-fraction-static`, which reduces the memory usage of the KV cache memory pool and helps both prefill and decoding.
### Try Advanced Options ### Try Advanced Options
......
# Learn more # Learn more
You can find more blogs, slides, and videos about SGLang at [https://github.com/sgl-project/sgl-learning-materials](https://github.com/sgl-project/sgl-learning-materials). You can find more blogs, slides, and videos about SGLang at [https://github.com/sgl-project/sgl-learning-materials](https://github.com/sgl-project/sgl-learning-materials).
\ No newline at end of file
...@@ -223,4 +223,4 @@ response = requests.post( ...@@ -223,4 +223,4 @@ response = requests.post(
}, },
) )
print(response.json()) print(response.json())
``` ```
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment