"src/vscode:/vscode.git/clone" did not exist on "aab6de22c33cc01fb7bc81c0807d6109e2c998c9"
Unverified Commit a5e0defb authored by Xuehai Pan's avatar Xuehai Pan Committed by GitHub
Browse files

minor: Add basic editorconfig and pre-commit hooks to enforce style for whitespaces (#1926)

parent 96766101
# https://editorconfig.org/
root = true
[*]
charset = utf-8
end_of_line = lf
indent_style = space
indent_size = 4
trim_trailing_whitespace = true
insert_final_newline = true
[*.{json,yaml,yml}]
indent_size = 2
[*.md]
indent_size = 2
x-soft-wrap-text = true
[*.rst]
indent_size = 4
x-soft-wrap-text = true
[Makefile]
indent_style = tab
......@@ -12,4 +12,4 @@
- [ ] Format your code according to the [Contributor Guide](https://github.com/sgl-project/sglang/blob/main/docs/contributor_guide.md).
- [ ] Add unit tests as outlined in the [Contributor Guide](https://github.com/sgl-project/sglang/blob/main/docs/contributor_guide.md).
- [ ] Update documentation as needed, including docstrings or example tutorials.
\ No newline at end of file
- [ ] Update documentation as needed, including docstrings or example tutorials.
......@@ -20,10 +20,10 @@ jobs:
github-token: ${{secrets.GITHUB_TOKEN}}
script: |
const sixtyDaysAgo = new Date(Date.now() - 60 * 24 * 60 * 60 * 1000);
const [owner, repo] = process.env.GITHUB_REPOSITORY.split('/');
console.log(`Owner: ${owner}, Repo: ${repo}`);
async function fetchIssues(page = 1) {
console.log(`Fetching issues for ${owner}/${repo}, page ${page}`);
return await github.rest.issues.listForRepo({
......@@ -36,23 +36,23 @@ jobs:
page: page
});
}
async function processIssues() {
console.log('Starting to process issues');
console.log(`Repository: ${owner}/${repo}`);
let page = 1;
let hasMoreIssues = true;
while (hasMoreIssues) {
try {
const issues = await fetchIssues(page);
console.log(`Fetched ${issues.data.length} issues on page ${page}`);
if (issues.data.length === 0) {
hasMoreIssues = false;
break;
}
for (const issue of issues.data) {
if (new Date(issue.updated_at) < sixtyDaysAgo) {
try {
......@@ -87,5 +87,5 @@ jobs:
}
console.log('Finished processing issues');
}
await processIssues();
......@@ -18,7 +18,7 @@ concurrency:
group: execute-notebook-${{ github.ref }}
cancel-in-progress: true
jobs:
run-all-notebooks:
runs-on: 1-gpu-runner
......@@ -45,4 +45,4 @@ jobs:
run: |
cd docs
make clean
make compile
\ No newline at end of file
make compile
......@@ -36,4 +36,4 @@ jobs:
run: |
source "$HOME/.cargo/env"
cd rust/
cargo test
\ No newline at end of file
cargo test
......@@ -237,7 +237,7 @@ jobs:
run: |
cd test/srt
python3 test_moe_eval_accuracy_large.py
- name: Evaluate MLA Accuracy (TP=2)
timeout-minutes: 10
run: |
......
......@@ -47,7 +47,7 @@ jobs:
make html
cd _build/html
git clone https://$GITHUB_TOKEN@github.com/sgl-project/sgl-project.github.io.git ../sgl-project.github.io --depth 1
rm -rf ../sgl-project.github.io/*
cp -r * ../sgl-project.github.io
......
......@@ -185,4 +185,4 @@ tmp*.txt
work_dirs/
*.csv
!logo.png
\ No newline at end of file
!logo.png
[settings]
profile=black
known_first_party=sglang
\ No newline at end of file
known_first_party=sglang
default_language_version:
python: python3.9
default_stages: [pre-commit, pre-push, manual]
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: check-symlinks
- id: destroyed-symlinks
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
args: [--allow-multiple-documents]
- id: check-toml
- id: check-ast
- id: check-added-large-files
- id: check-merge-conflict
- id: check-executables-have-shebangs
- id: check-shebang-scripts-are-executable
- id: detect-private-key
- id: debug-statements
- id: no-commit-to-branch
- repo: https://github.com/PyCQA/isort
rev: 5.13.2
hooks:
......@@ -13,8 +33,3 @@ repos:
additional_dependencies: ['.[jupyter]']
types: [python, jupyter]
types_or: [python, jupyter]
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: no-commit-to-branch
\ No newline at end of file
......@@ -6,5 +6,3 @@ Two primary methods are covered:
- [Torch Profiler](https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html)
......@@ -29,18 +29,18 @@ def _triton_kernel_funtion():
...
```
## 2. Torch Tunable Operations
**TunableOp** is a feature in PyTorch that allows for the definition and optimization of custom kernels with tunable parameters. This feature is particularly useful for enhancing the performance of kernels by experimenting with different configurations.
**TunableOp** is a feature in PyTorch that allows for the definition and optimization of custom kernels with tunable parameters. This feature is particularly useful for enhancing the performance of kernels by experimenting with different configurations.
### Key Environment Variables:
1. **PYTORCH_TUNABLEOP_ENABLED**:
1. **PYTORCH_TUNABLEOP_ENABLED**:
- Default: `0`
- Set to `1` to enable TunableOp.
2. **PYTORCH_TUNABLEOP_TUNING**:
2. **PYTORCH_TUNABLEOP_TUNING**:
- Default: `1`
- Set to `0` to disable tuning. If a tuned entry is not found, it will run the tuning step and record the entry when PYTORCH_TUNABLEOP_ENABLED is enabled.
3. **PYTORCH_TUNABLEOP_VERBOSE**:
3. **PYTORCH_TUNABLEOP_VERBOSE**:
- Default: `0`
- Set to `1` to enable verbose output for TunableOp.
......@@ -66,20 +66,20 @@ The following are suggestions for optimizing matrix multiplication (GEMM) and co
To tune Triton kernels with GEMM and convolution ops (conv), use the `torch.compile` function with the max-autotune mode. This benchmarks a predefined list of Triton configurations and selects the fastest one for each shape.
### Key Configurations:
1. **Max Autotune**:
1. **Max Autotune**:
- Set `torch._inductor.config.max_autotune = True` or `TORCHINDUCTOR_MAX_AUTOTUNE=1`.
2. **Fine-Grained Control**:
- Enable GEMM tuning: `torch._inductor.config.max_autotune_gemm = True`.
- Enable tuning for pointwise/reduction ops: `torch._inductor.config.max_autotune.pointwise = True`.
3. **Backend Selection**:
3. **Backend Selection**:
- Use `torch._inductor.max_autotune_gemm_backends` to limit backends to TRITON for better performance.
4. **Freezing for Inference**:
4. **Freezing for Inference**:
- Use `torch._inductor.config.freezing=True` to enable constant folding optimizations.
5. **Debugging**:
5. **Debugging**:
- Set `TORCH_COMPILE_DEBUG=1` to extract Triton kernels generated by Inductor.
### Example Code Block:
......@@ -98,4 +98,4 @@ TORCHINDUCTOR_FREEZING=1 your_script.sh
For more detailed information on tuning SGLang performance with AMD GPUs, please refer to the following link:
[ROCm Documentation: Triton Kernel Performance Optimization](https://rocm.docs.amd.com/en/latest/how-to/tuning-guides/mi300x/workload.html#triton-kernel-performance-optimization)
\ No newline at end of file
[ROCm Documentation: Triton Kernel Performance Optimization](https://rocm.docs.amd.com/en/latest/how-to/tuning-guides/mi300x/workload.html#triton-kernel-performance-optimization)
......@@ -21,4 +21,4 @@ python3 -m sglang.bench_serving --backend sglang --dataset-name random --num-pro
python3 -m sglang.bench_serving --backend sglang --dataset-name random --num-prompt 600 --request-rate 2 --random-input 1024 --random-output 1024 > sglang_log32
python3 -m sglang.bench_serving --backend sglang --dataset-name random --num-prompt 1200 --request-rate 4 --random-input 1024 --random-output 1024 > sglang_log33
python3 -m sglang.bench_serving --backend sglang --dataset-name random --num-prompt 2400 --request-rate 8 --random-input 1024 --random-output 1024 > sglang_log34
python3 -m sglang.bench_serving --backend sglang --dataset-name random --num-prompt 3200 --request-rate 16 --random-input 1024 --random-output 1024 > sglang_log35
\ No newline at end of file
python3 -m sglang.bench_serving --backend sglang --dataset-name random --num-prompt 3200 --request-rate 16 --random-input 1024 --random-output 1024 > sglang_log35
......@@ -21,4 +21,4 @@ python3 ../../python/sglang/bench_serving.py --backend vllm --dataset-name rando
python3 ../../python/sglang/bench_serving.py --backend vllm --dataset-name random --num-prompt 600 --request-rate 2 --random-input 1024 --random-output 1024 > vllm_log32
python3 ../../python/sglang/bench_serving.py --backend vllm --dataset-name random --num-prompt 1200 --request-rate 4 --random-input 1024 --random-output 1024 > vllm_log33
python3 ../../python/sglang/bench_serving.py --backend vllm --dataset-name random --num-prompt 2400 --request-rate 8 --random-input 1024 --random-output 1024 > vllm_log34
python3 ../../python/sglang/bench_serving.py --backend vllm --dataset-name random --num-prompt 3200 --request-rate 16 --random-input 1024 --random-output 1024 > vllm_log35
\ No newline at end of file
python3 ../../python/sglang/bench_serving.py --backend vllm --dataset-name random --num-prompt 3200 --request-rate 16 --random-input 1024 --random-output 1024 > vllm_log35
......@@ -30,22 +30,22 @@ def poignancy_event_prompt(persona_name, persona_iss, event):
@sgl.function
def generate_event_triple(s, persona_name, action):
s += """Task: Turn the input into (subject, predicate, object).
Input: Sam Johnson is eating breakfast.
Output: (Dolores Murphy, eat, breakfast)
---
Input: Sam Johnson is eating breakfast.
Output: (Dolores Murphy, eat, breakfast)
---
Input: Joon Park is brewing coffee.
Output: (Joon Park, brew, coffee)
---
Input: Jane Cook is sleeping.
Input: Jane Cook is sleeping.
Output: (Jane Cook, is, sleep)
---
Input: Michael Bernstein is writing email on a computer.
Input: Michael Bernstein is writing email on a computer.
Output: (Michael Bernstein, write, email)
---
Input: Percy Liang is teaching students in a classroom.
Input: Percy Liang is teaching students in a classroom.
Output: (Percy Liang, teach, students)
---
Input: Merrie Morris is running on a treadmill.
Input: Merrie Morris is running on a treadmill.
Output: (Merrie Morris, run, treadmill)
---"""
s += persona_name + "is" + action + ".\n"
......@@ -56,22 +56,22 @@ Output: (Merrie Morris, run, treadmill)
def generate_event_triple_prompt(persona_name, action):
s = ""
s += """Task: Turn the input into (subject, predicate, object).
Input: Sam Johnson is eating breakfast.
Output: (Dolores Murphy, eat, breakfast)
---
Input: Sam Johnson is eating breakfast.
Output: (Dolores Murphy, eat, breakfast)
---
Input: Joon Park is brewing coffee.
Output: (Joon Park, brew, coffee)
---
Input: Jane Cook is sleeping.
Input: Jane Cook is sleeping.
Output: (Jane Cook, is, sleep)
---
Input: Michael Bernstein is writing email on a computer.
Input: Michael Bernstein is writing email on a computer.
Output: (Michael Bernstein, write, email)
---
Input: Percy Liang is teaching students in a classroom.
Input: Percy Liang is teaching students in a classroom.
Output: (Percy Liang, teach, students)
---
Input: Merrie Morris is running on a treadmill.
Input: Merrie Morris is running on a treadmill.
Output: (Merrie Morris, run, treadmill)
---"""
s += persona_name + "is" + action + ".\n"
......@@ -107,9 +107,9 @@ def action_location_sector(
current_action,
next_action,
):
s += """Task -- choose an appropriate area from the area options for a task at hand.
s += """Task -- choose an appropriate area from the area options for a task at hand.
Sam Kim lives in {Sam Kim's house} that has Sam Kim's room, bathroom, kitchen.
Sam Kim is currently in {Sam Kim's house} that has Sam Kim's room, bathroom, kitchen.
Sam Kim is currently in {Sam Kim's house} that has Sam Kim's room, bathroom, kitchen.
Area options: {Sam Kim's house, The Rose and Crown Pub, Hobbs Cafe, Oak Hill College, Johnson Park, Harvey Oak Supply Store, The Willows Market and Pharmacy}.
* Stay in the current area if the activity can be done there. Only go out if the activity needs to take place in another place.
* Must be one of the "Area options," verbatim.
......@@ -117,7 +117,7 @@ For taking a walk, Sam Kim should go to the following area: {Johnson Park}
---
Jane Anderson lives in {Oak Hill College Student Dormatory} that has Jane Anderson's room.
Jane Anderson is currently in {Oak Hill College} that has a classroom, library
Area options: {Oak Hill College Student Dormatory, The Rose and Crown Pub, Hobbs Cafe, Oak Hill College, Johnson Park, Harvey Oak Supply Store, The Willows Market and Pharmacy}.
Area options: {Oak Hill College Student Dormatory, The Rose and Crown Pub, Hobbs Cafe, Oak Hill College, Johnson Park, Harvey Oak Supply Store, The Willows Market and Pharmacy}.
* Stay in the current area if the activity can be done there. Only go out if the activity needs to take place in another place.
* Must be one of the "Area options," verbatim.
For eating dinner, Jane Anderson should go to the following area: {Hobbs Cafe}
......@@ -167,9 +167,9 @@ def action_location_sector_prompt(
next_action,
):
s = ""
s += """Task -- choose an appropriate area from the area options for a task at hand.
s += """Task -- choose an appropriate area from the area options for a task at hand.
Sam Kim lives in {Sam Kim's house} that has Sam Kim's room, bathroom, kitchen.
Sam Kim is currently in {Sam Kim's house} that has Sam Kim's room, bathroom, kitchen.
Sam Kim is currently in {Sam Kim's house} that has Sam Kim's room, bathroom, kitchen.
Area options: {Sam Kim's house, The Rose and Crown Pub, Hobbs Cafe, Oak Hill College, Johnson Park, Harvey Oak Supply Store, The Willows Market and Pharmacy}.
* Stay in the current area if the activity can be done there. Only go out if the activity needs to take place in another place.
* Must be one of the "Area options," verbatim.
......@@ -177,7 +177,7 @@ For taking a walk, Sam Kim should go to the following area: {Johnson Park}
---
Jane Anderson lives in {Oak Hill College Student Dormatory} that has Jane Anderson's room.
Jane Anderson is currently in {Oak Hill College} that has a classroom, library
Area options: {Oak Hill College Student Dormatory, The Rose and Crown Pub, Hobbs Cafe, Oak Hill College, Johnson Park, Harvey Oak Supply Store, The Willows Market and Pharmacy}.
Area options: {Oak Hill College Student Dormatory, The Rose and Crown Pub, Hobbs Cafe, Oak Hill College, Johnson Park, Harvey Oak Supply Store, The Willows Market and Pharmacy}.
* Stay in the current area if the activity can be done there. Only go out if the activity needs to take place in another place.
* Must be one of the "Area options," verbatim.
For eating dinner, Jane Anderson should go to the following area: {Hobbs Cafe}
......@@ -226,7 +226,7 @@ Stay in the current area if the activity can be done there. Never go into other
For cooking, Jane Anderson should go to the following area in Jane Anderson's house:
Answer: {kitchen}
---
Tom Watson is in common room in Tom Watson's apartment.
Tom Watson is in common room in Tom Watson's apartment.
Tom Watson is going to Hobbs Cafe that has the following areas: {cafe}
Stay in the current area if the activity can be done there. Never go into other people's rooms unless necessary.
For getting coffee, Tom Watson should go to the following area in Hobbs Cafe:
......@@ -240,7 +240,7 @@ Answer: {cafe}
+ target_sector_areas
+ "}\n"
)
s += """* Stay in the current area if the activity can be done there.
s += """* Stay in the current area if the activity can be done there.
* NEVER go into other people's rooms unless necessary."""
s += (
persona_name
......@@ -268,7 +268,7 @@ Stay in the current area if the activity can be done there. Never go into other
For cooking, Jane Anderson should go to the following area in Jane Anderson's house:
Answer: {kitchen}
---
Tom Watson is in common room in Tom Watson's apartment.
Tom Watson is in common room in Tom Watson's apartment.
Tom Watson is going to Hobbs Cafe that has the following areas: {cafe}
Stay in the current area if the activity can be done there. Never go into other people's rooms unless necessary.
For getting coffee, Tom Watson should go to the following area in Hobbs Cafe:
......@@ -282,7 +282,7 @@ Answer: {cafe}
+ target_sector_areas
+ "}\n"
)
s += """* Stay in the current area if the activity can be done there.
s += """* Stay in the current area if the activity can be done there.
* NEVER go into other people's rooms unless necessary."""
s += (
persona_name
......
......@@ -20,7 +20,7 @@ outlines 0.0.22
Run Llama-7B
```
python3 -m sglang.launch_server --model-path meta-llama/Llama-2-7b-chat-hf --port 30000
python3 -m sglang.launch_server --model-path meta-llama/Llama-2-7b-chat-hf --port 30000
```
Run Mixtral-8x7B
......
......@@ -23,7 +23,7 @@ python3 build_dataset.py
Run Llama-7B
```bash
python3 -m sglang.launch_server --model-path meta-llama/Llama-2-7b-chat-hf --port 30000
python3 -m sglang.launch_server --model-path meta-llama/Llama-2-7b-chat-hf --port 30000
```
Benchmark Character Generation
......
......@@ -47,4 +47,4 @@ Quirinus Quirrell
Nearly Headless Nick
Aunt Marge
Griphook
Ludo Bagman
\ No newline at end of file
Ludo Bagman
......@@ -4,7 +4,7 @@
python3 download_images.py
```
image benchmark source: https://huggingface.co/datasets/liuhaotian/llava-bench-in-the-wild
image benchmark source: https://huggingface.co/datasets/liuhaotian/llava-bench-in-the-wild
### Other Dependency
```
......
File mode changed from 100644 to 100755
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment