"vscode:/vscode.git/clone" did not exist on "d08ad65819cde8c762c9185407ff689c2a9a4706"
Unverified Commit 41bb1ab1 authored by Meng, Peng's avatar Meng, Peng Committed by GitHub
Browse files

fix nsys cannot profile cuda kernel (#957)

parent 87e8c090
...@@ -34,7 +34,7 @@ python3 bench_serving.py --backend srt --port 30000 --tokenizer meta-llama/Llama ...@@ -34,7 +34,7 @@ python3 bench_serving.py --backend srt --port 30000 --tokenizer meta-llama/Llama
### Profile with Nsight ### Profile with Nsight
1. To profile a single batch, use `nsys profile --cuda-graph-trace=node python3 -m sglang.bench_latency --model meta-llama/Meta-Llama-3-8B --batch-size 64 --input-len 512` 1. To profile a single batch, use `nsys profile --cuda-graph-trace=node python3 -m sglang.bench_latency --model meta-llama/Meta-Llama-3-8B --batch-size 64 --input-len 512`
2. To profile a server, use `nsys profile --cuda-graph-trace=node python3 -m sglang.launch_server --model meta-llama/Meta-Llama-3-8B`. 2. To profile a server, use `nsys profile --trace-fork-before-exec=true --cuda-graph-trace=node python3 -m sglang.launch_server --model meta-llama/Meta-Llama-3-8B`.
## Other baselines ## Other baselines
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment