Update server args doc (#3273)

Co-authored-by: Shi Shuai <126407087+shuaills@users.noreply.github.com>

Update server args doc (#3273)
Co-authored-by: Shi Shuai <126407087+shuaills@users.noreply.github.com>
7b5a3741 · simveit · GitHub · 4b6f62e2 · 7b5a3741
Unverified Commit 7b5a3741 authored Feb 04, 2025 by simveit Committed by GitHub Feb 03, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

docs/backend/server_arguments.md docs/backend/server_arguments.md +1 -1

No files found.
--- a/docs/backend/server_arguments.md
+++ b/docs/backend/server_arguments.md
@@ -159,7 +159,7 @@ Please consult the documentation below to learn more about the parameters you ma
 * `disable_radix_cache`: Disable [Radix](https://lmsys.org/blog/2024-01-17-sglang/) backend for prefix caching.
 * `disable_jump_forward`: Disable [jump-forward](https://lmsys.org/blog/2024-02-05-compressed-fsm/#our-method-jump-forward-decoding-with-a-compressed-finite-state-machine) for outlines grammar backend.
-* `disable_cuda_graph`: Disable [cuda graph](https://pytorch.org/blog/accelerating-pytorch-with-cuda-graphs/) for model forward.
+* `disable_cuda_graph`: Disable [cuda graph](https://pytorch.org/blog/accelerating-pytorch-with-cuda-graphs/) for model forward. Use if encountering uncorrectable CUDA ECC errors.
 * `disable_cuda_graph_padding`: Disable cuda graph when padding is needed. In other case still use cuda graph.
 * `disable_outlines_disk_cache`: Disable disk cache for outlines grammar backend.
 * `disable_custom_all_reduce`: Disable usage of custom all reduce kernel.