Unverified Commit 21d5daa4 authored by Woosuk Kwon's avatar Woosuk Kwon Committed by GitHub
Browse files

Add warning on CUDA graph memory usage (#2182)

parent 290e015c
...@@ -395,6 +395,9 @@ class ModelRunner: ...@@ -395,6 +395,9 @@ class ModelRunner:
"unexpected consequences if the model is not static. To " "unexpected consequences if the model is not static. To "
"run the model in eager mode, set 'enforce_eager=True' or " "run the model in eager mode, set 'enforce_eager=True' or "
"use '--enforce-eager' in the CLI.") "use '--enforce-eager' in the CLI.")
logger.info("CUDA graphs can take additional 1~3 GiB memory per GPU. "
"If you are running out of memory, consider decreasing "
"`gpu_memory_utilization` or enforcing eager mode.")
start_time = time.perf_counter() start_time = time.perf_counter()
# Prepare dummy inputs. These will be reused for all batch sizes. # Prepare dummy inputs. These will be reused for all batch sizes.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment