Unverified Commit 846197f5 authored by Wentao Ye's avatar Wentao Ye Committed by GitHub
Browse files

[Log] Optimize kv cache memory log from Bytes to GiB (#25204)


Signed-off-by: default avataryewentao256 <zhyanwentao@126.com>
parent 2357480b
...@@ -383,11 +383,13 @@ class Worker(WorkerBase): ...@@ -383,11 +383,13 @@ class Worker(WorkerBase):
f"for non-torch memory, and {GiB(cuda_graph_memory_bytes)} " f"for non-torch memory, and {GiB(cuda_graph_memory_bytes)} "
f"GiB for CUDAGraph memory. Replace gpu_memory_utilization " f"GiB for CUDAGraph memory. Replace gpu_memory_utilization "
f"config with `--kv-cache-memory=" f"config with `--kv-cache-memory="
f"{kv_cache_memory_bytes_to_requested_limit}` to fit into " f"{kv_cache_memory_bytes_to_requested_limit}` "
f"requested memory, or `--kv-cache-memory=" f"({GiB(kv_cache_memory_bytes_to_requested_limit)} GiB) to fit "
f"{kv_cache_memory_bytes_to_gpu_limit}` to fully " f"into requested memory, or `--kv-cache-memory="
f"{kv_cache_memory_bytes_to_gpu_limit}` "
f"({GiB(kv_cache_memory_bytes_to_gpu_limit)} GiB) to fully "
f"utilize gpu memory. Current kv cache memory in use is " f"utilize gpu memory. Current kv cache memory in use is "
f"{int(self.available_kv_cache_memory_bytes)} bytes.") f"{GiB(self.available_kv_cache_memory_bytes)} GiB.")
logger.debug(msg) logger.debug(msg)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment