Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
dynamo
Commits
f6fef485
"...ssh:/git@developer.sourcefind.cn:2222/OpenDAS/dynamo.git" did not exist on "b73c571f0fa63a31224248da54d26b5899823694"
Unverified
Commit
f6fef485
authored
Aug 13, 2025
by
Tanmay Verma
Committed by
GitHub
Aug 13, 2025
Browse files
fix(ci): Reduce the free gpu memory fraction (#2433)
parent
cebe9219
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
3 additions
and
3 deletions
+3
-3
components/backends/trtllm/engine_configs/agg.yaml
components/backends/trtllm/engine_configs/agg.yaml
+1
-1
components/backends/trtllm/engine_configs/decode.yaml
components/backends/trtllm/engine_configs/decode.yaml
+1
-1
components/backends/trtllm/engine_configs/prefill.yaml
components/backends/trtllm/engine_configs/prefill.yaml
+1
-1
No files found.
components/backends/trtllm/engine_configs/agg.yaml
View file @
f6fef485
...
@@ -22,7 +22,7 @@ backend: pytorch
...
@@ -22,7 +22,7 @@ backend: pytorch
enable_chunked_prefill
:
true
enable_chunked_prefill
:
true
kv_cache_config
:
kv_cache_config
:
free_gpu_memory_fraction
:
0.
9
5
free_gpu_memory_fraction
:
0.
8
5
# NOTE: pytorch_backend_config section flattened since: https://github.com/NVIDIA/TensorRT-LLM/pull/4603
# NOTE: pytorch_backend_config section flattened since: https://github.com/NVIDIA/TensorRT-LLM/pull/4603
# NOTE: overlap_scheduler enabled by default since this commit and changed
# NOTE: overlap_scheduler enabled by default since this commit and changed
...
...
components/backends/trtllm/engine_configs/decode.yaml
View file @
f6fef485
...
@@ -25,7 +25,7 @@ cuda_graph_config:
...
@@ -25,7 +25,7 @@ cuda_graph_config:
max_batch_size
:
16
max_batch_size
:
16
kv_cache_config
:
kv_cache_config
:
free_gpu_memory_fraction
:
0.
9
5
free_gpu_memory_fraction
:
0.
8
5
cache_transceiver_config
:
cache_transceiver_config
:
backend
:
default
backend
:
default
components/backends/trtllm/engine_configs/prefill.yaml
View file @
f6fef485
...
@@ -24,7 +24,7 @@ disable_overlap_scheduler: true
...
@@ -24,7 +24,7 @@ disable_overlap_scheduler: true
cuda_graph_config
:
cuda_graph_config
:
max_batch_size
:
16
max_batch_size
:
16
kv_cache_config
:
kv_cache_config
:
free_gpu_memory_fraction
:
0.
9
5
free_gpu_memory_fraction
:
0.
8
5
cache_transceiver_config
:
cache_transceiver_config
:
backend
:
default
backend
:
default
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment