.nfs00000000a870410c000018da 24.4 KB
Newer Older
mashun1's avatar
mashun1 committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
nohup: ignoring input
&&&& RUNNING TensorRT.trtexec [TensorRT v8503] # trtexec --onnx=checkpoints/yolov5x/yolov5x.onnx --saveEngine=checkpoints/yolov5x/yolov5x.trt --int8
[03/20/2024-05:59:19] [I] === Model Options ===
[03/20/2024-05:59:19] [I] Format: ONNX
[03/20/2024-05:59:19] [I] Model: checkpoints/yolov5x/yolov5x.onnx
[03/20/2024-05:59:19] [I] Output:
[03/20/2024-05:59:19] [I] === Build Options ===
[03/20/2024-05:59:19] [I] Max batch: explicit batch
[03/20/2024-05:59:19] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[03/20/2024-05:59:19] [I] minTiming: 1
[03/20/2024-05:59:19] [I] avgTiming: 8
[03/20/2024-05:59:19] [I] Precision: FP32+INT8
[03/20/2024-05:59:19] [I] LayerPrecisions: 
[03/20/2024-05:59:19] [I] Calibration: Dynamic
[03/20/2024-05:59:19] [I] Refit: Disabled
[03/20/2024-05:59:19] [I] Sparsity: Disabled
[03/20/2024-05:59:19] [I] Safe mode: Disabled
[03/20/2024-05:59:19] [I] DirectIO mode: Disabled
[03/20/2024-05:59:19] [I] Restricted mode: Disabled
[03/20/2024-05:59:19] [I] Build only: Disabled
[03/20/2024-05:59:19] [I] Save engine: checkpoints/yolov5x/yolov5x.trt
[03/20/2024-05:59:19] [I] Load engine: 
[03/20/2024-05:59:19] [I] Profiling verbosity: 0
[03/20/2024-05:59:19] [I] Tactic sources: Using default tactic sources
[03/20/2024-05:59:19] [I] timingCacheMode: local
[03/20/2024-05:59:19] [I] timingCacheFile: 
[03/20/2024-05:59:19] [I] Heuristic: Disabled
[03/20/2024-05:59:19] [I] Preview Features: Use default preview flags.
[03/20/2024-05:59:19] [I] Input(s)s format: fp32:CHW
[03/20/2024-05:59:19] [I] Output(s)s format: fp32:CHW
[03/20/2024-05:59:19] [I] Input build shapes: model
[03/20/2024-05:59:19] [I] Input calibration shapes: model
[03/20/2024-05:59:19] [I] === System Options ===
[03/20/2024-05:59:19] [I] Device: 0
[03/20/2024-05:59:19] [I] DLACore: 
[03/20/2024-05:59:19] [I] Plugins:
[03/20/2024-05:59:19] [I] === Inference Options ===
[03/20/2024-05:59:19] [I] Batch: Explicit
[03/20/2024-05:59:19] [I] Input inference shapes: model
[03/20/2024-05:59:19] [I] Iterations: 10
[03/20/2024-05:59:19] [I] Duration: 3s (+ 200ms warm up)
[03/20/2024-05:59:19] [I] Sleep time: 0ms
[03/20/2024-05:59:19] [I] Idle time: 0ms
[03/20/2024-05:59:19] [I] Streams: 1
[03/20/2024-05:59:19] [I] ExposeDMA: Disabled
[03/20/2024-05:59:19] [I] Data transfers: Enabled
[03/20/2024-05:59:19] [I] Spin-wait: Disabled
[03/20/2024-05:59:19] [I] Multithreading: Disabled
[03/20/2024-05:59:19] [I] CUDA Graph: Disabled
[03/20/2024-05:59:19] [I] Separate profiling: Disabled
[03/20/2024-05:59:19] [I] Time Deserialize: Disabled
[03/20/2024-05:59:19] [I] Time Refit: Disabled
[03/20/2024-05:59:19] [I] NVTX verbosity: 0
[03/20/2024-05:59:19] [I] Persistent Cache Ratio: 0
[03/20/2024-05:59:19] [I] Inputs:
[03/20/2024-05:59:19] [I] === Reporting Options ===
[03/20/2024-05:59:19] [I] Verbose: Disabled
[03/20/2024-05:59:19] [I] Averages: 10 inferences
[03/20/2024-05:59:19] [I] Percentiles: 90,95,99
[03/20/2024-05:59:19] [I] Dump refittable layers:Disabled
[03/20/2024-05:59:19] [I] Dump output: Disabled
[03/20/2024-05:59:19] [I] Profile: Disabled
[03/20/2024-05:59:19] [I] Export timing to JSON file: 
[03/20/2024-05:59:19] [I] Export output to JSON file: 
[03/20/2024-05:59:19] [I] Export profile to JSON file: 
[03/20/2024-05:59:19] [I] 
[03/20/2024-05:59:48] [I] === Device Information ===
[03/20/2024-05:59:48] [I] Selected Device: NVIDIA A800 80GB PCIe
[03/20/2024-05:59:48] [I] Compute Capability: 8.0
[03/20/2024-05:59:48] [I] SMs: 108
[03/20/2024-05:59:48] [I] Compute Clock Rate: 1.41 GHz
[03/20/2024-05:59:48] [I] Device Global Memory: 81050 MiB
[03/20/2024-05:59:48] [I] Shared Memory per SM: 164 KiB
[03/20/2024-05:59:48] [I] Memory Bus Width: 5120 bits (ECC enabled)
[03/20/2024-05:59:48] [I] Memory Clock Rate: 1.512 GHz
[03/20/2024-05:59:48] [I] 
[03/20/2024-05:59:48] [I] TensorRT version: 8.5.3
[03/20/2024-06:00:17] [I] [TRT] [MemUsageChange] Init CUDA: CPU +14, GPU +0, now: CPU 27, GPU 24558 (MiB)
[03/20/2024-06:00:32] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +661, GPU +166, now: CPU 740, GPU 24724 (MiB)
[03/20/2024-06:00:32] [I] Start parsing network model
[03/20/2024-06:00:32] [I] [TRT] ----------------------------------------------------------------
[03/20/2024-06:00:32] [I] [TRT] Input filename:   checkpoints/yolov5x/yolov5x.onnx
[03/20/2024-06:00:32] [I] [TRT] ONNX IR version:  0.0.7
[03/20/2024-06:00:32] [I] [TRT] Opset version:    13
[03/20/2024-06:00:32] [I] [TRT] Producer name:    pytorch
[03/20/2024-06:00:32] [I] [TRT] Producer version: 2.0.1
[03/20/2024-06:00:32] [I] [TRT] Domain:           
[03/20/2024-06:00:32] [I] [TRT] Model version:    0
[03/20/2024-06:00:32] [I] [TRT] Doc string:       
[03/20/2024-06:00:32] [I] [TRT] ----------------------------------------------------------------
[03/20/2024-06:00:35] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[03/20/2024-06:00:35] [I] Finish parsing network model
[03/20/2024-06:00:38] [I] FP32 and INT8 precisions have been specified - more performance might be enabled by additionally specifying --fp16 or --best
[03/20/2024-06:00:41] [W] [TRT] Calibrator won't be used in explicit precision mode. Use quantization aware training to generate network with Quantize/Dequantize nodes.
[03/20/2024-17:33:14] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +6, GPU -21, now: CPU 1416, GPU 25098 (MiB)
[03/20/2024-17:33:19] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +2, GPU +26, now: CPU 1418, GPU 25124 (MiB)
[03/20/2024-17:33:20] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[03/20/2024-18:11:03] [I] [TRT] Detected 1 inputs and 7 output network tensors.
[03/20/2024-18:14:25] [I] [TRT] Total Host Persistent Memory: 372224
[03/20/2024-18:14:25] [I] [TRT] Total Device Persistent Memory: 0
[03/20/2024-18:14:25] [I] [TRT] Total Scratch Memory: 0
[03/20/2024-18:14:25] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 418 MiB, GPU 216 MiB
[03/20/2024-18:14:26] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 45.2042ms to assign 7 blocks to 184 nodes requiring 34969600 bytes.
[03/20/2024-18:14:26] [I] [TRT] Total Activation Memory: 34969600
[03/20/2024-18:14:28] [W] [TRT] TensorRT encountered issues when converting weights between types and that could affect accuracy.
[03/20/2024-18:14:28] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
[03/20/2024-18:14:28] [W] [TRT] Check verbose logs for the list of affected weights.
[03/20/2024-18:14:28] [W] [TRT] - 25 weights are affected by this issue: Detected values which are outside of int8_t range and clipped them to int8_t range.
[03/20/2024-18:14:28] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +83, GPU +84, now: CPU 83, GPU 84 (MiB)
[03/20/2024-18:20:20] [I] Engine built in 44431.5 sec.
[03/20/2024-18:20:26] [I] [TRT] Loaded engine size: 86 MiB
[03/20/2024-18:20:32] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +83, now: CPU 0, GPU 83 (MiB)
[03/20/2024-18:20:32] [I] Engine deserialized in 5.81573 sec.
[03/20/2024-18:20:35] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +1, GPU +34, now: CPU 1, GPU 117 (MiB)
[03/20/2024-18:20:35] [I] Setting persistentCacheLimit to 0 bytes.
[03/20/2024-18:20:35] [I] Using random values for input images
[03/20/2024-18:20:36] [I] Created input binding for images with dimensions 1x3x640x640
[03/20/2024-18:20:36] [I] Using random values for output onnx::Sigmoid_2177
[03/20/2024-18:20:37] [I] Created output binding for onnx::Sigmoid_2177 with dimensions 1x3x80x80x85
[03/20/2024-18:20:37] [I] Using random values for output onnx::Sigmoid_2229
[03/20/2024-18:20:38] [I] Created output binding for onnx::Sigmoid_2229 with dimensions 1x3x40x40x85
[03/20/2024-18:20:38] [I] Using random values for output onnx::Sigmoid_2281
[03/20/2024-18:20:38] [I] Created output binding for onnx::Sigmoid_2281 with dimensions 1x3x20x20x85
[03/20/2024-18:20:38] [I] Using random values for output outputs
[03/20/2024-18:20:38] [I] Created output binding for outputs with dimensions 1x25200x85
[03/20/2024-18:20:38] [I] Starting inference
[03/20/2024-18:20:41] [I] Warmup completed 82 queries over 200 ms
[03/20/2024-18:20:41] [I] Timing trace has 1239 queries over 3.00919 s
[03/20/2024-18:20:41] [I] 
[03/20/2024-18:20:41] [I] === Trace details ===
[03/20/2024-18:20:41] [I] Trace averages of 10 runs:
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42463 ms - Host latency: 3.67126 ms (enqueue 0.945071 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42299 ms - Host latency: 3.67298 ms (enqueue 1.04116 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.43364 ms - Host latency: 3.68269 ms (enqueue 0.956358 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.436 ms - Host latency: 3.6895 ms (enqueue 0.948724 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42319 ms - Host latency: 3.6725 ms (enqueue 0.954166 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42237 ms - Host latency: 3.67505 ms (enqueue 0.967197 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.43375 ms - Host latency: 3.67804 ms (enqueue 0.954947 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42954 ms - Host latency: 3.67611 ms (enqueue 0.972562 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42268 ms - Host latency: 3.67251 ms (enqueue 0.950684 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42064 ms - Host latency: 3.66315 ms (enqueue 0.958334 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42104 ms - Host latency: 3.6598 ms (enqueue 0.942892 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.4232 ms - Host latency: 3.66109 ms (enqueue 0.94252 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42339 ms - Host latency: 3.66243 ms (enqueue 0.94342 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42147 ms - Host latency: 3.66012 ms (enqueue 0.945453 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42648 ms - Host latency: 3.66543 ms (enqueue 0.945605 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42227 ms - Host latency: 3.66091 ms (enqueue 0.942908 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42288 ms - Host latency: 3.66246 ms (enqueue 0.946649 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42267 ms - Host latency: 3.66882 ms (enqueue 0.947675 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42556 ms - Host latency: 3.6711 ms (enqueue 0.948084 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42576 ms - Host latency: 3.6694 ms (enqueue 1.19636 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42053 ms - Host latency: 3.66154 ms (enqueue 0.956055 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42329 ms - Host latency: 3.67522 ms (enqueue 0.951782 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42267 ms - Host latency: 3.66431 ms (enqueue 0.946771 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42607 ms - Host latency: 3.66413 ms (enqueue 0.940643 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41919 ms - Host latency: 3.65739 ms (enqueue 0.941699 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42064 ms - Host latency: 3.65871 ms (enqueue 0.941052 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.4236 ms - Host latency: 3.66242 ms (enqueue 0.942377 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42146 ms - Host latency: 3.65906 ms (enqueue 0.943616 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42391 ms - Host latency: 3.66592 ms (enqueue 0.94458 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42299 ms - Host latency: 3.66309 ms (enqueue 0.949152 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42217 ms - Host latency: 3.6623 ms (enqueue 0.950781 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42043 ms - Host latency: 3.65912 ms (enqueue 0.965393 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42288 ms - Host latency: 3.66077 ms (enqueue 0.941821 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42422 ms - Host latency: 3.66156 ms (enqueue 0.939288 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42157 ms - Host latency: 3.65942 ms (enqueue 0.946912 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41923 ms - Host latency: 3.65651 ms (enqueue 0.942017 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41962 ms - Host latency: 3.65717 ms (enqueue 0.943994 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42388 ms - Host latency: 3.66847 ms (enqueue 1.00292 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42343 ms - Host latency: 3.66206 ms (enqueue 0.913086 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42021 ms - Host latency: 3.6624 ms (enqueue 0.91488 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42246 ms - Host latency: 3.67434 ms (enqueue 1.00906 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42062 ms - Host latency: 3.65934 ms (enqueue 0.919775 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42391 ms - Host latency: 3.66812 ms (enqueue 0.916284 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42493 ms - Host latency: 3.66461 ms (enqueue 0.911426 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.43384 ms - Host latency: 3.67771 ms (enqueue 0.915234 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42354 ms - Host latency: 3.66333 ms (enqueue 0.913477 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42145 ms - Host latency: 3.66732 ms (enqueue 0.9245 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42568 ms - Host latency: 3.66515 ms (enqueue 0.917236 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.4236 ms - Host latency: 3.66152 ms (enqueue 0.911279 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42297 ms - Host latency: 3.66169 ms (enqueue 0.90813 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41912 ms - Host latency: 3.65688 ms (enqueue 0.908411 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42267 ms - Host latency: 3.66293 ms (enqueue 0.986707 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42141 ms - Host latency: 3.6601 ms (enqueue 0.938635 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42107 ms - Host latency: 3.65969 ms (enqueue 0.921899 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42311 ms - Host latency: 3.66389 ms (enqueue 1.01439 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42207 ms - Host latency: 3.6714 ms (enqueue 1.00046 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42186 ms - Host latency: 3.67098 ms (enqueue 0.940332 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42271 ms - Host latency: 3.65981 ms (enqueue 0.934937 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42104 ms - Host latency: 3.65934 ms (enqueue 0.932422 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42106 ms - Host latency: 3.6592 ms (enqueue 0.934497 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42246 ms - Host latency: 3.66042 ms (enqueue 0.937061 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41971 ms - Host latency: 3.65696 ms (enqueue 0.936389 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42206 ms - Host latency: 3.66028 ms (enqueue 0.932471 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42167 ms - Host latency: 3.65941 ms (enqueue 0.934399 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42402 ms - Host latency: 3.66168 ms (enqueue 0.936987 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42183 ms - Host latency: 3.66056 ms (enqueue 0.938965 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.4229 ms - Host latency: 3.66044 ms (enqueue 0.936267 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42228 ms - Host latency: 3.65992 ms (enqueue 0.936157 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42115 ms - Host latency: 3.6614 ms (enqueue 0.935718 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42229 ms - Host latency: 3.66204 ms (enqueue 0.9771 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42228 ms - Host latency: 3.65969 ms (enqueue 0.935083 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41992 ms - Host latency: 3.65706 ms (enqueue 0.935071 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42136 ms - Host latency: 3.65869 ms (enqueue 0.93512 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41971 ms - Host latency: 3.6576 ms (enqueue 0.934998 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42156 ms - Host latency: 3.66012 ms (enqueue 0.936743 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42523 ms - Host latency: 3.66317 ms (enqueue 0.933447 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42057 ms - Host latency: 3.65875 ms (enqueue 0.934448 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42085 ms - Host latency: 3.65979 ms (enqueue 0.937305 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42146 ms - Host latency: 3.65918 ms (enqueue 0.936401 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42334 ms - Host latency: 3.66221 ms (enqueue 0.938501 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42651 ms - Host latency: 3.66423 ms (enqueue 0.935229 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.43682 ms - Host latency: 3.67505 ms (enqueue 0.934497 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41895 ms - Host latency: 3.65618 ms (enqueue 0.93313 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41904 ms - Host latency: 3.65657 ms (enqueue 0.94397 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42024 ms - Host latency: 3.66345 ms (enqueue 0.938745 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42075 ms - Host latency: 3.66128 ms (enqueue 0.939771 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.43738 ms - Host latency: 3.68162 ms (enqueue 0.951147 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41826 ms - Host latency: 3.65764 ms (enqueue 0.9375 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41931 ms - Host latency: 3.65972 ms (enqueue 0.948682 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42058 ms - Host latency: 3.65789 ms (enqueue 0.936865 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41924 ms - Host latency: 3.65674 ms (enqueue 0.937549 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41836 ms - Host latency: 3.65586 ms (enqueue 0.937134 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41853 ms - Host latency: 3.6564 ms (enqueue 0.936279 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42043 ms - Host latency: 3.65774 ms (enqueue 0.938257 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41963 ms - Host latency: 3.65896 ms (enqueue 0.941895 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42056 ms - Host latency: 3.66707 ms (enqueue 1.01245 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41853 ms - Host latency: 3.6614 ms (enqueue 0.958081 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42051 ms - Host latency: 3.66458 ms (enqueue 1.04944 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42004 ms - Host latency: 3.66091 ms (enqueue 0.931348 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42075 ms - Host latency: 3.66101 ms (enqueue 0.930615 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41948 ms - Host latency: 3.66685 ms (enqueue 0.936377 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42754 ms - Host latency: 3.67422 ms (enqueue 0.940063 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42412 ms - Host latency: 3.67114 ms (enqueue 0.937329 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42029 ms - Host latency: 3.66443 ms (enqueue 0.936279 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41902 ms - Host latency: 3.67385 ms (enqueue 1.0353 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42273 ms - Host latency: 3.67058 ms (enqueue 0.9573 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41768 ms - Host latency: 3.66528 ms (enqueue 0.940088 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42485 ms - Host latency: 3.6843 ms (enqueue 1.11968 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.4313 ms - Host latency: 3.6865 ms (enqueue 1.58699 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42156 ms - Host latency: 3.68008 ms (enqueue 1.09146 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42075 ms - Host latency: 3.6707 ms (enqueue 0.981885 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42014 ms - Host latency: 3.65872 ms (enqueue 0.933691 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42341 ms - Host latency: 3.68196 ms (enqueue 0.946704 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42031 ms - Host latency: 3.66436 ms (enqueue 0.951001 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42014 ms - Host latency: 3.6584 ms (enqueue 0.936011 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41819 ms - Host latency: 3.65571 ms (enqueue 0.936182 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41885 ms - Host latency: 3.65647 ms (enqueue 0.933203 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42004 ms - Host latency: 3.65845 ms (enqueue 0.93501 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41921 ms - Host latency: 3.65681 ms (enqueue 0.936646 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.4207 ms - Host latency: 3.65876 ms (enqueue 0.933789 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42468 ms - Host latency: 3.66323 ms (enqueue 0.934644 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42092 ms - Host latency: 3.65869 ms (enqueue 0.93501 ms)
[03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41921 ms - Host latency: 3.66094 ms (enqueue 0.93877 ms)
[03/20/2024-18:20:41] [I] 
[03/20/2024-18:20:41] [I] === Performance summary ===
[03/20/2024-18:20:41] [I] Throughput: 411.738 qps
[03/20/2024-18:20:41] [I] Latency: min = 3.6438 ms, max = 3.81909 ms, mean = 3.66416 ms, median = 3.66025 ms, percentile(90%) = 3.67578 ms, percentile(95%) = 3.68628 ms, percentile(99%) = 3.71466 ms
[03/20/2024-18:20:41] [I] Enqueue Time: min = 0.897339 ms, max = 3.44153 ms, mean = 0.954946 ms, median = 0.938965 ms, percentile(90%) = 0.954285 ms, percentile(95%) = 0.984131 ms, percentile(99%) = 1.58911 ms
[03/20/2024-18:20:41] [I] H2D Latency: min = 0.245483 ms, max = 0.280029 ms, mean = 0.247243 ms, median = 0.246582 ms, percentile(90%) = 0.249084 ms, percentile(95%) = 0.250122 ms, percentile(99%) = 0.254517 ms
[03/20/2024-18:20:41] [I] GPU Compute Time: min = 2.41064 ms, max = 2.57324 ms, mean = 2.4225 ms, median = 2.42072 ms, percentile(90%) = 2.42993 ms, percentile(95%) = 2.43298 ms, percentile(99%) = 2.44116 ms
[03/20/2024-18:20:41] [I] D2H Latency: min = 0.986084 ms, max = 1.06665 ms, mean = 0.994418 ms, median = 0.991882 ms, percentile(90%) = 1.00171 ms, percentile(95%) = 1.01141 ms, percentile(99%) = 1.03223 ms
[03/20/2024-18:20:41] [I] Total Host Walltime: 3.00919 s
[03/20/2024-18:20:41] [I] Total GPU Compute Time: 3.00148 s
[03/20/2024-18:20:41] [I] Explanations of the performance metrics are printed in the verbose logs.
[03/20/2024-18:20:41] [I] 
&&&& PASSED TensorRT.trtexec [TensorRT v8503] # trtexec --onnx=checkpoints/yolov5x/yolov5x.onnx --saveEngine=checkpoints/yolov5x/yolov5x.trt --int8