nohup: ignoring input &&&& RUNNING TensorRT.trtexec [TensorRT v8503] # trtexec --onnx=checkpoints/yolov5x/yolov5x.onnx --saveEngine=checkpoints/yolov5x/yolov5x.trt --int8 [03/20/2024-05:59:19] [I] === Model Options === [03/20/2024-05:59:19] [I] Format: ONNX [03/20/2024-05:59:19] [I] Model: checkpoints/yolov5x/yolov5x.onnx [03/20/2024-05:59:19] [I] Output: [03/20/2024-05:59:19] [I] === Build Options === [03/20/2024-05:59:19] [I] Max batch: explicit batch [03/20/2024-05:59:19] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default [03/20/2024-05:59:19] [I] minTiming: 1 [03/20/2024-05:59:19] [I] avgTiming: 8 [03/20/2024-05:59:19] [I] Precision: FP32+INT8 [03/20/2024-05:59:19] [I] LayerPrecisions: [03/20/2024-05:59:19] [I] Calibration: Dynamic [03/20/2024-05:59:19] [I] Refit: Disabled [03/20/2024-05:59:19] [I] Sparsity: Disabled [03/20/2024-05:59:19] [I] Safe mode: Disabled [03/20/2024-05:59:19] [I] DirectIO mode: Disabled [03/20/2024-05:59:19] [I] Restricted mode: Disabled [03/20/2024-05:59:19] [I] Build only: Disabled [03/20/2024-05:59:19] [I] Save engine: checkpoints/yolov5x/yolov5x.trt [03/20/2024-05:59:19] [I] Load engine: [03/20/2024-05:59:19] [I] Profiling verbosity: 0 [03/20/2024-05:59:19] [I] Tactic sources: Using default tactic sources [03/20/2024-05:59:19] [I] timingCacheMode: local [03/20/2024-05:59:19] [I] timingCacheFile: [03/20/2024-05:59:19] [I] Heuristic: Disabled [03/20/2024-05:59:19] [I] Preview Features: Use default preview flags. [03/20/2024-05:59:19] [I] Input(s)s format: fp32:CHW [03/20/2024-05:59:19] [I] Output(s)s format: fp32:CHW [03/20/2024-05:59:19] [I] Input build shapes: model [03/20/2024-05:59:19] [I] Input calibration shapes: model [03/20/2024-05:59:19] [I] === System Options === [03/20/2024-05:59:19] [I] Device: 0 [03/20/2024-05:59:19] [I] DLACore: [03/20/2024-05:59:19] [I] Plugins: [03/20/2024-05:59:19] [I] === Inference Options === [03/20/2024-05:59:19] [I] Batch: Explicit [03/20/2024-05:59:19] [I] Input inference shapes: model [03/20/2024-05:59:19] [I] Iterations: 10 [03/20/2024-05:59:19] [I] Duration: 3s (+ 200ms warm up) [03/20/2024-05:59:19] [I] Sleep time: 0ms [03/20/2024-05:59:19] [I] Idle time: 0ms [03/20/2024-05:59:19] [I] Streams: 1 [03/20/2024-05:59:19] [I] ExposeDMA: Disabled [03/20/2024-05:59:19] [I] Data transfers: Enabled [03/20/2024-05:59:19] [I] Spin-wait: Disabled [03/20/2024-05:59:19] [I] Multithreading: Disabled [03/20/2024-05:59:19] [I] CUDA Graph: Disabled [03/20/2024-05:59:19] [I] Separate profiling: Disabled [03/20/2024-05:59:19] [I] Time Deserialize: Disabled [03/20/2024-05:59:19] [I] Time Refit: Disabled [03/20/2024-05:59:19] [I] NVTX verbosity: 0 [03/20/2024-05:59:19] [I] Persistent Cache Ratio: 0 [03/20/2024-05:59:19] [I] Inputs: [03/20/2024-05:59:19] [I] === Reporting Options === [03/20/2024-05:59:19] [I] Verbose: Disabled [03/20/2024-05:59:19] [I] Averages: 10 inferences [03/20/2024-05:59:19] [I] Percentiles: 90,95,99 [03/20/2024-05:59:19] [I] Dump refittable layers:Disabled [03/20/2024-05:59:19] [I] Dump output: Disabled [03/20/2024-05:59:19] [I] Profile: Disabled [03/20/2024-05:59:19] [I] Export timing to JSON file: [03/20/2024-05:59:19] [I] Export output to JSON file: [03/20/2024-05:59:19] [I] Export profile to JSON file: [03/20/2024-05:59:19] [I] [03/20/2024-05:59:48] [I] === Device Information === [03/20/2024-05:59:48] [I] Selected Device: NVIDIA A800 80GB PCIe [03/20/2024-05:59:48] [I] Compute Capability: 8.0 [03/20/2024-05:59:48] [I] SMs: 108 [03/20/2024-05:59:48] [I] Compute Clock Rate: 1.41 GHz [03/20/2024-05:59:48] [I] Device Global Memory: 81050 MiB [03/20/2024-05:59:48] [I] Shared Memory per SM: 164 KiB [03/20/2024-05:59:48] [I] Memory Bus Width: 5120 bits (ECC enabled) [03/20/2024-05:59:48] [I] Memory Clock Rate: 1.512 GHz [03/20/2024-05:59:48] [I] [03/20/2024-05:59:48] [I] TensorRT version: 8.5.3 [03/20/2024-06:00:17] [I] [TRT] [MemUsageChange] Init CUDA: CPU +14, GPU +0, now: CPU 27, GPU 24558 (MiB) [03/20/2024-06:00:32] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +661, GPU +166, now: CPU 740, GPU 24724 (MiB) [03/20/2024-06:00:32] [I] Start parsing network model [03/20/2024-06:00:32] [I] [TRT] ---------------------------------------------------------------- [03/20/2024-06:00:32] [I] [TRT] Input filename: checkpoints/yolov5x/yolov5x.onnx [03/20/2024-06:00:32] [I] [TRT] ONNX IR version: 0.0.7 [03/20/2024-06:00:32] [I] [TRT] Opset version: 13 [03/20/2024-06:00:32] [I] [TRT] Producer name: pytorch [03/20/2024-06:00:32] [I] [TRT] Producer version: 2.0.1 [03/20/2024-06:00:32] [I] [TRT] Domain: [03/20/2024-06:00:32] [I] [TRT] Model version: 0 [03/20/2024-06:00:32] [I] [TRT] Doc string: [03/20/2024-06:00:32] [I] [TRT] ---------------------------------------------------------------- [03/20/2024-06:00:35] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. [03/20/2024-06:00:35] [I] Finish parsing network model [03/20/2024-06:00:38] [I] FP32 and INT8 precisions have been specified - more performance might be enabled by additionally specifying --fp16 or --best [03/20/2024-06:00:41] [W] [TRT] Calibrator won't be used in explicit precision mode. Use quantization aware training to generate network with Quantize/Dequantize nodes. [03/20/2024-17:33:14] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +6, GPU -21, now: CPU 1416, GPU 25098 (MiB) [03/20/2024-17:33:19] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +2, GPU +26, now: CPU 1418, GPU 25124 (MiB) [03/20/2024-17:33:20] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored. [03/20/2024-18:11:03] [I] [TRT] Detected 1 inputs and 7 output network tensors. [03/20/2024-18:14:25] [I] [TRT] Total Host Persistent Memory: 372224 [03/20/2024-18:14:25] [I] [TRT] Total Device Persistent Memory: 0 [03/20/2024-18:14:25] [I] [TRT] Total Scratch Memory: 0 [03/20/2024-18:14:25] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 418 MiB, GPU 216 MiB [03/20/2024-18:14:26] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 45.2042ms to assign 7 blocks to 184 nodes requiring 34969600 bytes. [03/20/2024-18:14:26] [I] [TRT] Total Activation Memory: 34969600 [03/20/2024-18:14:28] [W] [TRT] TensorRT encountered issues when converting weights between types and that could affect accuracy. [03/20/2024-18:14:28] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights. [03/20/2024-18:14:28] [W] [TRT] Check verbose logs for the list of affected weights. [03/20/2024-18:14:28] [W] [TRT] - 25 weights are affected by this issue: Detected values which are outside of int8_t range and clipped them to int8_t range. [03/20/2024-18:14:28] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +83, GPU +84, now: CPU 83, GPU 84 (MiB) [03/20/2024-18:20:20] [I] Engine built in 44431.5 sec. [03/20/2024-18:20:26] [I] [TRT] Loaded engine size: 86 MiB [03/20/2024-18:20:32] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +83, now: CPU 0, GPU 83 (MiB) [03/20/2024-18:20:32] [I] Engine deserialized in 5.81573 sec. [03/20/2024-18:20:35] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +1, GPU +34, now: CPU 1, GPU 117 (MiB) [03/20/2024-18:20:35] [I] Setting persistentCacheLimit to 0 bytes. [03/20/2024-18:20:35] [I] Using random values for input images [03/20/2024-18:20:36] [I] Created input binding for images with dimensions 1x3x640x640 [03/20/2024-18:20:36] [I] Using random values for output onnx::Sigmoid_2177 [03/20/2024-18:20:37] [I] Created output binding for onnx::Sigmoid_2177 with dimensions 1x3x80x80x85 [03/20/2024-18:20:37] [I] Using random values for output onnx::Sigmoid_2229 [03/20/2024-18:20:38] [I] Created output binding for onnx::Sigmoid_2229 with dimensions 1x3x40x40x85 [03/20/2024-18:20:38] [I] Using random values for output onnx::Sigmoid_2281 [03/20/2024-18:20:38] [I] Created output binding for onnx::Sigmoid_2281 with dimensions 1x3x20x20x85 [03/20/2024-18:20:38] [I] Using random values for output outputs [03/20/2024-18:20:38] [I] Created output binding for outputs with dimensions 1x25200x85 [03/20/2024-18:20:38] [I] Starting inference [03/20/2024-18:20:41] [I] Warmup completed 82 queries over 200 ms [03/20/2024-18:20:41] [I] Timing trace has 1239 queries over 3.00919 s [03/20/2024-18:20:41] [I] [03/20/2024-18:20:41] [I] === Trace details === [03/20/2024-18:20:41] [I] Trace averages of 10 runs: [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42463 ms - Host latency: 3.67126 ms (enqueue 0.945071 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42299 ms - Host latency: 3.67298 ms (enqueue 1.04116 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.43364 ms - Host latency: 3.68269 ms (enqueue 0.956358 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.436 ms - Host latency: 3.6895 ms (enqueue 0.948724 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42319 ms - Host latency: 3.6725 ms (enqueue 0.954166 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42237 ms - Host latency: 3.67505 ms (enqueue 0.967197 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.43375 ms - Host latency: 3.67804 ms (enqueue 0.954947 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42954 ms - Host latency: 3.67611 ms (enqueue 0.972562 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42268 ms - Host latency: 3.67251 ms (enqueue 0.950684 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42064 ms - Host latency: 3.66315 ms (enqueue 0.958334 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42104 ms - Host latency: 3.6598 ms (enqueue 0.942892 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.4232 ms - Host latency: 3.66109 ms (enqueue 0.94252 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42339 ms - Host latency: 3.66243 ms (enqueue 0.94342 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42147 ms - Host latency: 3.66012 ms (enqueue 0.945453 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42648 ms - Host latency: 3.66543 ms (enqueue 0.945605 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42227 ms - Host latency: 3.66091 ms (enqueue 0.942908 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42288 ms - Host latency: 3.66246 ms (enqueue 0.946649 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42267 ms - Host latency: 3.66882 ms (enqueue 0.947675 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42556 ms - Host latency: 3.6711 ms (enqueue 0.948084 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42576 ms - Host latency: 3.6694 ms (enqueue 1.19636 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42053 ms - Host latency: 3.66154 ms (enqueue 0.956055 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42329 ms - Host latency: 3.67522 ms (enqueue 0.951782 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42267 ms - Host latency: 3.66431 ms (enqueue 0.946771 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42607 ms - Host latency: 3.66413 ms (enqueue 0.940643 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41919 ms - Host latency: 3.65739 ms (enqueue 0.941699 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42064 ms - Host latency: 3.65871 ms (enqueue 0.941052 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.4236 ms - Host latency: 3.66242 ms (enqueue 0.942377 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42146 ms - Host latency: 3.65906 ms (enqueue 0.943616 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42391 ms - Host latency: 3.66592 ms (enqueue 0.94458 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42299 ms - Host latency: 3.66309 ms (enqueue 0.949152 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42217 ms - Host latency: 3.6623 ms (enqueue 0.950781 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42043 ms - Host latency: 3.65912 ms (enqueue 0.965393 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42288 ms - Host latency: 3.66077 ms (enqueue 0.941821 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42422 ms - Host latency: 3.66156 ms (enqueue 0.939288 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42157 ms - Host latency: 3.65942 ms (enqueue 0.946912 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41923 ms - Host latency: 3.65651 ms (enqueue 0.942017 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41962 ms - Host latency: 3.65717 ms (enqueue 0.943994 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42388 ms - Host latency: 3.66847 ms (enqueue 1.00292 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42343 ms - Host latency: 3.66206 ms (enqueue 0.913086 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42021 ms - Host latency: 3.6624 ms (enqueue 0.91488 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42246 ms - Host latency: 3.67434 ms (enqueue 1.00906 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42062 ms - Host latency: 3.65934 ms (enqueue 0.919775 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42391 ms - Host latency: 3.66812 ms (enqueue 0.916284 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42493 ms - Host latency: 3.66461 ms (enqueue 0.911426 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.43384 ms - Host latency: 3.67771 ms (enqueue 0.915234 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42354 ms - Host latency: 3.66333 ms (enqueue 0.913477 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42145 ms - Host latency: 3.66732 ms (enqueue 0.9245 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42568 ms - Host latency: 3.66515 ms (enqueue 0.917236 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.4236 ms - Host latency: 3.66152 ms (enqueue 0.911279 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42297 ms - Host latency: 3.66169 ms (enqueue 0.90813 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41912 ms - Host latency: 3.65688 ms (enqueue 0.908411 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42267 ms - Host latency: 3.66293 ms (enqueue 0.986707 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42141 ms - Host latency: 3.6601 ms (enqueue 0.938635 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42107 ms - Host latency: 3.65969 ms (enqueue 0.921899 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42311 ms - Host latency: 3.66389 ms (enqueue 1.01439 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42207 ms - Host latency: 3.6714 ms (enqueue 1.00046 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42186 ms - Host latency: 3.67098 ms (enqueue 0.940332 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42271 ms - Host latency: 3.65981 ms (enqueue 0.934937 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42104 ms - Host latency: 3.65934 ms (enqueue 0.932422 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42106 ms - Host latency: 3.6592 ms (enqueue 0.934497 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42246 ms - Host latency: 3.66042 ms (enqueue 0.937061 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41971 ms - Host latency: 3.65696 ms (enqueue 0.936389 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42206 ms - Host latency: 3.66028 ms (enqueue 0.932471 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42167 ms - Host latency: 3.65941 ms (enqueue 0.934399 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42402 ms - Host latency: 3.66168 ms (enqueue 0.936987 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42183 ms - Host latency: 3.66056 ms (enqueue 0.938965 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.4229 ms - Host latency: 3.66044 ms (enqueue 0.936267 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42228 ms - Host latency: 3.65992 ms (enqueue 0.936157 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42115 ms - Host latency: 3.6614 ms (enqueue 0.935718 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42229 ms - Host latency: 3.66204 ms (enqueue 0.9771 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42228 ms - Host latency: 3.65969 ms (enqueue 0.935083 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41992 ms - Host latency: 3.65706 ms (enqueue 0.935071 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42136 ms - Host latency: 3.65869 ms (enqueue 0.93512 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41971 ms - Host latency: 3.6576 ms (enqueue 0.934998 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42156 ms - Host latency: 3.66012 ms (enqueue 0.936743 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42523 ms - Host latency: 3.66317 ms (enqueue 0.933447 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42057 ms - Host latency: 3.65875 ms (enqueue 0.934448 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42085 ms - Host latency: 3.65979 ms (enqueue 0.937305 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42146 ms - Host latency: 3.65918 ms (enqueue 0.936401 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42334 ms - Host latency: 3.66221 ms (enqueue 0.938501 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42651 ms - Host latency: 3.66423 ms (enqueue 0.935229 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.43682 ms - Host latency: 3.67505 ms (enqueue 0.934497 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41895 ms - Host latency: 3.65618 ms (enqueue 0.93313 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41904 ms - Host latency: 3.65657 ms (enqueue 0.94397 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42024 ms - Host latency: 3.66345 ms (enqueue 0.938745 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42075 ms - Host latency: 3.66128 ms (enqueue 0.939771 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.43738 ms - Host latency: 3.68162 ms (enqueue 0.951147 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41826 ms - Host latency: 3.65764 ms (enqueue 0.9375 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41931 ms - Host latency: 3.65972 ms (enqueue 0.948682 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42058 ms - Host latency: 3.65789 ms (enqueue 0.936865 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41924 ms - Host latency: 3.65674 ms (enqueue 0.937549 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41836 ms - Host latency: 3.65586 ms (enqueue 0.937134 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41853 ms - Host latency: 3.6564 ms (enqueue 0.936279 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42043 ms - Host latency: 3.65774 ms (enqueue 0.938257 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41963 ms - Host latency: 3.65896 ms (enqueue 0.941895 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42056 ms - Host latency: 3.66707 ms (enqueue 1.01245 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41853 ms - Host latency: 3.6614 ms (enqueue 0.958081 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42051 ms - Host latency: 3.66458 ms (enqueue 1.04944 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42004 ms - Host latency: 3.66091 ms (enqueue 0.931348 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42075 ms - Host latency: 3.66101 ms (enqueue 0.930615 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41948 ms - Host latency: 3.66685 ms (enqueue 0.936377 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42754 ms - Host latency: 3.67422 ms (enqueue 0.940063 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42412 ms - Host latency: 3.67114 ms (enqueue 0.937329 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42029 ms - Host latency: 3.66443 ms (enqueue 0.936279 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41902 ms - Host latency: 3.67385 ms (enqueue 1.0353 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42273 ms - Host latency: 3.67058 ms (enqueue 0.9573 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41768 ms - Host latency: 3.66528 ms (enqueue 0.940088 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42485 ms - Host latency: 3.6843 ms (enqueue 1.11968 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.4313 ms - Host latency: 3.6865 ms (enqueue 1.58699 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42156 ms - Host latency: 3.68008 ms (enqueue 1.09146 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42075 ms - Host latency: 3.6707 ms (enqueue 0.981885 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42014 ms - Host latency: 3.65872 ms (enqueue 0.933691 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42341 ms - Host latency: 3.68196 ms (enqueue 0.946704 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42031 ms - Host latency: 3.66436 ms (enqueue 0.951001 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42014 ms - Host latency: 3.6584 ms (enqueue 0.936011 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41819 ms - Host latency: 3.65571 ms (enqueue 0.936182 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41885 ms - Host latency: 3.65647 ms (enqueue 0.933203 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42004 ms - Host latency: 3.65845 ms (enqueue 0.93501 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41921 ms - Host latency: 3.65681 ms (enqueue 0.936646 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.4207 ms - Host latency: 3.65876 ms (enqueue 0.933789 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42468 ms - Host latency: 3.66323 ms (enqueue 0.934644 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.42092 ms - Host latency: 3.65869 ms (enqueue 0.93501 ms) [03/20/2024-18:20:41] [I] Average on 10 runs - GPU latency: 2.41921 ms - Host latency: 3.66094 ms (enqueue 0.93877 ms) [03/20/2024-18:20:41] [I] [03/20/2024-18:20:41] [I] === Performance summary === [03/20/2024-18:20:41] [I] Throughput: 411.738 qps [03/20/2024-18:20:41] [I] Latency: min = 3.6438 ms, max = 3.81909 ms, mean = 3.66416 ms, median = 3.66025 ms, percentile(90%) = 3.67578 ms, percentile(95%) = 3.68628 ms, percentile(99%) = 3.71466 ms [03/20/2024-18:20:41] [I] Enqueue Time: min = 0.897339 ms, max = 3.44153 ms, mean = 0.954946 ms, median = 0.938965 ms, percentile(90%) = 0.954285 ms, percentile(95%) = 0.984131 ms, percentile(99%) = 1.58911 ms [03/20/2024-18:20:41] [I] H2D Latency: min = 0.245483 ms, max = 0.280029 ms, mean = 0.247243 ms, median = 0.246582 ms, percentile(90%) = 0.249084 ms, percentile(95%) = 0.250122 ms, percentile(99%) = 0.254517 ms [03/20/2024-18:20:41] [I] GPU Compute Time: min = 2.41064 ms, max = 2.57324 ms, mean = 2.4225 ms, median = 2.42072 ms, percentile(90%) = 2.42993 ms, percentile(95%) = 2.43298 ms, percentile(99%) = 2.44116 ms [03/20/2024-18:20:41] [I] D2H Latency: min = 0.986084 ms, max = 1.06665 ms, mean = 0.994418 ms, median = 0.991882 ms, percentile(90%) = 1.00171 ms, percentile(95%) = 1.01141 ms, percentile(99%) = 1.03223 ms [03/20/2024-18:20:41] [I] Total Host Walltime: 3.00919 s [03/20/2024-18:20:41] [I] Total GPU Compute Time: 3.00148 s [03/20/2024-18:20:41] [I] Explanations of the performance metrics are printed in the verbose logs. [03/20/2024-18:20:41] [I] &&&& PASSED TensorRT.trtexec [TensorRT v8503] # trtexec --onnx=checkpoints/yolov5x/yolov5x.onnx --saveEngine=checkpoints/yolov5x/yolov5x.trt --int8