Commit f7f2bf4f authored by gracehoney's avatar gracehoney Committed by Karmel Allison
Browse files

Fix TensorRT test output and add int8 result. (#4004)

* Fix TensorRT test output and add int8 result.

* Fix comments

* Removing footnote declaration as well
parent e8863ed6
...@@ -9,12 +9,10 @@ Here we provide a sample script that can: ...@@ -9,12 +9,10 @@ Here we provide a sample script that can:
1. Convert a TensorFlow SavedModel to a Frozen Graph. 1. Convert a TensorFlow SavedModel to a Frozen Graph.
2. Load a Frozen Graph for inference. 2. Load a Frozen Graph for inference.
3. Time inference loops using the native TensorFlow graph. 3. Time inference loops using the native TensorFlow graph.
4. Time inference loops using FP32, FP16, or INT8<sup>1</sup> precision modes from TensorRT. 4. Time inference loops using FP32, FP16, or INT8 precision modes from TensorRT.
We provide some results below, as well as instructions for running this script. We provide some results below, as well as instructions for running this script.
<sup>1</sup> INT8 mode is a work in progress; please see [INT8 Mode is the Bleeding Edge](#int8-mode-is-the-bleeding-edge) below.
## How to Run This Script ## How to Run This Script
### Step 1: Install Prerequisites ### Step 1: Install Prerequisites
...@@ -63,12 +61,12 @@ you would run: ...@@ -63,12 +61,12 @@ you would run:
``` ```
python tensorrt.py --frozen_graph=resnetv2_imagenet_frozen_graph.pb \ python tensorrt.py --frozen_graph=resnetv2_imagenet_frozen_graph.pb \
--image_file=image.jpg --native --fp32 --fp16 --output_dir=/my/output --image_file=image.jpg --native --fp32 --fp16 --int8 --output_dir=/my/output
``` ```
This will print the predictions for each of the precision modes that were run This will print the predictions for each of the precision modes that were run
(native, which is the native precision of the model passed in, as well (native, which is the native precision of the model passed in, as well
as the TensorRT version of the graph at precisions of fp32 and fp16): as the TensorRT version of the graph at precisions of fp32, fp16 and int8):
``` ```
INFO:tensorflow:Starting timing. INFO:tensorflow:Starting timing.
...@@ -76,7 +74,8 @@ INFO:tensorflow:Timing loop done! ...@@ -76,7 +74,8 @@ INFO:tensorflow:Timing loop done!
Predictions: Predictions:
Precision: native [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'lakeside, lakeshore', u'grey whale, gray whale, devilfish, Eschrichtius gibbosus, Eschrichtius robustus'] Precision: native [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'lakeside, lakeshore', u'grey whale, gray whale, devilfish, Eschrichtius gibbosus, Eschrichtius robustus']
Precision: FP32 [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'lakeside, lakeshore', u'sandbar, sand bar'] Precision: FP32 [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'lakeside, lakeshore', u'sandbar, sand bar']
Precision: FP16 [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'lakeside, lakeshore', u'sandbar, sand bar', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty'] Precision: FP16 [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'lakeside, lakeshore', u'sandbar, sand bar']
Precision: INT8 [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'grey whale, gray whale, devilfish, Eschrichtius gibbosus, Eschrichtius robustus', u'lakeside, lakeshore']
``` ```
The script will generate or append to a file in the output_dir, `log.txt`, The script will generate or append to a file in the output_dir, `log.txt`,
...@@ -84,20 +83,24 @@ which includes the timing information for each of the models: ...@@ -84,20 +83,24 @@ which includes the timing information for each of the models:
``` ```
========================== ==========================
network: native_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100 network: native_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
fps median: 1041.4, mean: 1056.6, uncertainty: 2.8, jitter: 6.1 fps median: 468.2, mean: 469.0, uncertainty: 0.3, jitter: 1.6
latency median: 0.12292, mean: 0.12123, 99th_p: 0.13151, 99th_uncertainty: 0.00024 latency median: 0.27336, mean: 0.27290, 99th_p: 0.27475, 99th_uncertainty: 0.00027
========================== ==========================
network: tftrt_fp32_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100 network: tftrt_fp32_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
fps median: 1253.0, mean: 1250.8, uncertainty: 3.4, jitter: 17.3 fps median: 627.7, mean: 628.9, uncertainty: 0.5, jitter: 3.6
latency median: 0.10215, mean: 0.10241, 99th_p: 0.11482, 99th_uncertainty: 0.01109 latency median: 0.20392, mean: 0.20354, 99th_p: 0.20608, 99th_uncertainty: 0.00083
========================== ==========================
network: tftrt_fp16_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100 network: tftrt_fp16_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
fps median: 2280.2, mean: 2312.8, uncertainty: 10.3, jitter: 100.1 fps median: 626.8, mean: 628.8, uncertainty: 0.5, jitter: 3.1
latency median: 0.05614, mean: 0.05546, 99th_p: 0.06103, 99th_uncertainty: 0.00781 latency median: 0.20421, mean: 0.20359, 99th_p: 0.20555, 99th_uncertainty: 0.00019
==========================
network: tftrt_int8_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
fps median: 1362.4, mean: 1368.1, uncertainty: 2.2, jitter: 14.4
latency median: 0.09396, mean: 0.09359, 99th_p: 0.09546, 99th_uncertainty: 0.00021
``` ```
The script will also output the GraphDefs used for each of the modes run, The script will also output the GraphDefs used for each of the modes run,
...@@ -106,22 +109,14 @@ for future use and inspection: ...@@ -106,22 +109,14 @@ for future use and inspection:
``` ```
ls /my/output ls /my/output
log.txt log.txt
tftrt_fp16_imagenet_frozen_graph.pb tftrt_fp16_resnetv2_imagenet_frozen_graph.pb
tftrt_fp32_imagenet_frozen_graph.pb tftrt_fp32_resnetv2_imagenet_frozen_graph.pb
tftrt_int8_calib_resnetv2_imagenet_frozen_graph.pb
tftrt_int8_resnetv2_imagenet_frozen_graph.pb
``` ```
## Troubleshooting and Notes ## Troubleshooting and Notes
### INT8 Mode is the Bleeding Edge
Note that currently, INT8 mode results in a segfault using the models provided.
We are working on it.
```
E tensorflow/contrib/tensorrt/log/trt_logger.cc:38] DefaultLogger Parameter check failed at: Network.cpp::addScale::118, condition: shift.count == 0 || shift.count == weightCount
Segmentation fault (core dumped)
```
### GPU/Precision Compatibility ### GPU/Precision Compatibility
Not all GPUs support the ops required for all precisions. For example, P100s Not all GPUs support the ops required for all precisions. For example, P100s
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment