1) The benchmark class should extend the TensorFlow python class
[tensorflow.test.Benchmark](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/platform/benchmark.py). The benchmark class constructor should have a
constructor with signature `__init__(self, output_dir, data_dir, **kwargs)`.
Below is the usage for each arguments:
- Benchmark method should put all generated files (e.g. logs) in `output_dir` so that PerfZero can
upload these files to Google Cloud Storage when `--output_gcs_url` is specified.
- Benchmark method should read data from `root_data_dir`. For example, the benchmark method can read data from e.g. `${root_data_dir}/cifar-10-binary`
-`**kwargs` is useful to make the benchmark constructor forward compatible when PerfZero provides more named arguments to the benchmark constructor before
updating the benchmark class.
2) At the end of the benchmark method execution, the method should call [report_benchmark()](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/platform/benchmark.py)
with the following parameters:
```
tf.test.Benchmark.report_benchmark(
iters=num_iteration, # The number of iterations of the benchmark.
wall_time=wall_time_sec, # Total wall time in sec for all iterations.
metrics = [ # List of metric entries
{
'name': 'accuracy_top_5', # Metric name
'value': 80, # Metric value
'min_value': 90, # Optional. Minimum acceptable metric value for the benchmark to succeed.
'max_value': 99 # Optional. Maximum acceptable metric value for the benchmark to succeed.
},
{
'name': 'accuracy_top_1',
'value': 99.5
}
]
)
```
This format allows PerfZero to specify whether the benchmark has succeeded
(e.g. for convergence test) in its summary based on the logic determined by
the benchmark developer.
3) Include dependent libraries in `--git_repos` and `--python_path`. These
libraries will be checked-out in the directory
`path_to_perfzero/workspace/site-packages` by default. Developer can edit these
libraries directly and execute benchmark with the local change.
# Deep dive into individual tools
The sections below go into details about the individual components of PerfZero.
## Build docker image
The command below builds the docker image named `perfzero/tensorflow` which contains the
libraries (e.g. TensorFlow) needed for benchmark.
```
python3 benchmarks/perfzero/lib/setup.py
```
Here are a few selected optional flags. Run `python3 setup.py -h` to see
detailed documentation for all supported flags.
1) Use `--dockerfile_path=docker/Dockerfile_ubuntu_1804_tf_v2` to build docker image for TensorFlow v2
2) Use `--tensorflow_pip_spec` to specify the tensorflow pip package name (and optionally version) to be
installed in the docker image, e.g. `--tensorflow_pip_spec=tensorflow==1.12.0`.
## Run benchmark
The command below executes the benchmark method specified by `--benchmark_methods`.
```
export ROOT_DATA_DIR=/data
nvidia-docker run -it --rm -v $(pwd):/workspace -v $ROOT_DATA_DIR:$ROOT_DATA_DIR perfzero/tensorflow \
Here is an example output from PerfZero. Explanation is provided inline for each
key when the name of the key is not sufficiently self-explanary.
```
{
"ml_framework_info": { # Summary of the machine learning framework
"version": "1.13.0-dev20190206", # Short version. It is tf.__version__ for TensorFlow
"name": "tensorflow", # Machine learning framework name such as PyTorch
"build_label": "ml_framework_build_label", # Specified by the flag --ml_framework_build_label
"build_version": "v1.12.0-7504-g9b32b5742b" # Long version. It is tf.__git_version__ for TensorFlow
},
"execution_timestamp": 1550040322.8991697, # Timestamp when the benchmark is executed
"execution_id": "2019-02-13-06-45-22-899128", # A string that uniquely identify this benchmark execution
"benchmark_info": { # Summary of the benchmark framework setup
"output_url": "gs://tf-performance/test-results/2019-02-13-06-45-22-899128/", # Google storage url that contains the log file from this benchmark execution
"execution_label": "execution_label" # Specified by the flag --execution_label
},
"system_info": { # Summary of the resources in the system that is used to execute the benchmark
"system_name": "system_name", # Specified by the flag --system_name
"accelerator_count": 2, # Number of GPUs in the system
"physical_cpu_count": 8, # Number of physical cpu cores in the system. Hyper thread CPUs are excluded.
"logical_cpu_count": 16, # Number of logical cpu cores in the system. Hyper thread CPUs are included.
"cpu_socket_count": 1, # Number of cpu socket in the system.
"platform_name": "platform_name", # Specified by the flag --platform_name
"accelerator_model": "Tesla V100-SXM2-16GB",
"accelerator_driver_version": "410.48",
"cpu_model": "Intel(R) Xeon(R) CPU @ 2.20GHz"
},
"process_info": { # Summary of the resources used by the process to execute the benchmark
"max_rss": 4269047808, # maximum physical memory in bytes used by the process
"max_vms": 39894450176, # maximum virtual memory in bytes used by the process
"max_cpu_percent": 771.1 # CPU utilization as a percentage. See psutil.Process.cpu_percent() for more information
},
"benchmark_result": { # Summary of the benchmark execution results. This is pretty much the same data structure defined in test_log.proto.
# Most values are read from test_log.proto which is written by tf.test.Benchmark.report_benchmark() defined in TensorFlow library.
"metrics": [ # This is derived from `extras` [test_log.proto](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/util/test_log.proto)
# which is written by report_benchmark().
# If the EntryValue is double, then name is the extra's key and value is extra's double value.
# If the EntryValue is string, then name is the extra's key. The string value will be a json formated string whose keys
# include `value`, `succeeded` and `description`. Benchmark method can provide arbitrary metric key/value pairs here.
{
"name": "accuracy_top_5",
"value": 0.7558000087738037
},
{
"name": "accuracy_top_1",
"value": 0.2639999985694885
}
],
"name": "official.resnet.estimator_cifar_benchmark.EstimatorCifar10BenchmarkTests.unit_test", # Full path to the benchmark method, i.e. module_path.class_name.method_name
"succeeded": true, # True iff benchmark method execution finishes without exception and no metric in metrics show succeeded = false
"wall_time": 14.552583694458008 # The value is determined by tf.test.Benchmark.report_benchmark() called by the benchmark method. It is -1 if report_benchmark() is not called.
}
}
```
### Profiling
When the flag `--profiler_enabled_time=start_time:end_time` is specified, the