The SDK has ability to record the time consumption of each module in the pipeline. It's closed by default. To use this ability, two steps are required:
- Generate profiler data
- Analyze profiler Data
## Generate profiler data
Using the C interface and classification pipeline as an example, when creating the pipeline, the create api with context information needs to be used, and profiler handle needs to be added to the context. The detailed code is shown below. Running the demo normally will generate profiler data "profiler_data.txt" in the current directory.
The performance data can be visualized using a script.
```bash
python tools/sdk_analyze.py profiler_data.txt
```
The parsing results are as follows: "name" represents the name of the node, "n_call" represents the number of calls, "t_mean" represents the average time consumption, "t_50%" and "t_90%" represent the percentiles of the time consumption.
In terms of model deployment, most ML models require some preprocessing steps on the input data and postprocessing steps on the output to get structured output. MMDeploy sdk provides a lot of pre-processing and post-processing process. When you convert and deploy a model, you can enjoy the convenience brought by mmdeploy sdk.
## Model Conversion
You can refer to [convert model](../02-how-to-run/convert_model.md) for more details.
After model conversion with `--dump-info`, the structure of model directory (tensorrt model) is as follows. If you convert to other backend, the structure will be slightly different. The two images are for quick conversion validation.
```bash
├── deploy.json
├── detail.json
├── pipeline.json
├── end2end.onnx
├── end2end.engine
├── output_pytorch.jpg
└── output_tensorrt.jpg
```
The files related to sdk are:
- deploy.json // model information.
- pipeline.json // inference information.
- end2end.engine // model file for tensort, will be different for other backends.
SDK can read the model directory directly or you can pack the related files to zip archive for better distribution or encryption. To read the zip file, the sdk should build with `-DMMDEPLOY_ZIP_MODEL=ON`
## SDK Inference
Generally speaking, there are three steps to inference a model.
- Create a pipeline
- Load the data
- Model inference
We use `classifier` as an example to show these three steps.
### Create a pipeline
#### Load model from disk
```cpp
std::stringmodel_path="/data/resnet";// or "/data/resnet.zip" if build with `-DMMDEPLOY_ZIP_MODEL=ON`