# *BLS* Triton Backend The [*BLS*](../bls) backend demonstrates using in-process C-API to execute inferences within the backend. This backend serves as an example to backend developers for implementing their own custom pipeline in C++. For Python use cases, please refer to [Business Logic Scripting](https://github.com/triton-inference-server/python_backend/blob/main/README.md#business-logic-scripting) section in Python backend. The source code for the *bls* backend is contained in [src](./src). * [backend.cc](./src/backend.cc) contains the main backend implementation. The content of this file is not BLS specific. It only includes the required Triton backend functions that is standard for any backend implementation. The BLS logic is set off in the `TRITONBACKEND_ModelInstanceExecute` with lines `bls_executor.Execute(requests[r], &responses[r]);`. * [bls.h](./src/bls.h) is where the BLS (class `BLSExecutor`) of this example is located. You can refer to this file to see how to interact with Triton in-process C-API to build the custom execution pipeline. * [bls_utils.h](./src/bls_utils.h) is where all the utilities that are not BLS dependent are located. The source code contains extensive documentation describing the operation of the backend and the use of the [Triton Backend API](../../../README.md#triton-backend-api) and the [Triton Server API](https://github.com/triton-inference-server/server/blob/main/docs/customization_guide/inference_protocols.md#in-process-triton-server-api). Before reading the source code, make sure you understand the concepts associated with Triton backend abstractions [TRITONBACKEND_Backend](../../../README.md#tritonbackend_backend), [TRITONBACKEND_Model](../../../README.md#tritonbackend_model), and [TRITONBACKEND_ModelInstance](../../../README.md#tritonbackend_modelinstance). The *bls* backend will send two requests on the 'addsub_python' and 'addsub_tf' models. After the inference requests are completed, this backend will extract OUTPUT0 from the 'addsub_python' and OUTPUT1 from the 'addsub_tf' model to construct the final inference response object using these tensors. There are some self-imposed limitations that were made for the simplicity of this example: 1. This backend does not support batching. 2. This backend does not support decoupled models. 3. This backend does not support GPU tensors. 4. The model configuraion should be strictly set as the comments described in [backend.cc](./src/backend.cc). You can implement your custom backend that is not limited to the limitations mentioned above. ## Building the *BLS* Backend [backends/bls/CMakeLists.txt](CMakeLists.txt) shows the recommended build and install script for a Triton backend. Building and installing is the same as decribed in [Building the *Minimal* Backend](../../README.md#building-the-minimal-backend). ## Running Triton with the *BLS* Backend After adding the *bls* backend to the Triton server as described in [Backend Shared Library](../../../README.md#backend-shared-library), you can run Triton and have it load the models in [model_repos/bls_models](../../model_repos/bls_models). Assuming you have created a *tritonserver* Docker image by adding the *bls* backend to Triton, the following command will run Triton: ``` $ docker run --rm -it --net=host -v/path/to/model_repos/bls_models:/models tritonserver --model-repository=/models ``` The console output will show similar to the following indicating that the *bls_fp32*, *addsub_python* and *addsub_tf* models from the bls_models repository have loaded correctly. ``` I0616 09:34:47.767433 19214 server.cc:629] +---------------+---------+--------+ | Model | Version | Status | +---------------+---------+--------+ | addsub_python | 1 | READY | | addsub_tf | 1 | READY | | bls_fp32 | 1 | READY | +---------------+---------+--------+ ``` ## Testing the *BLS* Backend The [clients](../../clients) directory holds example clients. The [bls_client](../../clients/bls_client) Python script demonstrates sending an inference requests to the *bls* backend. With Triton running as described in [Running Triton with the *BLS* Backend](#running-triton-with-the-bls-backend), execute the client: ``` $ clients/bls_client ``` You should see an output similar to the output below: ``` INPUT0 ([0.42935285 0.51512766 0.43625894 ... 0.6670954 0.17747518 0.7976901 ]) + INPUT1 ([6.7752063e-01 2.4223252e-01 6.7743927e-01 ... 4.1531715e-01 2.5451833e-01 7.9097062e-01]) = OUTPUT0 ([1.1068735 0.75736016 1.1136982 ... 1.0824126 0.4319935 1.5886607 ]) INPUT0 ([0.42935285 0.51512766 0.43625894 ... 0.6670954 0.17747518 0.7976901 ]) - INPUT1 ([6.7752063e-01 2.4223252e-01 6.7743927e-01 ... 4.1531715e-01 2.5451833e-01 7.9097062e-01]) = OUTPUT1 ([-0.24816778 0.27289516 -0.24118033 ... 0.25177827 -0.07704315 0.00671947]) PASS ```