This Setup treats each Dynamo deployment as a black box and routes traffic randomly among the deployments.
This guide demonstrates two setups.
Currently, this setup is only kgateway based Inference Gateway.
The EPP-unaware setup treats each Dynamo deployment as a black box and routes traffic randomly among the deployments.
The EPP-aware setup first uses Dynamo Router to pick the worker instance id for serving the model. Then traffic gets directed straight to the selected worker.
Currently, these setups are only supported with the kGateway based Inference Gateway.