-`endpoint`: The API endpoint that received the request (e.g., `chat_completions`, `completions`, `embeddings`)
- This metric is incremented when the router returns a `ResourceExhausted` error because all workers are busy. The rejected request is surfaced to the client as an HTTP 503 response.