This document describes the `/v1/classify` API endpoint implementation in SGLang, which is compatible with vLLM's classification API format.
## Overview
The classification API allows you to classify text inputs using classification models. This implementation follows the same format as vLLM's 0.7.0 classification API.
## API Endpoint
```
POST /v1/classify
```
## Request Format
```json
{
"model":"model_name",
"input":"text to classify"
}
```
### Parameters
-`model` (string, required): The name of the classification model to use
-`input` (string, required): The text to classify
-`user` (string, optional): User identifier for tracking
-`rid` (string, optional): Request ID for tracking
-`priority` (integer, optional): Request priority
## Response Format
```json
{
"id":"classify-9bf17f2847b046c7b2d5495f4b4f9682",
"object":"list",
"created":1745383213,
"model":"jason9693/Qwen2.5-1.5B-apeach",
"data":[
{
"index":0,
"label":"Default",
"probs":[0.565970778465271,0.4340292513370514],
"num_classes":2
}
],
"usage":{
"prompt_tokens":10,
"total_tokens":10,
"completion_tokens":0,
"prompt_tokens_details":null
}
}
```
### Response Fields
-`id`: Unique identifier for the classification request
-`object`: Always "list"
-`created`: Unix timestamp when the request was created
-`model`: The model used for classification
-`data`: Array of classification results
-`index`: Index of the result
-`label`: Predicted class label
-`probs`: Array of probabilities for each class
-`num_classes`: Total number of classes
-`usage`: Token usage information
-`prompt_tokens`: Number of input tokens
-`total_tokens`: Total number of tokens
-`completion_tokens`: Number of completion tokens (always 0 for classification)
**Label Mapping**: The API automatically uses the `id2label` mapping from the model's `config.json` file to provide meaningful label names instead of generic class names. If `id2label` is not available, it falls back to `LABEL_0`, `LABEL_1`, etc., or `Class_0`, `Class_1` as a last resort.
### Reward Models (Single score)
-`InternLM2ForRewardModel` - Single reward score
-`Qwen2ForRewardModel` - Single reward score
-`LlamaForSequenceClassificationWithNormal_Weights` - Special reward model
**Note**: The `/classify` endpoint in SGLang was originally designed for reward models but now supports all non-generative models. Our `/v1/classify` endpoint provides a standardized vLLM-compatible interface for classification tasks.
## Error Handling
The API returns appropriate HTTP status codes and error messages:
-`400 Bad Request`: Invalid request format or missing required fields
-`500 Internal Server Error`: Server-side processing error
Error response format:
```json
{
"error":"Error message",
"type":"error_type",
"code":400
}
```
## Implementation Details
The classification API is implemented using:
1.**Rust Router**: Handles routing and request/response models in `sgl-router/src/protocols/spec.rs`
2.**Python HTTP Server**: Implements the actual endpoint in `python/sglang/srt/entrypoints/http_server.py`
3.**Classification Service**: Handles the classification logic in `python/sglang/srt/entrypoints/openai/serving_classify.py`
## Testing
Use the provided test script to verify the implementation:
```bash
python test_classify_api.py
```
## Compatibility
This implementation is compatible with vLLM's classification API format, allowing seamless migration from vLLM to SGLang for classification tasks.