A tool to convert CSV files containing ChatGPT/GPT-4 conversation logs into mooncake-style JSONL format for load testing and simulation.
> [!NOTE]
> Currently, KV reuse is not considered in the output. We will update the script once [BurstGPT](https://github.com/HPMLL/BurstGPT) adds user session information.
## Input Format
The input CSV can be downloaded from [BurstGPT Release v1.1](https://github.com/HPMLL/BurstGPT/releases/tag/v1.1):
-`Timestamp`: Request timestamp in seconds
-`Model`: Model name (e.g., "ChatGPT", "GPT-4")
-`Request tokens`: Number of input tokens
-`Response tokens`: Number of output tokens
-`Total tokens`: Total tokens (not used)
-`Log Type`: Type of log (e.g., "Conversation log", "API log")
Example:
```csv
Timestamp,Model,Request tokens,Response tokens,Total tokens,Log Type
5,ChatGPT,472,18,490,Conversation log
45,ChatGPT,1087,230,1317,Conversation log
118,GPT-4,417,276,693,Conversation log
```
## Output Format
The output is a JSONL file where each line is a JSON object: