Unverified Commit 636ee223 authored by Russell Bryant's avatar Russell Bryant Committed by GitHub
Browse files

[Docs] Document security risks of GPT-OSS Python tool (#35139)


Signed-off-by: default avatarRussell Bryant <rbryant@redhat.com>
parent b7d59ffc
......@@ -219,6 +219,47 @@ The most effective approach is to deploy vLLM behind a reverse proxy (such as ng
- Blocks all other endpoints, including the unauthenticated inference and operational control endpoints
- Implements additional authentication, rate limiting, and logging at the proxy layer
## Tool Server and MCP Security
vLLM supports connecting to external tool servers via the `--tool-server` argument. This enables models to call tools through the Responses API (`/v1/responses`). Tool server support works with all models — it is not limited to specific model architectures.
**Important:** No tool servers are enabled by default. They must be explicitly opted into via configuration.
### Built-in Demo Tools (GPT-OSS)
Passing `--tool-server demo` enables built-in demo tools that work with any model that supports tool calling. The tool implementations are not part of vLLM — they are provided by the separately installed [`gpt-oss`](https://github.com/openai/gpt-oss) package. vLLM provides thin wrappers that delegate to `gpt-oss`.
- **Code interpreter** (`python`): Python execution via Docker (via `gpt_oss.tools.python_docker`)
- **Web browser** (`browser`): Search via Exa API, requires `EXA_API_KEY` (via `gpt_oss.tools.simple_browser`)
#### Code Interpreter (Python Tool) Security Risks
The code interpreter executes model-generated code inside a Docker container. However, the container is **not configured with network isolation by default**. It inherits the host's Docker networking configuration (e.g., default bridge network or `--network=host`), which means:
- The container may be able to access the host network and LAN.
- Internal services reachable from the container may be exploited via SSRF (Server-Side Request Forgery).
- Cloud metadata services (e.g., `169.254.169.254`) may be accessible.
- If vulnerable internal services (such as `torch.distributed` endpoints) are reachable from the container, this could be used to attack them.
This is particularly concerning because the code being executed is generated by the model, which may be influenced by adversarial inputs (prompt injection).
#### Controlling Built-in Tool Availability
Built-in demo tools are controlled by two settings:
1. **`--tool-server demo`**: Enables the built-in demo tools (browser and Python code interpreter).
2. **`VLLM_GPT_OSS_SYSTEM_TOOL_MCP_LABELS`**: When built-in tools are requested via the `mcp` tool type in the Responses API, this comma-separated allowlist controls which tool labels are permitted. Valid values are:
- `container` - Container tool
- `code_interpreter` - Python code execution tool
- `web_search_preview` - Web search/browser tool
If this variable is not set or is empty, no built-in tools requested via MCP tool type will be enabled.
To disable the Python code interpreter specifically, omit `code_interpreter` from `VLLM_GPT_OSS_SYSTEM_TOOL_MCP_LABELS`.
**Consider a custom implementation**: The GPT-OSS Python tool is a reference implementation. For production deployments, consider implementing a custom code execution sandbox with stricter isolation guarantees. See the [GPT-OSS documentation](https://github.com/openai/gpt-oss?tab=readme-ov-file#python) for guidance.
## Reporting Security Vulnerabilities
If you believe you have found a security vulnerability in vLLM, please report it following the project's security policy. For more information on how to report security issues and the project's security policy, please see the [vLLM Security Policy](https://github.com/vllm-project/vllm/blob/main/SECURITY.md).
......@@ -125,8 +125,11 @@ class BaseFrontendArgs:
`--tool-call-parser`."""
tool_server: str | None = None
"""Comma-separated list of host:port pairs (IPv4, IPv6, or hostname).
Examples: 127.0.0.1:8000, [::1]:8000, localhost:1234. Or `demo` for demo
purpose."""
Examples: 127.0.0.1:8000, [::1]:8000, localhost:1234. Or `demo` for
built-in demo tools (browser and Python code interpreter). WARNING:
The `demo` Python tool executes model-generated code in Docker without
network isolation by default. See the security guide for more
information."""
log_config_file: str | None = envs.VLLM_LOGGING_CONFIG_PATH
"""Path to logging config JSON file for both vllm and uvicorn"""
max_log_len: int | None = None
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment