[Docs] Document security risks of GPT-OSS Python tool (#35139)

Signed-off-by: Russell Bryant <rbryant@redhat.com>

[Docs] Document security risks of GPT-OSS Python tool (#35139)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
636ee223 · Russell Bryant · GitHub · b7d59ffc · 636ee223 · 636ee223
Unverified Commit 636ee223 authored Mar 04, 2026 by Russell Bryant Committed by GitHub Mar 04, 2026
Show whitespace changes
Inline Side-by-side

Showing with 46 additions and 2 deletions

docs/usage/security.md docs/usage/security.md +41 -0

vllm/entrypoints/openai/cli_args.py vllm/entrypoints/openai/cli_args.py +5 -2

No files found.
--- a/docs/usage/security.md
+++ b/docs/usage/security.md
@@ -219,6 +219,47 @@ The most effective approach is to deploy vLLM behind a reverse proxy (such as ng
 - Blocks all other endpoints, including the unauthenticated inference and operational control endpoints
 - Implements additional authentication, rate limiting, and logging at the proxy layer

+## Tool Server and MCP Security
+
+vLLM supports connecting to external tool servers via the `--tool-server` argument. This enables models to call tools through the Responses API (`/v1/responses`). Tool server support works with all models — it is not limited to specific model architectures.
+
+**Important:** No tool servers are enabled by default. They must be explicitly opted into via configuration.
+
+### Built-in Demo Tools (GPT-OSS)
+
+Passing `--tool-server demo` enables built-in demo tools that work with any model that supports tool calling. The tool implementations are not part of vLLM — they are provided by the separately installed [`gpt-oss`](https://github.com/openai/gpt-oss) package. vLLM provides thin wrappers that delegate to `gpt-oss`.
+
+- **Code interpreter** (`python`): Python execution via Docker (via `gpt_oss.tools.python_docker`)
+- **Web browser** (`browser`): Search via Exa API, requires `EXA_API_KEY` (via `gpt_oss.tools.simple_browser`)
+
+#### Code Interpreter (Python Tool) Security Risks
+
+The code interpreter executes model-generated code inside a Docker container. However, the container is **not configured with network isolation by default**. It inherits the host's Docker networking configuration (e.g., default bridge network or `--network=host`), which means:
+
+- The container may be able to access the host network and LAN.
+- Internal services reachable from the container may be exploited via SSRF (Server-Side Request Forgery).
+- Cloud metadata services (e.g., `169.254.169.254`) may be accessible.
+- If vulnerable internal services (such as `torch.distributed` endpoints) are reachable from the container, this could be used to attack them.
+
+This is particularly concerning because the code being executed is generated by the model, which may be influenced by adversarial inputs (prompt injection).
+
+#### Controlling Built-in Tool Availability
+
+Built-in demo tools are controlled by two settings:
+
+1. **`--tool-server demo`**: Enables the built-in demo tools (browser and Python code interpreter).
+
+2. **`VLLM_GPT_OSS_SYSTEM_TOOL_MCP_LABELS`**: When built-in tools are requested via the `mcp` tool type in the Responses API, this comma-separated allowlist controls which tool labels are permitted. Valid values are:
+   - `container` - Container tool
+   - `code_interpreter` - Python code execution tool
+   - `web_search_preview` - Web search/browser tool
+
+   If this variable is not set or is empty, no built-in tools requested via MCP tool type will be enabled.
+
+To disable the Python code interpreter specifically, omit `code_interpreter` from `VLLM_GPT_OSS_SYSTEM_TOOL_MCP_LABELS`.
+
+**Consider a custom implementation**: The GPT-OSS Python tool is a reference implementation. For production deployments, consider implementing a custom code execution sandbox with stricter isolation guarantees. See the [GPT-OSS documentation](https://github.com/openai/gpt-oss?tab=readme-ov-file#python) for guidance.
+
 ## Reporting Security Vulnerabilities

 If you believe you have found a security vulnerability in vLLM, please report it following the project's security policy. For more information on how to report security issues and the project's security policy, please see the [vLLM Security Policy](https://github.com/vllm-project/vllm/blob/main/SECURITY.md).
--- a/vllm/entrypoints/openai/cli_args.py
+++ b/vllm/entrypoints/openai/cli_args.py
@@ -125,8 +125,11 @@ class BaseFrontendArgs:
    `--tool-call-parser`."""
    tool_server: str | None = None
    """Comma-separated list of host:port pairs (IPv4, IPv6, or hostname).
-    Examples: 127.0.0.1:8000, [::1]:8000, localhost:1234. Or `demo` for demo
-    purpose."""
+    Examples: 127.0.0.1:8000, [::1]:8000, localhost:1234. Or `demo` for
+    built-in demo tools (browser and Python code interpreter). WARNING:
+    The `demo` Python tool executes model-generated code in Docker without
+    network isolation by default. See the security guide for more
+    information."""
    log_config_file: str | None = envs.VLLM_LOGGING_CONFIG_PATH
    """Path to logging config JSON file for both vllm and uvicorn"""
    max_log_len: int | None = None