Unverified Commit 7d7e3b78 authored by Woosuk Kwon's avatar Woosuk Kwon Committed by GitHub
Browse files

Use `--ipc=host` in docker run for distributed inference (#1125)

parent f98b745a
...@@ -46,4 +46,5 @@ You can also build and install vLLM from source: ...@@ -46,4 +46,5 @@ You can also build and install vLLM from source:
.. code-block:: console .. code-block:: console
$ # Pull the Docker image with CUDA 11.8. $ # Pull the Docker image with CUDA 11.8.
$ docker run --gpus all -it --rm --shm-size=8g nvcr.io/nvidia/pytorch:22.12-py3 $ # Use `--ipc=host` to make sure the shared memory is large enough.
$ docker run --gpus all -it --rm --ipc=host nvcr.io/nvidia/pytorch:22.12-py3
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment