@@ -95,6 +100,23 @@ You can build and install vLLM from source:
...
@@ -95,6 +100,23 @@ You can build and install vLLM from source:
Build a docker image from `Dockerfile.rocm`, and launch a docker container.
Build a docker image from `Dockerfile.rocm`, and launch a docker container.
The `Dokerfile.rocm` is designed to support both ROCm 5.7 and ROCm 6.0 and later versions. It provides flexibility to customize the build of docker image using the following arguments:
* `BASE_IMAGE`: specifies the base image used when running ``docker build``, specifically the PyTorch on ROCm base image. We have tested ROCm 5.7 and ROCm 6.0. The default is `rocm/pytorch:rocm6.0_ubuntu20.04_py3.9_pytorch_2.1.1`
* `FX_GFX_ARCHS`: specifies the GFX architecture that is used to build flash-attention, for example, `gfx90a;gfx942` for MI200 and MI300. The default is `gfx90a;gfx942`
* `FA_BRANCH`: specifies the branch used to build the flash-attention in `ROCmSoftwarePlatform's flash-attention repo <https://github.com/ROCmSoftwarePlatform/flash-attention>`_. The default is `3d2b6f5`
Their values can be passed in when running ``docker build`` with ``--build-arg`` options.
For example, to build docker image for vllm on ROCm 5.7, you can run:
To build vllm on ROCm 6.0, you can use the default:
.. code-block:: console
.. code-block:: console
$ docker build -f Dockerfile.rocm -t vllm-rocm .
$ docker build -f Dockerfile.rocm -t vllm-rocm .
...
@@ -142,3 +164,8 @@ Alternatively, if you plan to install vLLM-ROCm on a local machine or start from
...
@@ -142,3 +164,8 @@ Alternatively, if you plan to install vLLM-ROCm on a local machine or start from
$ cd vllm
$ cd vllm
$ pip install -U -r requirements-rocm.txt
$ pip install -U -r requirements-rocm.txt
$ python setup.py install # This may take 5-10 minutes.
$ python setup.py install # This may take 5-10 minutes.
.. note::
- You may need to turn on the ``--enforce-eager`` flag if you experience process hang when running the `benchmark_thoughput.py` script to test your installation.