update_pytorch_version.md 5.16 KB
Newer Older
1
# Update PyTorch version on vLLM OSS CI/CD
2
3
4
5
6
7

vLLM's current policy is to always use the latest PyTorch stable
release in CI/CD. It is standard practice to submit a PR to update the
PyTorch version as early as possible when a new [PyTorch stable
release](https://github.com/pytorch/pytorch/blob/main/RELEASE.md#release-cadence) becomes available.
This process is non-trivial due to the gap between PyTorch
8
releases. Using <https://github.com/vllm-project/vllm/pull/16859> as an example, this document outlines common steps to achieve this
9
update along with a list of potential issues and how to address them.
10
11
12
13
14
15
16
17
18

## Test PyTorch release candidates (RCs)

Updating PyTorch in vLLM after the official release is not
ideal because any issues discovered at that point can only be resolved
by waiting for the next release or by implementing hacky workarounds in vLLM.
The better solution is to test vLLM with PyTorch release candidates (RC) to ensure
compatibility before each release.

Reid's avatar
Reid committed
19
20
PyTorch release candidates can be downloaded from [PyTorch test index](https://download.pytorch.org/whl/test).
For example, `torch2.7.0+cu12.8` RC can be installed using the following command:
21

Reid's avatar
Reid committed
22
23
24
```bash
uv pip install torch torchvision torchaudio \
    --index-url https://download.pytorch.org/whl/test/cu128
25
26
27
28
29
30
31
```

When the final RC is ready for testing, it will be announced to the community
on the [PyTorch dev-discuss forum](https://dev-discuss.pytorch.org/c/release-announcements).
After this announcement, we can begin testing vLLM integration by drafting a pull request
following this 3-step process:

Reid's avatar
Reid committed
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
1. Update [requirements files](https://github.com/vllm-project/vllm/tree/main/requirements)
to point to the new releases for `torch`, `torchvision`, and `torchaudio`.

2. Use the following option to get the final release candidates' wheels. Some common platforms are `cpu`, `cu128`, and `rocm6.2.4`.

    ```bash
    --extra-index-url https://download.pytorch.org/whl/test/<PLATFORM>
    ```

3. Since vLLM uses `uv`, ensure the following index strategy is applied:

    - Via environment variable:

    ```bash
    export UV_INDEX_STRATEGY=unsafe-best-match
    ```

    - Or via CLI flag:

    ```bash
    --index-strategy unsafe-best-match
    ```
54
55
56
57
58
59

If failures are found in the pull request, raise them as issues on vLLM and
cc the PyTorch release team to initiate discussion on how to address them.

## Update CUDA version

60
The PyTorch release matrix includes both stable and experimental [CUDA versions](https://github.com/pytorch/pytorch/blob/main/RELEASE.md#release-compatibility-matrix). Due to limitations, only the latest stable CUDA version (for example, torch `2.7.1+cu126`) is uploaded to PyPI. However, vLLM may require a different CUDA version,
61
62
63
64
65
such as 12.8 for Blackwell support.
This complicates the process as we cannot use the out-of-the-box
`pip install torch torchvision torchaudio` command. The solution is to use
`--extra-index-url` in vLLM's Dockerfiles.

Reid's avatar
Reid committed
66
67
68
- Important indexes at the moment include:

| Platform | `--extra-index-url` |
69
70
71
| -------- | ------------------- |
| CUDA 12.8 | [https://download.pytorch.org/whl/cu128](https://download.pytorch.org/whl/cu128) |
| CPU | [https://download.pytorch.org/whl/cpu](https://download.pytorch.org/whl/cpu) |
Reid's avatar
Reid committed
72
73
| ROCm 6.2 | [https://download.pytorch.org/whl/rocm6.2.4](https://download.pytorch.org/whl/rocm6.2.4) |
| ROCm 6.3 | [https://download.pytorch.org/whl/rocm6.3](https://download.pytorch.org/whl/rocm6.3) |
74
| XPU | [https://download.pytorch.org/whl/xpu](https://download.pytorch.org/whl/xpu) |
Reid's avatar
Reid committed
75
76
77
78

- Update the below files to match the CUDA version from step 1. This makes sure that the release vLLM wheel is tested on CI.
    - `.buildkite/release-pipeline.yaml`
    - `.buildkite/scripts/upload-wheels.sh`
79

80
## Manually running vLLM builds on BuildKiteCI
81

82
83
84
85
When building vLLM with a new PyTorch/CUDA version, the vLLM sccache S3 bucket
will not have any cached artifacts, which can cause CI build jobs to exceed 5 hours.
Furthermore, vLLM's fastcheck pipeline operates in read-only mode and does not
populate the cache, making it ineffective for cache warm-up purposes.
86

87
To address this, manually trigger a build on Buildkite to accomplish two objectives:
88

89
90
1. Run the complete test suite against the PyTorch RC build by setting the environment variables: `RUN_ALL=1` and `NIGHTLY=1`
2. Populate the vLLM sccache S3 bucket with compiled artifacts, enabling faster subsequent builds
91
92

<p align="center" width="100%">
93
<img width="60%" alt="Buildkite new build popup" src="https://github.com/user-attachments/assets/3b07f71b-bb18-4ca3-aeaf-da0fe79d315f" />
94
95
96
97
98
99
100
101
</p>

## Update all the different vLLM platforms

Rather than attempting to update all vLLM platforms in a single pull request, it's more manageable
to handle some platforms separately. The separation of requirements and Dockerfiles
for different platforms in vLLM CI/CD allows us to selectively choose
which platforms to update. For instance, updating XPU requires the corresponding
Reid's avatar
Reid committed
102
release from [Intel Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch) by Intel.
103
104
While <https://github.com/vllm-project/vllm/pull/16859> updated vLLM to PyTorch 2.7.0 on CPU, CUDA, and ROCm,
<https://github.com/vllm-project/vllm/pull/17444> completed the update for XPU.