Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
b5bae42f
Unverified
Commit
b5bae42f
authored
Oct 30, 2025
by
Kunshang Ji
Committed by
GitHub
Oct 30, 2025
Browse files
[XPU] Update latest IPEX 2.8 release (#27735)
Signed-off-by:
Kunshang Ji
<
kunshang.ji@intel.com
>
parent
d7fb10c5
Changes
4
Show whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
14 additions
and
20 deletions
+14
-20
.buildkite/scripts/hardware_ci/run-xpu-test.sh
.buildkite/scripts/hardware_ci/run-xpu-test.sh
+5
-2
docs/getting_started/installation/gpu.xpu.inc.md
docs/getting_started/installation/gpu.xpu.inc.md
+3
-1
requirements/xpu.txt
requirements/xpu.txt
+1
-1
vllm/_ipex_ops.py
vllm/_ipex_ops.py
+5
-16
No files found.
.buildkite/scripts/hardware_ci/run-xpu-test.sh
View file @
b5bae42f
...
...
@@ -20,7 +20,10 @@ trap remove_docker_container EXIT
# Run the image and test offline inference/tensor parallel
docker run
\
--device
/dev/dri
\
--device
/dev/dri:/dev/dri
\
--net
=
host
\
--ipc
=
host
\
--privileged
\
-v
/dev/dri/by-path:/dev/dri/by-path
\
--entrypoint
=
""
\
-e
"HF_TOKEN=
${
HF_TOKEN
}
"
\
...
...
@@ -42,7 +45,7 @@ docker run \
pytest -v -s v1/sample --ignore=v1/sample/test_logprobs.py --ignore=v1/sample/test_logprobs_e2e.py
pytest -v -s v1/worker --ignore=v1/worker/test_gpu_model_runner.py
pytest -v -s v1/structured_output
pytest -v -s v1/spec_decode --ignore=v1/spec_decode/test_max_len.py --ignore=v1/spec_decode/test_tree_attention.py
pytest -v -s v1/spec_decode --ignore=v1/spec_decode/test_max_len.py --ignore=v1/spec_decode/test_tree_attention.py
--ignore=v1/spec_decode/test_speculators_eagle3.py
pytest -v -s v1/kv_connector/unit --ignore=v1/kv_connector/unit/test_multi_connector.py --ignore=v1/kv_connector/unit/test_nixl_connector.py --ignore=v1/kv_connector/unit/test_shared_storage_connector.py
pytest -v -s v1/test_serial_utils.py
'
docs/getting_started/installation/gpu.xpu.inc.md
View file @
b5bae42f
...
...
@@ -56,8 +56,10 @@ docker build -f docker/Dockerfile.xpu -t vllm-xpu-env --shm-size=4g .
docker run
-it
\
--rm
\
--network
=
host
\
--device
/dev/dri
\
--device
/dev/dri
:/dev/dri
\
-v
/dev/dri/by-path:/dev/dri/by-path
\
--ipc
=
host
\
--privileged
\
vllm-xpu-env
```
...
...
requirements/xpu.txt
View file @
b5bae42f
...
...
@@ -15,4 +15,4 @@ torchaudio
torchvision
--extra-index-url=https://download.pytorch.org/whl/xpu
intel-extension-for-pytorch @ https://intel-extension-for-pytorch.s3.us-east-1.amazonaws.com/ipex_dev/xpu/intel_extension_for_pytorch-2.8.10.post
0
%2Bxpu-cp312-cp312-linux_x86_64.whl
intel-extension-for-pytorch @ https://intel-extension-for-pytorch.s3.us-east-1.amazonaws.com/ipex_dev/xpu/intel_extension_for_pytorch-2.8.10.post
1
%2Bxpu-cp312-cp312-linux_x86_64.whl
vllm/_ipex_ops.py
View file @
b5bae42f
...
...
@@ -151,7 +151,9 @@ class ipex_ops:
def
rms_norm
(
input
:
torch
.
Tensor
,
weight
:
torch
.
Tensor
,
epsilon
:
float
)
->
torch
.
Tensor
:
return
ipex
.
llm
.
functional
.
rms_norm
(
input
,
weight
,
epsilon
)
out
=
torch
.
empty_like
(
input
)
torch
.
ops
.
torch_ipex
.
rms_norm_vllm
(
out
,
input
.
contiguous
(),
weight
,
epsilon
)
return
out
@
staticmethod
def
fused_add_rms_norm
(
...
...
@@ -160,10 +162,7 @@ class ipex_ops:
weight
:
torch
.
Tensor
,
epsilon
:
float
,
)
->
None
:
tmp
=
ipex
.
llm
.
functional
.
add_rms_norm
(
residual
,
input
,
weight
,
None
,
epsilon
,
True
)
input
.
copy_
(
tmp
)
torch
.
ops
.
torch_ipex
.
fused_add_rms_norm_vllm
(
input
,
residual
,
weight
,
epsilon
)
@
staticmethod
def
varlen_attention
(
...
...
@@ -296,16 +295,6 @@ class ipex_ops:
num_splits
=
0
,
s_aux
:
torch
.
Tensor
|
None
=
None
,
):
if
cu_seqlens_k
is
None
:
# cu_seqlens_k is not used in ipex kernel.
cu_seqlens_k
=
torch
.
cumsum
(
seqused_k
,
dim
=
0
)
cu_seqlens_k
=
torch
.
cat
(
[
torch
.
tensor
([
0
],
device
=
seqused_k
.
device
,
dtype
=
torch
.
int32
),
cu_seqlens_k
,
]
).
to
(
torch
.
int32
)
real_window_size
:
tuple
[
int
,
int
]
if
window_size
is
None
:
real_window_size
=
(
-
1
,
-
1
)
...
...
@@ -318,7 +307,7 @@ class ipex_ops:
k
,
v
,
cu_seqlens_q
,
cu_seqlens
_k
,
seqused
_k
,
max_seqlen_q
,
max_seqlen_k
,
softmax_scale
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment