Unverified Commit 341923b9 authored by Aziz's avatar Aziz Committed by GitHub
Browse files

fix(tests): Ensure reliable CUDA cache clearing in MoE test (#23416)


Signed-off-by: default avatarAzizCode92 <azizbenothman76@gmail.com>
Signed-off-by: default avatarMichael Goin <mgoin64@gmail.com>
Co-authored-by: default avatarMichael Goin <mgoin64@gmail.com>
Co-authored-by: default avatargemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
parent 424fb7a5
...@@ -429,11 +429,11 @@ def test_mixtral_moe(dtype: torch.dtype, padding: bool, use_rocm_aiter: bool, ...@@ -429,11 +429,11 @@ def test_mixtral_moe(dtype: torch.dtype, padding: bool, use_rocm_aiter: bool,
vllm_moe.experts.w13_weight, (0, 128), "constant", 0)[..., vllm_moe.experts.w13_weight, (0, 128), "constant", 0)[...,
0:-128], 0:-128],
requires_grad=False) requires_grad=False)
torch.cuda.empty_cache()
vllm_moe.experts.w2_weight = Parameter(F.pad( vllm_moe.experts.w2_weight = Parameter(F.pad(
vllm_moe.experts.w2_weight, (0, 128), "constant", 0)[..., vllm_moe.experts.w2_weight, (0, 128), "constant", 0)[...,
0:-128], 0:-128],
requires_grad=False) requires_grad=False)
torch.cuda.synchronize()
torch.cuda.empty_cache() torch.cuda.empty_cache()
# Run forward passes for both MoE blocks # Run forward passes for both MoE blocks
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment