Interpolate fix on cuda for large output tensors (#10067)

* Workaround for upscale with large output tensors. Fixes #10040. * Fix scale when output_size is given * Style --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

Interpolate fix on cuda for large output tensors (#10067)
* Workaround for upscale with large output tensors. Fixes #10040. * Fix scale when output_size is given * Style --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2312b27f · Pedro Cuenca · GitHub · 6db33337 · 2312b27f
Unverified Commit 2312b27f authored Dec 03, 2024 by Pedro Cuenca Committed by GitHub Dec 02, 2024
Hide whitespace changes
Inline Side-by-side

Showing with 8 additions and 0 deletions

src/diffusers/models/upsampling.py src/diffusers/models/upsampling.py +8 -0

No files found.
--- a/src/diffusers/models/upsampling.py
+++ b/src/diffusers/models/upsampling.py
@@ -165,6 +165,14 @@ class Upsample2D(nn.Module):
        # if `output_size` is passed we force the interpolation output
        # size and do not make use of `scale_factor=2`
        if self.interpolate:
+            # upsample_nearest_nhwc also fails when the number of output elements is large
+            # https://github.com/pytorch/pytorch/issues/141831
+            scale_factor = (
+                2 if output_size is None else max([f / s for f, s in zip(output_size, hidden_states.shape[-2:])])
+            )
+            if hidden_states.numel() * scale_factor > pow(2, 31):
+                hidden_states = hidden_states.contiguous()
            if output_size is None:
                hidden_states = F.interpolate(hidden_states, scale_factor=2.0, mode="nearest")
            else: