[Flux] Optimize guidance creation in flux pipeline by moving it outside the loop (#9153)

* optimize guidance creation in flux pipeline by moving it outside the loop * use torch.full instead of torch.tensor to create a tensor with a single value --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

[Flux] Optimize guidance creation in flux pipeline by moving it outside the loop (#9153)
* optimize guidance creation in flux pipeline by moving it outside the loop * use torch.full instead of torch.tensor to create a tensor with a single value --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
e649678b · C · GitHub · 39b87b14 · e649678b
Unverified Commit e649678b authored Aug 16, 2024 by C Committed by GitHub Aug 16, 2024
Show whitespace changes
Inline Side-by-side

Showing with 7 additions and 7 deletions

src/diffusers/pipelines/flux/pipeline_flux.py src/diffusers/pipelines/flux/pipeline_flux.py +7 -7

No files found.
--- a/src/diffusers/pipelines/flux/pipeline_flux.py
+++ b/src/diffusers/pipelines/flux/pipeline_flux.py
@@ -677,6 +677,13 @@ class FluxPipeline(DiffusionPipeline, FluxLoraLoaderMixin):
        num_warmup_steps = max(len(timesteps) - num_inference_steps * self.scheduler.order, 0)
        self._num_timesteps = len(timesteps)
+        # handle guidance
+        if self.transformer.config.guidance_embeds:
+            guidance = torch.full([1], guidance_scale, device=device, dtype=torch.float32)
+            guidance = guidance.expand(latents.shape[0])
+        else:
+            guidance = None
        # 6. Denoising loop
        with self.progress_bar(total=num_inference_steps) as progress_bar:
            for i, t in enumerate(timesteps):
@@ -686,13 +693,6 @@ class FluxPipeline(DiffusionPipeline, FluxLoraLoaderMixin):
                # broadcast to batch dimension in a way that's compatible with ONNX/Core ML
                timestep = t.expand(latents.shape[0]).to(latents.dtype)
-                # handle guidance
-                if self.transformer.config.guidance_embeds:
-                    guidance = torch.tensor([guidance_scale], device=device)
-                    guidance = guidance.expand(latents.shape[0])
-                else:
-                    guidance = None
                noise_pred = self.transformer(
                    hidden_states=latents,
                    # YiYi notes: divide it by 1000 for now because we scale it by 1000 in the transforme rmodel (we should not keep it but I want to keep the inputs same for the model for testing)