Unverified Commit 51fd3dd2 authored by M. Tolga Cangöz's avatar M. Tolga Cangöz Committed by GitHub
Browse files

[`Docs`] Remove `.to('cuda')` before `.enable_model_cpu_offload()` (#5795)

Remove .to('cuda') before cpu_offload, trim trailing whitespaces
parent 98457580
...@@ -50,7 +50,7 @@ Under the hood, [`AutoPipelineForText2Image`]: ...@@ -50,7 +50,7 @@ Under the hood, [`AutoPipelineForText2Image`]:
1. automatically detects a `"stable-diffusion"` class from the [`model_index.json`](https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/model_index.json) file 1. automatically detects a `"stable-diffusion"` class from the [`model_index.json`](https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/model_index.json) file
2. loads the corresponding text-to-image [`StableDiffusionPipeline`] based on the `"stable-diffusion"` class name 2. loads the corresponding text-to-image [`StableDiffusionPipeline`] based on the `"stable-diffusion"` class name
Likewise, for image-to-image, [`AutoPipelineForImage2Image`] detects a `"stable-diffusion"` checkpoint from the `model_index.json` file and it'll load the corresponding [`StableDiffusionImg2ImgPipeline`] behind the scenes. You can also pass any additional arguments specific to the pipeline class such as `strength`, which determines the amount of noise or variation added to an input image: Likewise, for image-to-image, [`AutoPipelineForImage2Image`] detects a `"stable-diffusion"` checkpoint from the `model_index.json` file and it'll load the corresponding [`StableDiffusionImg2ImgPipeline`] behind the scenes. You can also pass any additional arguments specific to the pipeline class such as `strength`, which determines the amount of noise or variation added to an input image:
```py ```py
from diffusers import AutoPipelineForImage2Image from diffusers import AutoPipelineForImage2Image
......
...@@ -12,7 +12,7 @@ specific language governing permissions and limitations under the License. ...@@ -12,7 +12,7 @@ specific language governing permissions and limitations under the License.
# Overview # Overview
Welcome to 🧨 Diffusers! If you're new to diffusion models and generative AI, and want to learn more, then you've come to the right place. These beginner-friendly tutorials are designed to provide a gentle introduction to diffusion models and help you understand the library fundamentals - the core components and how 🧨 Diffusers is meant to be used. Welcome to 🧨 Diffusers! If you're new to diffusion models and generative AI, and want to learn more, then you've come to the right place. These beginner-friendly tutorials are designed to provide a gentle introduction to diffusion models and help you understand the library fundamentals - the core components and how 🧨 Diffusers is meant to be used.
You'll learn how to use a pipeline for inference to rapidly generate things, and then deconstruct that pipeline to really understand how to use the library as a modular toolbox for building your own diffusion systems. In the next lesson, you'll learn how to train your own diffusion model to generate what you want. You'll learn how to use a pipeline for inference to rapidly generate things, and then deconstruct that pipeline to really understand how to use the library as a modular toolbox for building your own diffusion systems. In the next lesson, you'll learn how to train your own diffusion model to generate what you want.
......
...@@ -58,7 +58,7 @@ image ...@@ -58,7 +58,7 @@ image
``` ```
![toy-face](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/peft_integration/diffusers_peft_lora_inference_8_1.png) ![toy-face](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/peft_integration/diffusers_peft_lora_inference_8_1.png)
With the `adapter_name` parameter, it is really easy to use another adapter for inference! Load the [nerijs/pixel-art-xl](https://huggingface.co/nerijs/pixel-art-xl) adapter that has been fine-tuned to generate pixel art images, and let's call it `"pixel"`. With the `adapter_name` parameter, it is really easy to use another adapter for inference! Load the [nerijs/pixel-art-xl](https://huggingface.co/nerijs/pixel-art-xl) adapter that has been fine-tuned to generate pixel art images, and let's call it `"pixel"`.
...@@ -80,7 +80,7 @@ image ...@@ -80,7 +80,7 @@ image
``` ```
![pixel-art](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/peft_integration/diffusers_peft_lora_inference_12_1.png) ![pixel-art](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/peft_integration/diffusers_peft_lora_inference_12_1.png)
## Combine multiple adapters ## Combine multiple adapters
You can also perform multi-adapter inference where you combine different adapter checkpoints for inference. You can also perform multi-adapter inference where you combine different adapter checkpoints for inference.
...@@ -112,7 +112,7 @@ image ...@@ -112,7 +112,7 @@ image
``` ```
![toy-face-pixel-art](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/peft_integration/diffusers_peft_lora_inference_16_1.png) ![toy-face-pixel-art](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/peft_integration/diffusers_peft_lora_inference_16_1.png)
Impressive! As you can see, the model was able to generate an image that mixes the characteristics of both adapters. Impressive! As you can see, the model was able to generate an image that mixes the characteristics of both adapters.
If you want to go back to using only one adapter, use the [`~diffusers.loaders.UNet2DConditionLoadersMixin.set_adapters`] method to activate the `"toy"` adapter: If you want to go back to using only one adapter, use the [`~diffusers.loaders.UNet2DConditionLoadersMixin.set_adapters`] method to activate the `"toy"` adapter:
......
...@@ -226,7 +226,7 @@ pipeline = AutoPipelineForText2Image.from_pretrained( ...@@ -226,7 +226,7 @@ pipeline = AutoPipelineForText2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16 "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16
).to("cuda") ).to("cuda")
image = pipeline( image = pipeline(
prompt="Astronaut in a jungle, cold color palette, muted colors, detailed, 8k", prompt="Astronaut in a jungle, cold color palette, muted colors, detailed, 8k",
negative_prompt="ugly, deformed, disfigured, poor details, bad anatomy", negative_prompt="ugly, deformed, disfigured, poor details, bad anatomy",
).images[0] ).images[0]
image image
...@@ -258,7 +258,7 @@ pipeline = AutoPipelineForText2Image.from_pretrained( ...@@ -258,7 +258,7 @@ pipeline = AutoPipelineForText2Image.from_pretrained(
).to("cuda") ).to("cuda")
generator = torch.Generator(device="cuda").manual_seed(30) generator = torch.Generator(device="cuda").manual_seed(30)
image = pipeline( image = pipeline(
"Astronaut in a jungle, cold color palette, muted colors, detailed, 8k", "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k",
generator=generator, generator=generator,
).images[0] ).images[0]
image image
......
...@@ -49,7 +49,7 @@ To ensure your pipeline and its components (`unet` and `scheduler`) can be saved ...@@ -49,7 +49,7 @@ To ensure your pipeline and its components (`unet` and `scheduler`) can be saved
+ self.register_modules(unet=unet, scheduler=scheduler) + self.register_modules(unet=unet, scheduler=scheduler)
``` ```
Cool, the `__init__` step is done and you can move to the forward pass now! 🔥 Cool, the `__init__` step is done and you can move to the forward pass now! 🔥
## Define the forward pass ## Define the forward pass
......
...@@ -86,7 +86,7 @@ import torch ...@@ -86,7 +86,7 @@ import torch
controlnet = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-canny", torch_dtype=torch.float16, use_safetensors=True) controlnet = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-canny", torch_dtype=torch.float16, use_safetensors=True)
pipe = StableDiffusionControlNetPipeline.from_pretrained( pipe = StableDiffusionControlNetPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16, use_safetensors=True "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16, use_safetensors=True
).to("cuda") )
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload() pipe.enable_model_cpu_offload()
...@@ -146,7 +146,7 @@ import torch ...@@ -146,7 +146,7 @@ import torch
controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11f1p_sd15_depth", torch_dtype=torch.float16, use_safetensors=True) controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11f1p_sd15_depth", torch_dtype=torch.float16, use_safetensors=True)
pipe = StableDiffusionControlNetImg2ImgPipeline.from_pretrained( pipe = StableDiffusionControlNetImg2ImgPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16, use_safetensors=True "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16, use_safetensors=True
).to("cuda") )
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload() pipe.enable_model_cpu_offload()
...@@ -231,7 +231,7 @@ import torch ...@@ -231,7 +231,7 @@ import torch
controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_inpaint", torch_dtype=torch.float16, use_safetensors=True) controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_inpaint", torch_dtype=torch.float16, use_safetensors=True)
pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained( pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16, use_safetensors=True "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16, use_safetensors=True
).to("cuda") )
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload() pipe.enable_model_cpu_offload()
......
...@@ -74,7 +74,7 @@ diffuser_pipeline = DiffusionPipeline.from_pretrained( ...@@ -74,7 +74,7 @@ diffuser_pipeline = DiffusionPipeline.from_pretrained(
diffuser_pipeline.enable_attention_slicing() diffuser_pipeline.enable_attention_slicing()
diffuser_pipeline = diffuser_pipeline.to(device) diffuser_pipeline = diffuser_pipeline.to(device)
prompt = ["a photograph of an astronaut riding a horse", prompt = ["a photograph of an astronaut riding a horse",
"Una casa en la playa", "Una casa en la playa",
"Ein Hund, der Orange isst", "Ein Hund, der Orange isst",
"Un restaurant parisien"] "Un restaurant parisien"]
......
...@@ -117,10 +117,10 @@ from pipeline_t2v_base_pixel import TextToVideoIFPipeline ...@@ -117,10 +117,10 @@ from pipeline_t2v_base_pixel import TextToVideoIFPipeline
import torch import torch
pipeline = TextToVideoIFPipeline( pipeline = TextToVideoIFPipeline(
unet=unet, unet=unet,
text_encoder=text_encoder, text_encoder=text_encoder,
tokenizer=tokenizer, tokenizer=tokenizer,
scheduler=scheduler, scheduler=scheduler,
feature_extractor=feature_extractor feature_extractor=feature_extractor
) )
pipeline = pipeline.to(device="cuda") pipeline = pipeline.to(device="cuda")
......
...@@ -160,12 +160,12 @@ Check out the [generation strategy](https://huggingface.co/docs/transformers/mai ...@@ -160,12 +160,12 @@ Check out the [generation strategy](https://huggingface.co/docs/transformers/mai
Load the text encoder model used by the [`StableDiffusionDiffEditPipeline`] to encode the text. You'll use the text encoder to compute the text embeddings: Load the text encoder model used by the [`StableDiffusionDiffEditPipeline`] to encode the text. You'll use the text encoder to compute the text embeddings:
```py ```py
import torch import torch
from diffusers import StableDiffusionDiffEditPipeline from diffusers import StableDiffusionDiffEditPipeline
pipeline = StableDiffusionDiffEditPipeline.from_pretrained( pipeline = StableDiffusionDiffEditPipeline.from_pretrained(
"stabilityai/stable-diffusion-2-1", torch_dtype=torch.float16, use_safetensors=True "stabilityai/stable-diffusion-2-1", torch_dtype=torch.float16, use_safetensors=True
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
pipeline.enable_vae_slicing() pipeline.enable_vae_slicing()
......
...@@ -14,12 +14,12 @@ specific language governing permissions and limitations under the License. ...@@ -14,12 +14,12 @@ specific language governing permissions and limitations under the License.
[[open-in-colab]] [[open-in-colab]]
The UNet is responsible for denoising during the reverse diffusion process, and there are two distinct features in its architecture: The UNet is responsible for denoising during the reverse diffusion process, and there are two distinct features in its architecture:
1. Backbone features primarily contribute to the denoising process 1. Backbone features primarily contribute to the denoising process
2. Skip features mainly introduce high-frequency features into the decoder module and can make the network overlook the semantics in the backbone features 2. Skip features mainly introduce high-frequency features into the decoder module and can make the network overlook the semantics in the backbone features
However, the skip connection can sometimes introduce unnatural image details. [FreeU](https://hf.co/papers/2309.11497) is a technique for improving image quality by rebalancing the contributions from the UNet’s skip connections and backbone feature maps. However, the skip connection can sometimes introduce unnatural image details. [FreeU](https://hf.co/papers/2309.11497) is a technique for improving image quality by rebalancing the contributions from the UNet’s skip connections and backbone feature maps.
FreeU is applied during inference and it does not require any additional training. The technique works for different tasks such as text-to-image, image-to-image, and text-to-video. FreeU is applied during inference and it does not require any additional training. The technique works for different tasks such as text-to-image, image-to-image, and text-to-video.
...@@ -27,11 +27,11 @@ In this guide, you will apply FreeU to the [`StableDiffusionPipeline`], [`Stable ...@@ -27,11 +27,11 @@ In this guide, you will apply FreeU to the [`StableDiffusionPipeline`], [`Stable
## StableDiffusionPipeline ## StableDiffusionPipeline
Load the pipeline: Load the pipeline:
```py ```py
from diffusers import DiffusionPipeline from diffusers import DiffusionPipeline
import torch import torch
pipeline = DiffusionPipeline.from_pretrained( pipeline = DiffusionPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, safety_checker=None "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, safety_checker=None
...@@ -70,7 +70,7 @@ Let's see how Stable Diffusion 2 results are impacted: ...@@ -70,7 +70,7 @@ Let's see how Stable Diffusion 2 results are impacted:
```py ```py
from diffusers import DiffusionPipeline from diffusers import DiffusionPipeline
import torch import torch
pipeline = DiffusionPipeline.from_pretrained( pipeline = DiffusionPipeline.from_pretrained(
"stabilityai/stable-diffusion-2-1", torch_dtype=torch.float16, safety_checker=None "stabilityai/stable-diffusion-2-1", torch_dtype=torch.float16, safety_checker=None
...@@ -92,7 +92,7 @@ Finally, let's take a look at how FreeU affects Stable Diffusion XL results: ...@@ -92,7 +92,7 @@ Finally, let's take a look at how FreeU affects Stable Diffusion XL results:
```py ```py
from diffusers import DiffusionPipeline from diffusers import DiffusionPipeline
import torch import torch
pipeline = DiffusionPipeline.from_pretrained( pipeline = DiffusionPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16,
......
...@@ -27,7 +27,7 @@ from diffusers.utils import load_image, make_image_grid ...@@ -27,7 +27,7 @@ from diffusers.utils import load_image, make_image_grid
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"kandinsky-community/kandinsky-2-2-decoder", torch_dtype=torch.float16, use_safetensors=True "kandinsky-community/kandinsky-2-2-decoder", torch_dtype=torch.float16, use_safetensors=True
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -79,7 +79,7 @@ from diffusers.utils import make_image_grid, load_image ...@@ -79,7 +79,7 @@ from diffusers.utils import make_image_grid, load_image
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -117,7 +117,7 @@ from diffusers.utils import make_image_grid, load_image ...@@ -117,7 +117,7 @@ from diffusers.utils import make_image_grid, load_image
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"stabilityai/stable-diffusion-xl-refiner-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True "stabilityai/stable-diffusion-xl-refiner-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -157,7 +157,7 @@ from diffusers.utils import make_image_grid, load_image ...@@ -157,7 +157,7 @@ from diffusers.utils import make_image_grid, load_image
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"kandinsky-community/kandinsky-2-2-decoder", torch_dtype=torch.float16, use_safetensors=True "kandinsky-community/kandinsky-2-2-decoder", torch_dtype=torch.float16, use_safetensors=True
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -204,7 +204,7 @@ from diffusers.utils import make_image_grid, load_image ...@@ -204,7 +204,7 @@ from diffusers.utils import make_image_grid, load_image
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -248,7 +248,7 @@ from diffusers.utils import make_image_grid, load_image ...@@ -248,7 +248,7 @@ from diffusers.utils import make_image_grid, load_image
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -290,7 +290,7 @@ from diffusers.utils import make_image_grid, load_image ...@@ -290,7 +290,7 @@ from diffusers.utils import make_image_grid, load_image
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"stabilityai/stable-diffusion-xl-refiner-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True "stabilityai/stable-diffusion-xl-refiner-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -335,7 +335,7 @@ from diffusers.utils import make_image_grid ...@@ -335,7 +335,7 @@ from diffusers.utils import make_image_grid
pipeline = AutoPipelineForText2Image.from_pretrained( pipeline = AutoPipelineForText2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -349,7 +349,7 @@ Now you can pass this generated image to the image-to-image pipeline: ...@@ -349,7 +349,7 @@ Now you can pass this generated image to the image-to-image pipeline:
```py ```py
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"kandinsky-community/kandinsky-2-2-decoder", torch_dtype=torch.float16, use_safetensors=True "kandinsky-community/kandinsky-2-2-decoder", torch_dtype=torch.float16, use_safetensors=True
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -371,7 +371,7 @@ from diffusers.utils import make_image_grid, load_image ...@@ -371,7 +371,7 @@ from diffusers.utils import make_image_grid, load_image
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -397,7 +397,7 @@ Pass the latent output from this pipeline to the next pipeline to generate an im ...@@ -397,7 +397,7 @@ Pass the latent output from this pipeline to the next pipeline to generate an im
```py ```py
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"ogkalu/Comic-Diffusion", torch_dtype=torch.float16 "ogkalu/Comic-Diffusion", torch_dtype=torch.float16
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -411,7 +411,7 @@ Repeat one more time to generate the final image in a [pixel art style](https:// ...@@ -411,7 +411,7 @@ Repeat one more time to generate the final image in a [pixel art style](https://
```py ```py
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"kohbanye/pixel-art-style", torch_dtype=torch.float16 "kohbanye/pixel-art-style", torch_dtype=torch.float16
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -434,7 +434,7 @@ from diffusers.utils import make_image_grid, load_image ...@@ -434,7 +434,7 @@ from diffusers.utils import make_image_grid, load_image
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -462,7 +462,7 @@ from diffusers import StableDiffusionLatentUpscalePipeline ...@@ -462,7 +462,7 @@ from diffusers import StableDiffusionLatentUpscalePipeline
upscaler = StableDiffusionLatentUpscalePipeline.from_pretrained( upscaler = StableDiffusionLatentUpscalePipeline.from_pretrained(
"stabilityai/sd-x2-latent-upscaler", torch_dtype=torch.float16, variant="fp16", use_safetensors=True "stabilityai/sd-x2-latent-upscaler", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda") )
upscaler.enable_model_cpu_offload() upscaler.enable_model_cpu_offload()
upscaler.enable_xformers_memory_efficient_attention() upscaler.enable_xformers_memory_efficient_attention()
...@@ -476,7 +476,7 @@ from diffusers import StableDiffusionUpscalePipeline ...@@ -476,7 +476,7 @@ from diffusers import StableDiffusionUpscalePipeline
super_res = StableDiffusionUpscalePipeline.from_pretrained( super_res = StableDiffusionUpscalePipeline.from_pretrained(
"stabilityai/stable-diffusion-x4-upscaler", torch_dtype=torch.float16, variant="fp16", use_safetensors=True "stabilityai/stable-diffusion-x4-upscaler", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda") )
super_res.enable_model_cpu_offload() super_res.enable_model_cpu_offload()
super_res.enable_xformers_memory_efficient_attention() super_res.enable_xformers_memory_efficient_attention()
...@@ -500,7 +500,7 @@ import torch ...@@ -500,7 +500,7 @@ import torch
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -537,7 +537,7 @@ import torch ...@@ -537,7 +537,7 @@ import torch
controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11f1p_sd15_depth", torch_dtype=torch.float16, variant="fp16", use_safetensors=True) controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11f1p_sd15_depth", torch_dtype=torch.float16, variant="fp16", use_safetensors=True)
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16, variant="fp16", use_safetensors=True "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -571,7 +571,7 @@ Let's apply a new [style](https://huggingface.co/nitrosocke/elden-ring-diffusion ...@@ -571,7 +571,7 @@ Let's apply a new [style](https://huggingface.co/nitrosocke/elden-ring-diffusion
```py ```py
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"nitrosocke/elden-ring-diffusion", torch_dtype=torch.float16, "nitrosocke/elden-ring-diffusion", torch_dtype=torch.float16,
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
......
...@@ -27,7 +27,7 @@ from diffusers.utils import load_image, make_image_grid ...@@ -27,7 +27,7 @@ from diffusers.utils import load_image, make_image_grid
pipeline = AutoPipelineForInpainting.from_pretrained( pipeline = AutoPipelineForInpainting.from_pretrained(
"kandinsky-community/kandinsky-2-2-decoder-inpaint", torch_dtype=torch.float16 "kandinsky-community/kandinsky-2-2-decoder-inpaint", torch_dtype=torch.float16
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -98,7 +98,7 @@ from diffusers.utils import load_image, make_image_grid ...@@ -98,7 +98,7 @@ from diffusers.utils import load_image, make_image_grid
pipeline = AutoPipelineForInpainting.from_pretrained( pipeline = AutoPipelineForInpainting.from_pretrained(
"runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16, variant="fp16" "runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16, variant="fp16"
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -124,7 +124,7 @@ from diffusers.utils import load_image, make_image_grid ...@@ -124,7 +124,7 @@ from diffusers.utils import load_image, make_image_grid
pipeline = AutoPipelineForInpainting.from_pretrained( pipeline = AutoPipelineForInpainting.from_pretrained(
"diffusers/stable-diffusion-xl-1.0-inpainting-0.1", torch_dtype=torch.float16, variant="fp16" "diffusers/stable-diffusion-xl-1.0-inpainting-0.1", torch_dtype=torch.float16, variant="fp16"
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -150,7 +150,7 @@ from diffusers.utils import load_image, make_image_grid ...@@ -150,7 +150,7 @@ from diffusers.utils import load_image, make_image_grid
pipeline = AutoPipelineForInpainting.from_pretrained( pipeline = AutoPipelineForInpainting.from_pretrained(
"kandinsky-community/kandinsky-2-2-decoder-inpaint", torch_dtype=torch.float16 "kandinsky-community/kandinsky-2-2-decoder-inpaint", torch_dtype=torch.float16
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -379,7 +379,7 @@ from diffusers.utils import load_image, make_image_grid ...@@ -379,7 +379,7 @@ from diffusers.utils import load_image, make_image_grid
pipeline = AutoPipelineForInpainting.from_pretrained( pipeline = AutoPipelineForInpainting.from_pretrained(
"runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16, variant="fp16" "runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16, variant="fp16"
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -424,7 +424,7 @@ from diffusers.utils import load_image, make_image_grid ...@@ -424,7 +424,7 @@ from diffusers.utils import load_image, make_image_grid
pipeline = AutoPipelineForInpainting.from_pretrained( pipeline = AutoPipelineForInpainting.from_pretrained(
"runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16, variant="fp16" "runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16, variant="fp16"
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -464,7 +464,7 @@ from diffusers.utils import load_image, make_image_grid ...@@ -464,7 +464,7 @@ from diffusers.utils import load_image, make_image_grid
pipeline = AutoPipelineForInpainting.from_pretrained( pipeline = AutoPipelineForInpainting.from_pretrained(
"runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16, variant="fp16" "runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16, variant="fp16"
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -503,7 +503,7 @@ from diffusers.utils import load_image, make_image_grid ...@@ -503,7 +503,7 @@ from diffusers.utils import load_image, make_image_grid
pipeline = AutoPipelineForText2Image.from_pretrained( pipeline = AutoPipelineForText2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -522,7 +522,7 @@ And let's inpaint the masked area with a waterfall: ...@@ -522,7 +522,7 @@ And let's inpaint the masked area with a waterfall:
```py ```py
pipeline = AutoPipelineForInpainting.from_pretrained( pipeline = AutoPipelineForInpainting.from_pretrained(
"kandinsky-community/kandinsky-2-2-decoder-inpaint", torch_dtype=torch.float16 "kandinsky-community/kandinsky-2-2-decoder-inpaint", torch_dtype=torch.float16
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -556,7 +556,7 @@ from diffusers.utils import load_image, make_image_grid ...@@ -556,7 +556,7 @@ from diffusers.utils import load_image, make_image_grid
pipeline = AutoPipelineForInpainting.from_pretrained( pipeline = AutoPipelineForInpainting.from_pretrained(
"runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16, variant="fp16" "runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16, variant="fp16"
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -577,7 +577,7 @@ Now let's pass the image to another inpainting pipeline with SDXL's refiner mode ...@@ -577,7 +577,7 @@ Now let's pass the image to another inpainting pipeline with SDXL's refiner mode
```py ```py
pipeline = AutoPipelineForInpainting.from_pretrained( pipeline = AutoPipelineForInpainting.from_pretrained(
"stabilityai/stable-diffusion-xl-refiner-1.0", torch_dtype=torch.float16, variant="fp16" "stabilityai/stable-diffusion-xl-refiner-1.0", torch_dtype=torch.float16, variant="fp16"
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -636,7 +636,7 @@ from diffusers.utils import make_image_grid ...@@ -636,7 +636,7 @@ from diffusers.utils import make_image_grid
pipeline = AutoPipelineForInpainting.from_pretrained( pipeline = AutoPipelineForInpainting.from_pretrained(
"runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16, "runwayml/stable-diffusion-inpainting", torch_dtype=torch.float16,
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -667,7 +667,7 @@ controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_inpai ...@@ -667,7 +667,7 @@ controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_inpai
# pass ControlNet to the pipeline # pass ControlNet to the pipeline
pipeline = StableDiffusionControlNetInpaintPipeline.from_pretrained( pipeline = StableDiffusionControlNetInpaintPipeline.from_pretrained(
"runwayml/stable-diffusion-inpainting", controlnet=controlnet, torch_dtype=torch.float16, variant="fp16" "runwayml/stable-diffusion-inpainting", controlnet=controlnet, torch_dtype=torch.float16, variant="fp16"
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
...@@ -705,7 +705,7 @@ from diffusers import AutoPipelineForImage2Image ...@@ -705,7 +705,7 @@ from diffusers import AutoPipelineForImage2Image
pipeline = AutoPipelineForImage2Image.from_pretrained( pipeline = AutoPipelineForImage2Image.from_pretrained(
"nitrosocke/elden-ring-diffusion", torch_dtype=torch.float16, "nitrosocke/elden-ring-diffusion", torch_dtype=torch.float16,
).to("cuda") )
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
# remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed # remove following line if xFormers is not installed or you have PyTorch 2.0 or higher installed
pipeline.enable_xformers_memory_efficient_attention() pipeline.enable_xformers_memory_efficient_attention()
......
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Kandinsky # Kandinsky
[[open-in-colab]] [[open-in-colab]]
...@@ -91,7 +103,7 @@ Use the [`AutoPipelineForText2Image`] to automatically call the combined pipelin ...@@ -91,7 +103,7 @@ Use the [`AutoPipelineForText2Image`] to automatically call the combined pipelin
from diffusers import AutoPipelineForText2Image from diffusers import AutoPipelineForText2Image
import torch import torch
pipeline = AutoPipelineForText2Image.from_pretrained("kandinsky-community/kandinsky-2-1", torch_dtype=torch.float16).to("cuda") pipeline = AutoPipelineForText2Image.from_pretrained("kandinsky-community/kandinsky-2-1", torch_dtype=torch.float16)
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
prompt = "A alien cheeseburger creature eating itself, claymation, cinematic, moody lighting" prompt = "A alien cheeseburger creature eating itself, claymation, cinematic, moody lighting"
...@@ -107,7 +119,7 @@ image = pipeline(prompt=prompt, negative_prompt=negative_prompt, prior_guidance_ ...@@ -107,7 +119,7 @@ image = pipeline(prompt=prompt, negative_prompt=negative_prompt, prior_guidance_
from diffusers import AutoPipelineForText2Image from diffusers import AutoPipelineForText2Image
import torch import torch
pipeline = AutoPipelineForText2Image.from_pretrained("kandinsky-community/kandinsky-2-2-decoder", torch_dtype=torch.float16).to("cuda") pipeline = AutoPipelineForText2Image.from_pretrained("kandinsky-community/kandinsky-2-2-decoder", torch_dtype=torch.float16)
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
prompt = "A alien cheeseburger creature eating itself, claymation, cinematic, moody lighting" prompt = "A alien cheeseburger creature eating itself, claymation, cinematic, moody lighting"
...@@ -217,14 +229,14 @@ from io import BytesIO ...@@ -217,14 +229,14 @@ from io import BytesIO
from PIL import Image from PIL import Image
import os import os
pipeline = AutoPipelineForImage2Image.from_pretrained("kandinsky-community/kandinsky-2-1", torch_dtype=torch.float16, use_safetensors=True).to("cuda") pipeline = AutoPipelineForImage2Image.from_pretrained("kandinsky-community/kandinsky-2-1", torch_dtype=torch.float16, use_safetensors=True)
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
prompt = "A fantasy landscape, Cinematic lighting" prompt = "A fantasy landscape, Cinematic lighting"
negative_prompt = "low quality, bad quality" negative_prompt = "low quality, bad quality"
url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg" url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
response = requests.get(url) response = requests.get(url)
original_image = Image.open(BytesIO(response.content)).convert("RGB") original_image = Image.open(BytesIO(response.content)).convert("RGB")
original_image.thumbnail((768, 768)) original_image.thumbnail((768, 768))
...@@ -243,14 +255,14 @@ from io import BytesIO ...@@ -243,14 +255,14 @@ from io import BytesIO
from PIL import Image from PIL import Image
import os import os
pipeline = AutoPipelineForImage2Image.from_pretrained("kandinsky-community/kandinsky-2-2-decoder", torch_dtype=torch.float16).to("cuda") pipeline = AutoPipelineForImage2Image.from_pretrained("kandinsky-community/kandinsky-2-2-decoder", torch_dtype=torch.float16)
pipeline.enable_model_cpu_offload() pipeline.enable_model_cpu_offload()
prompt = "A fantasy landscape, Cinematic lighting" prompt = "A fantasy landscape, Cinematic lighting"
negative_prompt = "low quality, bad quality" negative_prompt = "low quality, bad quality"
url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg" url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
response = requests.get(url) response = requests.get(url)
original_image = Image.open(BytesIO(response.content)).convert("RGB") original_image = Image.open(BytesIO(response.content)).convert("RGB")
original_image.thumbnail((768, 768)) original_image.thumbnail((768, 768))
......
...@@ -12,7 +12,7 @@ specific language governing permissions and limitations under the License. ...@@ -12,7 +12,7 @@ specific language governing permissions and limitations under the License.
# Performing inference with LCM # Performing inference with LCM
Latent Consistency Models (LCM) enable quality image generation in typically 2-4 steps making it possible to use diffusion models in almost real-time settings. Latent Consistency Models (LCM) enable quality image generation in typically 2-4 steps making it possible to use diffusion models in almost real-time settings.
From the [official website](https://latent-consistency-models.github.io/): From the [official website](https://latent-consistency-models.github.io/):
...@@ -59,7 +59,7 @@ Some details to keep in mind: ...@@ -59,7 +59,7 @@ Some details to keep in mind:
## Image-to-image ## Image-to-image
The findings above apply to image-to-image tasks too. Let's look at how we can perform image-to-image generation with LCMs: The findings above apply to image-to-image tasks too. Let's look at how we can perform image-to-image generation with LCMs:
```python ```python
from diffusers import AutoPipelineForImage2Image, UNet2DConditionModel, LCMScheduler from diffusers import AutoPipelineForImage2Image, UNet2DConditionModel, LCMScheduler
...@@ -96,8 +96,8 @@ image = pipe( ...@@ -96,8 +96,8 @@ image = pipe(
It is possible to generalize the LCM framework to use with [LoRA](../training/lora.md). It effectively eliminates the need to conduct expensive fine-tuning runs as LoRA training concerns just a few number of parameters compared to full fine-tuning. During inference, the [`LCMScheduler`] comes to the advantage as it enables very few-steps inference without compromising the quality. It is possible to generalize the LCM framework to use with [LoRA](../training/lora.md). It effectively eliminates the need to conduct expensive fine-tuning runs as LoRA training concerns just a few number of parameters compared to full fine-tuning. During inference, the [`LCMScheduler`] comes to the advantage as it enables very few-steps inference without compromising the quality.
We recommend to disable `guidance_scale` by setting it 0. The model is trained to follow prompts accurately We recommend to disable `guidance_scale` by setting it 0. The model is trained to follow prompts accurately
even without using guidance scale. You can however, still use guidance scale in which case we recommend even without using guidance scale. You can however, still use guidance scale in which case we recommend
using values between 1.0 and 2.0. using values between 1.0 and 2.0.
### Text-to-image ### Text-to-image
......
...@@ -232,7 +232,7 @@ TODO(Patrick) - Make sure to uncomment this part as soon as things are deprecate ...@@ -232,7 +232,7 @@ TODO(Patrick) - Make sure to uncomment this part as soon as things are deprecate
#### Using `revision` to load pipeline variants is deprecated #### Using `revision` to load pipeline variants is deprecated
Previously the `revision` argument of [`DiffusionPipeline.from_pretrained`] was heavily used to Previously the `revision` argument of [`DiffusionPipeline.from_pretrained`] was heavily used to
load model variants, e.g.: load model variants, e.g.:
```python ```python
...@@ -247,7 +247,7 @@ The above example is therefore deprecated and won't be supported anymore for `di ...@@ -247,7 +247,7 @@ The above example is therefore deprecated and won't be supported anymore for `di
<Tip warning={true}> <Tip warning={true}>
If you load diffusers pipelines or models with `revision="fp16"` or `revision="non_ema"`, If you load diffusers pipelines or models with `revision="fp16"` or `revision="non_ema"`,
please make sure to update the code and use `variant="fp16"` or `variation="non_ema"` respectively please make sure to update the code and use `variant="fp16"` or `variation="non_ema"` respectively
instead. instead.
......
...@@ -189,7 +189,7 @@ pipeline = StableDiffusionXLPipeline.from_pretrained( ...@@ -189,7 +189,7 @@ pipeline = StableDiffusionXLPipeline.from_pretrained(
).to("cuda") ).to("cuda")
``` ```
Next, load the LoRA checkpoint and fuse it with the original weights. The `lora_scale` parameter controls how much to scale the output by with the LoRA weights. It is important to make the `lora_scale` adjustments in the [`~loaders.LoraLoaderMixin.fuse_lora`] method because it won't work if you try to pass `scale` to the `cross_attention_kwargs` in the pipeline. Next, load the LoRA checkpoint and fuse it with the original weights. The `lora_scale` parameter controls how much to scale the output by with the LoRA weights. It is important to make the `lora_scale` adjustments in the [`~loaders.LoraLoaderMixin.fuse_lora`] method because it won't work if you try to pass `scale` to the `cross_attention_kwargs` in the pipeline.
If you need to reset the original model weights for any reason (use a different `lora_scale`), you should use the [`~loaders.LoraLoaderMixin.unfuse_lora`] method. If you need to reset the original model weights for any reason (use a different `lora_scale`), you should use the [`~loaders.LoraLoaderMixin.unfuse_lora`] method.
......
...@@ -34,11 +34,11 @@ There are two options for converting a `.ckpt` file: use a Space to convert the ...@@ -34,11 +34,11 @@ There are two options for converting a `.ckpt` file: use a Space to convert the
The easiest and most convenient way to convert a `.ckpt` file is to use the [SD to Diffusers](https://huggingface.co/spaces/diffusers/sd-to-diffusers) Space. You can follow the instructions on the Space to convert the `.ckpt` file. The easiest and most convenient way to convert a `.ckpt` file is to use the [SD to Diffusers](https://huggingface.co/spaces/diffusers/sd-to-diffusers) Space. You can follow the instructions on the Space to convert the `.ckpt` file.
This approach works well for basic models, but it may struggle with more customized models. You'll know the Space failed if it returns an empty pull request or error. In this case, you can try converting the `.ckpt` file with a script. This approach works well for basic models, but it may struggle with more customized models. You'll know the Space failed if it returns an empty pull request or error. In this case, you can try converting the `.ckpt` file with a script.
### Convert with a script ### Convert with a script
🤗 Diffusers provides a [conversion script](https://github.com/huggingface/diffusers/blob/main/scripts/convert_original_stable_diffusion_to_diffusers.py) for converting `.ckpt` files. This approach is more reliable than the Space above. 🤗 Diffusers provides a [conversion script](https://github.com/huggingface/diffusers/blob/main/scripts/convert_original_stable_diffusion_to_diffusers.py) for converting `.ckpt` files. This approach is more reliable than the Space above.
Before you start, make sure you have a local clone of 🤗 Diffusers to run the script and log in to your Hugging Face account so you can open pull requests and push your converted model to the Hub. Before you start, make sure you have a local clone of 🤗 Diffusers to run the script and log in to your Hugging Face account so you can open pull requests and push your converted model to the Hub.
...@@ -86,11 +86,11 @@ git push origin pr/13:refs/pr/13 ...@@ -86,11 +86,11 @@ git push origin pr/13:refs/pr/13
<Tip warning={true}> <Tip warning={true}>
🧪 This is an experimental feature. Only Stable Diffusion v1 checkpoints are supported by the Convert KerasCV Space at the moment. 🧪 This is an experimental feature. Only Stable Diffusion v1 checkpoints are supported by the Convert KerasCV Space at the moment.
</Tip> </Tip>
[KerasCV](https://keras.io/keras_cv/) supports training for [Stable Diffusion](https://github.com/keras-team/keras-cv/blob/master/keras_cv/models/stable_diffusion) v1 and v2. However, it offers limited support for experimenting with Stable Diffusion models for inference and deployment whereas 🤗 Diffusers has a more complete set of features for this purpose, such as different [noise schedulers](https://huggingface.co/docs/diffusers/using-diffusers/schedulers), [flash attention](https://huggingface.co/docs/diffusers/optimization/xformers), and [other [KerasCV](https://keras.io/keras_cv/) supports training for [Stable Diffusion](https://github.com/keras-team/keras-cv/blob/master/keras_cv/models/stable_diffusion) v1 and v2. However, it offers limited support for experimenting with Stable Diffusion models for inference and deployment whereas 🤗 Diffusers has a more complete set of features for this purpose, such as different [noise schedulers](https://huggingface.co/docs/diffusers/using-diffusers/schedulers), [flash attention](https://huggingface.co/docs/diffusers/optimization/xformers), and [other
optimization techniques](https://huggingface.co/docs/diffusers/optimization/fp16). optimization techniques](https://huggingface.co/docs/diffusers/optimization/fp16).
The [Convert KerasCV](https://huggingface.co/spaces/sayakpaul/convert-kerascv-sd-diffusers) Space converts `.pb` or `.h5` files to PyTorch, and then wraps them in a [`StableDiffusionPipeline`] so it is ready for inference. The converted checkpoint is stored in a repository on the Hugging Face Hub. The [Convert KerasCV](https://huggingface.co/spaces/sayakpaul/convert-kerascv-sd-diffusers) Space converts `.pb` or `.h5` files to PyTorch, and then wraps them in a [`StableDiffusionPipeline`] so it is ready for inference. The converted checkpoint is stored in a repository on the Hugging Face Hub.
......
...@@ -14,10 +14,10 @@ specific language governing permissions and limitations under the License. ...@@ -14,10 +14,10 @@ specific language governing permissions and limitations under the License.
[[open-in-colab]] [[open-in-colab]]
Diffusion pipelines are inherently a collection of diffusion models and schedulers that are partly independent from each other. This means that one is able to switch out parts of the pipeline to better customize Diffusion pipelines are inherently a collection of diffusion models and schedulers that are partly independent from each other. This means that one is able to switch out parts of the pipeline to better customize
a pipeline to one's use case. The best example of this is the [Schedulers](../api/schedulers/overview). a pipeline to one's use case. The best example of this is the [Schedulers](../api/schedulers/overview).
Whereas diffusion models usually simply define the forward pass from noise to a less noisy sample, Whereas diffusion models usually simply define the forward pass from noise to a less noisy sample,
schedulers define the whole denoising process, *i.e.*: schedulers define the whole denoising process, *i.e.*:
- How many denoising steps? - How many denoising steps?
- Stochastic or deterministic? - Stochastic or deterministic?
...@@ -77,7 +77,7 @@ PNDMScheduler { ...@@ -77,7 +77,7 @@ PNDMScheduler {
} }
``` ```
We can see that the scheduler is of type [`PNDMScheduler`]. We can see that the scheduler is of type [`PNDMScheduler`].
Cool, now let's compare the scheduler in its performance to other schedulers. Cool, now let's compare the scheduler in its performance to other schedulers.
First we define a prompt on which we will test all the different schedulers: First we define a prompt on which we will test all the different schedulers:
...@@ -102,7 +102,7 @@ image ...@@ -102,7 +102,7 @@ image
## Changing the scheduler ## Changing the scheduler
Now we show how easy it is to change the scheduler of a pipeline. Every scheduler has a property [`~SchedulerMixin.compatibles`] Now we show how easy it is to change the scheduler of a pipeline. Every scheduler has a property [`~SchedulerMixin.compatibles`]
which defines all compatible schedulers. You can take a look at all available, compatible schedulers for the Stable Diffusion pipeline as follows. which defines all compatible schedulers. You can take a look at all available, compatible schedulers for the Stable Diffusion pipeline as follows.
```python ```python
...@@ -127,7 +127,7 @@ pipeline.scheduler.compatibles ...@@ -127,7 +127,7 @@ pipeline.scheduler.compatibles
diffusers.schedulers.scheduling_k_dpm_2_ancestral_discrete.KDPM2AncestralDiscreteScheduler] diffusers.schedulers.scheduling_k_dpm_2_ancestral_discrete.KDPM2AncestralDiscreteScheduler]
``` ```
Cool, lots of schedulers to look at. Feel free to have a look at their respective class definitions: Cool, lots of schedulers to look at. Feel free to have a look at their respective class definitions:
- [`EulerDiscreteScheduler`], - [`EulerDiscreteScheduler`],
- [`LMSDiscreteScheduler`], - [`LMSDiscreteScheduler`],
...@@ -143,7 +143,7 @@ Cool, lots of schedulers to look at. Feel free to have a look at their respectiv ...@@ -143,7 +143,7 @@ Cool, lots of schedulers to look at. Feel free to have a look at their respectiv
- [`DPMSolverSinglestepScheduler`], - [`DPMSolverSinglestepScheduler`],
- [`KDPM2AncestralDiscreteScheduler`]. - [`KDPM2AncestralDiscreteScheduler`].
We will now compare the input prompt with all other schedulers. To change the scheduler of the pipeline you can make use of the We will now compare the input prompt with all other schedulers. To change the scheduler of the pipeline you can make use of the
convenient [`~ConfigMixin.config`] property in combination with the [`~ConfigMixin.from_config`] function. convenient [`~ConfigMixin.config`] property in combination with the [`~ConfigMixin.from_config`] function.
```python ```python
...@@ -171,7 +171,7 @@ FrozenDict([('num_train_timesteps', 1000), ...@@ -171,7 +171,7 @@ FrozenDict([('num_train_timesteps', 1000),
``` ```
This configuration can then be used to instantiate a scheduler This configuration can then be used to instantiate a scheduler
of a different class that is compatible with the pipeline. Here, of a different class that is compatible with the pipeline. Here,
we change the scheduler to the [`DDIMScheduler`]. we change the scheduler to the [`DDIMScheduler`].
```python ```python
...@@ -198,7 +198,7 @@ If you are a JAX/Flax user, please check [this section](#changing-the-scheduler- ...@@ -198,7 +198,7 @@ If you are a JAX/Flax user, please check [this section](#changing-the-scheduler-
## Compare schedulers ## Compare schedulers
So far we have tried running the stable diffusion pipeline with two schedulers: [`PNDMScheduler`] and [`DDIMScheduler`]. So far we have tried running the stable diffusion pipeline with two schedulers: [`PNDMScheduler`] and [`DDIMScheduler`].
A number of better schedulers have been released that can be run with much fewer steps; let's compare them here: A number of better schedulers have been released that can be run with much fewer steps; let's compare them here:
[`LMSDiscreteScheduler`] usually leads to better results: [`LMSDiscreteScheduler`] usually leads to better results:
......
...@@ -95,7 +95,7 @@ state_dict ...@@ -95,7 +95,7 @@ state_dict
``` ```
There are two tensors, `"clip_g"` and `"clip_l"`. There are two tensors, `"clip_g"` and `"clip_l"`.
`"clip_g"` corresponds to the bigger text encoder in SDXL and refers to `"clip_g"` corresponds to the bigger text encoder in SDXL and refers to
`pipe.text_encoder_2` and `"clip_l"` refers to `pipe.text_encoder`. `pipe.text_encoder_2` and `"clip_l"` refers to `pipe.text_encoder`.
Now you can load each tensor separately by passing them along with the correct text encoder and tokenizer Now you can load each tensor separately by passing them along with the correct text encoder and tokenizer
......
...@@ -35,7 +35,7 @@ from diffusers import DiffusionPipeline ...@@ -35,7 +35,7 @@ from diffusers import DiffusionPipeline
generator = DiffusionPipeline.from_pretrained("anton-l/ddpm-butterflies-128", use_safetensors=True) generator = DiffusionPipeline.from_pretrained("anton-l/ddpm-butterflies-128", use_safetensors=True)
``` ```
The [`DiffusionPipeline`] downloads and caches all modeling, tokenization, and scheduling components. The [`DiffusionPipeline`] downloads and caches all modeling, tokenization, and scheduling components.
Because the model consists of roughly 1.4 billion parameters, we strongly recommend running it on a GPU. Because the model consists of roughly 1.4 billion parameters, we strongly recommend running it on a GPU.
You can move the generator object to a GPU, just like you would in PyTorch: You can move the generator object to a GPU, just like you would in PyTorch:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment