[docs] Loader APIs (#5813)

* first draft * remove old loader doc * start adding lora code examples * finish * add link to loralinearlayer * feedback * fix

[docs] Loader APIs (#5813)
* first draft * remove old loader doc * start adding lora code examples * finish * add link to loralinearlayer * feedback * fix
7457aa67 · Steven Liu · GitHub · c72a1739 · 7457aa67 · 7457aa67
Unverified Commit 7457aa67 authored Nov 20, 2023 by Steven Liu Committed by GitHub Nov 20, 2023
9 changed files
--- a/docs/source/en/_toctree.yml
+++ b/docs/source/en/_toctree.yml
@@ -186,13 +186,21 @@
  - sections:
    - local: api/configuration
      title: Configuration
-    - local: api/loaders
-      title: Loaders
    - local: api/logging
      title: Logging
    - local: api/outputs
      title: Outputs
    title: Main Classes
+  - sections:
+    - local: api/loaders/lora
+      title: LoRA
+    - local: api/loaders/single_file
+      title: Single files
+    - local: api/loaders/textual_inversion
+      title: Textual Inversion
+    - local: api/loaders/unet
+      title: UNet
+    title: Loaders
  - sections:
    - local: api/models/overview
      title: Overview

--- a/docs/source/en/api/loaders/lora.md
+++ b/docs/source/en/api/loaders/lora.md
+<!--Copyright 2023 The HuggingFace Team. All rights reserved.
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License.
+-->
+# LoRA
+LoRA is a fast and lightweight training method that inserts and trains a significantly smaller number of parameters instead of all the model parameters. This produces a smaller file (~100 MBs) and makes it easier to quickly train a model to learn a new concept. LoRA weights are typically loaded into the UNet, text encoder or both. There are two classes for loading LoRA weights:
+- [`LoraLoaderMixin`] provides functions for loading and unloading, fusing and unfusing, enabling and disabling, and more functions for managing LoRA weights. This class can be used with any model.
+- [`StableDiffusionXLLoraLoaderMixin`] is a [Stable Diffusion (SDXL)](../../api/pipelines/stable_diffusion/stable_diffusion_xl) version of the [`LoraLoaderMixin`] class for loading and saving LoRA weights. It can only be used with the SDXL model.
+<Tip>
+To learn more about how to load LoRA weights, see the [LoRA](../../using-diffusers/loading_adapters#lora) loading guide.
+</Tip>
+## LoraLoaderMixin
+[[autodoc]] loaders.lora.LoraLoaderMixin
+## StableDiffusionXLLoraLoaderMixin
+[[autodoc]] loaders.lora.StableDiffusionXLLoraLoaderMixin
\ No newline at end of file
--- a/docs/source/en/api/loaders.md
+++ b/docs/source/en/api/loaders.md
@@ -10,40 +10,28 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
 specific language governing permissions and limitations under the License.
 -->
-# Loaders
+# Single files
-Adapters (textual inversion, LoRA, hypernetworks) allow you to modify a diffusion model to generate images in a specific style without training or finetuning the entire model. The adapter weights are very portable because they're typically only a tiny fraction of the pretrained model weights. 🤗 Diffusers provides an easy-to-use `LoaderMixin` API to load adapter weights.
+Diffusers supports loading pretrained pipeline (or model) weights stored in a single file, such as a `ckpt` or `safetensors` file. These single file types are typically produced from community trained models. There are three classes for loading single file weights:
-<Tip warning={true}>
+- [`FromSingleFileMixin`] supports loading pretrained pipeline weights stored in a single file, which can either be a `ckpt` or `safetensors` file.
+- [`FromOriginalVAEMixin`] supports loading a pretrained [`AutoencoderKL`] from pretrained ControlNet weights stored in a single file, which can either be a `ckpt` or `safetensors` file.
+- [`FromOriginalControlnetMixin`] supports loading pretrained ControlNet weights stored in a single file, which can either be a `ckpt` or `safetensors` file.
-🧪 The `LoaderMixin`s are highly experimental and prone to future changes. To use private or [gated](https://huggingface.co/docs/hub/models-gated#gated-models) models, log-in with `huggingface-cli login`.
+<Tip>
-</Tip>
+To learn more about how to load single file weights, see the [Load different Stable Diffusion formats](../../using-diffusers/other-formats) loading guide.
-## UNet2DConditionLoadersMixin
-[[autodoc]] loaders.UNet2DConditionLoadersMixin
-## TextualInversionLoaderMixin
-[[autodoc]] loaders.TextualInversionLoaderMixin
-## StableDiffusionXLLoraLoaderMixin
-[[autodoc]] loaders.StableDiffusionXLLoraLoaderMixin
-## LoraLoaderMixin
+</Tip>
-[[autodoc]] loaders.LoraLoaderMixin
 ## FromSingleFileMixin
-[[autodoc]] loaders.FromSingleFileMixin
+[[autodoc]] loaders.single_file.FromSingleFileMixin
-## FromOriginalControlnetMixin
+## FromOriginalVAEMixin
-[[autodoc]] loaders.FromOriginalControlnetMixin
+[[autodoc]] loaders.single_file.FromOriginalVAEMixin
-## FromOriginalVAEMixin
+## FromOriginalControlnetMixin
-[[autodoc]] loaders.FromOriginalVAEMixin
+[[autodoc]] loaders.single_file.FromOriginalControlnetMixin
\ No newline at end of file
--- a/docs/source/en/api/loaders/textual_inversion.md
+++ b/docs/source/en/api/loaders/textual_inversion.md
+<!--Copyright 2023 The HuggingFace Team. All rights reserved.
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License.
+-->
+# Textual Inversion
+Textual Inversion is a training method for personalizing models by learning new text embeddings from a few example images. The file produced from training is extremely small (a few KBs) and the new embeddings can be loaded into the text encoder.
+[`TextualInversionLoaderMixin`] provides a function for loading Textual Inversion embeddings from Diffusers and Automatic1111 into the text encoder and loading a special token to activate the embeddings.
+<Tip>
+To learn more about how to load Textual Inversion embeddings, see the [Textual Inversion](../../using-diffusers/loading_adapters#textual-inversion) loading guide.
+</Tip>
+## TextualInversionLoaderMixin
+[[autodoc]] loaders.textual_inversion.TextualInversionLoaderMixin
\ No newline at end of file
--- a/docs/source/en/api/loaders/unet.md
+++ b/docs/source/en/api/loaders/unet.md
+<!--Copyright 2023 The HuggingFace Team. All rights reserved.
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License.
+-->
+# UNet
+Some training methods - like LoRA and Custom Diffusion - typically target the UNet's attention layers, but these training methods can also target other non-attention layers. Instead of training all of a model's parameters, only a subset of the parameters are trained, which is faster and more efficient. This class is useful if you're *only* loading weights into a UNet. If you need to load weights into the text encoder or a text encoder and UNet, try using the [`~loaders.LoraLoaderMixin.load_lora_weights`] function instead.
+The [`UNet2DConditionLoadersMixin`] class provides functions for loading and saving weights, fusing and unfusing LoRAs, disabling and enabling LoRAs, and setting and deleting adapters.
+<Tip>
+To learn more about how to load LoRA weights, see the [LoRA](../../using-diffusers/loading_adapters#lora) loading guide.
+</Tip>
+## UNet2DConditionLoadersMixin
+[[autodoc]] loaders.unet.UNet2DConditionLoadersMixin
\ No newline at end of file
--- a/src/diffusers/loaders/lora.py
+++ b/src/diffusers/loaders/lora.py
@@ -68,8 +68,7 @@ LORA_DEPRECATION_MESSAGE = "You are using an old version of LoRA backend. This w
 class LoraLoaderMixin:
    r"""
-    Load LoRA layers into [`UNet2DConditionModel`] and
+    Load LoRA layers into [`UNet2DConditionModel`] and [`~transformers.CLIPTextModel`].
-    [`CLIPTextModel`](https://huggingface.co/docs/transformers/model_doc/clip#transformers.CLIPTextModel).
    """
    text_encoder_name = TEXT_ENCODER_NAME
@@ -95,12 +94,28 @@ class LoraLoaderMixin:
        Parameters:
            pretrained_model_name_or_path_or_dict (`str` or `os.PathLike` or `dict`):
-                See [`~loaders.LoraLoaderMixin.lora_state_dict`].
+                A string (model id of a pretrained model hosted on the Hub), a path to a directory containing the model
+                weights, or a [torch state
+                dict](https://pytorch.org/tutorials/beginner/saving_loading_models.html#what-is-a-state-dict).
            kwargs (`dict`, *optional*):
                See [`~loaders.LoraLoaderMixin.lora_state_dict`].
            adapter_name (`str`, *optional*):
-                Adapter name to be used for referencing the loaded adapter model. If not specified, it will use
+                Name for referencing the loaded adapter model. If not specified, it will use `default_{i}` where `i` is
-                `default_{i}` where i is the total number of adapters being loaded.
+                the total number of adapters being loaded. Must have PEFT installed to use.
+        Example:
+        ```py
+        from diffusers import DiffusionPipeline
+        import torch
+        pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16).to(
+            "cuda"
+        )
+        pipeline.load_lora_weights(
+            "Yntec/pineappleAnimeMix", weight_name="pineappleAnimeMix_pineapple10.1.safetensors", adapter_name="anime"
+        )
+        ```
        """
        # First, ensure that the checkpoint is a compatible one and can be successfully loaded.
        state_dict, network_alphas = self.lora_state_dict(pretrained_model_name_or_path_or_dict, **kwargs)
@@ -138,15 +153,7 @@ class LoraLoaderMixin:
        **kwargs,
    ):
        r"""
-        Return state dict for lora weights and the network alphas.
+        Return state dict and network alphas of the LoRA weights.
-        <Tip warning={true}>
-        We support loading A1111 formatted LoRA checkpoints in a limited capacity.
-        This function is experimental and might change in the future.
-        </Tip>
        Parameters:
            pretrained_model_name_or_path_or_dict (`str` or `os.PathLike` or `dict`):
@@ -154,8 +161,7 @@ class LoraLoaderMixin:
                    - A string, the *model id* (for example `google/ddpm-celebahq-256`) of a pretrained model hosted on
                      the Hub.
-                    - A path to a *directory* (for example `./my_model_directory`) containing the model weights saved
+                    - A path to a *directory* (for example `./my_model_directory`) containing the model weights.
-                      with [`ModelMixin.save_pretrained`].
                    - A [torch state
                      dict](https://pytorch.org/tutorials/beginner/saving_loading_models.html#what-is-a-state-dict).
@@ -191,7 +197,6 @@ class LoraLoaderMixin:
                Mirror source to resolve accessibility issues if you're downloading a model in China. We do not
                guarantee the timeliness or safety of the source, and you should refer to the mirror site for more
                information.
        """
        # Load the main state dict first which has the LoRA layers for either of
        # UNet and text encoder or both.
@@ -468,25 +473,27 @@ class LoraLoaderMixin:
        cls, state_dict, network_alphas, unet, low_cpu_mem_usage=None, adapter_name=None, _pipeline=None
    ):
        """
-        This will load the LoRA layers specified in `state_dict` into `unet`.
+        Load LoRA layers specified in `state_dict` into `unet`.
        Parameters:
            state_dict (`dict`):
-                A standard state dict containing the lora layer parameters. The keys can either be indexed directly
+                A standard state dict containing the LoRA layer parameters. The keys can either be indexed directly
-                into the unet or prefixed with an additional `unet` which can be used to distinguish between text
+                into the `unet` or prefixed with an additional `unet`, which can be used to distinguish between text
-                encoder lora layers.
+                encoder LoRA layers.
            network_alphas (`Dict[str, float]`):
-                See `LoRALinearLayer` for more details.
+                See
+                [`LoRALinearLayer`](https://github.com/huggingface/diffusers/blob/c697f524761abd2314c030221a3ad2f7791eab4e/src/diffusers/models/lora.py#L182)
+                for more details.
            unet (`UNet2DConditionModel`):
                The UNet model to load the LoRA layers into.
            low_cpu_mem_usage (`bool`, *optional*, defaults to `True` if torch version >= 1.9.0 else `False`):
-                Speed up model loading only loading the pretrained weights and not initializing the weights. This also
+                Only load and not initialize the pretrained weights. This can speedup model loading and also tries to
-                tries to not use more than 1x model size in CPU memory (including peak memory) while loading the model.
+                not use more than 1x model size in CPU memory (including peak memory) while loading the model. Only
-                Only supported for PyTorch >= 1.9.0. If you are using an older version of PyTorch, setting this
+                supported for PyTorch >= 1.9.0. If you are using an older version of PyTorch, setting this argument to
-                argument to `True` will raise an error.
+                `True` will raise an error.
            adapter_name (`str`, *optional*):
-                Adapter name to be used for referencing the loaded adapter model. If not specified, it will use
+                Name for referencing the loaded adapter model. If not specified, it will use `default_{i}` where `i` is
-                `default_{i}` where i is the total number of adapters being loaded.
+                the total number of adapters being loaded.
        """
        low_cpu_mem_usage = low_cpu_mem_usage if low_cpu_mem_usage is not None else _LOW_CPU_MEM_USAGE_DEFAULT
        # If the serialization format is new (introduced in https://github.com/huggingface/diffusers/pull/2918),
@@ -580,26 +587,27 @@ class LoraLoaderMixin:
        _pipeline=None,
    ):
        """
-        This will load the LoRA layers specified in `state_dict` into `text_encoder`
+        Load LoRA layers specified in `state_dict` into `text_encoder`.
        Parameters:
            state_dict (`dict`):
-                A standard state dict containing the lora layer parameters. The key should be prefixed with an
+                A standard state dict containing the LoRA layer parameters. The key should be prefixed with an
-                additional `text_encoder` to distinguish between unet lora layers.
+                additional `text_encoder` to distinguish between UNet LoRA layers.
            network_alphas (`Dict[str, float]`):
-                See `LoRALinearLayer` for more details.
+                See
+                [`LoRALinearLayer`](https://github.com/huggingface/diffusers/blob/c697f524761abd2314c030221a3ad2f7791eab4e/src/diffusers/models/lora.py#L182)
+                for more details.
            text_encoder (`CLIPTextModel`):
                The text encoder model to load the LoRA layers into.
            prefix (`str`):
                Expected prefix of the `text_encoder` in the `state_dict`.
            lora_scale (`float`):
-                How much to scale the output of the lora linear layer before it is added with the output of the regular
+                Scale of `LoRALinearLayer`'s output before it is added with the output of the regular LoRA layer.
-                lora layer.
            low_cpu_mem_usage (`bool`, *optional*, defaults to `True` if torch version >= 1.9.0 else `False`):
-                Speed up model loading only loading the pretrained weights and not initializing the weights. This also
+                Only load and not initialize the pretrained weights. This can speedup model loading and also tries to
-                tries to not use more than 1x model size in CPU memory (including peak memory) while loading the model.
+                not use more than 1x model size in CPU memory (including peak memory) while loading the model. Only
-                Only supported for PyTorch >= 1.9.0. If you are using an older version of PyTorch, setting this
+                supported for PyTorch >= 1.9.0. If you are using an older version of PyTorch, setting this argument to
-                argument to `True` will raise an error.
+                `True` will raise an error.
            adapter_name (`str`, *optional*):
                Adapter name to be used for referencing the loaded adapter model. If not specified, it will use
                `default_{i}` where i is the total number of adapters being loaded.
@@ -884,11 +892,11 @@ class LoraLoaderMixin:
        safe_serialization: bool = True,
    ):
        r"""
-        Save the LoRA parameters corresponding to the UNet and text encoder.
+        Save the UNet and text encoder LoRA parameters.
        Arguments:
            save_directory (`str` or `os.PathLike`):
-                Directory to save LoRA parameters to. Will be created if it doesn't exist.
+                Directory to save LoRA parameters to (will be created if it doesn't exist).
            unet_lora_layers (`Dict[str, torch.nn.Module]` or `Dict[str, torch.Tensor]`):
                State dict of the LoRA layers corresponding to the `unet`.
            text_encoder_lora_layers (`Dict[str, torch.nn.Module]` or `Dict[str, torch.Tensor]`):
@@ -899,11 +907,30 @@ class LoraLoaderMixin:
                need to call this function on all processes. In this case, set `is_main_process=True` only on the main
                process to avoid race conditions.
            save_function (`Callable`):
-                The function to use to save the state dictionary. Useful during distributed training when you need to
+                The function to use to save the state dict. Useful during distributed training when you need to replace
-                replace `torch.save` with another method. Can be configured with the environment variable
+                `torch.save` with another method. Can be configured with the environment variable
                `DIFFUSERS_SAVE_MODE`.
            safe_serialization (`bool`, *optional*, defaults to `True`):
-                Whether to save the model using `safetensors` or the traditional PyTorch way with `pickle`.
+                Whether to save the model using `safetensors` or with `pickle`.
+        Example:
+        ```py
+        from diffusers import StableDiffusionXLPipeline
+        from peft.utils import get_peft_model_state_dict
+        import torch
+        pipeline = StableDiffusionXLPipeline.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+        ).to("cuda")
+        pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+        pipeline.fuse_lora()
+        # get and save unet state dict
+        unet_state_dict = get_peft_model_state_dict(pipeline.unet, adapter_name="pixel")
+        pipeline.save_lora_weights("fused-model", unet_lora_layers=unet_state_dict)
+        pipeline.load_lora_weights("fused-model", weight_name="pytorch_lora_weights.safetensors")
+        ```
        """
        # Create a flat dictionary.
        state_dict = {}
@@ -1139,14 +1166,19 @@ class LoraLoaderMixin:
    def unload_lora_weights(self):
        """
-        Unloads the LoRA parameters.
+        Unload the LoRA parameters from a pipeline.
        Examples:
-        ```python
+        ```py
-        >>> # Assuming `pipeline` is already loaded with the LoRA parameters.
+        from diffusers import DiffusionPipeline
-        >>> pipeline.unload_lora_weights()
+        import torch
-        >>> ...
+        pipeline = DiffusionPipeline.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+        ).to("cuda")
+        pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+        pipeline.unload_lora_weights()
        ```
        """
        if not USE_PEFT_BACKEND:
@@ -1175,7 +1207,7 @@ class LoraLoaderMixin:
        safe_fusing: bool = False,
    ):
        r"""
-        Fuses the LoRA parameters into the original parameters of the corresponding blocks.
+        Fuse the LoRA parameters with the original parameters in their corresponding blocks.
        <Tip warning={true}>
@@ -1189,9 +1221,23 @@ class LoraLoaderMixin:
                Whether to fuse the text encoder LoRA parameters. If the text encoder wasn't monkey-patched with the
                LoRA parameters then it won't have any effect.
            lora_scale (`float`, defaults to 1.0):
-                Controls how much to influence the outputs with the LoRA parameters.
+                Controls LoRA influence on the outputs.
            safe_fusing (`bool`, defaults to `False`):
-                Whether to check fused weights for NaN values before fusing and if values are NaN not fusing them.
+                Whether to check fused weights for `NaN` values before fusing and if values are `NaN`, then don't fuse
+                them.
+        Example:
+        ```py
+        from diffusers import DiffusionPipeline
+        import torch
+        pipeline = DiffusionPipeline.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+        ).to("cuda")
+        pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+        pipeline.fuse_lora(lora_scale=0.7)
+        ```
        """
        if fuse_unet or fuse_text_encoder:
            self.num_fused_loras += 1
@@ -1240,8 +1286,7 @@ class LoraLoaderMixin:
    def unfuse_lora(self, unfuse_unet: bool = True, unfuse_text_encoder: bool = True):
        r"""
-        Reverses the effect of
+        Unfuse the LoRA parameters from the original parameters in their corresponding blocks.
-        [`pipe.fuse_lora()`](https://huggingface.co/docs/diffusers/main/en/api/loaders#diffusers.loaders.LoraLoaderMixin.fuse_lora).
        <Tip warning={true}>
@@ -1254,6 +1299,20 @@ class LoraLoaderMixin:
            unfuse_text_encoder (`bool`, defaults to `True`):
                Whether to unfuse the text encoder LoRA parameters. If the text encoder wasn't monkey-patched with the
                LoRA parameters then it won't have any effect.
+        Example:
+        ```py
+        from diffusers import DiffusionPipeline
+        import torch
+        pipeline = DiffusionPipeline.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+        ).to("cuda")
+        pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+        pipeline.fuse_lora(lora_scale=0.7)
+        pipeline.unfuse_lora()
+        ```
        """
        if unfuse_unet:
            if not USE_PEFT_BACKEND:
@@ -1305,16 +1364,32 @@ class LoraLoaderMixin:
        text_encoder_weights: List[float] = None,
    ):
        """
-        Sets the adapter layers for the text encoder.
+        Set the currently active adapter for use in the text encoder.
        Args:
            adapter_names (`List[str]` or `str`):
-                The names of the adapters to use.
+                The adapter to activate.
            text_encoder (`torch.nn.Module`, *optional*):
-                The text encoder module to set the adapter layers for. If `None`, it will try to get the `text_encoder`
+                The text encoder module to activate the adapter layers for. If `None`, it will try to get the
-                attribute.
+                `text_encoder` attribute.
            text_encoder_weights (`List[float]`, *optional*):
                The weights to use for the text encoder. If `None`, the weights are set to `1.0` for all the adapters.
+        Example:
+        ```py
+        from diffusers import DiffusionPipeline
+        import torch
+        pipeline = DiffusionPipeline.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+        ).to("cuda")
+        pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+        pipeline.load_lora_weights(
+            "jbilcke-hf/sdxl-cinematic-1", weight_name="pytorch_lora_weights.safetensors", adapter_name="cinematic"
+        )
+        pipeline.set_adapters_for_text_encoder("pixel")
+        ```
        """
        if not USE_PEFT_BACKEND:
            raise ValueError("PEFT backend is required for this method.")
@@ -1342,12 +1417,25 @@ class LoraLoaderMixin:
    def disable_lora_for_text_encoder(self, text_encoder: Optional["PreTrainedModel"] = None):
        """
-        Disables the LoRA layers for the text encoder.
+        Disable the text encoder's LoRA layers.
        Args:
            text_encoder (`torch.nn.Module`, *optional*):
                The text encoder module to disable the LoRA layers for. If `None`, it will try to get the
                `text_encoder` attribute.
+        Example:
+        ```py
+        from diffusers import DiffusionPipeline
+        import torch
+        pipeline = DiffusionPipeline.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+        ).to("cuda")
+        pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+        pipeline.disable_lora_for_text_encoder()
+        ```
        """
        if not USE_PEFT_BACKEND:
            raise ValueError("PEFT backend is required for this method.")
@@ -1359,12 +1447,25 @@ class LoraLoaderMixin:
    def enable_lora_for_text_encoder(self, text_encoder: Optional["PreTrainedModel"] = None):
        """
-        Enables the LoRA layers for the text encoder.
+        Enables the text encoder's LoRA layers.
        Args:
            text_encoder (`torch.nn.Module`, *optional*):
                The text encoder module to enable the LoRA layers for. If `None`, it will try to get the `text_encoder`
                attribute.
+        Example:
+        ```py
+        from diffusers import DiffusionPipeline
+        import torch
+        pipeline = DiffusionPipeline.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+        ).to("cuda")
+        pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+        pipeline.enable_lora_for_text_encoder()
+        ```
        """
        if not USE_PEFT_BACKEND:
            raise ValueError("PEFT backend is required for this method.")
@@ -1415,10 +1516,24 @@ class LoraLoaderMixin:
    def delete_adapters(self, adapter_names: Union[List[str], str]):
        """
+        Delete an adapter's LoRA layers from the UNet and text encoder(s).
        Args:
-        Deletes the LoRA layers of `adapter_name` for the unet and text-encoder(s).
            adapter_names (`Union[List[str], str]`):
-                The names of the adapter to delete. Can be a single string or a list of strings
+                The names (single string or list of strings) of the adapter to delete.
+        Example:
+        ```py
+        from diffusers import DiffusionPipeline
+        import torch
+        pipeline = DiffusionPipeline.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+        ).to("cuda")
+        pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+        pipeline.delete_adapters("pixel")
+        ```
        """
        if not USE_PEFT_BACKEND:
            raise ValueError("PEFT backend is required for this method.")
@@ -1438,7 +1553,7 @@ class LoraLoaderMixin:
    def get_active_adapters(self) -> List[str]:
        """
-        Gets the list of the current active adapters.
+        Get a list of currently active adapters.
        Example:
@@ -1470,7 +1585,22 @@ class LoraLoaderMixin:
    def get_list_adapters(self) -> Dict[str, List[str]]:
        """
-        Gets the current list of all available adapters in the pipeline.
+        Get a list of all currently available adapters for each component in the pipeline.
+        Example:
+        ```py
+        from diffusers import DiffusionPipeline
+        pipeline = DiffusionPipeline.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0",
+        ).to("cuda")
+        pipeline.load_lora_weights(
+            "jbilcke-hf/sdxl-cinematic-1", weight_name="pytorch_lora_weights.safetensors", adapter_name="cinematic"
+        )
+        pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+        pipeline.get_list_adapters()
+        ```
        """
        if not USE_PEFT_BACKEND:
            raise ValueError(
@@ -1492,14 +1622,27 @@ class LoraLoaderMixin:
    def set_lora_device(self, adapter_names: List[str], device: Union[torch.device, str, int]) -> None:
        """
-        Moves the LoRAs listed in `adapter_names` to a target device. Useful for offloading the LoRA to the CPU in case
+        Move a LoRA to a target device. Useful for offloading a LoRA to the CPU in case you want to load multiple
-        you want to load multiple adapters and free some GPU memory.
+        adapters and free some GPU memory.
        Args:
            adapter_names (`List[str]`):
-                List of adapters to send device to.
+                List of adapters to send to device.
            device (`Union[torch.device, str, int]`):
-                Device to send the adapters to. Can be either a torch device, a str or an integer.
+                Device (can be a `torch.device`, `str` or `int`) to place adapters on.
+        Example:
+        ```py
+        from diffusers import DiffusionPipeline
+        import torch
+        pipeline = DiffusionPipeline.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0",
+        ).to("cuda")
+        pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+        pipeline.set_lora_device(["pixel"], device="cuda")
+        ```
        """
        if not USE_PEFT_BACKEND:
            raise ValueError("PEFT backend is required for this method.")
@@ -1531,7 +1674,7 @@ class LoraLoaderMixin:
 class StableDiffusionXLLoraLoaderMixin(LoraLoaderMixin):
-    """This class overrides `LoraLoaderMixin` with LoRA loading/saving code that's specific to SDXL"""
+    """This class overrides [`LoraLoaderMixin`] with LoRA loading/saving code that's specific to SDXL."""
    # Overrride to properly handle the loading and unloading of the additional text encoder.
    def load_lora_weights(
@@ -1556,12 +1699,26 @@ class StableDiffusionXLLoraLoaderMixin(LoraLoaderMixin):
        Parameters:
            pretrained_model_name_or_path_or_dict (`str` or `os.PathLike` or `dict`):
-                See [`~loaders.LoraLoaderMixin.lora_state_dict`].
+                A string (model id of a pretrained model hosted on the Hub), a path to a directory containing the model
-            adapter_name (`str`, *optional*):
+                weights, or a [torch state
-                Adapter name to be used for referencing the loaded adapter model. If not specified, it will use
+                dict](https://pytorch.org/tutorials/beginner/saving_loading_models.html#what-is-a-state-dict).
-                `default_{i}` where i is the total number of adapters being loaded.
            kwargs (`dict`, *optional*):
                See [`~loaders.LoraLoaderMixin.lora_state_dict`].
+            adapter_name (`str`, *optional*):
+                Name for referencing the loaded adapter model. If not specified, it will use `default_{i}` where `i` is
+                the total number of adapters being loaded. Must have PEFT installed to use.
+        Example:
+        ```py
+        from diffusers import StableDiffusionXLPipeline
+        import torch
+        pipeline = StableDiffusionXLPipeline.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+        ).to("cuda")
+        pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+        ```
        """
        # We could have accessed the unet config from `lora_state_dict()` too. We pass
        # it here explicitly to be able to tell that it's coming from an SDXL

--- a/src/diffusers/loaders/single_file.py
+++ b/src/diffusers/loaders/single_file.py
@@ -288,12 +288,15 @@ class FromSingleFileMixin:
 class FromOriginalVAEMixin:
+    """
+    Load pretrained ControlNet weights saved in the `.ckpt` or `.safetensors` format into an [`AutoencoderKL`].
+    """
    @classmethod
    def from_single_file(cls, pretrained_model_link_or_path, **kwargs):
        r"""
-        Instantiate a [`AutoencoderKL`] from pretrained controlnet weights saved in the original `.ckpt` or
+        Instantiate a [`AutoencoderKL`] from pretrained ControlNet weights saved in the original `.ckpt` or
-        `.safetensors` format. The pipeline is format. The pipeline is set in evaluation mode (`model.eval()`) by
+        `.safetensors` format. The pipeline is set in evaluation mode (`model.eval()`) by default.
-        default.
        Parameters:
            pretrained_model_link_or_path (`str` or `os.PathLike`, *optional*):
@@ -348,8 +351,8 @@ class FromOriginalVAEMixin:
        <Tip warning={true}>
-            Make sure to pass both `image_size` and `scaling_factor` to `from_single_file()` if you want to load
+            Make sure to pass both `image_size` and `scaling_factor` to `from_single_file()` if you're loading
-            a VAE that does accompany a stable diffusion model of v2 or higher or SDXL.
+            a VAE from SDXL or a Stable Diffusion v2 model or higher.
        </Tip>
@@ -482,10 +485,14 @@ class FromOriginalVAEMixin:
 class FromOriginalControlnetMixin:
+    """
+    Load pretrained ControlNet weights saved in the `.ckpt` or `.safetensors` format into a [`ControlNetModel`].
+    """
    @classmethod
    def from_single_file(cls, pretrained_model_link_or_path, **kwargs):
        r"""
-        Instantiate a [`ControlNetModel`] from pretrained controlnet weights saved in the original `.ckpt` or
+        Instantiate a [`ControlNetModel`] from pretrained ControlNet weights saved in the original `.ckpt` or
        `.safetensors` format. The pipeline is set in evaluation mode (`model.eval()`) by default.
        Parameters:

--- a/src/diffusers/loaders/textual_inversion.py
+++ b/src/diffusers/loaders/textual_inversion.py
@@ -116,7 +116,7 @@ def load_textual_inversion_state_dicts(pretrained_model_name_or_paths, **kwargs)
 class TextualInversionLoaderMixin:
    r"""
-    Load textual inversion tokens and embeddings to the tokenizer and text encoder.
+    Load Textual Inversion tokens and embeddings to the tokenizer and text encoder.
    """
    def maybe_convert_prompt(self, prompt: Union[str, List[str]], tokenizer: "PreTrainedTokenizer"):  # noqa: F821
@@ -276,7 +276,7 @@ class TextualInversionLoaderMixin:
        **kwargs,
    ):
        r"""
-        Load textual inversion embeddings into the text encoder of [`StableDiffusionPipeline`] (both 🤗 Diffusers and
+        Load Textual Inversion embeddings into the text encoder of [`StableDiffusionPipeline`] (both 🤗 Diffusers and
        Automatic1111 formats are supported).
        Parameters:
@@ -335,7 +335,7 @@ class TextualInversionLoaderMixin:
        Example:
-        To load a textual inversion embedding vector in 🤗 Diffusers format:
+        To load a Textual Inversion embedding vector in 🤗 Diffusers format:
        ```py
        from diffusers import StableDiffusionPipeline
@@ -352,7 +352,7 @@ class TextualInversionLoaderMixin:
        image.save("cat-backpack.png")
        ```
-        To load a textual inversion embedding vector in Automatic1111 format, make sure to download the vector first
+        To load a Textual Inversion embedding vector in Automatic1111 format, make sure to download the vector first
        (for example from [civitAI](https://civitai.com/models/3036?modelVersionId=9857)) and then load the vector
        locally:

--- a/src/diffusers/loaders/unet.py
+++ b/src/diffusers/loaders/unet.py
@@ -53,6 +53,10 @@ CUSTOM_DIFFUSION_WEIGHT_NAME_SAFE = "pytorch_custom_diffusion_weights.safetensor
 class UNet2DConditionLoadersMixin:
+    """
+    Load LoRA layers into a [`UNet2DCondtionModel`].
+    """
    text_encoder_name = TEXT_ENCODER_NAME
    unet_name = UNET_NAME
@@ -107,6 +111,19 @@ class UNet2DConditionLoadersMixin:
                guarantee the timeliness or safety of the source, and you should refer to the mirror site for more
                information.
+        Example:
+        ```py
+        from diffusers import AutoPipelineForText2Image
+        import torch
+        pipeline = AutoPipelineForText2Image.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+        ).to("cuda")
+        pipeline.unet.load_attn_procs(
+            "jbilcke-hf/sdxl-cinematic-1", weight_name="pytorch_lora_weights.safetensors", adapter_name="cinematic"
+        )
+        ```
        """
        from ..models.attention_processor import CustomDiffusionAttnProcessor
        from ..models.lora import LoRACompatibleConv, LoRACompatibleLinear, LoRAConv2dLayer, LoRALinearLayer
@@ -393,12 +410,12 @@ class UNet2DConditionLoadersMixin:
        **kwargs,
    ):
        r"""
-        Save an attention processor to a directory so that it can be reloaded using the
+        Save attention processor layers to a directory so that it can be reloaded with the
        [`~loaders.UNet2DConditionLoadersMixin.load_attn_procs`] method.
        Arguments:
            save_directory (`str` or `os.PathLike`):
-                Directory to save an attention processor to. Will be created if it doesn't exist.
+                Directory to save an attention processor to (will be created if it doesn't exist).
            is_main_process (`bool`, *optional*, defaults to `True`):
                Whether the process calling this is the main process or not. Useful during distributed training and you
                need to call this function on all processes. In this case, set `is_main_process=True` only on the main
@@ -408,7 +425,21 @@ class UNet2DConditionLoadersMixin:
                replace `torch.save` with another method. Can be configured with the environment variable
                `DIFFUSERS_SAVE_MODE`.
            safe_serialization (`bool`, *optional*, defaults to `True`):
-                Whether to save the model using `safetensors` or the traditional PyTorch way with `pickle`.
+                Whether to save the model using `safetensors` or with `pickle`.
+        Example:
+        ```py
+        import torch
+        from diffusers import DiffusionPipeline
+        pipeline = DiffusionPipeline.from_pretrained(
+            "CompVis/stable-diffusion-v1-4",
+            torch_dtype=torch.float16,
+        ).to("cuda")
+        pipeline.unet.load_attn_procs("path-to-save-model", weight_name="pytorch_custom_diffusion_weights.bin")
+        pipeline.unet.save_attn_procs("path-to-save-model", weight_name="pytorch_custom_diffusion_weights.bin")
+        ```
        """
        from ..models.attention_processor import (
            CustomDiffusionAttnProcessor,
@@ -507,14 +538,30 @@ class UNet2DConditionLoadersMixin:
        weights: Optional[Union[List[float], float]] = None,
    ):
        """
-        Sets the adapter layers for the unet.
+        Set the currently active adapters for use in the UNet.
        Args:
            adapter_names (`List[str]` or `str`):
                The names of the adapters to use.
-            weights (`Union[List[float], float]`, *optional*):
+            adapter_weights (`Union[List[float], float]`, *optional*):
                The adapter(s) weights to use with the UNet. If `None`, the weights are set to `1.0` for all the
                adapters.
+        Example:
+        ```py
+        from diffusers import AutoPipelineForText2Image
+        import torch
+        pipeline = AutoPipelineForText2Image.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+        ).to("cuda")
+        pipeline.load_lora_weights(
+            "jbilcke-hf/sdxl-cinematic-1", weight_name="pytorch_lora_weights.safetensors", adapter_name="cinematic"
+        )
+        pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+        pipeline.set_adapters(["cinematic", "pixel"], adapter_weights=[0.5, 0.5])
+        ```
        """
        if not USE_PEFT_BACKEND:
            raise ValueError("PEFT backend is required for `set_adapters()`.")
@@ -535,7 +582,22 @@ class UNet2DConditionLoadersMixin:
    def disable_lora(self):
        """
-        Disables the active LoRA layers for the unet.
+        Disable the UNet's active LoRA layers.
+        Example:
+        ```py
+        from diffusers import AutoPipelineForText2Image
+        import torch
+        pipeline = AutoPipelineForText2Image.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+        ).to("cuda")
+        pipeline.load_lora_weights(
+            "jbilcke-hf/sdxl-cinematic-1", weight_name="pytorch_lora_weights.safetensors", adapter_name="cinematic"
+        )
+        pipeline.disable_lora()
+        ```
        """
        if not USE_PEFT_BACKEND:
            raise ValueError("PEFT backend is required for this method.")
@@ -543,7 +605,22 @@ class UNet2DConditionLoadersMixin:
    def enable_lora(self):
        """
-        Enables the active LoRA layers for the unet.
+        Enable the UNet's active LoRA layers.
+        Example:
+        ```py
+        from diffusers import AutoPipelineForText2Image
+        import torch
+        pipeline = AutoPipelineForText2Image.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+        ).to("cuda")
+        pipeline.load_lora_weights(
+            "jbilcke-hf/sdxl-cinematic-1", weight_name="pytorch_lora_weights.safetensors", adapter_name="cinematic"
+        )
+        pipeline.enable_lora()
+        ```
        """
        if not USE_PEFT_BACKEND:
            raise ValueError("PEFT backend is required for this method.")
@@ -551,10 +628,26 @@ class UNet2DConditionLoadersMixin:
    def delete_adapters(self, adapter_names: Union[List[str], str]):
        """
+        Delete an adapter's LoRA layers from the UNet.
        Args:
-        Deletes the LoRA layers of `adapter_name` for the unet.
            adapter_names (`Union[List[str], str]`):
-                The names of the adapter to delete. Can be a single string or a list of strings
+                The names (single string or list of strings) of the adapter to delete.
+        Example:
+        ```py
+        from diffusers import AutoPipelineForText2Image
+        import torch
+        pipeline = AutoPipelineForText2Image.from_pretrained(
+            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+        ).to("cuda")
+        pipeline.load_lora_weights(
+            "jbilcke-hf/sdxl-cinematic-1", weight_name="pytorch_lora_weights.safetensors", adapter_names="cinematic"
+        )
+        pipeline.delete_adapters("cinematic")
+        ```
        """
        if not USE_PEFT_BACKEND:
            raise ValueError("PEFT backend is required for this method.")