[Refactor] Better align `from_single_file` logic with `from_pretrained` (#7496)

* refactor unet single file loading a bit. * retrieve the unet from create_diffusers_unet_model_from_ldm * update * update * updae * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * tests * update * update * update * Update docs/source/en/api/single_file.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/api/single_file.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * update * update * update * update * update * update * update * update * update * update * update * update * update * Update docs/source/en/api/loaders/single_file.md Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/loaders/single_file.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update docs/source/en/api/loaders/single_file.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/api/loaders/single_file.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/api/loaders/single_file.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/api/loaders/single_file.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update --------- Co-authored-by: sayakpaul <spsayakpaul@gmail.com> Co-authored-by: YiYi Xu <yixu310@gmail.com>

[Refactor] Better align `from_single_file` logic with `from_pretrained` (#7496)
* refactor unet single file loading a bit. * retrieve the unet from create_diffusers_unet_model_from_ldm * update * update * updae * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * tests * update * update * update * Update docs/source/en/api/single_file.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/api/single_file.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * update * update * update * update * update * update * update * update * update * update * update * update * update * Update docs/source/en/api/loaders/single_file.md Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/loaders/single_file.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update docs/source/en/api/loaders/single_file.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/api/loaders/single_file.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/api/loaders/single_file.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/api/loaders/single_file.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update --------- Co-authored-by: sayakpaul <spsayakpaul@gmail.com> Co-authored-by: YiYi Xu <yixu310@gmail.com>
cb0f3b49 · Dhruv Nair · GitHub · caf9e985 · cb0f3b49 · cb0f3b49
Unverified Commit cb0f3b49 authored May 09, 2024 by Dhruv Nair Committed by GitHub May 09, 2024
20 changed files
--- a/.github/workflows/push_tests.yml
+++ b/.github/workflows/push_tests.yml
@@ -124,7 +124,7 @@ jobs:
        shell: bash
    strategy:
      matrix:
-        module: [models, schedulers, lora, others]
+        module: [models, schedulers, lora, others, single_file]
    steps:
    - name: Checkout diffusers
      uses: actions/checkout@v3

--- a/docs/source/en/api/loaders/single_file.md
+++ b/docs/source/en/api/loaders/single_file.md
@@ -10,13 +10,124 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
 specific language governing permissions and limitations under the License.
 -->
-# Single files
+# Loading Pipelines and Models via `from_single_file`
-Diffusers supports loading pretrained pipeline (or model) weights stored in a single file, such as a `ckpt` or `safetensors` file. These single file types are typically produced from community trained models. There are three classes for loading single file weights:
+The `from_single_file` method allows you to load supported pipelines using a single checkpoint file as opposed to the folder format used by Diffusers. This is useful if you are working with many of the Stable Diffusion Web UI's (such as A1111) that extensively rely on a single file to distribute all the components of a diffusion model.
- [`FromSingleFileMixin`] supports loading pretrained pipeline weights stored in a single file, which can either be a `ckpt` or `safetensors` file.
+The `from_single_file` method also supports loading models in their originally distributed format. This means that supported models that have been finetuned with other services can be loaded directly into supported Diffusers model objects and pipelines.
- [`FromOriginalVAEMixin`] supports loading a pretrained [`AutoencoderKL`] from pretrained ControlNet weights stored in a single file, which can either be a `ckpt` or `safetensors` file.
- [`FromOriginalControlnetMixin`] supports loading pretrained ControlNet weights stored in a single file, which can either be a `ckpt` or `safetensors` file.
+## Pipelines that currently support `from_single_file` loading
+- [`StableDiffusionPipeline`]
+- [`StableDiffusionImg2ImgPipeline`]
+- [`StableDiffusionInpaintPipeline`]
+- [`StableDiffusionControlNetPipeline`]
+- [`StableDiffusionControlNetImg2ImgPipeline`]
+- [`StableDiffusionControlNetInpaintPipeline`]
+- [`StableDiffusionUpscalePipeline`]
+- [`StableDiffusionXLPipeline`]
+- [`StableDiffusionXLImg2ImgPipeline`]
+- [`StableDiffusionXLInpaintPipeline`]
+- [`StableDiffusionXLInstructPix2PixPipeline`]
+- [`StableDiffusionXLControlNetPipeline`]
+- [`StableDiffusionXLKDiffusionPipeline`]
+- [`LatentConsistencyModelPipeline`]
+- [`LatentConsistencyModelImg2ImgPipeline`]
+- [`StableDiffusionControlNetXSPipeline`]
+- [`StableDiffusionXLControlNetXSPipeline`]
+- [`LEditsPPPipelineStableDiffusion`]
+- [`LEditsPPPipelineStableDiffusionXL`]
+- [`PIAPipeline`]
+## Models that currently support `from_single_file` loading
+- [`UNet2DConditionModel`]
+- [`StableCascadeUNet`]
+- [`AutoencoderKL`]
+- [`ControlNetModel`]
+## Usage Examples
+## Loading a Pipeline using `from_single_file`
+```python
+from diffusers import StableDiffusionXLPipeline
+ckpt_path = "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0_0.9vae.safetensors"
+pipe = StableDiffusionXLPipeline.from_single_file(ckpt_path)
+```
+## Setting components in a Pipeline using `from_single_file`
+Swap components of the pipeline by passing them directly to the `from_single_file` method. e.g If you would like use a different scheduler than the pipeline default.
+```python
+from diffusers import StableDiffusionXLPipeline, DDIMScheduler
+ckpt_path = "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0_0.9vae.safetensors"
+scheduler = DDIMScheduler()
+pipe = StableDiffusionXLPipeline.from_single_file(ckpt_path, scheduler=scheduler)
+```
+```python
+from diffusers import StableDiffusionPipeline, ControlNetModel
+ckpt_path = "https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned-emaonly.safetensors"
+controlnet = ControlNetModel.from_pretrained("https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned-emaonly.safetensors")
+pipe = StableDiffusionPipeline.from_single_file(ckpt_path, controlnet=controlnet)
+```
+## Loading a Model using `from_single_file`
+```python
+from diffusers import StableCascadeUNet
+ckpt_path = "https://huggingface.co/stabilityai/stable-cascade/blob/main/stage_b_lite.safetensors"
+model = StableCascadeUNet.from_single_file(ckpt_path)
+```
+## Using a Diffusers model repository to configure single file loading
+Under the hood, `from_single_file` will try to determine a model repository to use to configure the components of the pipeline. You can also pass in a repository id to the `config` argument of the `from_single_file` method to explicitly set the repository to use.
+```python
+from diffusers import StableDiffusionXLPipeline
+ckpt_path = "https://huggingface.co/segmind/SSD-1B/blob/main/SSD-1B.safetensors"
+repo_id = "segmind/SSD-1B"
+pipe = StableDiffusionXLPipeline.from_single_file(ckpt_path, config=repo_id)
+```
+## Override configuration options when using single file loading
+Override the default model or pipeline configuration options when using `from_single_file` by passing in the relevant arguments directly to the `from_single_file` method. Any argument that is supported by the model or pipeline class can be configured in this way:
+```python
+from diffusers import StableDiffusionXLInstructPix2PixPipeline
+ckpt_path = "https://huggingface.co/stabilityai/cosxl/blob/main/cosxl_edit.safetensors"
+pipe = StableDiffusionXLInstructPix2PixPipeline.from_single_file(ckpt_path, config="diffusers/sdxl-instructpix2pix-768", is_cosxl_edit=True)
+```
+```python
+from diffusers import UNet2DConditionModel
+ckpt_path = "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0_0.9vae.safetensors"
+model = UNet2DConditionModel.from_single_file(ckpt_path, upcast_attention=True)
+```
+In the example above, since we explicitly passed `repo_id="segmind/SSD-1B"`, it will use this [configuration file](https://huggingface.co/segmind/SSD-1B/blob/main/unet/config.json) from the "unet" subfolder in `"segmind/SSD-1B"` to configure the unet component included in the checkpoint; Similarly, it will use the `config.json` file from `"vae"` subfolder to configure the vae model, `config.json` file from text_encoder folder to configure text_encoder and so on.
+Note that most of the time you do not need to explicitly a `config` argument, `from_single_file` will automatically map the checkpoint to a repo id (we will discuss this in more details in next section). However, this can be useful in cases where model components might have been changed from what was originally distributed or in cases where a checkpoint file might not have the necessary metadata to correctly determine the configuration to use for the pipeline.
 <Tip>
@@ -24,14 +135,114 @@ To learn more about how to load single file weights, see the [Load different Sta
 </Tip>
-## FromSingleFileMixin
+## Working with local files
-[[autodoc]] loaders.single_file.FromSingleFileMixin
+As of `diffusers>=0.28.0` the `from_single_file` method will attempt to configure a pipeline or model by first inferring the model type from the checkpoint file and then using the model type to determine the appropriate model repo configuration to use from the Hugging Face Hub. For example, any single file checkpoint based on the Stable Diffusion XL base model will use the [`stabilityai/stable-diffusion-xl-base-1.0`](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) model repo to configure the pipeline.
-## FromOriginalVAEMixin
+If you are working in an environment with restricted internet access, it is recommended to download the config files and checkpoints for the model to your preferred directory and pass the local paths to the `pretrained_model_link_or_path` and `config` arguments of the `from_single_file` method.
-[[autodoc]] loaders.autoencoder.FromOriginalVAEMixin
+```python
+from huggingface_hub import hf_hub_download, snapshot_download
+my_local_checkpoint_path = hf_hub_download(
+    repo_id="segmind/SSD-1B",
+    filename="SSD-1B.safetensors"
+)
+my_local_config_path = snapshot_download(
+    repo_id="segmind/SSD-1B",
+    allowed_patterns=["*.json", "**/*.json", "*.txt", "**/*.txt"]
+)
+pipe = StableDiffusionXLPipeline.from_single_file(my_local_checkpoint_path, config=my_local_config_path, local_files_only=True)
+```
+By default this will download the checkpoints and config files to the [Hugging Face Hub cache directory](https://huggingface.co/docs/huggingface_hub/en/guides/manage-cache). You can also specify a local directory to download the files to by passing the `local_dir` argument to the `hf_hub_download` and `snapshot_download` functions.
+```python
+from huggingface_hub import hf_hub_download, snapshot_download
+my_local_checkpoint_path = hf_hub_download(
+    repo_id="segmind/SSD-1B",
+    filename="SSD-1B.safetensors"
+    local_dir="my_local_checkpoints"
+)
+my_local_config_path = snapshot_download(
+    repo_id="segmind/SSD-1B",
+    allowed_patterns=["*.json", "**/*.json", "*.txt", "**/*.txt"]
+    local_dir="my_local_config"
+)
+pipe = StableDiffusionXLPipeline.from_single_file(my_local_checkpoint_path, config=my_local_config_path, local_files_only=True)
+```
+## Working with local files on file systems that do not support symlinking
+By default the `from_single_file` method relies on the `huggingface_hub` caching mechanism to fetch and store checkpoints and config files for models and pipelines. If you are working with a file system that does not support symlinking, it is recommended that you first download the checkpoint file to a local directory and disable symlinking by passing the `local_dir_use_symlink=False` argument to the `hf_hub_download` and `snapshot_download` functions.
+```python
+from huggingface_hub import hf_hub_download, snapshot_download
+my_local_checkpoint_path = hf_hub_download(
+    repo_id="segmind/SSD-1B",
+    filename="SSD-1B.safetensors"
+    local_dir="my_local_checkpoints",
+    local_dir_use_symlinks=False
+)
+print("My local checkpoint: ", my_local_checkpoint_path)
+my_local_config_path = snapshot_download(
+    repo_id="segmind/SSD-1B",
+    allowed_patterns=["*.json", "**/*.json", "*.txt", "**/*.txt"]
+    local_dir_use_symlinks=False,
+)
+print("My local config: ", my_local_config_path)
+```
+Then pass the local paths to the `pretrained_model_link_or_path` and `config` arguments of the `from_single_file` method.
+```python
+pipe = StableDiffusionXLPipeline.from_single_file(my_local_checkpoint_path, config=my_local_config_path, local_files_only=True)
+```
+<Tip>
+Disabling symlinking means that the `huggingface_hub` caching mechanism has no way to determine whether a file has already been downloaded to the local directory. This means that the `hf_hub_download` and `snapshot_download` functions will download files to the local directory each time they are executed. If you are disabling symlinking, it is recommended that you separate the model download and loading steps to avoid downloading the same file multiple times.
+</Tip>
+## Using the original configuration file of a model
+If you would like to configure the parameters of the model components in the pipeline using the orignal YAML configuration file, you can pass a local path or url to the original configuration file to the `original_config` argument of the `from_single_file` method.
+```python
+from diffusers import StableDiffusionXLPipeline
+ckpt_path = "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0_0.9vae.safetensors"
+repo_id = "stabilityai/stable-diffusion-xl-base-1.0"
+original_config = "https://raw.githubusercontent.com/Stability-AI/generative-models/main/configs/inference/sd_xl_base.yaml"
+pipe = StableDiffusionXLPipeline.from_single_file(ckpt_path, original_config=original_config)
+```
+In the example above, the `original_config` file is only used to configure the parameters of the individual model components of the pipeline. For example it will be used to configure parameters such as the `in_channels` of the `vae` model and `unet` model. It is not used to determine the type of component objects in the pipeline.
+<Tip>
+When using `original_config` with local_files_only=True`, Diffusers will attempt to infer the components based on the type signatures of pipeline class, rather than attempting to fetch the pipeline config from the Hugging Face Hub. This is to prevent backwards breaking changes in existing code that might not be able to connect to the internet to fetch the necessary pipeline config files.
+This is not as reliable as providing a path to a local config repo and might lead to errors when configuring the pipeline. To avoid this, please run the pipeline with `local_files_only=False` once to download the appropriate pipeline config files to the local cache.
+</Tip>
+## FromSingleFileMixin
+[[autodoc]] loaders.single_file.FromSingleFileMixin
-## FromOriginalControlnetMixin
+## FromOriginalModelMixin
-[[autodoc]] loaders.controlnet.FromOriginalControlNetMixin
+[[autodoc]] loaders.single_file_model.FromOriginalModelMixin
\ No newline at end of file
--- a/src/diffusers/__init__.py
+++ b/src/diffusers/__init__.py
@@ -27,6 +27,7 @@ from .utils import (
 _import_structure = {
    "configuration_utils": ["ConfigMixin"],
+    "loaders": ["FromOriginalModelMixin"],
    "models": [],
    "pipelines": [],
    "schedulers": [],

--- a/src/diffusers/configuration_utils.py
+++ b/src/diffusers/configuration_utils.py
@@ -340,6 +340,8 @@ class ConfigMixin:
        """
        cache_dir = kwargs.pop("cache_dir", None)
+        local_dir = kwargs.pop("local_dir", None)
+        local_dir_use_symlinks = kwargs.pop("local_dir_use_symlinks", "auto")
        force_download = kwargs.pop("force_download", False)
        resume_download = kwargs.pop("resume_download", None)
        proxies = kwargs.pop("proxies", None)
@@ -364,13 +366,13 @@ class ConfigMixin:
        if os.path.isfile(pretrained_model_name_or_path):
            config_file = pretrained_model_name_or_path
        elif os.path.isdir(pretrained_model_name_or_path):
-            if os.path.isfile(os.path.join(pretrained_model_name_or_path, cls.config_name)):
+            if subfolder is not None and os.path.isfile(
-                # Load from a PyTorch checkpoint
-                config_file = os.path.join(pretrained_model_name_or_path, cls.config_name)
-            elif subfolder is not None and os.path.isfile(
                os.path.join(pretrained_model_name_or_path, subfolder, cls.config_name)
            ):
                config_file = os.path.join(pretrained_model_name_or_path, subfolder, cls.config_name)
+            elif os.path.isfile(os.path.join(pretrained_model_name_or_path, cls.config_name)):
+                # Load from a PyTorch checkpoint
+                config_file = os.path.join(pretrained_model_name_or_path, cls.config_name)
            else:
                raise EnvironmentError(
                    f"Error no file named {cls.config_name} found in directory {pretrained_model_name_or_path}."
@@ -390,6 +392,8 @@ class ConfigMixin:
                    user_agent=user_agent,
                    subfolder=subfolder,
                    revision=revision,
+                    local_dir=local_dir,
+                    local_dir_use_symlinks=local_dir_use_symlinks,
                )
            except RepositoryNotFoundError:
                raise EnvironmentError(

--- a/src/diffusers/loaders/__init__.py
+++ b/src/diffusers/loaders/__init__.py
@@ -54,9 +54,7 @@ if is_transformers_available():
 _import_structure = {}
 if is_torch_available():
-    _import_structure["autoencoder"] = ["FromOriginalVAEMixin"]
+    _import_structure["single_file_model"] = ["FromOriginalModelMixin"]
-    _import_structure["controlnet"] = ["FromOriginalControlNetMixin"]
    _import_structure["unet"] = ["UNet2DConditionLoadersMixin"]
    _import_structure["utils"] = ["AttnProcsLayers"]
    if is_transformers_available():
@@ -70,8 +68,7 @@ _import_structure["peft"] = ["PeftAdapterMixin"]
 if TYPE_CHECKING or DIFFUSERS_SLOW_IMPORT:
    if is_torch_available():
-        from .autoencoder import FromOriginalVAEMixin
+        from .single_file_model import FromOriginalModelMixin
-        from .controlnet import FromOriginalControlNetMixin
        from .unet import UNet2DConditionLoadersMixin
        from .utils import AttnProcsLayers

--- a/src/diffusers/loaders/single_file.py
+++ b/src/diffusers/loaders/single_file.py
--- a/src/diffusers/loaders/single_file_model.py
+++ b/src/diffusers/loaders/single_file_model.py
+# Copyright 2024 The HuggingFace Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import inspect
+import re
+from contextlib import nullcontext
+from typing import Optional
+from huggingface_hub.utils import validate_hf_hub_args
+from ..utils import deprecate, is_accelerate_available, logging
+from .single_file_utils import (
+    SingleFileComponentError,
+    convert_controlnet_checkpoint,
+    convert_ldm_unet_checkpoint,
+    convert_ldm_vae_checkpoint,
+    convert_stable_cascade_unet_single_file_to_diffusers,
+    create_controlnet_diffusers_config_from_ldm,
+    create_unet_diffusers_config_from_ldm,
+    create_vae_diffusers_config_from_ldm,
+    fetch_diffusers_config,
+    fetch_original_config,
+    load_single_file_checkpoint,
+)
+logger = logging.get_logger(__name__)
+if is_accelerate_available():
+    from accelerate import init_empty_weights
+    from ..models.modeling_utils import load_model_dict_into_meta
+SINGLE_FILE_LOADABLE_CLASSES = {
+    "StableCascadeUNet": {
+        "checkpoint_mapping_fn": convert_stable_cascade_unet_single_file_to_diffusers,
+    },
+    "UNet2DConditionModel": {
+        "checkpoint_mapping_fn": convert_ldm_unet_checkpoint,
+        "config_mapping_fn": create_unet_diffusers_config_from_ldm,
+        "default_subfolder": "unet",
+        "legacy_kwargs": {
+            "num_in_channels": "in_channels",  # Legacy kwargs supported by `from_single_file` mapped to new args
+        },
+    },
+    "AutoencoderKL": {
+        "checkpoint_mapping_fn": convert_ldm_vae_checkpoint,
+        "config_mapping_fn": create_vae_diffusers_config_from_ldm,
+        "default_subfolder": "vae",
+    },
+    "ControlNetModel": {
+        "checkpoint_mapping_fn": convert_controlnet_checkpoint,
+        "config_mapping_fn": create_controlnet_diffusers_config_from_ldm,
+    },
+}
+def _get_mapping_function_kwargs(mapping_fn, **kwargs):
+    parameters = inspect.signature(mapping_fn).parameters
+    mapping_kwargs = {}
+    for parameter in parameters:
+        if parameter in kwargs:
+            mapping_kwargs[parameter] = kwargs[parameter]
+    return mapping_kwargs
+class FromOriginalModelMixin:
+    """
+    Load pretrained weights saved in the `.ckpt` or `.safetensors` format into a model.
+    """
+    @classmethod
+    @validate_hf_hub_args
+    def from_single_file(cls, pretrained_model_link_or_path_or_dict: Optional[str] = None, **kwargs):
+        r"""
+        Instantiate a model from pretrained weights saved in the original `.ckpt` or `.safetensors` format. The model
+        is set in evaluation mode (`model.eval()`) by default.
+        Parameters:
+            pretrained_model_link_or_path_or_dict (`str`, *optional*):
+                Can be either:
+                    - A link to the `.safetensors` or `.ckpt` file (for example
+                      `"https://huggingface.co/<repo_id>/blob/main/<path_to_file>.safetensors"`) on the Hub.
+                    - A path to a local *file* containing the weights of the component model.
+                    - A state dict containing the component model weights.
+            config (`str`, *optional*):
+                - A string, the *repo id* (for example `CompVis/ldm-text2im-large-256`) of a pretrained pipeline hosted
+                  on the Hub.
+                - A path to a *directory* (for example `./my_pipeline_directory/`) containing the pipeline component
+                  configs in Diffusers format.
+            subfolder (`str`, *optional*, defaults to `""`):
+                The subfolder location of a model file within a larger model repository on the Hub or locally.
+            original_config (`str`, *optional*):
+                Dict or path to a yaml file containing the configuration for the model in its original format.
+                    If a dict is provided, it will be used to initialize the model configuration.
+            torch_dtype (`str` or `torch.dtype`, *optional*):
+                Override the default `torch.dtype` and load the model with another dtype. If `"auto"` is passed, the
+                dtype is automatically derived from the model's weights.
+            force_download (`bool`, *optional*, defaults to `False`):
+                Whether or not to force the (re-)download of the model weights and configuration files, overriding the
+                cached versions if they exist.
+            cache_dir (`Union[str, os.PathLike]`, *optional*):
+                Path to a directory where a downloaded pretrained model configuration is cached if the standard cache
+                is not used.
+            resume_download (`bool`, *optional*, defaults to `False`):
+                Whether or not to resume downloading the model weights and configuration files. If set to `False`, any
+                incompletely downloaded files are deleted.
+            proxies (`Dict[str, str]`, *optional*):
+                A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128',
+                'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.
+            local_files_only (`bool`, *optional*, defaults to `False`):
+                Whether to only load local model weights and configuration files or not. If set to True, the model
+                won't be downloaded from the Hub.
+            token (`str` or *bool*, *optional*):
+                The token to use as HTTP bearer authorization for remote files. If `True`, the token generated from
+                `diffusers-cli login` (stored in `~/.huggingface`) is used.
+            revision (`str`, *optional*, defaults to `"main"`):
+                The specific model version to use. It can be a branch name, a tag name, a commit id, or any identifier
+                allowed by Git.
+            kwargs (remaining dictionary of keyword arguments, *optional*):
+                Can be used to overwrite load and saveable variables (for example the pipeline components of the
+                specific pipeline class). The overwritten components are directly passed to the pipelines `__init__`
+                method. See example below for more information.
+        ```py
+        >>> from diffusers import StableCascadeUNet
+        >>> ckpt_path = "https://huggingface.co/stabilityai/stable-cascade/blob/main/stage_b_lite.safetensors"
+        >>> model = StableCascadeUNet.from_single_file(ckpt_path)
+        ```
+        """
+        class_name = cls.__name__
+        if class_name not in SINGLE_FILE_LOADABLE_CLASSES:
+            raise ValueError(
+                f"FromOriginalModelMixin is currently only compatible with {', '.join(SINGLE_FILE_LOADABLE_CLASSES.keys())}"
+            )
+        pretrained_model_link_or_path = kwargs.get("pretrained_model_link_or_path", None)
+        if pretrained_model_link_or_path is not None:
+            deprecation_message = (
+                "Please use `pretrained_model_link_or_path_or_dict` argument instead for model classes"
+            )
+            deprecate("pretrained_model_link_or_path", "1.0.0", deprecation_message)
+            pretrained_model_link_or_path_or_dict = pretrained_model_link_or_path
+        config = kwargs.pop("config", None)
+        original_config = kwargs.pop("original_config", None)
+        if config is not None and original_config is not None:
+            raise ValueError(
+                "`from_single_file` cannot accept both `config` and `original_config` arguments. Please provide only one of these arguments"
+            )
+        resume_download = kwargs.pop("resume_download", False)
+        force_download = kwargs.pop("force_download", False)
+        proxies = kwargs.pop("proxies", None)
+        token = kwargs.pop("token", None)
+        cache_dir = kwargs.pop("cache_dir", None)
+        local_files_only = kwargs.pop("local_files_only", None)
+        subfolder = kwargs.pop("subfolder", None)
+        revision = kwargs.pop("revision", None)
+        torch_dtype = kwargs.pop("torch_dtype", None)
+        if isinstance(pretrained_model_link_or_path_or_dict, dict):
+            checkpoint = pretrained_model_link_or_path_or_dict
+        else:
+            checkpoint = load_single_file_checkpoint(
+                pretrained_model_link_or_path_or_dict,
+                resume_download=resume_download,
+                force_download=force_download,
+                proxies=proxies,
+                token=token,
+                cache_dir=cache_dir,
+                local_files_only=local_files_only,
+                revision=revision,
+            )
+        mapping_functions = SINGLE_FILE_LOADABLE_CLASSES[class_name]
+        checkpoint_mapping_fn = mapping_functions["checkpoint_mapping_fn"]
+        if original_config:
+            if "config_mapping_fn" in mapping_functions:
+                config_mapping_fn = mapping_functions["config_mapping_fn"]
+            else:
+                config_mapping_fn = None
+            if config_mapping_fn is None:
+                raise ValueError(
+                    (
+                        f"`original_config` has been provided for {class_name} but no mapping function"
+                        "was found to convert the original config to a Diffusers config in"
+                        "`diffusers.loaders.single_file_utils`"
+                    )
+                )
+            if isinstance(original_config, str):
+                # If original_config is a URL or filepath fetch the original_config dict
+                original_config = fetch_original_config(original_config, local_files_only=local_files_only)
+            config_mapping_kwargs = _get_mapping_function_kwargs(config_mapping_fn, **kwargs)
+            diffusers_model_config = config_mapping_fn(
+                original_config=original_config, checkpoint=checkpoint, **config_mapping_kwargs
+            )
+        else:
+            if config:
+                if isinstance(config, str):
+                    default_pretrained_model_config_name = config
+                else:
+                    raise ValueError(
+                        (
+                            "Invalid `config` argument. Please provide a string representing a repo id"
+                            "or path to a local Diffusers model repo."
+                        )
+                    )
+            else:
+                config = fetch_diffusers_config(checkpoint)
+                default_pretrained_model_config_name = config["pretrained_model_name_or_path"]
+                if "default_subfolder" in mapping_functions:
+                    subfolder = mapping_functions["default_subfolder"]
+                subfolder = subfolder or config.pop(
+                    "subfolder", None
+                )  # some configs contain a subfolder key, e.g. StableCascadeUNet
+            diffusers_model_config = cls.load_config(
+                pretrained_model_name_or_path=default_pretrained_model_config_name,
+                subfolder=subfolder,
+                local_files_only=local_files_only,
+            )
+            expected_kwargs, optional_kwargs = cls._get_signature_keys(cls)
+            # Map legacy kwargs to new kwargs
+            if "legacy_kwargs" in mapping_functions:
+                legacy_kwargs = mapping_functions["legacy_kwargs"]
+                for legacy_key, new_key in legacy_kwargs.items():
+                    if legacy_key in kwargs:
+                        kwargs[new_key] = kwargs.pop(legacy_key)
+            model_kwargs = {k: kwargs.get(k) for k in kwargs if k in expected_kwargs or k in optional_kwargs}
+            diffusers_model_config.update(model_kwargs)
+        checkpoint_mapping_kwargs = _get_mapping_function_kwargs(checkpoint_mapping_fn, **kwargs)
+        diffusers_format_checkpoint = checkpoint_mapping_fn(
+            config=diffusers_model_config, checkpoint=checkpoint, **checkpoint_mapping_kwargs
+        )
+        if not diffusers_format_checkpoint:
+            raise SingleFileComponentError(
+                f"Failed to load {class_name}. Weights for this component appear to be missing in the checkpoint."
+            )
+        ctx = init_empty_weights if is_accelerate_available() else nullcontext
+        with ctx():
+            model = cls.from_config(diffusers_model_config)
+        if is_accelerate_available():
+            unexpected_keys = load_model_dict_into_meta(model, diffusers_format_checkpoint, dtype=torch_dtype)
+            if model._keys_to_ignore_on_load_unexpected is not None:
+                for pat in model._keys_to_ignore_on_load_unexpected:
+                    unexpected_keys = [k for k in unexpected_keys if re.search(pat, k) is None]
+            if len(unexpected_keys) > 0:
+                logger.warning(
+                    f"Some weights of the model checkpoint were not used when initializing {cls.__name__}: \n {[', '.join(unexpected_keys)]}"
+                )
+        else:
+            model.load_state_dict(diffusers_format_checkpoint)
+        if torch_dtype is not None:
+            model.to(torch_dtype)
+        model.eval()
+        return model
--- a/src/diffusers/loaders/single_file_utils.py
+++ b/src/diffusers/loaders/single_file_utils.py
--- a/src/diffusers/loaders/unet.py
+++ b/src/diffusers/loaders/unet.py
@@ -44,11 +44,6 @@ from ..utils import (
    set_adapter_layers,
    set_weights_and_activate_adapters,
 )
-from .single_file_utils import (
-    convert_stable_cascade_unet_single_file_to_diffusers,
-    infer_stable_cascade_single_file_config,
-    load_single_file_model_checkpoint,
-)
 from .unet_loader_utils import _maybe_expand_lora_scales
 from .utils import AttnProcsLayers
@@ -1059,103 +1054,3 @@ class UNet2DConditionLoadersMixin:
                        }
                    )
        return lora_dicts
-class FromOriginalUNetMixin:
-    """
-    Load pretrained UNet model weights saved in the `.ckpt` or `.safetensors` format into a [`StableCascadeUNet`].
-    """
-    @classmethod
-    @validate_hf_hub_args
-    def from_single_file(cls, pretrained_model_link_or_path, **kwargs):
-        r"""
-        Instantiate a [`StableCascadeUNet`] from pretrained StableCascadeUNet weights saved in the original `.ckpt` or
-        `.safetensors` format. The pipeline is set in evaluation mode (`model.eval()`) by default.
-        Parameters:
-            pretrained_model_link_or_path (`str` or `os.PathLike`, *optional*):
-                Can be either:
-                    - A link to the `.ckpt` file (for example
-                      `"https://huggingface.co/<repo_id>/blob/main/<path_to_file>.ckpt"`) on the Hub.
-                    - A path to a *file* containing all pipeline weights.
-            config: (`dict`, *optional*):
-                Dictionary containing the configuration of the model:
-            torch_dtype (`str` or `torch.dtype`, *optional*):
-                Override the default `torch.dtype` and load the model with another dtype. If `"auto"` is passed, the
-                dtype is automatically derived from the model's weights.
-            force_download (`bool`, *optional*, defaults to `False`):
-                Whether or not to force the (re-)download of the model weights and configuration files, overriding the
-                cached versions if they exist.
-            cache_dir (`Union[str, os.PathLike]`, *optional*):
-                Path to a directory where a downloaded pretrained model configuration is cached if the standard cache
-                is not used.
-            resume_download:
-                Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v1
-                of Diffusers.
-            proxies (`Dict[str, str]`, *optional*):
-                A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128',
-                'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.
-            local_files_only (`bool`, *optional*, defaults to `False`):
-                Whether to only load local model weights and configuration files or not. If set to True, the model
-                won't be downloaded from the Hub.
-            token (`str` or *bool*, *optional*):
-                The token to use as HTTP bearer authorization for remote files. If `True`, the token generated from
-                `diffusers-cli login` (stored in `~/.huggingface`) is used.
-            revision (`str`, *optional*, defaults to `"main"`):
-                The specific model version to use. It can be a branch name, a tag name, a commit id, or any identifier
-                allowed by Git.
-            kwargs (remaining dictionary of keyword arguments, *optional*):
-                Can be used to overwrite load and saveable variables of the model.
-        """
-        class_name = cls.__name__
-        if class_name != "StableCascadeUNet":
-            raise ValueError("FromOriginalUNetMixin is currently only compatible with StableCascadeUNet")
-        config = kwargs.pop("config", None)
-        resume_download = kwargs.pop("resume_download", None)
-        force_download = kwargs.pop("force_download", False)
-        proxies = kwargs.pop("proxies", None)
-        token = kwargs.pop("token", None)
-        cache_dir = kwargs.pop("cache_dir", None)
-        local_files_only = kwargs.pop("local_files_only", None)
-        revision = kwargs.pop("revision", None)
-        torch_dtype = kwargs.pop("torch_dtype", None)
-        checkpoint = load_single_file_model_checkpoint(
-            pretrained_model_link_or_path,
-            resume_download=resume_download,
-            force_download=force_download,
-            proxies=proxies,
-            token=token,
-            cache_dir=cache_dir,
-            local_files_only=local_files_only,
-            revision=revision,
-        )
-        if config is None:
-            config = infer_stable_cascade_single_file_config(checkpoint)
-            model_config = cls.load_config(**config, **kwargs)
-        else:
-            model_config = config
-        ctx = init_empty_weights if is_accelerate_available() else nullcontext
-        with ctx():
-            model = cls.from_config(model_config, **kwargs)
-        diffusers_format_checkpoint = convert_stable_cascade_unet_single_file_to_diffusers(checkpoint)
-        if is_accelerate_available():
-            unexpected_keys = load_model_dict_into_meta(model, diffusers_format_checkpoint, dtype=torch_dtype)
-            if len(unexpected_keys) > 0:
-                logger.warning(
-                    f"Some weights of the model checkpoint were not used when initializing {cls.__name__}: \n {[', '.join(unexpected_keys)]}"
-                )
-        else:
-            model.load_state_dict(diffusers_format_checkpoint)
-        if torch_dtype is not None:
-            model.to(torch_dtype)
-        return model
--- a/src/diffusers/models/autoencoders/autoencoder_kl.py
+++ b/src/diffusers/models/autoencoders/autoencoder_kl.py
@@ -17,7 +17,7 @@ import torch
 import torch.nn as nn
 from ...configuration_utils import ConfigMixin, register_to_config
-from ...loaders import FromOriginalVAEMixin
+from ...loaders.single_file_model import FromOriginalModelMixin
 from ...utils.accelerate_utils import apply_forward_hook
 from ..attention_processor import (
    ADDED_KV_ATTENTION_PROCESSORS,
@@ -32,7 +32,7 @@ from ..modeling_utils import ModelMixin
 from .vae import Decoder, DecoderOutput, DiagonalGaussianDistribution, Encoder
-class AutoencoderKL(ModelMixin, ConfigMixin, FromOriginalVAEMixin):
+class AutoencoderKL(ModelMixin, ConfigMixin, FromOriginalModelMixin):
    r"""
    A VAE model with KL loss for encoding images into latents and decoding latent representations into images.

--- a/src/diffusers/models/controlnet.py
+++ b/src/diffusers/models/controlnet.py
@@ -19,7 +19,7 @@ from torch import nn
 from torch.nn import functional as F
 from ..configuration_utils import ConfigMixin, register_to_config
-from ..loaders import FromOriginalControlNetMixin
+from ..loaders.single_file_model import FromOriginalModelMixin
 from ..utils import BaseOutput, logging
 from .attention_processor import (
    ADDED_KV_ATTENTION_PROCESSORS,
@@ -108,7 +108,7 @@ class ControlNetConditioningEmbedding(nn.Module):
        return embedding
-class ControlNetModel(ModelMixin, ConfigMixin, FromOriginalControlNetMixin):
+class ControlNetModel(ModelMixin, ConfigMixin, FromOriginalModelMixin):
    """
    A ControlNet model.

--- a/src/diffusers/models/modeling_utils.py
+++ b/src/diffusers/models/modeling_utils.py
@@ -963,6 +963,15 @@ class ModelMixin(torch.nn.Module, PushToHubMixin):
        return model, missing_keys, unexpected_keys, mismatched_keys, error_msgs
+    @classmethod
+    def _get_signature_keys(cls, obj):
+        parameters = inspect.signature(obj.__init__).parameters
+        required_parameters = {k: v for k, v in parameters.items() if v.default == inspect._empty}
+        optional_parameters = set({k for k, v in parameters.items() if v.default != inspect._empty})
+        expected_modules = set(required_parameters.keys()) - {"self"}
+        return expected_modules, optional_parameters
    # Adapted from `transformers` modeling_utils.py
    def _get_no_split_modules(self, device_map: str):
        """

--- a/src/diffusers/models/unets/unet_2d_condition.py
+++ b/src/diffusers/models/unets/unet_2d_condition.py
@@ -20,6 +20,7 @@ import torch.utils.checkpoint
 from ...configuration_utils import ConfigMixin, register_to_config
 from ...loaders import PeftAdapterMixin, UNet2DConditionLoadersMixin
+from ...loaders.single_file_model import FromOriginalModelMixin
 from ...utils import USE_PEFT_BACKEND, BaseOutput, deprecate, logging, scale_lora_layers, unscale_lora_layers
 from ..activations import get_activation
 from ..attention_processor import (
@@ -66,7 +67,9 @@ class UNet2DConditionOutput(BaseOutput):
    sample: torch.FloatTensor = None
-class UNet2DConditionModel(ModelMixin, ConfigMixin, UNet2DConditionLoadersMixin, PeftAdapterMixin):
+class UNet2DConditionModel(
+    ModelMixin, ConfigMixin, FromOriginalModelMixin, UNet2DConditionLoadersMixin, PeftAdapterMixin
+):
    r"""
    A conditional 2D UNet model that takes a noisy sample, conditional state, and a timestep and returns a sample
    shaped output.

--- a/src/diffusers/models/unets/unet_stable_cascade.py
+++ b/src/diffusers/models/unets/unet_stable_cascade.py
@@ -21,7 +21,7 @@ import torch
 import torch.nn as nn
 from ...configuration_utils import ConfigMixin, register_to_config
-from ...loaders.unet import FromOriginalUNetMixin
+from ...loaders import FromOriginalModelMixin
 from ...utils import BaseOutput
 from ..attention_processor import Attention
 from ..modeling_utils import ModelMixin
@@ -134,7 +134,7 @@ class StableCascadeUNetOutput(BaseOutput):
    sample: torch.FloatTensor = None
-class StableCascadeUNet(ModelMixin, ConfigMixin, FromOriginalUNetMixin):
+class StableCascadeUNet(ModelMixin, ConfigMixin, FromOriginalModelMixin):
    _supports_gradient_checkpointing = True
    @register_to_config

--- a/src/diffusers/pipelines/pipeline_loading_utils.py
+++ b/src/diffusers/pipelines/pipeline_loading_utils.py
@@ -609,6 +609,7 @@ def load_sub_model(
 ):
    """Helper method to load the module `name` from `library_name` and `class_name`"""
    # retrieve class candidates
    class_obj, class_candidates = get_class_obj_and_candidates(
        library_name,
        class_name,

--- a/tests/models/autoencoders/test_models_vae.py
+++ b/tests/models/autoencoders/test_models_vae.py
@@ -791,62 +791,6 @@ class AutoencoderKLIntegrationTests(unittest.TestCase):
        tolerance = 3e-3 if torch_device != "mps" else 1e-2
        assert torch_all_close(output_slice, expected_output_slice, atol=tolerance)
-    def test_stable_diffusion_model_local(self):
-        model_id = "stabilityai/sd-vae-ft-mse"
-        model_1 = AutoencoderKL.from_pretrained(model_id).to(torch_device)
-        url = "https://huggingface.co/stabilityai/sd-vae-ft-mse-original/blob/main/vae-ft-mse-840000-ema-pruned.safetensors"
-        model_2 = AutoencoderKL.from_single_file(url).to(torch_device)
-        image = self.get_sd_image(33)
-        with torch.no_grad():
-            sample_1 = model_1(image).sample
-            sample_2 = model_2(image).sample
-        assert sample_1.shape == sample_2.shape
-        output_slice_1 = sample_1[-1, -2:, -2:, :2].flatten().float().cpu()
-        output_slice_2 = sample_2[-1, -2:, -2:, :2].flatten().float().cpu()
-        assert torch_all_close(output_slice_1, output_slice_2, atol=3e-3)
-    def test_single_file_component_configs(self):
-        vae_single_file = AutoencoderKL.from_single_file(
-            "https://huggingface.co/stabilityai/sd-vae-ft-mse-original/blob/main/vae-ft-mse-840000-ema-pruned.safetensors"
-        )
-        vae = AutoencoderKL.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="vae")
-        PARAMS_TO_IGNORE = ["torch_dtype", "_name_or_path", "_use_default_values"]
-        for param_name, param_value in vae_single_file.config.items():
-            if param_name in PARAMS_TO_IGNORE:
-                continue
-            assert (
-                vae.config[param_name] == param_value
-            ), f"{param_name} differs between single file loading and pretrained loading"
-    def test_single_file_arguments(self):
-        vae_default = AutoencoderKL.from_single_file(
-            "https://huggingface.co/stabilityai/sd-vae-ft-mse-original/blob/main/vae-ft-mse-840000-ema-pruned.safetensors",
-        )
-        assert vae_default.config.scaling_factor == 0.18215
-        assert vae_default.config.sample_size == 512
-        assert vae_default.dtype == torch.float32
-        scaling_factor = 2.0
-        image_size = 256
-        torch_dtype = torch.float16
-        vae = AutoencoderKL.from_single_file(
-            "https://huggingface.co/stabilityai/sd-vae-ft-mse-original/blob/main/vae-ft-mse-840000-ema-pruned.safetensors",
-            image_size=image_size,
-            scaling_factor=scaling_factor,
-            torch_dtype=torch_dtype,
-        )
-        assert vae.config.scaling_factor == scaling_factor
-        assert vae.config.sample_size == image_size
-        assert vae.dtype == torch_dtype
 @slow
 class AsymmetricAutoencoderKLIntegrationTests(unittest.TestCase):

--- a/tests/models/unets/test_models_unet_stable_cascade.py
+++ b/tests/models/unets/test_models_unet_stable_cascade.py
@@ -56,7 +56,7 @@ class StableCascadeUNetModelSlowTests(unittest.TestCase):
        gc.collect()
        torch.cuda.empty_cache()
-        PARAMS_TO_IGNORE = ["torch_dtype", "_name_or_path", "_use_default_values"]
+        PARAMS_TO_IGNORE = ["torch_dtype", "_name_or_path", "_use_default_values", "_diffusers_version"]
        for param_name, param_value in single_file_unet_config.items():
            if param_name in PARAMS_TO_IGNORE:
                continue
@@ -78,7 +78,7 @@ class StableCascadeUNetModelSlowTests(unittest.TestCase):
        gc.collect()
        torch.cuda.empty_cache()
-        PARAMS_TO_IGNORE = ["torch_dtype", "_name_or_path", "_use_default_values"]
+        PARAMS_TO_IGNORE = ["torch_dtype", "_name_or_path", "_use_default_values", "_diffusers_version"]
        for param_name, param_value in single_file_unet_config.items():
            if param_name in PARAMS_TO_IGNORE:
                continue
@@ -97,7 +97,7 @@ class StableCascadeUNetModelSlowTests(unittest.TestCase):
        gc.collect()
        torch.cuda.empty_cache()
-        PARAMS_TO_IGNORE = ["torch_dtype", "_name_or_path", "_use_default_values"]
+        PARAMS_TO_IGNORE = ["torch_dtype", "_name_or_path", "_use_default_values", "_diffusers_version"]
        for param_name, param_value in config.items():
            if param_name in PARAMS_TO_IGNORE:
                continue

--- a/tests/pipelines/controlnet/test_controlnet.py
+++ b/tests/pipelines/controlnet/test_controlnet.py
@@ -38,7 +38,6 @@ from diffusers.utils.testing_utils import (
    get_python_version,
    load_image,
    load_numpy,
-    numpy_cosine_similarity_distance,
    require_python39_or_higher,
    require_torch_2,
    require_torch_gpu,
@@ -1063,97 +1062,6 @@ class ControlNetPipelineSlowTests(unittest.TestCase):
        expected_slice = np.array([0.1338, 0.1597, 0.1202, 0.1687, 0.1377, 0.1017, 0.2070, 0.1574, 0.1348])
        assert np.abs(image_slice.flatten() - expected_slice).max() < 1e-2
-    def test_load_local(self):
-        controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_canny")
-        pipe = StableDiffusionControlNetPipeline.from_pretrained(
-            "runwayml/stable-diffusion-v1-5", safety_checker=None, controlnet=controlnet
-        )
-        pipe.unet.set_default_attn_processor()
-        pipe.enable_model_cpu_offload()
-        controlnet = ControlNetModel.from_single_file(
-            "https://huggingface.co/lllyasviel/ControlNet-v1-1/blob/main/control_v11p_sd15_canny.pth"
-        )
-        pipe_sf = StableDiffusionControlNetPipeline.from_single_file(
-            "https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned-emaonly.safetensors",
-            safety_checker=None,
-            controlnet=controlnet,
-            scheduler_type="pndm",
-        )
-        pipe_sf.unet.set_default_attn_processor()
-        pipe_sf.enable_model_cpu_offload()
-        control_image = load_image(
-            "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/bird_canny.png"
-        ).resize((512, 512))
-        prompt = "bird"
-        generator = torch.Generator(device="cpu").manual_seed(0)
-        output = pipe(
-            prompt,
-            image=control_image,
-            generator=generator,
-            output_type="np",
-            num_inference_steps=3,
-        ).images[0]
-        generator = torch.Generator(device="cpu").manual_seed(0)
-        output_sf = pipe_sf(
-            prompt,
-            image=control_image,
-            generator=generator,
-            output_type="np",
-            num_inference_steps=3,
-        ).images[0]
-        max_diff = numpy_cosine_similarity_distance(output_sf.flatten(), output.flatten())
-        assert max_diff < 1e-3
-    def test_single_file_component_configs(self):
-        controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_canny", variant="fp16")
-        pipe = StableDiffusionControlNetPipeline.from_pretrained(
-            "runwayml/stable-diffusion-v1-5", variant="fp16", safety_checker=None, controlnet=controlnet
-        )
-        controlnet_single_file = ControlNetModel.from_single_file(
-            "https://huggingface.co/lllyasviel/ControlNet-v1-1/blob/main/control_v11p_sd15_canny.pth"
-        )
-        single_file_pipe = StableDiffusionControlNetPipeline.from_single_file(
-            "https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned-emaonly.safetensors",
-            safety_checker=None,
-            controlnet=controlnet_single_file,
-            scheduler_type="pndm",
-        )
-        PARAMS_TO_IGNORE = ["torch_dtype", "_name_or_path", "architectures", "_use_default_values"]
-        for param_name, param_value in single_file_pipe.controlnet.config.items():
-            if param_name in PARAMS_TO_IGNORE:
-                continue
-            # This parameter doesn't appear to be loaded from the config.
-            # So when it is registered to config, it remains a tuple as this is the default in the class definition
-            # from_pretrained, does load from config and converts to a list when registering to config
-            if param_name == "conditioning_embedding_out_channels" and isinstance(param_value, tuple):
-                param_value = list(param_value)
-            assert (
-                pipe.controlnet.config[param_name] == param_value
-            ), f"{param_name} differs between single file loading and pretrained loading"
-        for param_name, param_value in single_file_pipe.unet.config.items():
-            if param_name in PARAMS_TO_IGNORE:
-                continue
-            assert (
-                pipe.unet.config[param_name] == param_value
-            ), f"{param_name} differs between single file loading and pretrained loading"
-        for param_name, param_value in single_file_pipe.vae.config.items():
-            if param_name in PARAMS_TO_IGNORE:
-                continue
-            assert (
-                pipe.vae.config[param_name] == param_value
-            ), f"{param_name} differs between single file loading and pretrained loading"
 @slow
 @require_torch_gpu

--- a/tests/pipelines/controlnet/test_controlnet_img2img.py
+++ b/tests/pipelines/controlnet/test_controlnet_img2img.py
@@ -39,7 +39,6 @@ from diffusers.utils.testing_utils import (
    enable_full_determinism,
    floats_tensor,
    load_numpy,
-    numpy_cosine_similarity_distance,
    require_torch_gpu,
    slow,
    torch_device,
@@ -441,56 +440,3 @@ class ControlNetImg2ImgPipelineSlowTests(unittest.TestCase):
        )
        assert np.abs(expected_image - image).max() < 9e-2
-    def test_load_local(self):
-        controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_canny")
-        pipe = StableDiffusionControlNetImg2ImgPipeline.from_pretrained(
-            "runwayml/stable-diffusion-v1-5", safety_checker=None, controlnet=controlnet
-        )
-        pipe.unet.set_default_attn_processor()
-        pipe.enable_model_cpu_offload()
-        controlnet = ControlNetModel.from_single_file(
-            "https://huggingface.co/lllyasviel/ControlNet-v1-1/blob/main/control_v11p_sd15_canny.pth"
-        )
-        pipe_sf = StableDiffusionControlNetImg2ImgPipeline.from_single_file(
-            "https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned-emaonly.safetensors",
-            safety_checker=None,
-            controlnet=controlnet,
-            scheduler_type="pndm",
-        )
-        pipe_sf.unet.set_default_attn_processor()
-        pipe_sf.enable_model_cpu_offload()
-        control_image = load_image(
-            "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/bird_canny.png"
-        ).resize((512, 512))
-        image = load_image(
-            "https://huggingface.co/lllyasviel/sd-controlnet-canny/resolve/main/images/bird.png"
-        ).resize((512, 512))
-        prompt = "bird"
-        generator = torch.Generator(device="cpu").manual_seed(0)
-        output = pipe(
-            prompt,
-            image=image,
-            control_image=control_image,
-            strength=0.9,
-            generator=generator,
-            output_type="np",
-            num_inference_steps=3,
-        ).images[0]
-        generator = torch.Generator(device="cpu").manual_seed(0)
-        output_sf = pipe_sf(
-            prompt,
-            image=image,
-            control_image=control_image,
-            strength=0.9,
-            generator=generator,
-            output_type="np",
-            num_inference_steps=3,
-        ).images[0]
-        max_diff = numpy_cosine_similarity_distance(output_sf.flatten(), output.flatten())
-        assert max_diff < 1e-3
--- a/tests/pipelines/controlnet/test_controlnet_inpaint.py
+++ b/tests/pipelines/controlnet/test_controlnet_inpaint.py
@@ -556,55 +556,3 @@ class ControlNetInpaintPipelineSlowTests(unittest.TestCase):
        )
        assert numpy_cosine_similarity_distance(expected_image.flatten(), image.flatten()) < 1e-2
-    def test_load_local(self):
-        controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_canny")
-        pipe_1 = StableDiffusionControlNetInpaintPipeline.from_pretrained(
-            "runwayml/stable-diffusion-inpainting", safety_checker=None, controlnet=controlnet
-        )
-        controlnet = ControlNetModel.from_single_file(
-            "https://huggingface.co/lllyasviel/ControlNet-v1-1/blob/main/control_v11p_sd15_canny.pth"
-        )
-        pipe_2 = StableDiffusionControlNetInpaintPipeline.from_single_file(
-            "https://huggingface.co/runwayml/stable-diffusion-inpainting/blob/main/sd-v1-5-inpainting.ckpt",
-            safety_checker=None,
-            controlnet=controlnet,
-        )
-        control_image = load_image(
-            "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/bird_canny.png"
-        ).resize((512, 512))
-        image = load_image(
-            "https://huggingface.co/lllyasviel/sd-controlnet-canny/resolve/main/images/bird.png"
-        ).resize((512, 512))
-        mask_image = load_image(
-            "https://huggingface.co/datasets/diffusers/test-arrays/resolve/main"
-            "/stable_diffusion_inpaint/input_bench_mask.png"
-        ).resize((512, 512))
-        pipes = [pipe_1, pipe_2]
-        images = []
-        for pipe in pipes:
-            pipe.enable_model_cpu_offload()
-            pipe.set_progress_bar_config(disable=None)
-            generator = torch.Generator(device="cpu").manual_seed(0)
-            prompt = "bird"
-            output = pipe(
-                prompt,
-                image=image,
-                control_image=control_image,
-                mask_image=mask_image,
-                strength=0.9,
-                generator=generator,
-                output_type="np",
-                num_inference_steps=3,
-            )
-            images.append(output.images[0])
-            del pipe
-            gc.collect()
-            torch.cuda.empty_cache()
-        max_diff = numpy_cosine_similarity_distance(images[0].flatten(), images[1].flatten())
-        assert max_diff < 1e-3