Unverified Commit 14e3a28c authored by Naoki Ainoya's avatar Naoki Ainoya Committed by GitHub
Browse files

Rename 'CLIPFeatureExtractor' class to 'CLIPImageProcessor' (#2732)

The 'CLIPFeatureExtractor' class name has been renamed to 'CLIPImageProcessor' in order to comply with future deprecation. This commit includes the necessary changes to the affected files.
parent 8e35ef01
...@@ -19,9 +19,9 @@ components - all of which are needed to have a functioning end-to-end diffusion ...@@ -19,9 +19,9 @@ components - all of which are needed to have a functioning end-to-end diffusion
As an example, [Stable Diffusion](https://huggingface.co/blog/stable_diffusion) has three independently trained models: As an example, [Stable Diffusion](https://huggingface.co/blog/stable_diffusion) has three independently trained models:
- [Autoencoder](./api/models#vae) - [Autoencoder](./api/models#vae)
- [Conditional Unet](./api/models#UNet2DConditionModel) - [Conditional Unet](./api/models#UNet2DConditionModel)
- [CLIP text encoder](https://huggingface.co/docs/transformers/v4.21.2/en/model_doc/clip#transformers.CLIPTextModel) - [CLIP text encoder](https://huggingface.co/docs/transformers/v4.27.1/en/model_doc/clip#transformers.CLIPTextModel)
- a scheduler component, [scheduler](./api/scheduler#pndm), - a scheduler component, [scheduler](./api/scheduler#pndm),
- a [CLIPFeatureExtractor](https://huggingface.co/docs/transformers/v4.21.2/en/model_doc/clip#transformers.CLIPFeatureExtractor), - a [CLIPImageProcessor](https://huggingface.co/docs/transformers/v4.27.1/en/model_doc/clip#transformers.CLIPImageProcessor),
- as well as a [safety checker](./stable_diffusion#safety_checker). - as well as a [safety checker](./stable_diffusion#safety_checker).
All of these components are necessary to run stable diffusion in inference even though they were trained All of these components are necessary to run stable diffusion in inference even though they were trained
or created independently from each other. or created independently from each other.
......
...@@ -45,11 +45,11 @@ The following code requires roughly 12GB of GPU RAM. ...@@ -45,11 +45,11 @@ The following code requires roughly 12GB of GPU RAM.
```python ```python
from diffusers import DiffusionPipeline from diffusers import DiffusionPipeline
from transformers import CLIPFeatureExtractor, CLIPModel from transformers import CLIPImageProcessor, CLIPModel
import torch import torch
feature_extractor = CLIPFeatureExtractor.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K") feature_extractor = CLIPImageProcessor.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K")
clip_model = CLIPModel.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K", torch_dtype=torch.float16) clip_model = CLIPModel.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K", torch_dtype=torch.float16)
......
...@@ -50,11 +50,11 @@ and passing pipeline modules directly. ...@@ -50,11 +50,11 @@ and passing pipeline modules directly.
```python ```python
from diffusers import DiffusionPipeline from diffusers import DiffusionPipeline
from transformers import CLIPFeatureExtractor, CLIPModel from transformers import CLIPImageProcessor, CLIPModel
clip_model_id = "laion/CLIP-ViT-B-32-laion2B-s34B-b79K" clip_model_id = "laion/CLIP-ViT-B-32-laion2B-s34B-b79K"
feature_extractor = CLIPFeatureExtractor.from_pretrained(clip_model_id) feature_extractor = CLIPImageProcessor.from_pretrained(clip_model_id)
clip_model = CLIPModel.from_pretrained(clip_model_id) clip_model = CLIPModel.from_pretrained(clip_model_id)
pipeline = DiffusionPipeline.from_pretrained( pipeline = DiffusionPipeline.from_pretrained(
......
...@@ -415,7 +415,7 @@ print(pipe) ...@@ -415,7 +415,7 @@ print(pipe)
StableDiffusionPipeline { StableDiffusionPipeline {
"feature_extractor": [ "feature_extractor": [
"transformers", "transformers",
"CLIPFeatureExtractor" "CLIPImageProcessor"
], ],
"safety_checker": [ "safety_checker": [
"stable_diffusion", "stable_diffusion",
...@@ -445,7 +445,7 @@ StableDiffusionPipeline { ...@@ -445,7 +445,7 @@ StableDiffusionPipeline {
``` ```
First, we see that the official pipeline is the [`StableDiffusionPipeline`], and second we see that the `StableDiffusionPipeline` consists of 7 components: First, we see that the official pipeline is the [`StableDiffusionPipeline`], and second we see that the `StableDiffusionPipeline` consists of 7 components:
- `"feature_extractor"` of class `CLIPFeatureExtractor` as defined [in `transformers`](https://huggingface.co/docs/transformers/main/en/model_doc/clip#transformers.CLIPFeatureExtractor). - `"feature_extractor"` of class `CLIPImageProcessor` as defined [in `transformers`](https://huggingface.co/docs/transformers/main/en/model_doc/clip#transformers.CLIPImageProcessor).
- `"safety_checker"` as defined [here](https://github.com/huggingface/diffusers/blob/e55687e1e15407f60f32242027b7bb8170e58266/src/diffusers/pipelines/stable_diffusion/safety_checker.py#L32). - `"safety_checker"` as defined [here](https://github.com/huggingface/diffusers/blob/e55687e1e15407f60f32242027b7bb8170e58266/src/diffusers/pipelines/stable_diffusion/safety_checker.py#L32).
- `"scheduler"` of class [`PNDMScheduler`]. - `"scheduler"` of class [`PNDMScheduler`].
- `"text_encoder"` of class `CLIPTextModel` as defined [in `transformers`](https://huggingface.co/docs/transformers/main/en/model_doc/clip#transformers.CLIPTextModel). - `"text_encoder"` of class `CLIPTextModel` as defined [in `transformers`](https://huggingface.co/docs/transformers/main/en/model_doc/clip#transformers.CLIPTextModel).
...@@ -493,7 +493,7 @@ In the case of `runwayml/stable-diffusion-v1-5` the `model_index.json` is theref ...@@ -493,7 +493,7 @@ In the case of `runwayml/stable-diffusion-v1-5` the `model_index.json` is theref
"_diffusers_version": "0.6.0", "_diffusers_version": "0.6.0",
"feature_extractor": [ "feature_extractor": [
"transformers", "transformers",
"CLIPFeatureExtractor" "CLIPImageProcessor"
], ],
"safety_checker": [ "safety_checker": [
"stable_diffusion", "stable_diffusion",
......
...@@ -50,11 +50,11 @@ The following code requires roughly 12GB of GPU RAM. ...@@ -50,11 +50,11 @@ The following code requires roughly 12GB of GPU RAM.
```python ```python
from diffusers import DiffusionPipeline from diffusers import DiffusionPipeline
from transformers import CLIPFeatureExtractor, CLIPModel from transformers import CLIPImageProcessor, CLIPModel
import torch import torch
feature_extractor = CLIPFeatureExtractor.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K") feature_extractor = CLIPImageProcessor.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K")
clip_model = CLIPModel.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K", torch_dtype=torch.float16) clip_model = CLIPModel.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K", torch_dtype=torch.float16)
......
...@@ -5,7 +5,7 @@ import torch ...@@ -5,7 +5,7 @@ import torch
from torch import nn from torch import nn
from torch.nn import functional as F from torch.nn import functional as F
from torchvision import transforms from torchvision import transforms
from transformers import CLIPFeatureExtractor, CLIPModel, CLIPTextModel, CLIPTokenizer from transformers import CLIPImageProcessor, CLIPModel, CLIPTextModel, CLIPTokenizer
from diffusers import ( from diffusers import (
AutoencoderKL, AutoencoderKL,
...@@ -64,7 +64,7 @@ class CLIPGuidedStableDiffusion(DiffusionPipeline): ...@@ -64,7 +64,7 @@ class CLIPGuidedStableDiffusion(DiffusionPipeline):
tokenizer: CLIPTokenizer, tokenizer: CLIPTokenizer,
unet: UNet2DConditionModel, unet: UNet2DConditionModel,
scheduler: Union[PNDMScheduler, LMSDiscreteScheduler, DDIMScheduler], scheduler: Union[PNDMScheduler, LMSDiscreteScheduler, DDIMScheduler],
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
): ):
super().__init__() super().__init__()
self.register_modules( self.register_modules(
......
...@@ -17,7 +17,7 @@ from typing import Callable, List, Optional, Union ...@@ -17,7 +17,7 @@ from typing import Callable, List, Optional, Union
import torch import torch
from packaging import version from packaging import version
from transformers import CLIPFeatureExtractor, CLIPTextModel, CLIPTokenizer from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer
from diffusers import DiffusionPipeline from diffusers import DiffusionPipeline
from diffusers.configuration_utils import FrozenDict from diffusers.configuration_utils import FrozenDict
...@@ -64,7 +64,7 @@ class ComposableStableDiffusionPipeline(DiffusionPipeline): ...@@ -64,7 +64,7 @@ class ComposableStableDiffusionPipeline(DiffusionPipeline):
safety_checker ([`StableDiffusionSafetyChecker`]): safety_checker ([`StableDiffusionSafetyChecker`]):
Classification module that estimates whether generated images could be considered offensive or harmful. Classification module that estimates whether generated images could be considered offensive or harmful.
Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details.
feature_extractor ([`CLIPFeatureExtractor`]): feature_extractor ([`CLIPImageProcessor`]):
Model that extracts features from generated images to be used as inputs for the `safety_checker`. Model that extracts features from generated images to be used as inputs for the `safety_checker`.
""" """
_optional_components = ["safety_checker", "feature_extractor"] _optional_components = ["safety_checker", "feature_extractor"]
...@@ -84,7 +84,7 @@ class ComposableStableDiffusionPipeline(DiffusionPipeline): ...@@ -84,7 +84,7 @@ class ComposableStableDiffusionPipeline(DiffusionPipeline):
DPMSolverMultistepScheduler, DPMSolverMultistepScheduler,
], ],
safety_checker: StableDiffusionSafetyChecker, safety_checker: StableDiffusionSafetyChecker,
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
requires_safety_checker: bool = True, requires_safety_checker: bool = True,
): ):
super().__init__() super().__init__()
......
...@@ -15,7 +15,7 @@ from accelerate import Accelerator ...@@ -15,7 +15,7 @@ from accelerate import Accelerator
# TODO: remove and import from diffusers.utils when the new version of diffusers is released # TODO: remove and import from diffusers.utils when the new version of diffusers is released
from packaging import version from packaging import version
from tqdm.auto import tqdm from tqdm.auto import tqdm
from transformers import CLIPFeatureExtractor, CLIPTextModel, CLIPTokenizer from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer
from diffusers import DiffusionPipeline from diffusers import DiffusionPipeline
from diffusers.models import AutoencoderKL, UNet2DConditionModel from diffusers.models import AutoencoderKL, UNet2DConditionModel
...@@ -80,7 +80,7 @@ class ImagicStableDiffusionPipeline(DiffusionPipeline): ...@@ -80,7 +80,7 @@ class ImagicStableDiffusionPipeline(DiffusionPipeline):
safety_checker ([`StableDiffusionSafetyChecker`]): safety_checker ([`StableDiffusionSafetyChecker`]):
Classification module that estimates whether generated images could be considered offsensive or harmful. Classification module that estimates whether generated images could be considered offsensive or harmful.
Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details.
feature_extractor ([`CLIPFeatureExtractor`]): feature_extractor ([`CLIPImageProcessor`]):
Model that extracts features from generated images to be used as inputs for the `safety_checker`. Model that extracts features from generated images to be used as inputs for the `safety_checker`.
""" """
...@@ -92,7 +92,7 @@ class ImagicStableDiffusionPipeline(DiffusionPipeline): ...@@ -92,7 +92,7 @@ class ImagicStableDiffusionPipeline(DiffusionPipeline):
unet: UNet2DConditionModel, unet: UNet2DConditionModel,
scheduler: Union[DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler], scheduler: Union[DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler],
safety_checker: StableDiffusionSafetyChecker, safety_checker: StableDiffusionSafetyChecker,
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
): ):
super().__init__() super().__init__()
self.register_modules( self.register_modules(
......
...@@ -4,7 +4,7 @@ from typing import Callable, List, Optional, Tuple, Union ...@@ -4,7 +4,7 @@ from typing import Callable, List, Optional, Tuple, Union
import numpy as np import numpy as np
import PIL import PIL
import torch import torch
from transformers import CLIPFeatureExtractor, CLIPTextModel, CLIPTokenizer from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer
from diffusers import DiffusionPipeline from diffusers import DiffusionPipeline
from diffusers.configuration_utils import FrozenDict from diffusers.configuration_utils import FrozenDict
...@@ -79,7 +79,7 @@ class ImageToImageInpaintingPipeline(DiffusionPipeline): ...@@ -79,7 +79,7 @@ class ImageToImageInpaintingPipeline(DiffusionPipeline):
safety_checker ([`StableDiffusionSafetyChecker`]): safety_checker ([`StableDiffusionSafetyChecker`]):
Classification module that estimates whether generated images could be considered offensive or harmful. Classification module that estimates whether generated images could be considered offensive or harmful.
Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details.
feature_extractor ([`CLIPFeatureExtractor`]): feature_extractor ([`CLIPImageProcessor`]):
Model that extracts features from generated images to be used as inputs for the `safety_checker`. Model that extracts features from generated images to be used as inputs for the `safety_checker`.
""" """
...@@ -91,7 +91,7 @@ class ImageToImageInpaintingPipeline(DiffusionPipeline): ...@@ -91,7 +91,7 @@ class ImageToImageInpaintingPipeline(DiffusionPipeline):
unet: UNet2DConditionModel, unet: UNet2DConditionModel,
scheduler: Union[DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler], scheduler: Union[DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler],
safety_checker: StableDiffusionSafetyChecker, safety_checker: StableDiffusionSafetyChecker,
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
): ):
super().__init__() super().__init__()
......
...@@ -5,7 +5,7 @@ from typing import Callable, List, Optional, Union ...@@ -5,7 +5,7 @@ from typing import Callable, List, Optional, Union
import numpy as np import numpy as np
import torch import torch
from transformers import CLIPFeatureExtractor, CLIPTextModel, CLIPTokenizer from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer
from diffusers import DiffusionPipeline from diffusers import DiffusionPipeline
from diffusers.configuration_utils import FrozenDict from diffusers.configuration_utils import FrozenDict
...@@ -70,7 +70,7 @@ class StableDiffusionWalkPipeline(DiffusionPipeline): ...@@ -70,7 +70,7 @@ class StableDiffusionWalkPipeline(DiffusionPipeline):
safety_checker ([`StableDiffusionSafetyChecker`]): safety_checker ([`StableDiffusionSafetyChecker`]):
Classification module that estimates whether generated images could be considered offensive or harmful. Classification module that estimates whether generated images could be considered offensive or harmful.
Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details.
feature_extractor ([`CLIPFeatureExtractor`]): feature_extractor ([`CLIPImageProcessor`]):
Model that extracts features from generated images to be used as inputs for the `safety_checker`. Model that extracts features from generated images to be used as inputs for the `safety_checker`.
""" """
...@@ -82,7 +82,7 @@ class StableDiffusionWalkPipeline(DiffusionPipeline): ...@@ -82,7 +82,7 @@ class StableDiffusionWalkPipeline(DiffusionPipeline):
unet: UNet2DConditionModel, unet: UNet2DConditionModel,
scheduler: Union[DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler], scheduler: Union[DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler],
safety_checker: StableDiffusionSafetyChecker, safety_checker: StableDiffusionSafetyChecker,
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
): ):
super().__init__() super().__init__()
......
...@@ -6,7 +6,7 @@ import numpy as np ...@@ -6,7 +6,7 @@ import numpy as np
import PIL import PIL
import torch import torch
from packaging import version from packaging import version
from transformers import CLIPFeatureExtractor, CLIPTextModel, CLIPTokenizer from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer
import diffusers import diffusers
from diffusers import SchedulerMixin, StableDiffusionPipeline from diffusers import SchedulerMixin, StableDiffusionPipeline
...@@ -422,7 +422,7 @@ class StableDiffusionLongPromptWeightingPipeline(StableDiffusionPipeline): ...@@ -422,7 +422,7 @@ class StableDiffusionLongPromptWeightingPipeline(StableDiffusionPipeline):
safety_checker ([`StableDiffusionSafetyChecker`]): safety_checker ([`StableDiffusionSafetyChecker`]):
Classification module that estimates whether generated images could be considered offensive or harmful. Classification module that estimates whether generated images could be considered offensive or harmful.
Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details.
feature_extractor ([`CLIPFeatureExtractor`]): feature_extractor ([`CLIPImageProcessor`]):
Model that extracts features from generated images to be used as inputs for the `safety_checker`. Model that extracts features from generated images to be used as inputs for the `safety_checker`.
""" """
...@@ -436,7 +436,7 @@ class StableDiffusionLongPromptWeightingPipeline(StableDiffusionPipeline): ...@@ -436,7 +436,7 @@ class StableDiffusionLongPromptWeightingPipeline(StableDiffusionPipeline):
unet: UNet2DConditionModel, unet: UNet2DConditionModel,
scheduler: SchedulerMixin, scheduler: SchedulerMixin,
safety_checker: StableDiffusionSafetyChecker, safety_checker: StableDiffusionSafetyChecker,
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
requires_safety_checker: bool = True, requires_safety_checker: bool = True,
): ):
super().__init__( super().__init__(
...@@ -461,7 +461,7 @@ class StableDiffusionLongPromptWeightingPipeline(StableDiffusionPipeline): ...@@ -461,7 +461,7 @@ class StableDiffusionLongPromptWeightingPipeline(StableDiffusionPipeline):
unet: UNet2DConditionModel, unet: UNet2DConditionModel,
scheduler: SchedulerMixin, scheduler: SchedulerMixin,
safety_checker: StableDiffusionSafetyChecker, safety_checker: StableDiffusionSafetyChecker,
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
): ):
super().__init__( super().__init__(
vae=vae, vae=vae,
......
...@@ -6,7 +6,7 @@ import numpy as np ...@@ -6,7 +6,7 @@ import numpy as np
import PIL import PIL
import torch import torch
from packaging import version from packaging import version
from transformers import CLIPFeatureExtractor, CLIPTokenizer from transformers import CLIPImageProcessor, CLIPTokenizer
import diffusers import diffusers
from diffusers import OnnxRuntimeModel, OnnxStableDiffusionPipeline, SchedulerMixin from diffusers import OnnxRuntimeModel, OnnxStableDiffusionPipeline, SchedulerMixin
...@@ -441,7 +441,7 @@ class OnnxStableDiffusionLongPromptWeightingPipeline(OnnxStableDiffusionPipeline ...@@ -441,7 +441,7 @@ class OnnxStableDiffusionLongPromptWeightingPipeline(OnnxStableDiffusionPipeline
unet: OnnxRuntimeModel, unet: OnnxRuntimeModel,
scheduler: SchedulerMixin, scheduler: SchedulerMixin,
safety_checker: OnnxRuntimeModel, safety_checker: OnnxRuntimeModel,
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
requires_safety_checker: bool = True, requires_safety_checker: bool = True,
): ):
super().__init__( super().__init__(
...@@ -468,7 +468,7 @@ class OnnxStableDiffusionLongPromptWeightingPipeline(OnnxStableDiffusionPipeline ...@@ -468,7 +468,7 @@ class OnnxStableDiffusionLongPromptWeightingPipeline(OnnxStableDiffusionPipeline
unet: OnnxRuntimeModel, unet: OnnxRuntimeModel,
scheduler: SchedulerMixin, scheduler: SchedulerMixin,
safety_checker: OnnxRuntimeModel, safety_checker: OnnxRuntimeModel,
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
): ):
super().__init__( super().__init__(
vae_encoder=vae_encoder, vae_encoder=vae_encoder,
......
...@@ -3,7 +3,7 @@ from typing import Callable, List, Optional, Union ...@@ -3,7 +3,7 @@ from typing import Callable, List, Optional, Union
import torch import torch
from transformers import ( from transformers import (
CLIPFeatureExtractor, CLIPImageProcessor,
CLIPTextModel, CLIPTextModel,
CLIPTokenizer, CLIPTokenizer,
MBart50TokenizerFast, MBart50TokenizerFast,
...@@ -79,7 +79,7 @@ class MultilingualStableDiffusion(DiffusionPipeline): ...@@ -79,7 +79,7 @@ class MultilingualStableDiffusion(DiffusionPipeline):
safety_checker ([`StableDiffusionSafetyChecker`]): safety_checker ([`StableDiffusionSafetyChecker`]):
Classification module that estimates whether generated images could be considered offensive or harmful. Classification module that estimates whether generated images could be considered offensive or harmful.
Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details.
feature_extractor ([`CLIPFeatureExtractor`]): feature_extractor ([`CLIPImageProcessor`]):
Model that extracts features from generated images to be used as inputs for the `safety_checker`. Model that extracts features from generated images to be used as inputs for the `safety_checker`.
""" """
...@@ -94,7 +94,7 @@ class MultilingualStableDiffusion(DiffusionPipeline): ...@@ -94,7 +94,7 @@ class MultilingualStableDiffusion(DiffusionPipeline):
unet: UNet2DConditionModel, unet: UNet2DConditionModel,
scheduler: Union[DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler], scheduler: Union[DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler],
safety_checker: StableDiffusionSafetyChecker, safety_checker: StableDiffusionSafetyChecker,
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
): ):
super().__init__() super().__init__()
......
...@@ -65,7 +65,7 @@ class StableDiffusionPipeline(DiffusionPipeline): ...@@ -65,7 +65,7 @@ class StableDiffusionPipeline(DiffusionPipeline):
safety_checker ([`StableDiffusionSafetyChecker`]): safety_checker ([`StableDiffusionSafetyChecker`]):
Classification module that estimates whether generated images could be considered offensive or harmful. Classification module that estimates whether generated images could be considered offensive or harmful.
Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details.
feature_extractor ([`CLIPFeatureExtractor`]): feature_extractor ([`CLIPImageProcessor`]):
Model that extracts features from generated images to be used as inputs for the `safety_checker`. Model that extracts features from generated images to be used as inputs for the `safety_checker`.
""" """
_optional_components = ["safety_checker", "feature_extractor"] _optional_components = ["safety_checker", "feature_extractor"]
......
...@@ -5,7 +5,7 @@ import inspect ...@@ -5,7 +5,7 @@ import inspect
from typing import Callable, List, Optional, Union from typing import Callable, List, Optional, Union
import torch import torch
from transformers import CLIPFeatureExtractor, CLIPTextModel, CLIPTokenizer from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer
from diffusers import DiffusionPipeline from diffusers import DiffusionPipeline
from diffusers.models import AutoencoderKL, UNet2DConditionModel from diffusers.models import AutoencoderKL, UNet2DConditionModel
...@@ -42,7 +42,7 @@ class SeedResizeStableDiffusionPipeline(DiffusionPipeline): ...@@ -42,7 +42,7 @@ class SeedResizeStableDiffusionPipeline(DiffusionPipeline):
safety_checker ([`StableDiffusionSafetyChecker`]): safety_checker ([`StableDiffusionSafetyChecker`]):
Classification module that estimates whether generated images could be considered offensive or harmful. Classification module that estimates whether generated images could be considered offensive or harmful.
Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details. Please, refer to the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4) for details.
feature_extractor ([`CLIPFeatureExtractor`]): feature_extractor ([`CLIPImageProcessor`]):
Model that extracts features from generated images to be used as inputs for the `safety_checker`. Model that extracts features from generated images to be used as inputs for the `safety_checker`.
""" """
...@@ -54,7 +54,7 @@ class SeedResizeStableDiffusionPipeline(DiffusionPipeline): ...@@ -54,7 +54,7 @@ class SeedResizeStableDiffusionPipeline(DiffusionPipeline):
unet: UNet2DConditionModel, unet: UNet2DConditionModel,
scheduler: Union[DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler], scheduler: Union[DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler],
safety_checker: StableDiffusionSafetyChecker, safety_checker: StableDiffusionSafetyChecker,
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
): ):
super().__init__() super().__init__()
self.register_modules( self.register_modules(
......
...@@ -3,7 +3,7 @@ from typing import Callable, List, Optional, Union ...@@ -3,7 +3,7 @@ from typing import Callable, List, Optional, Union
import torch import torch
from transformers import ( from transformers import (
CLIPFeatureExtractor, CLIPImageProcessor,
CLIPTextModel, CLIPTextModel,
CLIPTokenizer, CLIPTokenizer,
WhisperForConditionalGeneration, WhisperForConditionalGeneration,
...@@ -37,7 +37,7 @@ class SpeechToImagePipeline(DiffusionPipeline): ...@@ -37,7 +37,7 @@ class SpeechToImagePipeline(DiffusionPipeline):
unet: UNet2DConditionModel, unet: UNet2DConditionModel,
scheduler: Union[DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler], scheduler: Union[DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler],
safety_checker: StableDiffusionSafetyChecker, safety_checker: StableDiffusionSafetyChecker,
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
): ):
super().__init__() super().__init__()
......
from typing import Any, Callable, Dict, List, Optional, Union from typing import Any, Callable, Dict, List, Optional, Union
import torch import torch
from transformers import CLIPFeatureExtractor, CLIPTextModel, CLIPTokenizer from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer
from diffusers import ( from diffusers import (
AutoencoderKL, AutoencoderKL,
...@@ -46,7 +46,7 @@ class StableDiffusionComparisonPipeline(DiffusionPipeline): ...@@ -46,7 +46,7 @@ class StableDiffusionComparisonPipeline(DiffusionPipeline):
safety_checker ([`StableDiffusionMegaSafetyChecker`]): safety_checker ([`StableDiffusionMegaSafetyChecker`]):
Classification module that estimates whether generated images could be considered offensive or harmful. Classification module that estimates whether generated images could be considered offensive or harmful.
Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details. Please, refer to the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5) for details.
feature_extractor ([`CLIPFeatureExtractor`]): feature_extractor ([`CLIPImageProcessor`]):
Model that extracts features from generated images to be used as inputs for the `safety_checker`. Model that extracts features from generated images to be used as inputs for the `safety_checker`.
""" """
...@@ -58,7 +58,7 @@ class StableDiffusionComparisonPipeline(DiffusionPipeline): ...@@ -58,7 +58,7 @@ class StableDiffusionComparisonPipeline(DiffusionPipeline):
unet: UNet2DConditionModel, unet: UNet2DConditionModel,
scheduler: Union[DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler], scheduler: Union[DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler],
safety_checker: StableDiffusionSafetyChecker, safety_checker: StableDiffusionSafetyChecker,
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
requires_safety_checker: bool = True, requires_safety_checker: bool = True,
): ):
super()._init_() super()._init_()
......
...@@ -6,7 +6,7 @@ from typing import Any, Callable, Dict, List, Optional, Union ...@@ -6,7 +6,7 @@ from typing import Any, Callable, Dict, List, Optional, Union
import numpy as np import numpy as np
import PIL.Image import PIL.Image
import torch import torch
from transformers import CLIPFeatureExtractor, CLIPTextModel, CLIPTokenizer from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer
from diffusers import AutoencoderKL, ControlNetModel, DiffusionPipeline, UNet2DConditionModel, logging from diffusers import AutoencoderKL, ControlNetModel, DiffusionPipeline, UNet2DConditionModel, logging
from diffusers.pipelines.stable_diffusion import StableDiffusionPipelineOutput, StableDiffusionSafetyChecker from diffusers.pipelines.stable_diffusion import StableDiffusionPipelineOutput, StableDiffusionSafetyChecker
...@@ -135,7 +135,7 @@ class StableDiffusionControlNetImg2ImgPipeline(DiffusionPipeline): ...@@ -135,7 +135,7 @@ class StableDiffusionControlNetImg2ImgPipeline(DiffusionPipeline):
controlnet: ControlNetModel, controlnet: ControlNetModel,
scheduler: KarrasDiffusionSchedulers, scheduler: KarrasDiffusionSchedulers,
safety_checker: StableDiffusionSafetyChecker, safety_checker: StableDiffusionSafetyChecker,
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
requires_safety_checker: bool = True, requires_safety_checker: bool = True,
): ):
super().__init__() super().__init__()
......
...@@ -7,7 +7,7 @@ import numpy as np ...@@ -7,7 +7,7 @@ import numpy as np
import PIL.Image import PIL.Image
import torch import torch
import torch.nn.functional as F import torch.nn.functional as F
from transformers import CLIPFeatureExtractor, CLIPTextModel, CLIPTokenizer from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer
from diffusers import AutoencoderKL, ControlNetModel, DiffusionPipeline, UNet2DConditionModel, logging from diffusers import AutoencoderKL, ControlNetModel, DiffusionPipeline, UNet2DConditionModel, logging
from diffusers.pipelines.stable_diffusion import StableDiffusionPipelineOutput, StableDiffusionSafetyChecker from diffusers.pipelines.stable_diffusion import StableDiffusionPipelineOutput, StableDiffusionSafetyChecker
...@@ -233,7 +233,7 @@ class StableDiffusionControlNetInpaintPipeline(DiffusionPipeline): ...@@ -233,7 +233,7 @@ class StableDiffusionControlNetInpaintPipeline(DiffusionPipeline):
controlnet: ControlNetModel, controlnet: ControlNetModel,
scheduler: KarrasDiffusionSchedulers, scheduler: KarrasDiffusionSchedulers,
safety_checker: StableDiffusionSafetyChecker, safety_checker: StableDiffusionSafetyChecker,
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
requires_safety_checker: bool = True, requires_safety_checker: bool = True,
): ):
super().__init__() super().__init__()
......
...@@ -7,7 +7,7 @@ import numpy as np ...@@ -7,7 +7,7 @@ import numpy as np
import PIL.Image import PIL.Image
import torch import torch
import torch.nn.functional as F import torch.nn.functional as F
from transformers import CLIPFeatureExtractor, CLIPTextModel, CLIPTokenizer from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer
from diffusers import AutoencoderKL, ControlNetModel, DiffusionPipeline, UNet2DConditionModel, logging from diffusers import AutoencoderKL, ControlNetModel, DiffusionPipeline, UNet2DConditionModel, logging
from diffusers.pipelines.stable_diffusion import StableDiffusionPipelineOutput, StableDiffusionSafetyChecker from diffusers.pipelines.stable_diffusion import StableDiffusionPipelineOutput, StableDiffusionSafetyChecker
...@@ -233,7 +233,7 @@ class StableDiffusionControlNetInpaintImg2ImgPipeline(DiffusionPipeline): ...@@ -233,7 +233,7 @@ class StableDiffusionControlNetInpaintImg2ImgPipeline(DiffusionPipeline):
controlnet: ControlNetModel, controlnet: ControlNetModel,
scheduler: KarrasDiffusionSchedulers, scheduler: KarrasDiffusionSchedulers,
safety_checker: StableDiffusionSafetyChecker, safety_checker: StableDiffusionSafetyChecker,
feature_extractor: CLIPFeatureExtractor, feature_extractor: CLIPImageProcessor,
requires_safety_checker: bool = True, requires_safety_checker: bool = True,
): ):
super().__init__() super().__init__()
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment