Unverified Commit 1f020876 authored by Steven Liu's avatar Steven Liu Committed by GitHub
Browse files

[docs] More API stuff (#3835)

* clean up loaders

* clean up rest of main class apis

* apply feedback
parent 95ea538c
...@@ -149,7 +149,7 @@ ...@@ -149,7 +149,7 @@
- local: api/utilities - local: api/utilities
title: Utilities title: Utilities
- local: api/image_processor - local: api/image_processor
title: Vae Image Processor title: VAE Image Processor
title: Main Classes title: Main Classes
- sections: - sections:
- local: api/pipelines/overview - local: api/pipelines/overview
......
...@@ -12,8 +12,13 @@ specific language governing permissions and limitations under the License. ...@@ -12,8 +12,13 @@ specific language governing permissions and limitations under the License.
# Configuration # Configuration
Schedulers from [`~schedulers.scheduling_utils.SchedulerMixin`] and models from [`ModelMixin`] inherit from [`ConfigMixin`] which conveniently takes care of storing all the parameters that are Schedulers from [`~schedulers.scheduling_utils.SchedulerMixin`] and models from [`ModelMixin`] inherit from [`ConfigMixin`] which stores all the parameters that are passed to their respective `__init__` methods in a JSON-configuration file.
passed to their respective `__init__` methods in a JSON-configuration file.
<Tip>
To use private or [gated](https://huggingface.co/docs/hub/models-gated#gated-models) models, log-in with `huggingface-cli login`.
</Tip>
## ConfigMixin ## ConfigMixin
......
...@@ -12,12 +12,12 @@ specific language governing permissions and limitations under the License. ...@@ -12,12 +12,12 @@ specific language governing permissions and limitations under the License.
# Pipelines # Pipelines
The [`DiffusionPipeline`] is the easiest way to load any pretrained diffusion pipeline from the [Hub](https://huggingface.co/models?library=diffusers) and use it for inference. The [`DiffusionPipeline`] is the quickest way to load any pretrained diffusion pipeline from the [Hub](https://huggingface.co/models?library=diffusers) for inference.
<Tip> <Tip>
You shouldn't use the [`DiffusionPipeline`] class for training or finetuning a diffusion model. Individual You shouldn't use the [`DiffusionPipeline`] class for training or finetuning a diffusion model. Individual
components (for example, [`UNetModel`] and [`UNetConditionModel`]) of diffusion pipelines are usually trained individually, so we suggest directly working with instead. components (for example, [`UNet2DModel`] and [`UNet2DConditionModel`]) of diffusion pipelines are usually trained individually, so we suggest directly working with them instead.
</Tip> </Tip>
......
...@@ -10,24 +10,18 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o ...@@ -10,24 +10,18 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License. specific language governing permissions and limitations under the License.
--> -->
# Image Processor for VAE # VAE Image Processor
Image processor provides a unified API for Stable Diffusion pipelines to prepare their image inputs for VAE encoding, as well as post-processing their outputs once decoded. This includes transformations such as resizing, normalization, and conversion between PIL Image, PyTorch, and Numpy arrays.
All pipelines with VAE image processor will accept image inputs in the format of PIL Image, PyTorch tensor, or Numpy array, and will able to return outputs in the format of PIL Image, Pytorch tensor, and Numpy array based on the `output_type` argument from the user. Additionally, the User can pass encoded image latents directly to the pipeline, or ask the pipeline to return latents as output with `output_type = 'pt'` argument. This allows you to take the generated latents from one pipeline and pass it to another pipeline as input, without ever having to leave the latent space. It also makes it much easier to use multiple pipelines together, by passing PyTorch tensors directly between different pipelines.
# Image Processor for VAE adapted to LDM3D
LDM3D Image processor does the same as the Image processor for VAE but accepts both RGB and depth inputs and will return RGB and depth outputs.
The [`VaeImageProcessor`] provides a unified API for [`StableDiffusionPipeline`]'s to prepare image inputs for VAE encoding and post-processing outputs once they're decoded. This includes transformations such as resizing, normalization, and conversion between PIL Image, PyTorch, and NumPy arrays.
All pipelines with [`VaeImageProcessor`] accepts PIL Image, PyTorch tensor, or NumPy arrays as image inputs and returns outputs based on the `output_type` argument by the user. You can pass encoded image latents directly to the pipeline and return latents from the pipeline as a specific output with the `output_type` argument (for example `output_type="pt"`). This allows you to take the generated latents from one pipeline and pass it to another pipeline as input without leaving the latent space. It also makes it much easier to use multiple pipelines together by passing PyTorch tensors directly between different pipelines.
## VaeImageProcessor ## VaeImageProcessor
[[autodoc]] image_processor.VaeImageProcessor [[autodoc]] image_processor.VaeImageProcessor
## VaeImageProcessorLDM3D ## VaeImageProcessorLDM3D
The [`VaeImageProcessorLDM3D`] accepts RGB and depth inputs and returns RGB and depth outputs.
[[autodoc]] image_processor.VaeImageProcessorLDM3D [[autodoc]] image_processor.VaeImageProcessorLDM3D
\ No newline at end of file
...@@ -12,31 +12,26 @@ specific language governing permissions and limitations under the License. ...@@ -12,31 +12,26 @@ specific language governing permissions and limitations under the License.
# Loaders # Loaders
There are many ways to train adapter neural networks for diffusion models, such as Adapters (textual inversion, LoRA, hypernetworks) allow you to modify a diffusion model to generate images in a specific style without training or finetuning the entire model. The adapter weights are typically only a tiny fraction of the pretrained model's which making them very portable. 🤗 Diffusers provides an easy-to-use `LoaderMixin` API to load adapter weights.
- [Textual Inversion](./training/text_inversion.mdx)
- [LoRA](https://github.com/cloneofsimo/lora)
- [Hypernetworks](https://arxiv.org/abs/1609.09106)
Such adapter neural networks often only consist of a fraction of the number of weights compared <Tip warning={true}>
to the pretrained model and as such are very portable. The Diffusers library offers an easy-to-use
API to load such adapter neural networks via the [`loaders.py` module](https://github.com/huggingface/diffusers/blob/main/src/diffusers/loaders.py).
**Note**: This module is still highly experimental and prone to future changes. 🧪 The `LoaderMixins` are highly experimental and prone to future changes. To use private or [gated](https://huggingface.co/docs/hub/models-gated#gated-models) models, log-in with `huggingface-cli login`.
## LoaderMixins </Tip>
### UNet2DConditionLoadersMixin ## UNet2DConditionLoadersMixin
[[autodoc]] loaders.UNet2DConditionLoadersMixin [[autodoc]] loaders.UNet2DConditionLoadersMixin
### TextualInversionLoaderMixin ## TextualInversionLoaderMixin
[[autodoc]] loaders.TextualInversionLoaderMixin [[autodoc]] loaders.TextualInversionLoaderMixin
### LoraLoaderMixin ## LoraLoaderMixin
[[autodoc]] loaders.LoraLoaderMixin [[autodoc]] loaders.LoraLoaderMixin
### FromCkptMixin ## FromCkptMixin
[[autodoc]] loaders.FromCkptMixin [[autodoc]] loaders.FromCkptMixin
...@@ -12,12 +12,9 @@ specific language governing permissions and limitations under the License. ...@@ -12,12 +12,9 @@ specific language governing permissions and limitations under the License.
# Logging # Logging
🧨 Diffusers has a centralized logging system, so that you can setup the verbosity of the library easily. 🤗 Diffusers has a centralized logging system to easily manage the verbosity of the library. The default verbosity is set to `WARNING`.
Currently the default verbosity of the library is `WARNING`. To change the verbosity level, use one of the direct setters. For instance, to change the verbosity to the `INFO` level.
To change the level of verbosity, just use one of the direct setters. For instance, here is how to change the verbosity
to the INFO level.
```python ```python
import diffusers import diffusers
...@@ -33,7 +30,7 @@ DIFFUSERS_VERBOSITY=error ./myprogram.py ...@@ -33,7 +30,7 @@ DIFFUSERS_VERBOSITY=error ./myprogram.py
``` ```
Additionally, some `warnings` can be disabled by setting the environment variable Additionally, some `warnings` can be disabled by setting the environment variable
`DIFFUSERS_NO_ADVISORY_WARNINGS` to a true value, like *1*. This will disable any warning that is logged using `DIFFUSERS_NO_ADVISORY_WARNINGS` to a true value, like `1`. This disables any warning logged by
[`logger.warning_advice`]. For example: [`logger.warning_advice`]. For example:
```bash ```bash
...@@ -52,20 +49,21 @@ logger.warning("WARN") ...@@ -52,20 +49,21 @@ logger.warning("WARN")
``` ```
All the methods of this logging module are documented below, the main ones are All methods of the logging module are documented below. The main methods are
[`logging.get_verbosity`] to get the current level of verbosity in the logger and [`logging.get_verbosity`] to get the current level of verbosity in the logger and
[`logging.set_verbosity`] to set the verbosity to the level of your choice. In order (from the least [`logging.set_verbosity`] to set the verbosity to the level of your choice.
verbose to the most verbose), those levels (with their corresponding int values in parenthesis) are:
In order from the least verbose to the most verbose:
- `diffusers.logging.CRITICAL` or `diffusers.logging.FATAL` (int value, 50): only report the most
critical errors. | Method | Integer value | Description |
- `diffusers.logging.ERROR` (int value, 40): only report errors. |----------------------------------------------------------:|--------------:|----------------------------------------------------:|
- `diffusers.logging.WARNING` or `diffusers.logging.WARN` (int value, 30): only reports error and | `diffusers.logging.CRITICAL` or `diffusers.logging.FATAL` | 50 | only report the most critical errors |
warnings. This is the default level used by the library. | `diffusers.logging.ERROR` | 40 | only report errors |
- `diffusers.logging.INFO` (int value, 20): reports error, warnings and basic information. | `diffusers.logging.WARNING` or `diffusers.logging.WARN` | 30 | only report errors and warnings (default) |
- `diffusers.logging.DEBUG` (int value, 10): report all information. | `diffusers.logging.INFO` | 20 | only report errors, warnings, and basic information |
| `diffusers.logging.DEBUG` | 10 | report all information |
By default, `tqdm` progress bars will be displayed during model download. [`logging.disable_progress_bar`] and [`logging.enable_progress_bar`] can be used to suppress or unsuppress this behavior.
By default, `tqdm` progress bars are displayed during model download. [`logging.disable_progress_bar`] and [`logging.enable_progress_bar`] are used to enable or disable this behavior.
## Base setters ## Base setters
......
...@@ -10,11 +10,9 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o ...@@ -10,11 +10,9 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License. specific language governing permissions and limitations under the License.
--> -->
# BaseOutputs # Outputs
All models have outputs that are subclasses of [`~utils.BaseOutput`]. Those are All models outputs are subclasses of [`~utils.BaseOutput`], data structures containing all the information returned by the model. The outputs can also be used as tuples or dictionaries.
data structures containing all the information returned by the model, but they can also be used as tuples or
dictionaries.
For example: For example:
......
...@@ -81,10 +81,9 @@ class FrozenDict(OrderedDict): ...@@ -81,10 +81,9 @@ class FrozenDict(OrderedDict):
class ConfigMixin: class ConfigMixin:
r""" r"""
Base class for all configuration classes. Stores all configuration parameters under `self.config` Also handles all Base class for all configuration classes. All configuration parameters are stored under `self.config`. Also
methods for loading/downloading/saving classes inheriting from [`ConfigMixin`] with provides the [`~ConfigMixin.from_config`] and [`~ConfigMixin.save_config`] methods for loading, downloading, and
- [`~ConfigMixin.from_config`] saving classes that inherit from [`ConfigMixin`].
- [`~ConfigMixin.save_config`]
Class attributes: Class attributes:
- **config_name** (`str`) -- A filename under which the config should stored when calling - **config_name** (`str`) -- A filename under which the config should stored when calling
...@@ -92,7 +91,7 @@ class ConfigMixin: ...@@ -92,7 +91,7 @@ class ConfigMixin:
- **ignore_for_config** (`List[str]`) -- A list of attributes that should not be saved in the config (should be - **ignore_for_config** (`List[str]`) -- A list of attributes that should not be saved in the config (should be
overridden by subclass). overridden by subclass).
- **has_compatibles** (`bool`) -- Whether the class has compatible classes (should be overridden by subclass). - **has_compatibles** (`bool`) -- Whether the class has compatible classes (should be overridden by subclass).
- **_deprecated_kwargs** (`List[str]`) -- Keyword arguments that are deprecated. Note that the init function - **_deprecated_kwargs** (`List[str]`) -- Keyword arguments that are deprecated. Note that the `init` function
should only have a `kwargs` argument if at least one argument is deprecated (should be overridden by should only have a `kwargs` argument if at least one argument is deprecated (should be overridden by
subclass). subclass).
""" """
...@@ -139,12 +138,12 @@ class ConfigMixin: ...@@ -139,12 +138,12 @@ class ConfigMixin:
def save_config(self, save_directory: Union[str, os.PathLike], push_to_hub: bool = False, **kwargs): def save_config(self, save_directory: Union[str, os.PathLike], push_to_hub: bool = False, **kwargs):
""" """
Save a configuration object to the directory `save_directory`, so that it can be re-loaded using the Save a configuration object to the directory specified in `save_directory` so that it can be reloaded using the
[`~ConfigMixin.from_config`] class method. [`~ConfigMixin.from_config`] class method.
Args: Args:
save_directory (`str` or `os.PathLike`): save_directory (`str` or `os.PathLike`):
Directory where the configuration JSON file will be saved (will be created if it does not exist). Directory where the configuration JSON file is saved (will be created if it does not exist).
""" """
if os.path.isfile(save_directory): if os.path.isfile(save_directory):
raise AssertionError(f"Provided path ({save_directory}) should be a directory, not a file") raise AssertionError(f"Provided path ({save_directory}) should be a directory, not a file")
...@@ -164,15 +163,14 @@ class ConfigMixin: ...@@ -164,15 +163,14 @@ class ConfigMixin:
Parameters: Parameters:
config (`Dict[str, Any]`): config (`Dict[str, Any]`):
A config dictionary from which the Python class will be instantiated. Make sure to only load A config dictionary from which the Python class is instantiated. Make sure to only load configuration
configuration files of compatible classes. files of compatible classes.
return_unused_kwargs (`bool`, *optional*, defaults to `False`): return_unused_kwargs (`bool`, *optional*, defaults to `False`):
Whether kwargs that are not consumed by the Python class should be returned or not. Whether kwargs that are not consumed by the Python class should be returned or not.
kwargs (remaining dictionary of keyword arguments, *optional*): kwargs (remaining dictionary of keyword arguments, *optional*):
Can be used to update the configuration object (after it is loaded) and initiate the Python class. Can be used to update the configuration object (after it is loaded) and initiate the Python class.
`**kwargs` are directly passed to the underlying scheduler/model's `__init__` method and eventually `**kwargs` are passed directly to the underlying scheduler/model's `__init__` method and eventually
overwrite same named arguments in `config`. overwrite the same named arguments in `config`.
Returns: Returns:
[`ModelMixin`] or [`SchedulerMixin`]: [`ModelMixin`] or [`SchedulerMixin`]:
...@@ -280,16 +278,16 @@ class ConfigMixin: ...@@ -280,16 +278,16 @@ class ConfigMixin:
Whether or not to force the (re-)download of the model weights and configuration files, overriding the Whether or not to force the (re-)download of the model weights and configuration files, overriding the
cached versions if they exist. cached versions if they exist.
resume_download (`bool`, *optional*, defaults to `False`): resume_download (`bool`, *optional*, defaults to `False`):
Whether or not to resume downloading the model weights and configuration files. If set to False, any Whether or not to resume downloading the model weights and configuration files. If set to `False`, any
incompletely downloaded files are deleted. incompletely downloaded files are deleted.
proxies (`Dict[str, str]`, *optional*): proxies (`Dict[str, str]`, *optional*):
A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128', A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128',
'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request. 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.
output_loading_info(`bool`, *optional*, defaults to `False`): output_loading_info(`bool`, *optional*, defaults to `False`):
Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(`bool`, *optional*, defaults to `False`): local_files_only (`bool`, *optional*, defaults to `False`):
Whether to only load local model weights and configuration files or not. If set to True, the model Whether to only load local model weights and configuration files or not. If set to `True`, the model
wont be downloaded from the Hub. won't be downloaded from the Hub.
use_auth_token (`str` or *bool*, *optional*): use_auth_token (`str` or *bool*, *optional*):
The token to use as HTTP bearer authorization for remote files. If `True`, the token generated from The token to use as HTTP bearer authorization for remote files. If `True`, the token generated from
`diffusers-cli login` (stored in `~/.huggingface`) is used. `diffusers-cli login` (stored in `~/.huggingface`) is used.
...@@ -307,14 +305,6 @@ class ConfigMixin: ...@@ -307,14 +305,6 @@ class ConfigMixin:
`dict`: `dict`:
A dictionary of all the parameters stored in a JSON configuration file. A dictionary of all the parameters stored in a JSON configuration file.
<Tip>
To use private or [gated models](https://huggingface.co/docs/hub/models-gated#gated-models), log-in with
`huggingface-cli login`. You can also activate the special
["offline-mode"](https://huggingface.co/transformers/installation.html#offline-mode) to use this method in a
firewalled environment.
</Tip>
""" """
cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE) cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE)
force_download = kwargs.pop("force_download", False) force_download = kwargs.pop("force_download", False)
...@@ -536,10 +526,11 @@ class ConfigMixin: ...@@ -536,10 +526,11 @@ class ConfigMixin:
def to_json_string(self) -> str: def to_json_string(self) -> str:
""" """
Serializes this instance to a JSON string. Serializes the configuration instance to a JSON string.
Returns: Returns:
`str`: String containing all the attributes that make up this configuration instance in JSON format. `str`:
String containing all the attributes that make up the configuration instance in JSON format.
""" """
config_dict = self._internal_dict if hasattr(self, "_internal_dict") else {} config_dict = self._internal_dict if hasattr(self, "_internal_dict") else {}
config_dict["_class_name"] = self.__class__.__name__ config_dict["_class_name"] = self.__class__.__name__
...@@ -560,11 +551,11 @@ class ConfigMixin: ...@@ -560,11 +551,11 @@ class ConfigMixin:
def to_json_file(self, json_file_path: Union[str, os.PathLike]): def to_json_file(self, json_file_path: Union[str, os.PathLike]):
""" """
Save this instance to a JSON file. Save the configuration instance's parameters to a JSON file.
Args: Args:
json_file_path (`str` or `os.PathLike`): json_file_path (`str` or `os.PathLike`):
Path to the JSON file in which this configuration instance's parameters will be saved. Path to the JSON file to save a configuration instance's parameters.
""" """
with open(json_file_path, "w", encoding="utf-8") as writer: with open(json_file_path, "w", encoding="utf-8") as writer:
writer.write(self.to_json_string()) writer.write(self.to_json_string())
......
...@@ -26,19 +26,18 @@ from .utils import CONFIG_NAME, PIL_INTERPOLATION, deprecate ...@@ -26,19 +26,18 @@ from .utils import CONFIG_NAME, PIL_INTERPOLATION, deprecate
class VaeImageProcessor(ConfigMixin): class VaeImageProcessor(ConfigMixin):
""" """
Image Processor for VAE Image processor for VAE.
Args: Args:
do_resize (`bool`, *optional*, defaults to `True`): do_resize (`bool`, *optional*, defaults to `True`):
Whether to downscale the image's (height, width) dimensions to multiples of `vae_scale_factor`. Can accept Whether to downscale the image's (height, width) dimensions to multiples of `vae_scale_factor`. Can accept
`height` and `width` arguments from `preprocess` method `height` and `width` arguments from [`image_processor.VaeImageProcessor.preprocess`] method.
vae_scale_factor (`int`, *optional*, defaults to `8`): vae_scale_factor (`int`, *optional*, defaults to `8`):
VAE scale factor. If `do_resize` is True, the image will be automatically resized to multiples of this VAE scale factor. If `do_resize` is `True`, the image is automatically resized to multiples of this factor.
factor.
resample (`str`, *optional*, defaults to `lanczos`): resample (`str`, *optional*, defaults to `lanczos`):
Resampling filter to use when resizing the image. Resampling filter to use when resizing the image.
do_normalize (`bool`, *optional*, defaults to `True`): do_normalize (`bool`, *optional*, defaults to `True`):
Whether to normalize the image to [-1,1] Whether to normalize the image to [-1,1].
do_convert_rgb (`bool`, *optional*, defaults to be `False`): do_convert_rgb (`bool`, *optional*, defaults to be `False`):
Whether to convert the images to RGB format. Whether to convert the images to RGB format.
""" """
...@@ -75,7 +74,7 @@ class VaeImageProcessor(ConfigMixin): ...@@ -75,7 +74,7 @@ class VaeImageProcessor(ConfigMixin):
@staticmethod @staticmethod
def pil_to_numpy(images: Union[List[PIL.Image.Image], PIL.Image.Image]) -> np.ndarray: def pil_to_numpy(images: Union[List[PIL.Image.Image], PIL.Image.Image]) -> np.ndarray:
""" """
Convert a PIL image or a list of PIL images to numpy arrays. Convert a PIL image or a list of PIL images to NumPy arrays.
""" """
if not isinstance(images, list): if not isinstance(images, list):
images = [images] images = [images]
...@@ -87,7 +86,7 @@ class VaeImageProcessor(ConfigMixin): ...@@ -87,7 +86,7 @@ class VaeImageProcessor(ConfigMixin):
@staticmethod @staticmethod
def numpy_to_pt(images: np.ndarray) -> torch.FloatTensor: def numpy_to_pt(images: np.ndarray) -> torch.FloatTensor:
""" """
Convert a numpy image to a pytorch tensor Convert a NumPy image to a PyTorch tensor.
""" """
if images.ndim == 3: if images.ndim == 3:
images = images[..., None] images = images[..., None]
...@@ -98,7 +97,7 @@ class VaeImageProcessor(ConfigMixin): ...@@ -98,7 +97,7 @@ class VaeImageProcessor(ConfigMixin):
@staticmethod @staticmethod
def pt_to_numpy(images: torch.FloatTensor) -> np.ndarray: def pt_to_numpy(images: torch.FloatTensor) -> np.ndarray:
""" """
Convert a pytorch tensor to a numpy image Convert a PyTorch tensor to a NumPy image.
""" """
images = images.cpu().permute(0, 2, 3, 1).float().numpy() images = images.cpu().permute(0, 2, 3, 1).float().numpy()
return images return images
...@@ -106,14 +105,14 @@ class VaeImageProcessor(ConfigMixin): ...@@ -106,14 +105,14 @@ class VaeImageProcessor(ConfigMixin):
@staticmethod @staticmethod
def normalize(images): def normalize(images):
""" """
Normalize an image array to [-1,1] Normalize an image array to [-1,1].
""" """
return 2.0 * images - 1.0 return 2.0 * images - 1.0
@staticmethod @staticmethod
def denormalize(images): def denormalize(images):
""" """
Denormalize an image array to [0,1] Denormalize an image array to [0,1].
""" """
return (images / 2 + 0.5).clamp(0, 1) return (images / 2 + 0.5).clamp(0, 1)
...@@ -132,7 +131,7 @@ class VaeImageProcessor(ConfigMixin): ...@@ -132,7 +131,7 @@ class VaeImageProcessor(ConfigMixin):
width: Optional[int] = None, width: Optional[int] = None,
) -> PIL.Image.Image: ) -> PIL.Image.Image:
""" """
Resize a PIL image. Both height and width will be downscaled to the next integer multiple of `vae_scale_factor` Resize a PIL image. Both height and width are downscaled to the next integer multiple of `vae_scale_factor`.
""" """
if height is None: if height is None:
height = image.height height = image.height
...@@ -152,7 +151,7 @@ class VaeImageProcessor(ConfigMixin): ...@@ -152,7 +151,7 @@ class VaeImageProcessor(ConfigMixin):
width: Optional[int] = None, width: Optional[int] = None,
) -> torch.Tensor: ) -> torch.Tensor:
""" """
Preprocess the image input, accepted formats are PIL images, numpy arrays or pytorch tensors" Preprocess the image input. Accepted formats are PIL images, NumPy arrays or PyTorch tensors.
""" """
supported_formats = (PIL.Image.Image, np.ndarray, torch.Tensor) supported_formats = (PIL.Image.Image, np.ndarray, torch.Tensor)
if isinstance(image, supported_formats): if isinstance(image, supported_formats):
...@@ -255,18 +254,17 @@ class VaeImageProcessor(ConfigMixin): ...@@ -255,18 +254,17 @@ class VaeImageProcessor(ConfigMixin):
class VaeImageProcessorLDM3D(VaeImageProcessor): class VaeImageProcessorLDM3D(VaeImageProcessor):
""" """
Image Processor for VAE LDM3D. Image processor for VAE LDM3D.
Args: Args:
do_resize (`bool`, *optional*, defaults to `True`): do_resize (`bool`, *optional*, defaults to `True`):
Whether to downscale the image's (height, width) dimensions to multiples of `vae_scale_factor`. Whether to downscale the image's (height, width) dimensions to multiples of `vae_scale_factor`.
vae_scale_factor (`int`, *optional*, defaults to `8`): vae_scale_factor (`int`, *optional*, defaults to `8`):
VAE scale factor. If `do_resize` is True, the image will be automatically resized to multiples of this VAE scale factor. If `do_resize` is `True`, the image is automatically resized to multiples of this factor.
factor.
resample (`str`, *optional*, defaults to `lanczos`): resample (`str`, *optional*, defaults to `lanczos`):
Resampling filter to use when resizing the image. Resampling filter to use when resizing the image.
do_normalize (`bool`, *optional*, defaults to `True`): do_normalize (`bool`, *optional*, defaults to `True`):
Whether to normalize the image to [-1,1] Whether to normalize the image to [-1,1].
""" """
config_name = CONFIG_NAME config_name = CONFIG_NAME
...@@ -284,7 +282,7 @@ class VaeImageProcessorLDM3D(VaeImageProcessor): ...@@ -284,7 +282,7 @@ class VaeImageProcessorLDM3D(VaeImageProcessor):
@staticmethod @staticmethod
def numpy_to_pil(images): def numpy_to_pil(images):
""" """
Convert a numpy image or a batch of images to a PIL image. Convert a NumPy image or a batch of images to a PIL image.
""" """
if images.ndim == 3: if images.ndim == 3:
images = images[None, ...] images = images[None, ...]
...@@ -310,7 +308,7 @@ class VaeImageProcessorLDM3D(VaeImageProcessor): ...@@ -310,7 +308,7 @@ class VaeImageProcessorLDM3D(VaeImageProcessor):
def numpy_to_depth(self, images): def numpy_to_depth(self, images):
""" """
Convert a numpy depth image or a batch of images to a PIL image. Convert a NumPy depth image or a batch of images to a PIL image.
""" """
if images.ndim == 3: if images.ndim == 3:
images = images[None, ...] images = images[None, ...]
......
This diff is collapsed.
...@@ -83,8 +83,8 @@ class FlaxImagePipelineOutput(BaseOutput): ...@@ -83,8 +83,8 @@ class FlaxImagePipelineOutput(BaseOutput):
Args: Args:
images (`List[PIL.Image.Image]` or `np.ndarray`) images (`List[PIL.Image.Image]` or `np.ndarray`)
List of denoised PIL images of length `batch_size` or numpy array of shape `(batch_size, height, width, List of denoised PIL images of length `batch_size` or NumPy array of shape `(batch_size, height, width,
num_channels)`. PIL images or numpy array present the denoised images of the diffusion pipeline. num_channels)`.
""" """
images: Union[List[PIL.Image.Image], np.ndarray] images: Union[List[PIL.Image.Image], np.ndarray]
......
This diff is collapsed.
...@@ -68,11 +68,11 @@ class ImageTextPipelineOutput(BaseOutput): ...@@ -68,11 +68,11 @@ class ImageTextPipelineOutput(BaseOutput):
Args: Args:
images (`List[PIL.Image.Image]` or `np.ndarray`) images (`List[PIL.Image.Image]` or `np.ndarray`)
List of denoised PIL images of length `batch_size` or numpy array of shape `(batch_size, height, width, List of denoised PIL images of length `batch_size` or NumPy array of shape `(batch_size, height, width,
num_channels)`. PIL images or numpy array present the denoised images of the diffusion pipeline. num_channels)`.
text (`List[str]` or `List[List[str]]`) text (`List[str]` or `List[List[str]]`)
List of generated text strings of length `batch_size` or a list of list of strings whose outer list has List of generated text strings of length `batch_size` or a list of list of strings whose outer list has
length `batch_size`. Text generated by the diffusion pipeline. length `batch_size`.
""" """
images: Optional[Union[List[PIL.Image.Image], np.ndarray]] images: Optional[Union[List[PIL.Image.Image], np.ndarray]]
......
...@@ -124,22 +124,19 @@ def get_logger(name: Optional[str] = None) -> logging.Logger: ...@@ -124,22 +124,19 @@ def get_logger(name: Optional[str] = None) -> logging.Logger:
def get_verbosity() -> int: def get_verbosity() -> int:
""" """
Return the current level for the 🤗 Diffusers' root logger as an int. Return the current level for the 🤗 Diffusers' root logger as an `int`.
Returns: Returns:
`int`: The logging level. `int`:
Logging level integers which can be one of:
<Tip> - `50`: `diffusers.logging.CRITICAL` or `diffusers.logging.FATAL`
- `40`: `diffusers.logging.ERROR`
- `30`: `diffusers.logging.WARNING` or `diffusers.logging.WARN`
- `20`: `diffusers.logging.INFO`
- `10`: `diffusers.logging.DEBUG`
🤗 Diffusers has following logging levels: """
- 50: `diffusers.logging.CRITICAL` or `diffusers.logging.FATAL`
- 40: `diffusers.logging.ERROR`
- 30: `diffusers.logging.WARNING` or `diffusers.logging.WARN`
- 20: `diffusers.logging.INFO`
- 10: `diffusers.logging.DEBUG`
</Tip>"""
_configure_library_root_logger() _configure_library_root_logger()
return _get_library_root_logger().getEffectiveLevel() return _get_library_root_logger().getEffectiveLevel()
...@@ -151,7 +148,7 @@ def set_verbosity(verbosity: int) -> None: ...@@ -151,7 +148,7 @@ def set_verbosity(verbosity: int) -> None:
Args: Args:
verbosity (`int`): verbosity (`int`):
Logging level, e.g., one of: Logging level which can be one of:
- `diffusers.logging.CRITICAL` or `diffusers.logging.FATAL` - `diffusers.logging.CRITICAL` or `diffusers.logging.FATAL`
- `diffusers.logging.ERROR` - `diffusers.logging.ERROR`
...@@ -185,7 +182,7 @@ def set_verbosity_error(): ...@@ -185,7 +182,7 @@ def set_verbosity_error():
def disable_default_handler() -> None: def disable_default_handler() -> None:
"""Disable the default handler of the HuggingFace Diffusers' root logger.""" """Disable the default handler of the 🤗 Diffusers' root logger."""
_configure_library_root_logger() _configure_library_root_logger()
...@@ -194,7 +191,7 @@ def disable_default_handler() -> None: ...@@ -194,7 +191,7 @@ def disable_default_handler() -> None:
def enable_default_handler() -> None: def enable_default_handler() -> None:
"""Enable the default handler of the HuggingFace Diffusers' root logger.""" """Enable the default handler of the 🤗 Diffusers' root logger."""
_configure_library_root_logger() _configure_library_root_logger()
...@@ -241,9 +238,9 @@ def enable_propagation() -> None: ...@@ -241,9 +238,9 @@ def enable_propagation() -> None:
def enable_explicit_format() -> None: def enable_explicit_format() -> None:
""" """
Enable explicit formatting for every HuggingFace Diffusers' logger. The explicit formatter is as follows: Enable explicit formatting for every 🤗 Diffusers' logger. The explicit formatter is as follows:
``` ```
[LEVELNAME|FILENAME|LINE NUMBER] TIME >> MESSAGE [LEVELNAME|FILENAME|LINE NUMBER] TIME >> MESSAGE
``` ```
All handlers currently bound to the root logger are affected by this method. All handlers currently bound to the root logger are affected by this method.
""" """
...@@ -256,7 +253,7 @@ def enable_explicit_format() -> None: ...@@ -256,7 +253,7 @@ def enable_explicit_format() -> None:
def reset_format() -> None: def reset_format() -> None:
""" """
Resets the formatting for HuggingFace Diffusers' loggers. Resets the formatting for 🤗 Diffusers' loggers.
All handlers currently bound to the root logger are affected by this method. All handlers currently bound to the root logger are affected by this method.
""" """
......
...@@ -41,12 +41,12 @@ class BaseOutput(OrderedDict): ...@@ -41,12 +41,12 @@ class BaseOutput(OrderedDict):
""" """
Base class for all model outputs as dataclass. Has a `__getitem__` that allows indexing by integer or slice (like a Base class for all model outputs as dataclass. Has a `__getitem__` that allows indexing by integer or slice (like a
tuple) or strings (like a dictionary) that will ignore the `None` attributes. Otherwise behaves like a regular tuple) or strings (like a dictionary) that will ignore the `None` attributes. Otherwise behaves like a regular
python dictionary. Python dictionary.
<Tip warning={true}> <Tip warning={true}>
You can't unpack a `BaseOutput` directly. Use the [`~utils.BaseOutput.to_tuple`] method to convert it to a tuple You can't unpack a [`BaseOutput`] directly. Use the [`~utils.BaseOutput.to_tuple`] method to convert it to a tuple
before. first.
</Tip> </Tip>
""" """
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment