Unverified Commit 7071b746 authored by Tolga Cangöz's avatar Tolga Cangöz Committed by GitHub
Browse files

Errata: Fix typos & `\s+$` (#9008)



* Fix typos

* chore: Fix typos

* chore: Update README.md for promptdiffusion example

* Trim trailing white spaces

* Fix a typo

* update number

* chore: update number

* Trim trailing white space

* Update README.md
Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>

* Update README.md
Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>

---------
Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
parent a054c784
......@@ -46,5 +46,4 @@ pipe.enable_model_cpu_offload()
# generate image
generator = torch.manual_seed(0)
image = pipe("a tortoise", num_inference_steps=20, generator=generator, image_pair=[image_a,image_b], image=query).images[0]
```
......@@ -2051,7 +2051,7 @@ if __name__ == "__main__":
default=512,
type=int,
help=(
"The image size that the model was trained on. Use 512 for Stable Diffusion v1.X and Stable Siffusion v2"
"The image size that the model was trained on. Use 512 for Stable Diffusion v1.X and Stable Diffusion v2"
" Base. Use 768 for Stable Diffusion v2."
),
)
......
......@@ -1253,7 +1253,7 @@ class PromptDiffusionPipeline(
)
if guess_mode and self.do_classifier_free_guidance:
# Infered ControlNet only for the conditional batch.
# Inferred ControlNet only for the conditional batch.
# To apply the output of ControlNet to both the unconditional and conditional batches,
# add 0 to the unconditional batch to keep it unchanged.
down_block_res_samples = [torch.cat([torch.zeros_like(d), d]) for d in down_block_res_samples]
......
......@@ -11,28 +11,28 @@ huggingface-cli login
This will also allow us to push the trained model parameters to the Hugging Face Hub platform.
For setup, inference code, and details on how to run the code, please follow the Colab Notebook provided above.
For setup, inference code, and details on how to run the code, please follow the Colab Notebook provided above.
## How
We make use of several techniques to make this possible:
* Compute the embeddings from the instance prompt and serialize them for later reuse. This is implemented in the [`compute_embeddings.py`](./compute_embeddings.py) script. We use an 8bit (as introduced in [`LLM.int8()`](https://arxiv.org/abs/2208.07339)) T5 to reduce memory requirements to ~10.5GB.
* Compute the embeddings from the instance prompt and serialize them for later reuse. This is implemented in the [`compute_embeddings.py`](./compute_embeddings.py) script. We use an 8bit (as introduced in [`LLM.int8()`](https://arxiv.org/abs/2208.07339)) T5 to reduce memory requirements to ~10.5GB.
* In the `train_dreambooth_sd3_lora_miniature.py` script, we make use of:
* 8bit Adam for optimization through the `bitsandbytes` library.
* Gradient checkpointing and gradient accumulation.
* FP16 precision.
* Flash attention through `F.scaled_dot_product_attention()`.
* Flash attention through `F.scaled_dot_product_attention()`.
Computing the text embeddings is arguably the most memory-intensive part in the pipeline as SD3 employs three text encoders. If we run them in FP32, it will take about 20GB of VRAM. With FP16, we are down to 12GB.
Computing the text embeddings is arguably the most memory-intensive part in the pipeline as SD3 employs three text encoders. If we run them in FP32, it will take about 20GB of VRAM. With FP16, we are down to 12GB.
## Gotchas
This project is educational. It exists to showcase the possibility of fine-tuning a big diffusion system on consumer GPUs. But additional components might have to be added to obtain state-of-the-art performance. Below are some commonly known gotchas that users should be aware of:
* Training of text encoders is purposefully disabled.
* Techniques such as prior-preservation is unsupported.
* Training of text encoders is purposefully disabled.
* Techniques such as prior-preservation is unsupported.
* Custom instance captions for instance images are unsupported, but this should be relatively easy to integrate.
Hopefully, this project gives you a template to extend it further to suit your needs.
......@@ -42,7 +42,7 @@ if __name__ == "__main__":
default=512,
type=int,
help=(
"The image size that the model was trained on. Use 512 for Stable Diffusion v1.X and Stable Siffusion v2"
"The image size that the model was trained on. Use 512 for Stable Diffusion v1.X and Stable Diffusion v2"
" Base. Use 768 for Stable Diffusion v2."
),
)
......
......@@ -67,7 +67,7 @@ if __name__ == "__main__":
default=None,
type=int,
help=(
"The image size that the model was trained on. Use 512 for Stable Diffusion v1.X and Stable Siffusion v2"
"The image size that the model was trained on. Use 512 for Stable Diffusion v1.X and Stable Diffusion v2"
" Base. Use 768 for Stable Diffusion v2."
),
)
......
......@@ -302,7 +302,7 @@ def get_2d_rotary_pos_embed(embed_dim, crops_coords, grid_size, use_real=True):
If True, return real part and imaginary part separately. Otherwise, return complex numbers.
Returns:
`torch.Tensor`: positional embdding with shape `( grid_size * grid_size, embed_dim/2)`.
`torch.Tensor`: positional embedding with shape `( grid_size * grid_size, embed_dim/2)`.
"""
start, stop = crops_coords
grid_h = np.linspace(start[0], stop[0], grid_size[0], endpoint=False, dtype=np.float32)
......@@ -902,7 +902,7 @@ class HunyuanCombinedTimestepTextSizeStyleEmbedding(nn.Module):
pooled_projections = self.pooler(encoder_hidden_states) # (N, 1024)
if self.use_style_cond_and_image_meta_size:
# extra condition2: image meta size embdding
# extra condition2: image meta size embedding
image_meta_size = self.size_proj(image_meta_size.view(-1))
image_meta_size = image_meta_size.to(dtype=hidden_dtype)
image_meta_size = image_meta_size.view(-1, 6 * 256) # (N, 1536)
......
......@@ -87,7 +87,7 @@ def get_piecewise_constant_schedule(optimizer: Optimizer, step_rules: str, last_
The optimizer for which to schedule the learning rate.
step_rules (`string`):
The rules for the learning rate. ex: rule_steps="1:10,0.1:20,0.01:30,0.005" it means that the learning rate
if multiple 1 for the first 10 steps, mutiple 0.1 for the next 20 steps, multiple 0.01 for the next 30
if multiple 1 for the first 10 steps, multiple 0.1 for the next 20 steps, multiple 0.01 for the next 30
steps and multiple 0.005 for the other steps.
last_epoch (`int`, *optional*, defaults to -1):
The index of the last epoch when resuming training.
......
......@@ -1272,7 +1272,7 @@ class StableDiffusionControlNetPipeline(
)
if guess_mode and self.do_classifier_free_guidance:
# Infered ControlNet only for the conditional batch.
# Inferred ControlNet only for the conditional batch.
# To apply the output of ControlNet to both the unconditional and conditional batches,
# add 0 to the unconditional batch to keep it unchanged.
down_block_res_samples = [torch.cat([torch.zeros_like(d), d]) for d in down_block_res_samples]
......
......@@ -1244,7 +1244,7 @@ class StableDiffusionControlNetImg2ImgPipeline(
)
if guess_mode and self.do_classifier_free_guidance:
# Infered ControlNet only for the conditional batch.
# Inferred ControlNet only for the conditional batch.
# To apply the output of ControlNet to both the unconditional and conditional batches,
# add 0 to the unconditional batch to keep it unchanged.
down_block_res_samples = [torch.cat([torch.zeros_like(d), d]) for d in down_block_res_samples]
......
......@@ -1408,7 +1408,7 @@ class StableDiffusionControlNetInpaintPipeline(
)
if guess_mode and self.do_classifier_free_guidance:
# Infered ControlNet only for the conditional batch.
# Inferred ControlNet only for the conditional batch.
# To apply the output of ControlNet to both the unconditional and conditional batches,
# add 0 to the unconditional batch to keep it unchanged.
down_block_res_samples = [torch.cat([torch.zeros_like(d), d]) for d in down_block_res_samples]
......
......@@ -1739,7 +1739,7 @@ class StableDiffusionXLControlNetInpaintPipeline(
)
if guess_mode and self.do_classifier_free_guidance:
# Infered ControlNet only for the conditional batch.
# Inferred ControlNet only for the conditional batch.
# To apply the output of ControlNet to both the unconditional and conditional batches,
# add 0 to the unconditional batch to keep it unchanged.
down_block_res_samples = [torch.cat([torch.zeros_like(d), d]) for d in down_block_res_samples]
......
......@@ -1487,7 +1487,7 @@ class StableDiffusionXLControlNetPipeline(
)
if guess_mode and self.do_classifier_free_guidance:
# Infered ControlNet only for the conditional batch.
# Inferred ControlNet only for the conditional batch.
# To apply the output of ControlNet to both the unconditional and conditional batches,
# add 0 to the unconditional batch to keep it unchanged.
down_block_res_samples = [torch.cat([torch.zeros_like(d), d]) for d in down_block_res_samples]
......
......@@ -1551,7 +1551,7 @@ class StableDiffusionXLControlNetImg2ImgPipeline(
)
if guess_mode and self.do_classifier_free_guidance:
# Infered ControlNet only for the conditional batch.
# Inferred ControlNet only for the conditional batch.
# To apply the output of ControlNet to both the unconditional and conditional batches,
# add 0 to the unconditional batch to keep it unchanged.
down_block_res_samples = [torch.cat([torch.zeros_like(d), d]) for d in down_block_res_samples]
......
......@@ -1249,7 +1249,7 @@ class StableDiffusionControlNetPAGPipeline(
)
if guess_mode and self.do_classifier_free_guidance:
# Infered ControlNet only for the conditional batch.
# Inferred ControlNet only for the conditional batch.
# To apply the output of ControlNet to both the unconditional and conditional batches,
# add 0 to the unconditional batch to keep it unchanged.
down_block_res_samples = [torch.cat([torch.zeros_like(d), d]) for d in down_block_res_samples]
......
......@@ -106,7 +106,7 @@ def checkout_commit(repo: Repo, commit_id: str):
def clean_code(content: str) -> str:
"""
Remove docstrings, empty line or comments from some code (used to detect if a diff is real or only concern
comments or docstings).
comments or docstrings).
Args:
content (`str`): The code to clean
......@@ -165,7 +165,7 @@ def keep_doc_examples_only(content: str) -> str:
def get_all_tests() -> List[str]:
"""
Walks the `tests` folder to return a list of files/subfolders. This is used to split the tests to run when using
paralellism. The split is:
parallelism. The split is:
- folders under `tests`: (`tokenization`, `pipelines`, etc) except the subfolder `models` is excluded.
- folders under `tests/models`: `bert`, `gpt2`, etc.
......@@ -635,7 +635,7 @@ def get_tree_starting_at(module: str, edges: List[Tuple[str, str]]) -> List[Unio
Args:
module (`str`): The module that will be the root of the subtree we want.
eges (`List[Tuple[str, str]]`): The list of all edges of the tree.
edges (`List[Tuple[str, str]]`): The list of all edges of the tree.
Returns:
`List[Union[str, List[str]]]`: The tree to print in the following format: [module, [list of edges
......@@ -663,7 +663,7 @@ def print_tree_deps_of(module, all_edges=None):
Args:
module (`str`): The module that will be the root of the subtree we want.
all_eges (`List[Tuple[str, str]]`, *optional*):
all_edges (`List[Tuple[str, str]]`, *optional*):
The list of all edges of the tree. Will be set to `create_reverse_dependency_tree()` if not passed.
"""
if all_edges is None:
......@@ -706,7 +706,7 @@ def init_test_examples_dependencies() -> Tuple[Dict[str, List[str]], List[str]]:
for framework in ["flax", "pytorch", "tensorflow"]:
test_files = list((PATH_TO_EXAMPLES / framework).glob("test_*.py"))
all_examples.extend(test_files)
# Remove the files at the root of examples/framework since they are not proper examples (they are eith utils
# Remove the files at the root of examples/framework since they are not proper examples (they are either utils
# or example test files).
examples = [
f for f in (PATH_TO_EXAMPLES / framework).glob("**/*.py") if f.parent != PATH_TO_EXAMPLES / framework
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment