Unverified Commit 8edaf3b7 authored by Bagheera's avatar Bagheera Committed by GitHub
Browse files

7879 - adjust documentation to use naruto dataset, since pokemon is now gated (#7880)



* 7879 - adjust documentation to use naruto dataset, since pokemon is now gated

* replace references to pokemon in docs

* more references to pokemon replaced

* Japanese translation update

---------
Co-authored-by: default avatarbghira <bghira@users.github.com>
parent 23e09156
...@@ -60,7 +60,7 @@ logger = get_logger(__name__) ...@@ -60,7 +60,7 @@ logger = get_logger(__name__)
DATASET_NAME_MAPPING = { DATASET_NAME_MAPPING = {
"lambdalabs/pokemon-blip-captions": ("image", "text"), "lambdalabs/naruto-blip-captions": ("image", "text"),
} }
......
...@@ -57,7 +57,7 @@ With `gradient_checkpointing` and `mixed_precision` it should be possible to fin ...@@ -57,7 +57,7 @@ With `gradient_checkpointing` and `mixed_precision` it should be possible to fin
<!-- accelerate_snippet_start --> <!-- accelerate_snippet_start -->
```bash ```bash
export MODEL_NAME="CompVis/stable-diffusion-v1-4" export MODEL_NAME="CompVis/stable-diffusion-v1-4"
export DATASET_NAME="lambdalabs/pokemon-blip-captions" export DATASET_NAME="lambdalabs/naruto-blip-captions"
accelerate launch --mixed_precision="fp16" train_text_to_image.py \ accelerate launch --mixed_precision="fp16" train_text_to_image.py \
--pretrained_model_name_or_path=$MODEL_NAME \ --pretrained_model_name_or_path=$MODEL_NAME \
...@@ -136,7 +136,7 @@ for running distributed training with `accelerate`. Here is an example command: ...@@ -136,7 +136,7 @@ for running distributed training with `accelerate`. Here is an example command:
```bash ```bash
export MODEL_NAME="CompVis/stable-diffusion-v1-4" export MODEL_NAME="CompVis/stable-diffusion-v1-4"
export DATASET_NAME="lambdalabs/pokemon-blip-captions" export DATASET_NAME="lambdalabs/naruto-blip-captions"
accelerate launch --mixed_precision="fp16" --multi_gpu train_text_to_image.py \ accelerate launch --mixed_precision="fp16" --multi_gpu train_text_to_image.py \
--pretrained_model_name_or_path=$MODEL_NAME \ --pretrained_model_name_or_path=$MODEL_NAME \
...@@ -192,7 +192,7 @@ on consumer GPUs like Tesla T4, Tesla V100. ...@@ -192,7 +192,7 @@ on consumer GPUs like Tesla T4, Tesla V100.
### Training ### Training
First, you need to set up your development environment as is explained in the [installation section](#installing-the-dependencies). Make sure to set the `MODEL_NAME` and `DATASET_NAME` environment variables. Here, we will use [Stable Diffusion v1-4](https://hf.co/CompVis/stable-diffusion-v1-4) and the [Pokemons dataset](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions). First, you need to set up your development environment as is explained in the [installation section](#installing-the-dependencies). Make sure to set the `MODEL_NAME` and `DATASET_NAME` environment variables. Here, we will use [Stable Diffusion v1-4](https://hf.co/CompVis/stable-diffusion-v1-4) and the [Pokemons dataset](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions).
**___Note: Change the `resolution` to 768 if you are using the [stable-diffusion-2](https://huggingface.co/stabilityai/stable-diffusion-2) 768x768 model.___** **___Note: Change the `resolution` to 768 if you are using the [stable-diffusion-2](https://huggingface.co/stabilityai/stable-diffusion-2) 768x768 model.___**
...@@ -200,7 +200,7 @@ First, you need to set up your development environment as is explained in the [i ...@@ -200,7 +200,7 @@ First, you need to set up your development environment as is explained in the [i
```bash ```bash
export MODEL_NAME="CompVis/stable-diffusion-v1-4" export MODEL_NAME="CompVis/stable-diffusion-v1-4"
export DATASET_NAME="lambdalabs/pokemon-blip-captions" export DATASET_NAME="lambdalabs/naruto-blip-captions"
``` ```
For this example we want to directly store the trained LoRA embeddings on the Hub, so For this example we want to directly store the trained LoRA embeddings on the Hub, so
...@@ -282,7 +282,7 @@ pip install -U -r requirements_flax.txt ...@@ -282,7 +282,7 @@ pip install -U -r requirements_flax.txt
```bash ```bash
export MODEL_NAME="duongna/stable-diffusion-v1-4-flax" export MODEL_NAME="duongna/stable-diffusion-v1-4-flax"
export DATASET_NAME="lambdalabs/pokemon-blip-captions" export DATASET_NAME="lambdalabs/naruto-blip-captions"
python train_text_to_image_flax.py \ python train_text_to_image_flax.py \
--pretrained_model_name_or_path=$MODEL_NAME \ --pretrained_model_name_or_path=$MODEL_NAME \
......
...@@ -52,7 +52,7 @@ Note also that we use PEFT library as backend for LoRA training, make sure to ha ...@@ -52,7 +52,7 @@ Note also that we use PEFT library as backend for LoRA training, make sure to ha
```bash ```bash
export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0" export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0"
export VAE_NAME="madebyollin/sdxl-vae-fp16-fix" export VAE_NAME="madebyollin/sdxl-vae-fp16-fix"
export DATASET_NAME="lambdalabs/pokemon-blip-captions" export DATASET_NAME="lambdalabs/naruto-blip-captions"
accelerate launch train_text_to_image_sdxl.py \ accelerate launch train_text_to_image_sdxl.py \
--pretrained_model_name_or_path=$MODEL_NAME \ --pretrained_model_name_or_path=$MODEL_NAME \
...@@ -76,7 +76,7 @@ accelerate launch train_text_to_image_sdxl.py \ ...@@ -76,7 +76,7 @@ accelerate launch train_text_to_image_sdxl.py \
**Notes**: **Notes**:
* The `train_text_to_image_sdxl.py` script pre-computes text embeddings and the VAE encodings and keeps them in memory. While for smaller datasets like [`lambdalabs/pokemon-blip-captions`](https://hf.co/datasets/lambdalabs/pokemon-blip-captions), it might not be a problem, it can definitely lead to memory problems when the script is used on a larger dataset. For those purposes, you would want to serialize these pre-computed representations to disk separately and load them during the fine-tuning process. Refer to [this PR](https://github.com/huggingface/diffusers/pull/4505) for a more in-depth discussion. * The `train_text_to_image_sdxl.py` script pre-computes text embeddings and the VAE encodings and keeps them in memory. While for smaller datasets like [`lambdalabs/naruto-blip-captions`](https://hf.co/datasets/lambdalabs/naruto-blip-captions), it might not be a problem, it can definitely lead to memory problems when the script is used on a larger dataset. For those purposes, you would want to serialize these pre-computed representations to disk separately and load them during the fine-tuning process. Refer to [this PR](https://github.com/huggingface/diffusers/pull/4505) for a more in-depth discussion.
* The training script is compute-intensive and may not run on a consumer GPU like Tesla T4. * The training script is compute-intensive and may not run on a consumer GPU like Tesla T4.
* The training command shown above performs intermediate quality validation in between the training epochs and logs the results to Weights and Biases. `--report_to`, `--validation_prompt`, and `--validation_epochs` are the relevant CLI arguments here. * The training command shown above performs intermediate quality validation in between the training epochs and logs the results to Weights and Biases. `--report_to`, `--validation_prompt`, and `--validation_epochs` are the relevant CLI arguments here.
* SDXL's VAE is known to suffer from numerical instability issues. This is why we also expose a CLI argument namely `--pretrained_vae_model_name_or_path` that lets you specify the location of a better VAE (such as [this one](https://huggingface.co/madebyollin/sdxl-vae-fp16-fix)). * SDXL's VAE is known to suffer from numerical instability issues. This is why we also expose a CLI argument namely `--pretrained_vae_model_name_or_path` that lets you specify the location of a better VAE (such as [this one](https://huggingface.co/madebyollin/sdxl-vae-fp16-fix)).
...@@ -142,14 +142,14 @@ on consumer GPUs like Tesla T4, Tesla V100. ...@@ -142,14 +142,14 @@ on consumer GPUs like Tesla T4, Tesla V100.
### Training ### Training
First, you need to set up your development environment as is explained in the [installation section](#installing-the-dependencies). Make sure to set the `MODEL_NAME` and `DATASET_NAME` environment variables and, optionally, the `VAE_NAME` variable. Here, we will use [Stable Diffusion XL 1.0-base](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) and the [Pokemons dataset](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions). First, you need to set up your development environment as is explained in the [installation section](#installing-the-dependencies). Make sure to set the `MODEL_NAME` and `DATASET_NAME` environment variables and, optionally, the `VAE_NAME` variable. Here, we will use [Stable Diffusion XL 1.0-base](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) and the [Pokemons dataset](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions).
**___Note: It is quite useful to monitor the training progress by regularly generating sample images during training. [Weights and Biases](https://docs.wandb.ai/quickstart) is a nice solution to easily see generating images during training. All you need to do is to run `pip install wandb` before training to automatically log images.___** **___Note: It is quite useful to monitor the training progress by regularly generating sample images during training. [Weights and Biases](https://docs.wandb.ai/quickstart) is a nice solution to easily see generating images during training. All you need to do is to run `pip install wandb` before training to automatically log images.___**
```bash ```bash
export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0" export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0"
export VAE_NAME="madebyollin/sdxl-vae-fp16-fix" export VAE_NAME="madebyollin/sdxl-vae-fp16-fix"
export DATASET_NAME="lambdalabs/pokemon-blip-captions" export DATASET_NAME="lambdalabs/naruto-blip-captions"
``` ```
For this example we want to directly store the trained LoRA embeddings on the Hub, so For this example we want to directly store the trained LoRA embeddings on the Hub, so
...@@ -219,7 +219,7 @@ You need to save the mentioned configuration as an `accelerate_config.yaml` file ...@@ -219,7 +219,7 @@ You need to save the mentioned configuration as an `accelerate_config.yaml` file
```shell ```shell
export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0" export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0"
export VAE_NAME="madebyollin/sdxl-vae-fp16-fix" export VAE_NAME="madebyollin/sdxl-vae-fp16-fix"
export DATASET_NAME="lambdalabs/pokemon-blip-captions" export DATASET_NAME="lambdalabs/naruto-blip-captions"
export ACCELERATE_CONFIG_FILE="your accelerate_config.yaml" export ACCELERATE_CONFIG_FILE="your accelerate_config.yaml"
accelerate launch --config_file $ACCELERATE_CONFIG_FILE train_text_to_image_lora_sdxl.py \ accelerate launch --config_file $ACCELERATE_CONFIG_FILE train_text_to_image_lora_sdxl.py \
......
...@@ -62,7 +62,7 @@ check_min_version("0.28.0.dev0") ...@@ -62,7 +62,7 @@ check_min_version("0.28.0.dev0")
logger = get_logger(__name__, log_level="INFO") logger = get_logger(__name__, log_level="INFO")
DATASET_NAME_MAPPING = { DATASET_NAME_MAPPING = {
"lambdalabs/pokemon-blip-captions": ("image", "text"), "lambdalabs/naruto-blip-captions": ("image", "text"),
} }
......
...@@ -250,7 +250,7 @@ def parse_args(): ...@@ -250,7 +250,7 @@ def parse_args():
dataset_name_mapping = { dataset_name_mapping = {
"lambdalabs/pokemon-blip-captions": ("image", "text"), "lambdalabs/naruto-blip-captions": ("image", "text"),
} }
......
...@@ -387,7 +387,7 @@ def parse_args(): ...@@ -387,7 +387,7 @@ def parse_args():
DATASET_NAME_MAPPING = { DATASET_NAME_MAPPING = {
"lambdalabs/pokemon-blip-captions": ("image", "text"), "lambdalabs/naruto-blip-captions": ("image", "text"),
} }
......
...@@ -454,7 +454,7 @@ def parse_args(input_args=None): ...@@ -454,7 +454,7 @@ def parse_args(input_args=None):
DATASET_NAME_MAPPING = { DATASET_NAME_MAPPING = {
"lambdalabs/pokemon-blip-captions": ("image", "text"), "lambdalabs/naruto-blip-captions": ("image", "text"),
} }
......
...@@ -61,7 +61,7 @@ logger = get_logger(__name__) ...@@ -61,7 +61,7 @@ logger = get_logger(__name__)
DATASET_NAME_MAPPING = { DATASET_NAME_MAPPING = {
"lambdalabs/pokemon-blip-captions": ("image", "text"), "lambdalabs/naruto-blip-captions": ("image", "text"),
} }
......
...@@ -37,7 +37,7 @@ You can fine-tune the Würstchen prior model with the `train_text_to_image_prior ...@@ -37,7 +37,7 @@ You can fine-tune the Würstchen prior model with the `train_text_to_image_prior
<!-- accelerate_snippet_start --> <!-- accelerate_snippet_start -->
```bash ```bash
export DATASET_NAME="lambdalabs/pokemon-blip-captions" export DATASET_NAME="lambdalabs/naruto-blip-captions"
accelerate launch train_text_to_image_prior.py \ accelerate launch train_text_to_image_prior.py \
--mixed_precision="fp16" \ --mixed_precision="fp16" \
...@@ -72,10 +72,10 @@ In a nutshell, LoRA allows adapting pretrained models by adding pairs of rank-de ...@@ -72,10 +72,10 @@ In a nutshell, LoRA allows adapting pretrained models by adding pairs of rank-de
### Prior Training ### Prior Training
First, you need to set up your development environment as explained in the [installation](#Running-locally-with-PyTorch) section. Make sure to set the `DATASET_NAME` environment variable. Here, we will use the [Pokemon captions dataset](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions). First, you need to set up your development environment as explained in the [installation](#Running-locally-with-PyTorch) section. Make sure to set the `DATASET_NAME` environment variable. Here, we will use the [Pokemon captions dataset](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions).
```bash ```bash
export DATASET_NAME="lambdalabs/pokemon-blip-captions" export DATASET_NAME="lambdalabs/naruto-blip-captions"
accelerate launch train_text_to_image_lora_prior.py \ accelerate launch train_text_to_image_lora_prior.py \
--mixed_precision="fp16" \ --mixed_precision="fp16" \
......
...@@ -55,7 +55,7 @@ check_min_version("0.28.0.dev0") ...@@ -55,7 +55,7 @@ check_min_version("0.28.0.dev0")
logger = get_logger(__name__, log_level="INFO") logger = get_logger(__name__, log_level="INFO")
DATASET_NAME_MAPPING = { DATASET_NAME_MAPPING = {
"lambdalabs/pokemon-blip-captions": ("image", "text"), "lambdalabs/naruto-blip-captions": ("image", "text"),
} }
......
...@@ -56,7 +56,7 @@ check_min_version("0.28.0.dev0") ...@@ -56,7 +56,7 @@ check_min_version("0.28.0.dev0")
logger = get_logger(__name__, log_level="INFO") logger = get_logger(__name__, log_level="INFO")
DATASET_NAME_MAPPING = { DATASET_NAME_MAPPING = {
"lambdalabs/pokemon-blip-captions": ("image", "text"), "lambdalabs/naruto-blip-captions": ("image", "text"),
} }
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment