create a script to train autoencoderkl (#10605)

* create a script to train vae * update main.py * update train_autoencoderkl.py * update train_autoencoderkl.py * add a check of --pretrained_model_name_or_path and --model_config_name_or_path * remove the comment, remove diffusers in requiremnets.txt, add validation_image ote * update autoencoderkl.py * quality --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

create a script to train autoencoderkl (#10605)
* create a script to train vae * update main.py * update train_autoencoderkl.py * update train_autoencoderkl.py * add a check of --pretrained_model_name_or_path and --model_config_name_or_path * remove the comment, remove diffusers in requiremnets.txt, add validation_image ote * update autoencoderkl.py * quality --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
4fa24591 · Yuqian Hong · GitHub · 4f3ec536 · 4fa24591 · 4fa24591
Unverified Commit 4fa24591 authored Jan 27, 2025 by Yuqian Hong Committed by GitHub Jan 27, 2025
3 changed files
--- a/examples/research_projects/autoencoderkl/README.md
+++ b/examples/research_projects/autoencoderkl/README.md
+# AutoencoderKL training example
+## Installing the dependencies
+Before running the scripts, make sure to install the library's training dependencies:
+**Important**
+To make sure you can successfully run the latest versions of the example scripts, we highly recommend **installing from source** and keeping the install up to date as we update the example scripts frequently and install some example-specific requirements. To do this, execute the following steps in a new virtual environment:
+```bash
+git clone https://github.com/huggingface/diffusers
+cd diffusers
+pip install .
+```
+Then cd in the example folder  and run
+```bash
+pip install -r requirements.txt
+```
+And initialize an [🤗Accelerate](https://github.com/huggingface/accelerate/) environment with:
+```bash
+accelerate config
+```
+## Training on CIFAR10
+Please replace the validation image with your own image.
+```bash
+accelerate launch train_autoencoderkl.py \
+    --pretrained_model_name_or_path stabilityai/sd-vae-ft-mse \
+    --dataset_name=cifar10 \
+    --image_column=img \
+    --validation_image images/bird.jpg images/car.jpg images/dog.jpg images/frog.jpg \
+    --num_train_epochs 100 \
+    --gradient_accumulation_steps 2 \
+    --learning_rate 4.5e-6 \
+    --lr_scheduler cosine \
+    --report_to wandb \
+```
+## Training on ImageNet
+```bash
+accelerate launch train_autoencoderkl.py \
+    --pretrained_model_name_or_path stabilityai/sd-vae-ft-mse \
+    --num_train_epochs 100 \
+    --gradient_accumulation_steps 2 \
+    --learning_rate 4.5e-6 \
+    --lr_scheduler cosine \
+    --report_to wandb \
+    --mixed_precision bf16 \
+    --train_data_dir /path/to/ImageNet/train \
+    --validation_image ./image.png \
+    --decoder_only
+```
--- a/examples/research_projects/autoencoderkl/requirements.txt
+++ b/examples/research_projects/autoencoderkl/requirements.txt
+accelerate>=0.16.0
+bitsandbytes
+datasets
+huggingface_hub
+lpips
+numpy
+packaging
+Pillow
+taming_transformers
+torch
+torchvision
+tqdm
+transformers
+wandb
+xformers
--- a/examples/research_projects/autoencoderkl/train_autoencoderkl.py
+++ b/examples/research_projects/autoencoderkl/train_autoencoderkl.py