"server/vscode:/vscode.git/clone" did not exist on "5a65066922ce28dbc202dc03bb2410da14b980d2"
  • Linoy Tsaban's avatar
    Add features to the Dreambooth LoRA SDXL training script (#5508) · 6fac1369
    Linoy Tsaban authored
    
    
    * Additions:
    - support for different lr for text encoder
    - support for Prodigy optimizer
    - support for min snr gamma
    - support for custom captions and dataset loading from the hub
    
    * adjusted --caption_column behaviour (to -not- use the second column of the dataset by default if --caption_column is not provided)
    
    * fixed --output_dir / --model_dir_name confusion
    
    * added --repeats, --adam_weight_decay_text_encoder
    + some fixes
    
    * Update examples/dreambooth/train_dreambooth_lora_sdxl.py
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    
    * Update examples/dreambooth/train_dreambooth_lora_sdxl.py
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    
    * Update examples/dreambooth/train_dreambooth_lora_sdxl.py
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    
    * - import compute_snr from diffusers/training_utils.py
    - cluster adamw together
    - when using 'prodigy', if --train_text_encoder == True and --text_encoder_lr != --learning rate, changes the lr of the text encoders optimization params to be --learning_rate (otherwise errors)
    
    * shape fixes when custom captions are used
    
    * formatting and a little cleanup
    
    * code styling
    
    * --repeats default value fixed, changed to 1
    
    * bug fix - removed redundant lines of embedding concatenation when using prior_preservation (that duplicated class_prompt embeddings)
    
    * changed dataset loading logic according to the following usecases (to avoid unnecessary dependency on datasets)-
    1. user provides --dataset_name
    2. user provides local dir --instance_data_dir that contains a metadata .jsonl file
    3. user provides local dir --instance_data_dir that contains only images
    in cases [1,2] we import datasets and use load_dataset method, in case [3] we process the data same as in the original script setting
    
    * styling fix
    
    * arg name fix
    
    * adjusted the --repeats logic
    
    * -removed redundant arg and 'if' when loading local folder with prompts
    -updated readme template
    -some default val fixes
    -custom caption tests
    
    * image path fix for readme
    
    * code style
    
    * bug fix
    
    * --caption_column arg
    
    * readme fix
    
    ---------
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    Co-authored-by: default avatarLinoy Tsaban <linoy@huggingface.co>
    6fac1369
test_examples.py 70.6 KB