1. 06 Jul, 2023 7 commits
    • Aisuko's avatar
    • Patrick von Platen's avatar
      Improve SD XL (#3968) · 187ea539
      Patrick von Platen authored
      * improve sd xl
      
      * correct more
      
      * finish
      
      * make style
      
      * fix more
      187ea539
    • YiYi Xu's avatar
      Add Shap-E (#3742) · 45f6d52b
      YiYi Xu authored
      
      
      * refactor prior_transformer
      
      adding conversion script
      
      add pipeline
      
      add step_index from pipeline, + remove permute
      
      add zero pad token
      
      remove copy from statement for betas_for_alpha_bar function
      
      * add
      
      * add
      
      * update conversion script for renderer model
      
      * refactor camera a little bit
      
      * clean up
      
      * style
      
      * fix copies
      
      * Update src/diffusers/schedulers/scheduling_heun_discrete.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/pipelines/shap_e/pipeline_shap_e.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/pipelines/shap_e/pipeline_shap_e.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * alpha_transform_type
      
      * remove step_index argument
      
      * remove get_sigmas_karras
      
      * remove _yiyi_sigma_to_t
      
      * move the rescale prompt_embeds from prior_transformer to pipeline
      
      * replace baddbmm with einsum to match origial repo
      
      * Revert "replace baddbmm with einsum to match origial repo"
      
      This reverts commit 3f6b435d65dad3e5514cad2f5dd9e4419ca78e0b.
      
      * add step_index to scale_model_input
      
      * Revert "move the rescale prompt_embeds from prior_transformer to pipeline"
      
      This reverts commit 5b5a8e6be918fefd114a2945ed89d8e8fa8be21b.
      
      * move rescale from prior_transformer to pipeline
      
      * correct step_index in scale_model_input
      
      * remove print lines
      
      * refactor prior - reduce arguments
      
      * make style
      
      * add prior_image
      
      * arg embedding_proj_norm -> norm_embedding_proj
      
      * add pre-norm for proj_embedding
      
      * move rescale prompt from pipeline to _encode_prompt
      
      * add img2img pipeline
      
      * style
      
      * copies
      
      * Update src/diffusers/models/prior_transformer.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/models/prior_transformer.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/models/prior_transformer.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/models/prior_transformer.py
      
      add arg: encoder_hid_proj
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/models/prior_transformer.py
      
      add new config: norm_in_type
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/models/prior_transformer.py
      
      add new config: added_emb_type
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/models/prior_transformer.py
      
      rename out_dim -> clip_embed_dim
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/models/prior_transformer.py
      
      rename config: out_dim -> clip_embed_dim
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/models/prior_transformer.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/models/prior_transformer.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * finish refactor prior_tranformer
      
      * make style
      
      * refactor renderer
      
      * fix
      
      * make style
      
      * refactor img2img
      
      * remove params_proj
      
      * add test
      
      * add upcast_softmax to prior_transformer
      
      * enable num_images_per_prompt, add save_gif utility
      
      * add
      
      * add fast test
      
      * make style
      
      * add slow test
      
      * style
      
      * add test for img2img
      
      * refactor
      
      * enable batching
      
      * style
      
      * refactor scheduler
      
      * update test
      
      * style
      
      * attempt to solve batch related tests timeout
      
      * add doc
      
      * Update src/diffusers/pipelines/shap_e/pipeline_shap_e.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/pipelines/shap_e/pipeline_shap_e_img2img.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * hardcode rendering related config
      
      * update betas_for_alpha_bar on ddpm_scheduler
      
      * fix copies
      
      * fix
      
      * export_to_gif
      
      * style
      
      * second attempt to speed up batching tests
      
      * add doc page to index
      
      * Remove intermediate clipping
      
      * 3rd attempt to speed up batching tests
      
      * Remvoe time index
      
      * simplify scheduler
      
      * Fix more
      
      * Fix more
      
      * fix more
      
      * make style
      
      * fix schedulers
      
      * fix some more tests
      
      * finish
      
      * add one more test
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * style
      
      * apply feedbacks
      
      * style
      
      * fix copies
      
      * add one example
      
      * style
      
      * add example for img2img
      
      * fix doc
      
      * fix more doc strings
      
      * size -> frame_size
      
      * style
      
      * update doc
      
      * style
      
      * fix on doc
      
      * update repo name
      
      * improve the usage example in shap-e img2img
      
      * add usage examples in the shap-e docs.
      
      * consolidate examples.
      
      * minor fix.
      
      * update doc
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      
      * remove upcast
      
      * Make sure background is white
      
      * Update src/diffusers/pipelines/shap_e/pipeline_shap_e.py
      
      * Apply suggestions from code review
      
      * Finish
      
      * Apply suggestions from code review
      
      * Update src/diffusers/pipelines/shap_e/pipeline_shap_e.py
      
      * Make style
      
      ---------
      Co-authored-by: default avataryiyixuxu <yixu310@gmail,com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      45f6d52b
    • YiYi Xu's avatar
      Kandinsky_v22_yiyi (#3936) · 74621567
      YiYi Xu authored
      
      
      * Kandinsky2_2
      
      * fix init kandinsky2_2
      
      * kandinsky2_2 fix inpainting
      
      * rename pipelines: remove decoder + 2_2 -> V22
      
      * Update scheduling_unclip.py
      
      * remove text_encoder and tokenizer arguments from doc string
      
      * add test for text2img
      
      * add tests for text2img & img2img
      
      * fix
      
      * add test for inpaint
      
      * add prior tests
      
      * style
      
      * copies
      
      * add controlnet test
      
      * style
      
      * add a test for controlnet_img2img
      
      * update prior_emb2emb api to accept image_embedding or image
      
      * add a test for prior_emb2emb
      
      * style
      
      * remove try except
      
      * example
      
      * fix
      
      * add doc string examples to all kandinsky pipelines
      
      * style
      
      * update doc
      
      * style
      
      * add a top about 2.2
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * vae -> movq
      
      * vae -> movq
      
      * style
      
      * fix the #copied from
      
      * remove decoder from file name
      
      * update doc: add a section for kandinsky 2.2
      
      * fix
      
      * fix-copies
      
      * add coped from
      
      * add copies from for prior
      
      * add copies from for prior emb2emb
      
      * copy from for img2img
      
      * copied from for inpaint
      
      * more copied from
      
      * more copies from
      
      * more copies
      
      * remove the yiyi comments
      
      * Apply suggestions from code review
      
      * Self-contained example, pipeline order
      
      * Import prior output instead of redefining.
      
      * Style
      
      * Make VQModel compatible with model offload.
      
      * Fix copies
      
      ---------
      Co-authored-by: default avatarShahmatov Arseniy <62886550+cene555@users.noreply.github.com>
      Co-authored-by: default avataryiyixuxu <yixu310@gmail,com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      74621567
    • Patrick von Platen's avatar
      [SD-XL] Add new pipelines (#3859) · bc9a8cef
      Patrick von Platen authored
      
      
      * Add new text encoder
      
      * add transformers depth
      
      * More
      
      * Correct conversion script
      
      * Fix more
      
      * Fix more
      
      * Correct more
      
      * correct text encoder
      
      * Finish all
      
      * proof that in works in run local xl
      
      * clean up
      
      * Get refiner to work
      
      * Add red castle
      
      * Fix batch size
      
      * Improve pipelines more
      
      * Finish text2image tests
      
      * Add img2img test
      
      * Fix more
      
      * fix import
      
      * Fix embeddings for classic models (#3888)
      
      Fix embeddings for classic SD models.
      
      * Allow multiple prompts to be passed to the refiner (#3895)
      
      * finish more
      
      * Apply suggestions from code review
      
      * add watermarker
      
      * Model offload (#3889)
      
      * Model offload.
      
      * Model offload for refiner / img2img
      
      * Hardcode encoder offload on img2img vae encode
      
      Saves some GPU RAM in img2img / refiner tasks so it remains below 8 GB.
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * correct
      
      * fix
      
      * clean print
      
      * Update install warning for `invisible-watermark`
      
      * add: missing docstrings.
      
      * fix and simplify the usage example in img2img.
      
      * fix setup for watermarking.
      
      * Revert "fix setup for watermarking."
      
      This reverts commit 491bc9f5a640bbf46a97a8e52d6eff7e70eb8e4b.
      
      * fix: watermarking setup.
      
      * fix: op.
      
      * run make fix-copies.
      
      * make sure tests pass
      
      * improve convert
      
      * make tests pass
      
      * make tests pass
      
      * better error message
      
      * fiinsh
      
      * finish
      
      * Fix final test
      
      ---------
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      bc9a8cef
    • Sayak Paul's avatar
      [Consistency Models] correct checkpoint url in the doc (#3962) · 46af9826
      Sayak Paul authored
      correct checkpoint url.
      46af9826
    • Sayak Paul's avatar
      Update consistency_models.mdx (#3961) · 41ea88f3
      Sayak Paul authored
      41ea88f3
  2. 05 Jul, 2023 1 commit
    • dg845's avatar
      Add Consistency Models Pipeline (#3492) · aed7499a
      dg845 authored
      
      
      * initial commit
      
      * Improve consistency models sampling implementation.
      
      * Add CMStochasticIterativeScheduler, which implements the multi-step sampler (stochastic_iterative_sampler) in the original code, and make further improvements to sampling.
      
      * Add Unet blocks for consistency models
      
      * Add conversion script for Unet
      
      * Fix bug in new unet blocks
      
      * Fix attention weight loading
      
      * Make design improvements to ConsistencyModelPipeline and CMStochasticIterativeScheduler and add initial version of tests.
      
      * make style
      
      * Make small random test UNet class conditional and set resnet_time_scale_shift to 'scale_shift' to better match consistency model checkpoints.
      
      * Add support for converting a test UNet and non-class-conditional UNets to the consistency models conversion script.
      
      * make style
      
      * Change num_class_embeds to 1000 to better match the original consistency models implementation.
      
      * Add support for distillation in pipeline_consistency_models.py.
      
      * Improve consistency model tests:
      	- Get small testing checkpoints from hub
      	- Modify tests to take into account "distillation" parameter of ConsistencyModelPipeline
      	- Add onestep, multistep tests for distillation and distillation + class conditional
      	- Add expected image slices for onestep tests
      
      * make style
      
      * Improve ConsistencyModelPipeline:
      	- Add initial support for class-conditional generation
      	- Fix initial sigma for onestep generation
      	- Fix some sigma shape issues
      
      * make style
      
      * Improve ConsistencyModelPipeline:
      	- add latents __call__ argument and prepare_latents method
      	- add check_inputs method
      	- add initial docstrings for ConsistencyModelPipeline.__call__
      
      * make style
      
      * Fix bug when randomly generating class labels for class-conditional generation.
      
      * Switch CMStochasticIterativeScheduler to configuring a sigma schedule and make related changes to the pipeline and tests.
      
      * Remove some unused code and make style.
      
      * Fix small bug in CMStochasticIterativeScheduler.
      
      * Add expected slices for multistep sampling tests and make them pass.
      
      * Work on consistency model fast tests:
      	- in pipeline, call self.scheduler.scale_model_input before denoising
      	- get expected slices for Euler and Heun scheduler tests
      	- make Euler test pass
      	- mark Heun test as expected fail because it doesn't support prediction_type "sample" yet
      	- remove DPM and Euler Ancestral tests because they don't support use_karras_sigmas
      
      * make style
      
      * Refactor conversion script to make it easier to add more model architectures to convert in the future.
      
      * Work on ConsistencyModelPipeline tests:
      	- Fix device bug when handling class labels in ConsistencyModelPipeline.__call__
      	- Add slow tests for onestep and multistep sampling and make them pass
      	- Refactor fast tests
      	- Refactor ConsistencyModelPipeline.__init__
      
      * make style
      
      * Remove the add_noise and add_noise_to_input methods from CMStochasticIterativeScheduler for now.
      
      * Run python utils/check_copies.py --fix_and_overwrite
      python utils/check_dummies.py --fix_and_overwrite to make dummy objects for new pipeline and scheduler.
      
      * Make fast tests from PipelineTesterMixin pass.
      
      * make style
      
      * Refactor consistency models pipeline and scheduler:
      	- Remove support for Karras schedulers (only support CMStochasticIterativeScheduler)
      	- Move sigma manipulation, input scaling, denoising from pipeline to scheduler
      	- Make corresponding changes to tests and ensure they pass
      
      * make style
      
      * Add docstrings and further refactor pipeline and scheduler.
      
      * make style
      
      * Add initial version of the consistency models documentation.
      
      * Refactor custom timesteps logic following DDPMScheduler/IFPipeline and temporarily add torch 2.0 SDPA kernel selection logic for debugging.
      
      * make style
      
      * Convert current slow tests to use fp16 and flash attention.
      
      * make style
      
      * Add slow tests for normal attention on cuda device.
      
      * make style
      
      * Fix attention weights loading
      
      * Update consistency model fast tests for new test checkpoints with attention fix.
      
      * make style
      
      * apply suggestions
      
      * Add add_noise method to CMStochasticIterativeScheduler (copied from EulerDiscreteScheduler).
      
      * Conversion script now outputs pipeline instead of UNet and add support for LSUN-256 models and different schedulers.
      
      * When both timesteps and num_inference_steps are supplied, raise warning instead of error (timesteps take precedence).
      
      * make style
      
      * Add remaining diffusers model checkpoints for models in the original consistency model release and update usage example.
      
      * apply suggestions from review
      
      * make style
      
      * fix attention naming
      
      * Add tests for CMStochasticIterativeScheduler.
      
      * make style
      
      * Make CMStochasticIterativeScheduler tests pass.
      
      * make style
      
      * Override test_step_shape in CMStochasticIterativeSchedulerTest instead of modifying it in SchedulerCommonTest.
      
      * make style
      
      * rename some models
      
      * Improve API
      
      * rename some models
      
      * Remove duplicated block
      
      * Add docstring and make torch compile work
      
      * More fixes
      
      * Fixes
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      
      * add more docstring
      
      * update consistency conversion script
      
      ---------
      Co-authored-by: default avatarayushmangal <ayushmangal@microsoft.com>
      Co-authored-by: default avatarAyush Mangal <43698245+ayushtues@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      aed7499a
  3. 03 Jul, 2023 3 commits
  4. 02 Jul, 2023 1 commit
    • Patrick von Platen's avatar
      Add video img2img (#3900) · 62825064
      Patrick von Platen authored
      * Add image to image video
      
      * Improve
      
      * better naming
      
      * make fix copies
      
      * add docs
      
      * finish tests
      
      * trigger tests
      
      * make style
      
      * correct
      
      * finish
      
      * Fix more
      
      * make style
      
      * finish
      62825064
  5. 01 Jul, 2023 1 commit
  6. 30 Jun, 2023 1 commit
    • Steven Liu's avatar
      [docs] Model API (#3562) · 174dcd69
      Steven Liu authored
      * add modelmixin and unets
      
      * remove old model page
      
      * minor fixes
      
      * fix unet2dcondition
      
      * add vqmodel and autoencoderkl
      
      * add rest of models
      
      * fix autoencoderkl path
      
      * fix toctree
      
      * fix toctree again
      
      * apply feedback
      
      * apply feedback
      
      * fix copies
      
      * fix controlnet copy
      
      * fix copies
      174dcd69
  7. 24 Jun, 2023 1 commit
  8. 22 Jun, 2023 1 commit
  9. 21 Jun, 2023 2 commits
  10. 20 Jun, 2023 3 commits
  11. 19 Jun, 2023 1 commit
  12. 16 Jun, 2023 1 commit
  13. 15 Jun, 2023 1 commit
  14. 14 Jun, 2023 1 commit
  15. 12 Jun, 2023 2 commits
  16. 09 Jun, 2023 1 commit
  17. 07 Jun, 2023 2 commits
  18. 06 Jun, 2023 4 commits
    • Patrick von Platen's avatar
      Add draft for lora text encoder scale (#3626) · 74fd735e
      Patrick von Platen authored
      
      
      * Add draft for lora text encoder scale
      
      * Improve naming
      
      * fix: training dreambooth lora script.
      
      * Apply suggestions from code review
      
      * Update examples/dreambooth/train_dreambooth_lora.py
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      
      * add lora mixin when fit
      
      * add lora mixin when fit
      
      * add lora mixin when fit
      
      * fix more
      
      * fix more
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      74fd735e
    • Isotr0py's avatar
      Support views batch for panorama (#3632) · 11b3002b
      Isotr0py authored
      
      
      * support views batch for panorama
      
      * add entry for the new argument
      
      * format entry for the new argument
      
      * add view_batch_size test
      
      * fix batch test and a boundary condition
      
      * add more docstrings
      
      * fix a typos
      
      * fix typos
      
      * add: entry to the doc about view_batch_size.
      
      * Revert "add: entry to the doc about view_batch_size."
      
      This reverts commit a36aeaa9edf9b662d09bbfd6e18cbc556ed38187.
      
      * add a tip on .
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      11b3002b
    • Sayak Paul's avatar
      [LoRA] feat: add lora attention processor for pt 2.0. (#3594) · 8669e831
      Sayak Paul authored
      * feat: add lora attention processor for pt 2.0.
      
      * explicit context manager for SDPA.
      
      * switch to flash attention
      
      * make shapes compatible to work optimally with SDPA.
      
      * fix: circular import problem.
      
      * explicitly specify the flash attention kernel in sdpa
      
      * fall back to efficient attention context manager.
      
      * remove explicit dispatch.
      
      * fix: removed processor.
      
      * fix: remove optional from type annotation.
      
      * feat: make changes regarding LoRAAttnProcessor2_0.
      
      * remove confusing warning.
      
      * formatting.
      
      * relax tolerance for PT 2.0
      
      * fix: loading message.
      
      * remove unnecessary logging.
      
      * add: entry to the docs.
      
      * add: network_alpha argument.
      
      * relax tolerance.
      8669e831
    • Steven Liu's avatar
      [docs] Fix link to loader method (#3680) · a8b0f42c
      Steven Liu authored
      fix link to load_lora_weights
      a8b0f42c
  19. 05 Jun, 2023 4 commits
  20. 02 Jun, 2023 2 commits
    • Will Berman's avatar
      dreambooth if docs - stage II, more info (#3628) · 5911a3aa
      Will Berman authored
      
      
      * dreambooth if docs - stage II, more info
      
      * Update docs/source/en/training/dreambooth.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update docs/source/en/training/dreambooth.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update docs/source/en/training/dreambooth.mdx
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * download instructions for downsized images
      
      * update source README to match docs
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      5911a3aa
    • Takuma Mori's avatar
      Support Kohya-ss style LoRA file format (in a limited capacity) (#3437) · 8e552bb4
      Takuma Mori authored
      
      
      * add _convert_kohya_lora_to_diffusers
      
      * make style
      
      * add scaffold
      
      * match result: unet attention only
      
      * fix monkey-patch for text_encoder
      
      * with CLIPAttention
      
      While the terrible images are no longer produced,
      the results do not match those from the hook ver.
      This may be due to not setting the network_alpha value.
      
      * add to support network_alpha
      
      * generate diff image
      
      * fix monkey-patch for text_encoder
      
      * add test_text_encoder_lora_monkey_patch()
      
      * verify that it's okay to release the attn_procs
      
      * fix closure version
      
      * add comment
      
      * Revert "fix monkey-patch for text_encoder"
      
      This reverts commit bb9c61e6faecc1935c9c4319c77065837655d616.
      
      * Fix to reuse utility functions
      
      * make LoRAAttnProcessor targets to self_attn
      
      * fix LoRAAttnProcessor target
      
      * make style
      
      * fix split key
      
      * Update src/diffusers/loaders.py
      
      * remove TEXT_ENCODER_TARGET_MODULES loop
      
      * add print memory usage
      
      * remove test_kohya_loras_scaffold.py
      
      * add: doc on LoRA civitai
      
      * remove print statement and refactor in the doc.
      
      * fix state_dict test for kohya-ss style lora
      
      * Apply suggestions from code review
      Co-authored-by: default avatarTakuma Mori <takuma104@gmail.com>
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      8e552bb4