1. 09 Jan, 2026 1 commit
  2. 19 Jun, 2025 1 commit
  3. 18 Mar, 2025 1 commit
  4. 10 May, 2024 1 commit
    • Mark Van Aken's avatar
      #7535 Update FloatTensor type hints to Tensor (#7883) · be4afa0b
      Mark Van Aken authored
      * find & replace all FloatTensors to Tensor
      
      * apply formatting
      
      * Update torch.FloatTensor to torch.Tensor in the remaining files
      
      * formatting
      
      * Fix the rest of the places where FloatTensor is used as well as in documentation
      
      * formatting
      
      * Update new file from FloatTensor to Tensor
      be4afa0b
  5. 02 Apr, 2024 2 commits
  6. 13 Mar, 2024 1 commit
    • Sayak Paul's avatar
      [LoRA] use the PyTorch classes wherever needed and start depcrecation cycles (#7204) · 531e7191
      Sayak Paul authored
      * fix PyTorch classes and start deprecsation cycles.
      
      * remove args crafting for accommodating scale.
      
      * remove scale check in feedforward.
      
      * assert against nn.Linear and not CompatibleLinear.
      
      * remove conv_cls and lineaR_cls.
      
      * remove scale
      
      * 👋
      
       scale.
      
      * fix: unet2dcondition
      
      * fix attention.py
      
      * fix: attention.py again
      
      * fix: unet_2d_blocks.
      
      * fix-copies.
      
      * more fixes.
      
      * fix: resnet.py
      
      * more fixes
      
      * fix i2vgenxl unet.
      
      * depcrecate scale gently.
      
      * fix-copies
      
      * Apply suggestions from code review
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * quality
      
      * throw warning when scale is passed to the the BasicTransformerBlock class.
      
      * remove scale from signature.
      
      * cross_attention_kwargs, very nice catch by Yiyi
      
      * fix: logger.warn
      
      * make deprecation message clearer.
      
      * address final comments.
      
      * maintain same depcrecation message and also add it to activations.
      
      * address yiyi
      
      * fix copies
      
      * Apply suggestions from code review
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * more depcrecation
      
      * fix-copies
      
      ---------
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      531e7191
  7. 08 Feb, 2024 1 commit
  8. 31 Jan, 2024 1 commit
  9. 10 Jan, 2024 1 commit
  10. 20 Dec, 2023 1 commit
  11. 29 Nov, 2023 1 commit
    • Suraj Patil's avatar
      Add SVD (#5895) · 63f767ef
      Suraj Patil authored
      
      
      * begin model
      
      * finish blocks
      
      * add_embedding
      
      * addition_time_embed_dim
      
      * use TimestepEmbedding
      
      * fix temporal res block
      
      * fix time_pos_embed
      
      * fix add_embedding
      
      * add conversion script
      
      * fix model
      
      * up
      
      * add new resnet blocks
      
      * make forward work
      
      * return sample in original shape
      
      * fix temb shape in TemporalResnetBlock
      
      * add spatio temporal transformers
      
      * add vae blocks
      
      * fix blocks
      
      * update
      
      * update
      
      * fix shapes in Alphablender and add time activation in res blcok
      
      * use new blocks
      
      * style
      
      * fix temb shape
      
      * fix SpatioTemporalResBlock
      
      * reuse TemporalBasicTransformerBlock
      
      * fix TemporalBasicTransformerBlock
      
      * use TransformerSpatioTemporalModel
      
      * fix TransformerSpatioTemporalModel
      
      * fix time_context dim
      
      * clean up
      
      * make temb optional
      
      * add blocks
      
      * rename model
      
      * update conversion script
      
      * remove UNetMidBlockSpatioTemporal
      
      * add in init
      
      * remove unused arg
      
      * remove unused arg
      
      * remove more unsed args
      
      * up
      
      * up
      
      * check for None
      
      * update vae
      
      * update up/mid blocks for decoder
      
      * begin pipeline
      
      * adapt scheduler
      
      * add guidance scalings
      
      * fix norm eps in temporal transformers
      
      * add temporal autoencoder
      
      * make pipeline run
      
      * fix frame decodig
      
      * decode in float32
      
      * decode n frames at a time
      
      * pass decoding_t to decode_latents
      
      * fix decode_latents
      
      * vae encode/decode in fp32
      
      * fix dtype in TransformerSpatioTemporalModel
      
      * type image_latents same as image_embeddings
      
      * allow using differnt eps in temporal block for video decoder
      
      * fix default values in vae
      
      * pass num frames in decode
      
      * switch spatial to temporal for mixing in VAE
      
      * fix num frames during split decoding
      
      * cast alpha to sample dtype
      
      * fix attention in MidBlockTemporalDecoder
      
      * fix typo
      
      * fix guidance_scales dtype
      
      * fix missing activation in TemporalDecoder
      
      * skip_post_quant_conv
      
      * add vae conversion
      
      * style
      
      * take guidance scale as input
      
      * up
      
      * allow passing PIL to export_video
      
      * accept fps as arg
      
      * add pipeline and vae in init
      
      * remove hack
      
      * use AutoencoderKLTemporalDecoder
      
      * don't scale image latents
      
      * add unet tests
      
      * clean up unet
      
      * clean TransformerSpatioTemporalModel
      
      * add slow svd test
      
      * clean up
      
      * make temb optional in Decoder mid block
      
      * fix norm eps in TransformerSpatioTemporalModel
      
      * clean up temp decoder
      
      * clean up
      
      * clean up
      
      * use c_noise values for timesteps
      
      * use math for log
      
      * update
      
      * fix copies
      
      * doc
      
      * upcast vae
      
      * update forward pass for gradient checkpointing
      
      * make added_time_ids is tensor
      
      * up
      
      * fix upcasting
      
      * remove post quant conv
      
      * add _resize_with_antialiasing
      
      * fix _compute_padding
      
      * cleanup model
      
      * more cleanup
      
      * more cleanup
      
      * more cleanup
      
      * remove freeu
      
      * remove attn slice
      
      * small clean
      
      * up
      
      * up
      
      * remove extra step kwargs
      
      * remove eta
      
      * remove dropout
      
      * remove callback
      
      * remove merge factor args
      
      * clean
      
      * clean up
      
      * move to dedicated folder
      
      * remove attention_head_dim
      
      * docstr and small fix
      
      * update unet doc strings
      
      * rename decoding_t
      
      * correct linting
      
      * store c_skip and c_out
      
      * cleanup
      
      * clean TemporalResnetBlock
      
      * more cleanup
      
      * clean up vae
      
      * clean up
      
      * begin doc
      
      * more cleanup
      
      * up
      
      * up
      
      * doc
      
      * Improve
      
      * better naming
      
      * better naming
      
      * better naming
      
      * better naming
      
      * better naming
      
      * better naming
      
      * better naming
      
      * better naming
      
      * Apply suggestions from code review
      
      * Default chunk size to None
      
      * add example
      
      * Better
      
      * Apply suggestions from code review
      
      * update doc
      
      * Update src/diffusers/pipelines/stable_diffusion_video/pipeline_stable_diffusion_video.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * style
      
      * Get torch compile working
      
      * up
      
      * rename
      
      * fix doc
      
      * add chunking
      
      * torch compile
      
      * torch compile
      
      * add modelling outputs
      
      * torch compile
      
      * Improve chunking
      
      * Apply suggestions from code review
      
      * Update docs/source/en/using-diffusers/svd.md
      
      * Close diff tag
      
      * remove slicing
      
      * resnet docstr
      
      * add docstr in resnet
      
      * rename
      
      * Apply suggestions from code review
      
      * update tests
      
      * Fix output type latents
      
      * fix more
      
      * fix more
      
      * Update docs/source/en/using-diffusers/svd.md
      
      * fix more
      
      * add pipeline tests
      
      * remove unused arg
      
      * clean  up
      
      * make sure get_scaling receives tensors
      
      * fix euler scheduler
      
      * fix get_scalings
      
      * simply euler for now
      
      * remove old test file
      
      * use randn_tensor to create noise
      
      * fix device for rand tensor
      
      * increase expected_max_difference
      
      * fix test_inference_batch_single_identical
      
      * actually fix test_inference_batch_single_identical
      
      * disable test_save_load_float16
      
      * skip test_float16_inference
      
      * skip test_inference_batch_single_identical
      
      * fix test_xformers_attention_forwardGenerator_pass
      
      * Apply suggestions from code review
      
      * update StableVideoDiffusionPipelineSlowTests
      
      * update image
      
      * add diffusers example
      
      * fix more
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarapolinário <joaopaulo.passos@gmail.com>
      63f767ef
  12. 16 Nov, 2023 1 commit
  13. 15 Nov, 2023 1 commit
  14. 08 Nov, 2023 1 commit
    • Chi's avatar
      Replacing the nn.Mish activation function with a get_activation function. (#5651) · 9ae90593
      Chi authored
      
      
      * I added a new doc string to the class. This is more flexible to understanding other developers what are doing and where it's using.
      
      * Update src/diffusers/models/unet_2d_blocks.py
      
      This changes suggest by maintener.
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update src/diffusers/models/unet_2d_blocks.py
      
      Add suggested text
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update unet_2d_blocks.py
      
      I changed the Parameter to Args text.
      
      * Update unet_2d_blocks.py
      
      proper indentation set in this file.
      
      * Update unet_2d_blocks.py
      
      a little bit of change in the act_fun argument line.
      
      * I run the black command to reformat style in the code
      
      * Update unet_2d_blocks.py
      
      similar doc-string add to have in the original diffusion repository.
      
      * I removed the dummy variable defined in both the encoder and decoder.
      
      * Now, I run black package to reformat my file
      
      * Remove the redundant line from the adapter.py file.
      
      * Black package using to reformated my file
      
      * Replacing the nn.Mish activation function with a get_activation function allows developers to more easily choose the right activation function for their task. Additionally, removing redundant variables can improve code readability and maintainability.
      
      * I try to fix this: Fast tests for PRs / Fast PyTorch Models & Schedulers CPU tests (pull_request)
      
      * Update src/diffusers/models/resnet.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      9ae90593
  15. 24 Oct, 2023 1 commit
  16. 13 Oct, 2023 1 commit
  17. 09 Oct, 2023 1 commit
  18. 04 Sep, 2023 1 commit
    • Sayak Paul's avatar
      [Core] LoRA improvements pt. 3 (#4842) · c81a88b2
      Sayak Paul authored
      
      
      * throw warning when more than one lora is attempted to be fused.
      
      * introduce support of lora scale during fusion.
      
      * change test name
      
      * changes
      
      * change to _lora_scale
      
      * lora_scale to call whenever applicable.
      
      * debugging
      
      * lora_scale additional.
      
      * cross_attention_kwargs
      
      * lora_scale -> scale.
      
      * lora_scale fix
      
      * lora_scale in patched projection.
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * styling.
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * remove unneeded prints.
      
      * remove unneeded prints.
      
      * assign cross_attention_kwargs.
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * clean up.
      
      * refactor scale retrieval logic a bit.
      
      * fix nonetypw
      
      * fix: tests
      
      * add more tests
      
      * more fixes.
      
      * figure out a way to pass lora_scale.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * unify the retrieval logic of lora_scale.
      
      * move adjust_lora_scale_text_encoder to lora.py.
      
      * introduce dynamic adjustment lora scale support to sd
      
      * fix up copies
      
      * Empty-Commit
      
      * add: test to check fusion equivalence on different scales.
      
      * handle lora fusion warning.
      
      * make lora smaller
      
      * make lora smaller
      
      * make lora smaller
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      c81a88b2
  19. 28 Jul, 2023 1 commit
    • Sayak Paul's avatar
      [Feat] Support SDXL Kohya-style LoRA (#4287) · 4a4cdd6b
      Sayak Paul authored
      
      
      * sdxl lora changes.
      
      * better name replacement.
      
      * better replacement.
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * remove print.
      
      * print state dict keys.
      
      * print
      
      * distingisuih better
      
      * debuggable.
      
      * fxi: tyests
      
      * fix: arg from training script.
      
      * access from class.
      
      * run style
      
      * debug
      
      * save intermediate
      
      * some simplifications for SDXL LoRA
      
      * styling
      
      * unet config is not needed in diffusers format.
      
      * fix: dynamic SGM block mapping for SDXL kohya loras (#4322)
      
      * Use lora compatible layers for linear proj_in/proj_out (#4323)
      
      * improve condition for using the sgm_diffusers mapping
      
      * informative comment.
      
      * load compatible keys and embedding layer maaping.
      
      * Get SDXL 1.0 example lora to load
      
      * simplify
      
      * specif ranks and hidden sizes.
      
      * better handling of k rank and hidden
      
      * debug
      
      * debug
      
      * debug
      
      * debug
      
      * debug
      
      * fix: alpha keys
      
      * add check for handling LoRAAttnAddedKVProcessor
      
      * sanity comment
      
      * modifications for text encoder SDXL
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * denugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * up
      
      * up
      
      * up
      
      * up
      
      * up
      
      * up
      
      * unneeded comments.
      
      * unneeded comments.
      
      * kwargs for the other attention processors.
      
      * kwargs for the other attention processors.
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * improve
      
      * debugging
      
      * debugging
      
      * more print
      
      * Fix alphas
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * debugging
      
      * clean up
      
      * clean up.
      
      * debugging
      
      * fix: text
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarBatuhan Taskaya <batuhan@python.org>
      4a4cdd6b
  20. 28 Jun, 2023 1 commit
  21. 05 Jun, 2023 1 commit
  22. 26 May, 2023 1 commit
  23. 25 May, 2023 1 commit
  24. 24 May, 2023 1 commit
  25. 16 May, 2023 1 commit
  26. 10 Apr, 2023 1 commit
  27. 22 Mar, 2023 1 commit
    • Patrick von Platen's avatar
      [MS Text To Video] Add first text to video (#2738) · ca1a2229
      Patrick von Platen authored
      
      
      * [MS Text To Video} Add first text to video
      
      * upload
      
      * make first model example
      
      * match unet3d params
      
      * make sure weights are correcctly converted
      
      * improve
      
      * forward pass works, but diff result
      
      * make forward work
      
      * fix more
      
      * finish
      
      * refactor video output class.
      
      * feat: add support for a video export utility.
      
      * fix: opencv availability check.
      
      * run make fix-copies.
      
      * add: docs for the model components.
      
      * add: standalone pipeline doc.
      
      * edit docstring of the pipeline.
      
      * add: right path to TransformerTempModel
      
      * add: first set of tests.
      
      * complete fast tests for text to video.
      
      * fix bug
      
      * up
      
      * three fast tests failing.
      
      * add: note on slow tests
      
      * make work with all schedulers
      
      * apply styling.
      
      * add slow tests
      
      * change file name
      
      * update
      
      * more correction
      
      * more fixes
      
      * finish
      
      * up
      
      * Apply suggestions from code review
      
      * up
      
      * finish
      
      * make copies
      
      * fix pipeline tests
      
      * fix more tests
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * apply suggestions
      
      * up
      
      * revert
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      ca1a2229
  28. 21 Mar, 2023 1 commit
  29. 07 Feb, 2023 1 commit
    • YiYi Xu's avatar
      Stable Diffusion Latent Upscaler (#2059) · 1051ca81
      YiYi Xu authored
      
      
      * Modify UNet2DConditionModel
      
      - allow skipping mid_block
      
      - adding a norm_group_size argument so that we can set the `num_groups` for group norm using `num_channels//norm_group_size`
      
      - allow user to set dimension for the timestep embedding (`time_embed_dim`)
      
      - the kernel_size for `conv_in` and `conv_out` is now configurable
      
      - add random fourier feature layer (`GaussianFourierProjection`) for `time_proj`
      
      - allow user to add the time and class embeddings before passing through the projection layer together - `time_embedding(t_emb + class_label))`
      
      - added 2 arguments `attn1_types` and `attn2_types`
      
        * currently we have argument `only_cross_attention`: when it's set to `True`, we will have a to the
      `BasicTransformerBlock` block with 2 cross-attention , otherwise we
      get a self-attention followed by a cross-attention; in k-upscaler, we need to have blocks that include just one cross-attention, or self-attention -> cross-attention;
      so I added `attn1_types` and `attn2_types` to the unet's argument list to allow user specify the attention types for the 2 positions in each block;  note that I stil kept
      the `only_cross_attention` argument for unet for easy configuration, but it will be converted to `attn1_type` and `attn2_type` when passing down to the down blocks
      
      - the position of downsample layer and upsample layer is now configurable
      
      - in k-upscaler unet, there is only one skip connection per each up/down block (instead of each layer in stable diffusion unet), added `skip_freq = "block"` to support
      this use case
      
      - if user passes attention_mask to unet, it will prepare the mask and pass a flag to cross attention processer to skip the `prepare_attention_mask` step
      inside cross attention block
      
      add up/down blocks for k-upscaler
      
      modify CrossAttention class
      
      - make the `dropout` layer in `to_out` optional
      
      - `use_conv_proj` - use conv instead of linear for all projection layers (i.e. `to_q`, `to_k`, `to_v`, `to_out`) whenever possible. note that when it's used to do cross
      attention, to_k, to_v has to be linear because the `encoder_hidden_states` is not 2d
      
      - `cross_attention_norm` - add an optional layernorm on encoder_hidden_states
      
      - `attention_dropout`: add an optional dropout on attention score
      
      adapt BasicTransformerBlock
      
      - add an ada groupnorm layer  to conditioning attention input with timestep embedding
      
      - allow skipping the FeedForward layer in between the attentions
      
      - replaced the only_cross_attention argument with attn1_type and attn2_type for more flexible configuration
      
      update timestep embedding: add new act_fn  gelu and an optional act_2
      
      modified ResnetBlock2D
      
      - refactored with AdaGroupNorm class (the timestep scale shift normalization)
      
      - add `mid_channel` argument - allow the first conv to have a different output dimension from the second conv
      
      - add option to use input AdaGroupNorm on the input instead of groupnorm
      
      - add options to add a dropout layer after each conv
      
      - allow user to set the bias in conv_shortcut (needed for k-upscaler)
      
      - add gelu
      
      adding conversion script for k-upscaler unet
      
      add pipeline
      
      * fix attention mask
      
      * fix a typo
      
      * fix a bug
      
      * make sure model can be used with GPU
      
      * make pipeline work with fp16
      
      * fix an error in BasicTransfomerBlock
      
      * make style
      
      * fix typo
      
      * some more fixes
      
      * uP
      
      * up
      
      * correct more
      
      * some clean-up
      
      * clean time proj
      
      * up
      
      * uP
      
      * more changes
      
      * remove the upcast_attention=True from unet config
      
      * remove attn1_types, attn2_types etc
      
      * fix
      
      * revert incorrect changes up/down samplers
      
      * make style
      
      * remove outdated files
      
      * Apply suggestions from code review
      
      * attention refactor
      
      * refactor cross attention
      
      * Apply suggestions from code review
      
      * update
      
      * up
      
      * update
      
      * Apply suggestions from code review
      
      * finish
      
      * Update src/diffusers/models/cross_attention.py
      
      * more fixes
      
      * up
      
      * up
      
      * up
      
      * finish
      
      * more corrections of conversion state
      
      * act_2 -> act_2_fn
      
      * remove dropout_after_conv from ResnetBlock2D
      
      * make style
      
      * simplify KAttentionBlock
      
      * add fast test for latent upscaler pipeline
      
      * add slow test
      
      * slow test fp16
      
      * make style
      
      * add doc string for pipeline_stable_diffusion_latent_upscale
      
      * add api doc page for latent upscaler pipeline
      
      * deprecate attention mask
      
      * clean up embeddings
      
      * simplify resnet
      
      * up
      
      * clean up resnet
      
      * up
      
      * correct more
      
      * up
      
      * up
      
      * improve a bit more
      
      * correct more
      
      * more clean-ups
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * add docstrings for new unet config
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * # Copied from
      
      * encode the image if not latent
      
      * remove force casting vae to fp32
      
      * fix
      
      * add comments about preconditioning parameters from k-diffusion paper
      
      * attn1_type, attn2_type -> add_self_attention
      
      * clean up get_down_block and get_up_block
      
      * fix
      
      * fixed a typo(?) in ada group norm
      
      * update slice attention processer for cross attention
      
      * update slice
      
      * fix fast test
      
      * update the checkpoint
      
      * finish tests
      
      * fix-copies
      
      * fix-copy for modeling_text_unet.py
      
      * make style
      
      * make style
      
      * fix f-string
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix import
      
      * correct changes
      
      * fix resnet
      
      * make fix-copies
      
      * correct euler scheduler
      
      * add missing #copied from for preprocess
      
      * revert
      
      * fix
      
      * fix copies
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/latent_upscale.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/models/cross_attention.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * clean up conversion script
      
      * KDownsample2d,KUpsample2d -> KDownsample2D,KUpsample2D
      
      * more
      
      * Update src/diffusers/models/unet_2d_condition.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * remove prepare_extra_step_kwargs
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix a typo in timestep embedding
      
      * remove num_image_per_prompt
      
      * fix fasttest
      
      * make style + fix-copies
      
      * fix
      
      * fix xformer test
      
      * fix style
      
      * doc string
      
      * make style
      
      * fix-copies
      
      * docstring for time_embedding_norm
      
      * make style
      
      * final finishes
      
      * make fix-copies
      
      * fix tests
      
      ---------
      Co-authored-by: default avataryiyixuxu <yixu@yis-macbook-pro.lan>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      1051ca81
  30. 18 Dec, 2022 1 commit
    • Will Berman's avatar
      kakaobrain unCLIP (#1428) · 2dcf64b7
      Will Berman authored
      
      
      * [wip] attention block updates
      
      * [wip] unCLIP unet decoder and super res
      
      * [wip] unCLIP prior transformer
      
      * [wip] scheduler changes
      
      * [wip] text proj utility class
      
      * [wip] UnCLIPPipeline
      
      * [wip] kakaobrain unCLIP convert script
      
      * [unCLIP pipeline] fixes re: @patrickvonplaten
      
      remove callbacks
      
      move denoising loops into call function
      
      * UNCLIPScheduler re: @patrickvonplaten
      
      Revert changes to DDPMScheduler. Make UNCLIPScheduler, a modified
      DDPM scheduler with changes to support karlo
      
      * mask -> attention_mask re: @patrickvonplaten
      
      * [DDPMScheduler] remove leftover change
      
      * [docs] PriorTransformer
      
      * [docs] UNet2DConditionModel and UNet2DModel
      
      * [nit] UNCLIPScheduler -> UnCLIPScheduler
      
      matches existing unclip naming better
      
      * [docs] SchedulingUnCLIP
      
      * [docs] UnCLIPTextProjModel
      
      * refactor
      
      * finish licenses
      
      * rename all to attention_mask and prep in models
      
      * more renaming
      
      * don't expose unused configs
      
      * final renaming fixes
      
      * remove x attn mask when not necessary
      
      * configure kakao script to use new class embedding config
      
      * fix copies
      
      * [tests] UnCLIPScheduler
      
      * finish x attn
      
      * finish
      
      * remove more
      
      * rename condition blocks
      
      * clean more
      
      * Apply suggestions from code review
      
      * up
      
      * fix
      
      * [tests] UnCLIPPipelineFastTests
      
      * remove unused imports
      
      * [tests] UnCLIPPipelineIntegrationTests
      
      * correct
      
      * make style
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      2dcf64b7
  31. 14 Nov, 2022 1 commit
    • Nathan Lambert's avatar
      Add UNet 1d for RL model for planning + colab (#105) · 7c5fef81
      Nathan Lambert authored
      
      
      * re-add RL model code
      
      * match model forward api
      
      * add register_to_config, pass training tests
      
      * fix tests, update forward outputs
      
      * remove unused code, some comments
      
      * add to docs
      
      * remove extra embedding code
      
      * unify time embedding
      
      * remove conv1d output sequential
      
      * remove sequential from conv1dblock
      
      * style and deleting duplicated code
      
      * clean files
      
      * remove unused variables
      
      * clean variables
      
      * add 1d resnet block structure for downsample
      
      * rename as unet1d
      
      * fix renaming
      
      * rename files
      
      * add get_block(...) api
      
      * unify args for model1d like model2d
      
      * minor cleaning
      
      * fix docs
      
      * improve 1d resnet blocks
      
      * fix tests, remove permuts
      
      * fix style
      
      * add output activation
      
      * rename flax blocks file
      
      * Add Value Function and corresponding example script to Diffuser implementation (#884)
      
      * valuefunction code
      
      * start example scripts
      
      * missing imports
      
      * bug fixes and placeholder example script
      
      * add value function scheduler
      
      * load value function from hub and get best actions in example
      
      * very close to working example
      
      * larger batch size for planning
      
      * more tests
      
      * merge unet1d changes
      
      * wandb for debugging, use newer models
      
      * success!
      
      * turns out we just need more diffusion steps
      
      * run on modal
      
      * merge and code cleanup
      
      * use same api for rl model
      
      * fix variance type
      
      * wrong normalization function
      
      * add tests
      
      * style
      
      * style and quality
      
      * edits based on comments
      
      * style and quality
      
      * remove unused var
      
      * hack unet1d into a value function
      
      * add pipeline
      
      * fix arg order
      
      * add pipeline to core library
      
      * community pipeline
      
      * fix couple shape bugs
      
      * style
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNathan Lambert <nathan@huggingface.co>
      
      * update post merge of scripts
      
      * add mdiblock / outblock architecture
      
      * Pipeline cleanup (#947)
      
      * valuefunction code
      
      * start example scripts
      
      * missing imports
      
      * bug fixes and placeholder example script
      
      * add value function scheduler
      
      * load value function from hub and get best actions in example
      
      * very close to working example
      
      * larger batch size for planning
      
      * more tests
      
      * merge unet1d changes
      
      * wandb for debugging, use newer models
      
      * success!
      
      * turns out we just need more diffusion steps
      
      * run on modal
      
      * merge and code cleanup
      
      * use same api for rl model
      
      * fix variance type
      
      * wrong normalization function
      
      * add tests
      
      * style
      
      * style and quality
      
      * edits based on comments
      
      * style and quality
      
      * remove unused var
      
      * hack unet1d into a value function
      
      * add pipeline
      
      * fix arg order
      
      * add pipeline to core library
      
      * community pipeline
      
      * fix couple shape bugs
      
      * style
      
      * Apply suggestions from code review
      
      * clean up comments
      
      * convert older script to using pipeline and add readme
      
      * rename scripts
      
      * style, update tests
      
      * delete unet rl model file
      
      * remove imports in src
      Co-authored-by: default avatarNathan Lambert <nathan@huggingface.co>
      
      * Update src/diffusers/models/unet_1d_blocks.py
      
      * Update tests/test_models_unet.py
      
      * RL Cleanup v2 (#965)
      
      * valuefunction code
      
      * start example scripts
      
      * missing imports
      
      * bug fixes and placeholder example script
      
      * add value function scheduler
      
      * load value function from hub and get best actions in example
      
      * very close to working example
      
      * larger batch size for planning
      
      * more tests
      
      * merge unet1d changes
      
      * wandb for debugging, use newer models
      
      * success!
      
      * turns out we just need more diffusion steps
      
      * run on modal
      
      * merge and code cleanup
      
      * use same api for rl model
      
      * fix variance type
      
      * wrong normalization function
      
      * add tests
      
      * style
      
      * style and quality
      
      * edits based on comments
      
      * style and quality
      
      * remove unused var
      
      * hack unet1d into a value function
      
      * add pipeline
      
      * fix arg order
      
      * add pipeline to core library
      
      * community pipeline
      
      * fix couple shape bugs
      
      * style
      
      * Apply suggestions from code review
      
      * clean up comments
      
      * convert older script to using pipeline and add readme
      
      * rename scripts
      
      * style, update tests
      
      * delete unet rl model file
      
      * remove imports in src
      
      * add specific vf block and update tests
      
      * style
      
      * Update tests/test_models_unet.py
      Co-authored-by: default avatarNathan Lambert <nathan@huggingface.co>
      
      * fix quality in tests
      
      * fix quality style, split test file
      
      * fix checks / tests
      
      * make timesteps closer to main
      
      * unify block API
      
      * unify forward api
      
      * delete lines in examples
      
      * style
      
      * examples style
      
      * all tests pass
      
      * make style
      
      * make dance_diff test pass
      
      * Refactoring RL PR (#1200)
      
      * init file changes
      
      * add import utils
      
      * finish cleaning files, imports
      
      * remove import flags
      
      * clean examples
      
      * fix imports, tests for merge
      
      * update readmes
      
      * hotfix for tests
      
      * quality
      
      * fix some tests
      
      * change defaults
      
      * more mps test fixes
      
      * unet1d defaults
      
      * do not default import experimental
      
      * defaults for tests
      
      * fix tests
      
      * fix-copies
      
      * fix
      
      * changes per Patrik's comments (#1285)
      
      * changes per Patrik's comments
      
      * update conversion script
      
      * fix renaming
      
      * skip more mps tests
      
      * last test fix
      
      * Update examples/rl/README.md
      Co-authored-by: default avatarBen Glickenhaus <benglickenhaus@gmail.com>
      7c5fef81
  32. 28 Oct, 2022 1 commit
  33. 25 Oct, 2022 1 commit
  34. 11 Oct, 2022 1 commit
  35. 10 Oct, 2022 1 commit
  36. 04 Oct, 2022 2 commits
  37. 30 Sep, 2022 2 commits
    • Josh Achiam's avatar
      Allow resolutions that are not multiples of 64 (#505) · a784be2e
      Josh Achiam authored
      
      
      * Allow resolutions that are not multiples of 64
      
      * ran black
      
      * fix bug
      
      * add test
      
      * more explanation
      
      * more comments
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      a784be2e
    • Nouamane Tazi's avatar
      Optimize Stable Diffusion (#371) · 9ebaea54
      Nouamane Tazi authored
      * initial commit
      
      * make UNet stream capturable
      
      * try to fix noise_pred value
      
      * remove cuda graph and keep NB
      
      * non blocking unet with PNDMScheduler
      
      * make timesteps np arrays for pndm scheduler
      because lists don't get formatted to tensors in `self.set_format`
      
      * make max async in pndm
      
      * use channel last format in unet
      
      * avoid moving timesteps device in each unet call
      
      * avoid memcpy op in `get_timestep_embedding`
      
      * add `channels_last` kwarg to `DiffusionPipeline.from_pretrained`
      
      * update TODO
      
      * replace `channels_last` kwarg with `memory_format` for more generality
      
      * revert the channels_last changes to leave it for another PR
      
      * remove non_blocking when moving input ids to device
      
      * remove blocking from all .to() operations at beginning of pipeline
      
      * fix merging
      
      * fix merging
      
      * model can run in other precisions without autocast
      
      * attn refactoring
      
      * Revert "attn refactoring"
      
      This reverts commit 0c70c0e189cd2c4d8768274c9fcf5b940ee310fb.
      
      * remove restriction to run conv_norm in fp32
      
      * use `baddbmm` instead of `matmul`for better in attention for better perf
      
      * removing all reshapes to test perf
      
      * Revert "removing all reshapes to test perf"
      
      This reverts commit 006ccb8a8c6bc7eb7e512392e692a29d9b1553cd.
      
      * add shapes comments
      
      * hardcore whats needed for jitting
      
      * Revert "hardcore whats needed for jitting"
      
      This reverts commit 2fa9c698eae2890ac5f8e367ca80532ecf94df9a.
      
      * Revert "remove restriction to run conv_norm in fp32"
      
      This reverts commit cec592890c32da3d1b78d38b49e4307aedf459b9.
      
      * revert using baddmm in attention's forward
      
      * cleanup comment
      
      * remove restriction to run conv_norm in fp32. no quality loss was noticed
      
      This reverts commit cc9bc1339c998ebe9e7d733f910c6d72d9792213.
      
      * add more optimizations techniques to docs
      
      * Revert "add shapes comments"
      
      This reverts commit 31c58eadb8892f95478cdf05229adf678678c5f4.
      
      * apply suggestions
      
      * make quality
      
      * apply suggestions
      
      * styling
      
      * `scheduler.timesteps` are now arrays so we dont need .to()
      
      * remove useless .type()
      
      * use mean instead of max in `test_stable_diffusion_inpaint_pipeline_k_lms`
      
      * move scheduler timestamps to correct device if tensors
      
      * add device to `set_timesteps` in LMSD scheduler
      
      * `self.scheduler.set_timesteps` now uses device arg for schedulers that accept it
      
      * quick fix
      
      * styling
      
      * remove kwargs from schedulers `set_timesteps`
      
      * revert to using max in K-LMS inpaint pipeline test
      
      * Revert "`self.scheduler.set_timesteps` now uses device arg for schedulers that accept it"
      
      This reverts commit 00d5a51e5c20d8d445c8664407ef29608106d899.
      
      * move timesteps to correct device before loop in SD pipeline
      
      * apply previous fix to other SD pipelines
      
      * UNet now accepts tensor timesteps even on wrong device, to avoid errors
      - it shouldnt affect performance if timesteps are alrdy on correct device
      - it does slow down performance if they're on the wrong device
      
      * fix pipeline when timesteps are arrays with strides
      9ebaea54