1. 17 Jan, 2023 1 commit
    • Kashif Rasul's avatar
      DiT Pipeline (#1806) · 37d113cc
      Kashif Rasul authored
      
      
      * added dit model
      
      * import
      
      * initial pipeline
      
      * initial convert script
      
      * initial pipeline
      
      * make style
      
      * raise valueerror
      
      * single function
      
      * rename classes
      
      * use DDIMScheduler
      
      * timesteps embedder
      
      * samples to cpu
      
      * fix var names
      
      * fix numpy type
      
      * use timesteps class for proj
      
      * fix typo
      
      * fix arg name
      
      * flip_sin_to_cos and better var names
      
      * fix C shape cal
      
      * make style
      
      * remove unused imports
      
      * cleanup
      
      * add back patch_size
      
      * initial dit doc
      
      * typo
      
      * Update docs/source/api/pipelines/dit.mdx
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * added copyright license headers
      
      * added example usage and toc
      
      * fix variable names asserts
      
      * remove comment
      
      * added docs
      
      * fix typo
      
      * upstream changes
      
      * set proper device for drop_ids
      
      * added initial dit pipeline test
      
      * update docs
      
      * fix imports
      
      * make fix-copies
      
      * isort
      
      * fix imports
      
      * get rid of more magic numbers
      
      * fix code when guidance is off
      
      * remove block_kwargs
      
      * cleanup script
      
      * removed to_2tuple
      
      * use FeedForward class instead of another MLP
      
      * style
      
      * work on mergint DiTBlock with BasicTransformerBlock
      
      * added missing final_dropout and args to BasicTransformerBlock
      
      * use norm from block
      
      * fix arg
      
      * remove unused arg
      
      * fix call to class_embedder
      
      * use timesteps
      
      * make style
      
      * attn_output gets multiplied
      
      * removed commented code
      
      * use Transformer2D
      
      * use self.is_input_patches
      
      * fix flags
      
      * fixed conversion to use Transformer2DModel
      
      * fixes for pipeline
      
      * remove dit.py
      
      * fix timesteps device
      
      * use randn_tensor and fix fp16 inf.
      
      * timesteps_emb already the right dtype
      
      * fix dit test class
      
      * fix test and style
      
      * fix norm2 usage in vq-diffusion
      
      * added author names to pipeline and lmagenet labels link
      
      * fix tests
      
      * use norm_type as string
      
      * rename dit to transformer
      
      * fix name
      
      * fix test
      
      * set  norm_type = "layer" by default
      
      * fix tests
      
      * do not skip common tests
      
      * Update src/diffusers/models/attention.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * revert AdaLayerNorm API
      
      * fix norm_type name
      
      * make sure all components are in eval mode
      
      * revert norm2 API
      
      * compact
      
      * finish deprecation
      
      * add slow tests
      
      * remove @
      
      * refactor some stuff
      
      * upload
      
      * Update src/diffusers/pipelines/dit/pipeline_dit.py
      
      * finish more
      
      * finish docs
      
      * improve docs
      
      * finish docs
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: default avatarWilliam Berman <WLBberman@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      37d113cc
  2. 16 Jan, 2023 1 commit
  3. 01 Jan, 2023 1 commit
  4. 30 Dec, 2022 1 commit
  5. 28 Dec, 2022 1 commit
  6. 27 Dec, 2022 1 commit
  7. 20 Dec, 2022 3 commits
  8. 19 Dec, 2022 3 commits
  9. 18 Dec, 2022 1 commit
    • Will Berman's avatar
      kakaobrain unCLIP (#1428) · 2dcf64b7
      Will Berman authored
      
      
      * [wip] attention block updates
      
      * [wip] unCLIP unet decoder and super res
      
      * [wip] unCLIP prior transformer
      
      * [wip] scheduler changes
      
      * [wip] text proj utility class
      
      * [wip] UnCLIPPipeline
      
      * [wip] kakaobrain unCLIP convert script
      
      * [unCLIP pipeline] fixes re: @patrickvonplaten
      
      remove callbacks
      
      move denoising loops into call function
      
      * UNCLIPScheduler re: @patrickvonplaten
      
      Revert changes to DDPMScheduler. Make UNCLIPScheduler, a modified
      DDPM scheduler with changes to support karlo
      
      * mask -> attention_mask re: @patrickvonplaten
      
      * [DDPMScheduler] remove leftover change
      
      * [docs] PriorTransformer
      
      * [docs] UNet2DConditionModel and UNet2DModel
      
      * [nit] UNCLIPScheduler -> UnCLIPScheduler
      
      matches existing unclip naming better
      
      * [docs] SchedulingUnCLIP
      
      * [docs] UnCLIPTextProjModel
      
      * refactor
      
      * finish licenses
      
      * rename all to attention_mask and prep in models
      
      * more renaming
      
      * don't expose unused configs
      
      * final renaming fixes
      
      * remove x attn mask when not necessary
      
      * configure kakao script to use new class embedding config
      
      * fix copies
      
      * [tests] UnCLIPScheduler
      
      * finish x attn
      
      * finish
      
      * remove more
      
      * rename condition blocks
      
      * clean more
      
      * Apply suggestions from code review
      
      * up
      
      * fix
      
      * [tests] UnCLIPPipelineFastTests
      
      * remove unused imports
      
      * [tests] UnCLIPPipelineIntegrationTests
      
      * correct
      
      * make style
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      2dcf64b7
  10. 09 Dec, 2022 1 commit
  11. 07 Dec, 2022 3 commits
  12. 05 Dec, 2022 1 commit
  13. 03 Dec, 2022 1 commit
  14. 02 Dec, 2022 1 commit
  15. 01 Dec, 2022 1 commit
  16. 25 Nov, 2022 1 commit
    • Kashif Rasul's avatar
      [MPS] call contiguous after permute (#1411) · babfb8a0
      Kashif Rasul authored
      * call contiguous after permute
      
      Fixes for MPS device
      
      * Fix MPS UserWarning
      
      * make style
      
      * Revert "Fix MPS UserWarning"
      
      This reverts commit b46c32810ee5fdc4c16a8e9224a826490b66cf49.
      babfb8a0
  17. 24 Nov, 2022 1 commit
    • Suraj Patil's avatar
      Adapt UNet2D for supre-resolution (#1385) · cecdd8bd
      Suraj Patil authored
      * allow disabling self attention
      
      * add class_embedding
      
      * fix copies
      
      * fix condition
      
      * fix copies
      
      * do_self_attention -> only_cross_attention
      
      * fix copies
      
      * num_classes -> num_class_embeds
      
      * fix default value
      cecdd8bd
  18. 23 Nov, 2022 3 commits
    • Suraj Patil's avatar
      [Transformer2DModel] don't norm twice (#1381) · 15241225
      Suraj Patil authored
      don't norm twice
      15241225
    • Suraj Patil's avatar
      update unet2d (#1376) · f07a16e0
      Suraj Patil authored
      * boom boom
      
      * remove duplicate arg
      
      * add use_linear_proj arg
      
      * fix copies
      
      * style
      
      * add fast tests
      
      * use_linear_proj -> use_linear_projection
      f07a16e0
    • Patrick von Platen's avatar
      [Versatile Diffusion] Add versatile diffusion model (#1283) · 2625fb59
      Patrick von Platen authored
      
      
      * up
      
      * convert dual unet
      
      * revert dual attn
      
      * adapt for vd-official
      
      * test the full pipeline
      
      * mixed inference
      
      * mixed inference for text2img
      
      * add image prompting
      
      * fix clip norm
      
      * split text2img and img2img
      
      * fix format
      
      * refactor text2img
      
      * mega pipeline
      
      * add optimus
      
      * refactor image var
      
      * wip text_unet
      
      * text unet end to end
      
      * update tests
      
      * reshape
      
      * fix image to text
      
      * add some first docs
      
      * dual guided pipeline
      
      * fix token ratio
      
      * propose change
      
      * dual transformer as a native module
      
      * DualTransformer(nn.Module)
      
      * DualTransformer(nn.Module)
      
      * correct unconditional image
      
      * save-load with mega pipeline
      
      * remove image to text
      
      * up
      
      * uP
      
      * fix
      
      * up
      
      * final fix
      
      * remove_unused_weights
      
      * test updates
      
      * save progress
      
      * uP
      
      * fix dual prompts
      
      * some fixes
      
      * finish
      
      * style
      
      * finish renaming
      
      * up
      
      * fix
      
      * fix
      
      * fix
      
      * finish
      Co-authored-by: default avataranton-l <anton@huggingface.co>
      2625fb59
  19. 22 Nov, 2022 1 commit
  20. 21 Nov, 2022 1 commit
  21. 14 Nov, 2022 1 commit
  22. 08 Nov, 2022 1 commit
  23. 03 Nov, 2022 1 commit
    • Will Berman's avatar
      VQ-diffusion (#658) · ef2ea33c
      Will Berman authored
      
      
      * Changes for VQ-diffusion VQVAE
      
      Add specify dimension of embeddings to VQModel:
      `VQModel` will by default set the dimension of embeddings to the number
      of latent channels. The VQ-diffusion VQVAE has a smaller
      embedding dimension, 128, than number of latent channels, 256.
      
      Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down
      unet block helpers. VQ-diffusion's VQVAE uses those two block types.
      
      * Changes for VQ-diffusion transformer
      
      Modify attention.py so SpatialTransformer can be used for
      VQ-diffusion's transformer.
      
      SpatialTransformer:
      - Can now operate over discrete inputs (classes of vector embeddings) as well as continuous.
      - `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs
      - modified forward pass to take optional timestep embeddings
      
      ImagePositionalEmbeddings:
      - added to provide positional embeddings to discrete inputs for latent pixels
      
      BasicTransformerBlock:
      - norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings
      - modified forward pass to take optional timestep embeddings
      
      CrossAttention:
      - now may optionally take a bias parameter for its query, key, and value linear layers
      
      FeedForward:
      - Internal layers are now configurable
      
      ApproximateGELU:
      - Activation function in VQ-diffusion's feedforward layer
      
      AdaLayerNorm:
      - Norm layer modified to incorporate timestep embeddings
      
      * Add VQ-diffusion scheduler
      
      * Add VQ-diffusion pipeline
      
      * Add VQ-diffusion convert script to diffusers
      
      * Add VQ-diffusion dummy objects
      
      * Add VQ-diffusion markdown docs
      
      * Add VQ-diffusion tests
      
      * some renaming
      
      * some fixes
      
      * more renaming
      
      * correct
      
      * fix typo
      
      * correct weights
      
      * finalize
      
      * fix tests
      
      * Apply suggestions from code review
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * finish
      
      * finish
      
      * up
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      ef2ea33c
  24. 02 Nov, 2022 2 commits
    • Omiita's avatar
      Fix a small typo of a variable name (#1063) · 1216a3b1
      Omiita authored
      Fix a small typo
      
      fix a typo in `models/attention.py`.
      weight -> width
      1216a3b1
    • MatthieuTPHR's avatar
      Up to 2x speedup on GPUs using memory efficient attention (#532) · 98c42134
      MatthieuTPHR authored
      
      
      * 2x speedup using memory efficient attention
      
      * remove einops dependency
      
      * Swap K, M in op instantiation
      
      * Simplify code, remove unnecessary maybe_init call and function, remove unused self.scale parameter
      
      * make xformers a soft dependency
      
      * remove one-liner functions
      
      * change one letter variable to appropriate names
      
      * Remove Env variable dependency, remove MemoryEfficientCrossAttention class and use enable_xformers_memory_efficient_attention method
      
      * Add memory efficient attention toggle to img2img and inpaint pipelines
      
      * Clearer management of xformers' availability
      
      * update optimizations markdown to add info about memory efficient attention
      
      * add benchmarks for TITAN RTX
      
      * More detailed explanation of how the mem eff benchmark were ran
      
      * Removing autocast from optimization markdown
      
      * import_utils: import torch only if is available
      Co-authored-by: default avatarNouamane Tazi <nouamane98@gmail.com>
      98c42134
  25. 31 Oct, 2022 1 commit
  26. 29 Oct, 2022 1 commit
    • Pedro Cuenca's avatar
      Experimental: allow fp16 in `mps` (#961) · 95414bd6
      Pedro Cuenca authored
      * Docs: refer to pre-RC version of PyTorch 1.13.0.
      
      * Remove temporary workaround for unavailable op.
      
      * Update comment to make it less ambiguous.
      
      * Remove use of contiguous in mps.
      
      It appears to not longer be necessary.
      
      * Special case: use einsum for much better performance in mps
      
      * Update mps docs.
      
      * MPS: make pipeline work in half precision.
      95414bd6
  27. 25 Oct, 2022 1 commit
  28. 12 Oct, 2022 1 commit
  29. 30 Sep, 2022 2 commits
    • Nouamane Tazi's avatar
      Fix slow tests (#689) · b2cfc7a0
      Nouamane Tazi authored
      * revert using baddbmm in attention
      - to fix `test_stable_diffusion_memory_chunking` test
      
      * styling
      b2cfc7a0
    • Nouamane Tazi's avatar
      Optimize Stable Diffusion (#371) · 9ebaea54
      Nouamane Tazi authored
      * initial commit
      
      * make UNet stream capturable
      
      * try to fix noise_pred value
      
      * remove cuda graph and keep NB
      
      * non blocking unet with PNDMScheduler
      
      * make timesteps np arrays for pndm scheduler
      because lists don't get formatted to tensors in `self.set_format`
      
      * make max async in pndm
      
      * use channel last format in unet
      
      * avoid moving timesteps device in each unet call
      
      * avoid memcpy op in `get_timestep_embedding`
      
      * add `channels_last` kwarg to `DiffusionPipeline.from_pretrained`
      
      * update TODO
      
      * replace `channels_last` kwarg with `memory_format` for more generality
      
      * revert the channels_last changes to leave it for another PR
      
      * remove non_blocking when moving input ids to device
      
      * remove blocking from all .to() operations at beginning of pipeline
      
      * fix merging
      
      * fix merging
      
      * model can run in other precisions without autocast
      
      * attn refactoring
      
      * Revert "attn refactoring"
      
      This reverts commit 0c70c0e189cd2c4d8768274c9fcf5b940ee310fb.
      
      * remove restriction to run conv_norm in fp32
      
      * use `baddbmm` instead of `matmul`for better in attention for better perf
      
      * removing all reshapes to test perf
      
      * Revert "removing all reshapes to test perf"
      
      This reverts commit 006ccb8a8c6bc7eb7e512392e692a29d9b1553cd.
      
      * add shapes comments
      
      * hardcore whats needed for jitting
      
      * Revert "hardcore whats needed for jitting"
      
      This reverts commit 2fa9c698eae2890ac5f8e367ca80532ecf94df9a.
      
      * Revert "remove restriction to run conv_norm in fp32"
      
      This reverts commit cec592890c32da3d1b78d38b49e4307aedf459b9.
      
      * revert using baddmm in attention's forward
      
      * cleanup comment
      
      * remove restriction to run conv_norm in fp32. no quality loss was noticed
      
      This reverts commit cc9bc1339c998ebe9e7d733f910c6d72d9792213.
      
      * add more optimizations techniques to docs
      
      * Revert "add shapes comments"
      
      This reverts commit 31c58eadb8892f95478cdf05229adf678678c5f4.
      
      * apply suggestions
      
      * make quality
      
      * apply suggestions
      
      * styling
      
      * `scheduler.timesteps` are now arrays so we dont need .to()
      
      * remove useless .type()
      
      * use mean instead of max in `test_stable_diffusion_inpaint_pipeline_k_lms`
      
      * move scheduler timestamps to correct device if tensors
      
      * add device to `set_timesteps` in LMSD scheduler
      
      * `self.scheduler.set_timesteps` now uses device arg for schedulers that accept it
      
      * quick fix
      
      * styling
      
      * remove kwargs from schedulers `set_timesteps`
      
      * revert to using max in K-LMS inpaint pipeline test
      
      * Revert "`self.scheduler.set_timesteps` now uses device arg for schedulers that accept it"
      
      This reverts commit 00d5a51e5c20d8d445c8664407ef29608106d899.
      
      * move timesteps to correct device before loop in SD pipeline
      
      * apply previous fix to other SD pipelines
      
      * UNet now accepts tensor timesteps even on wrong device, to avoid errors
      - it shouldnt affect performance if timesteps are alrdy on correct device
      - it does slow down performance if they're on the wrong device
      
      * fix pipeline when timesteps are arrays with strides
      9ebaea54
  30. 27 Sep, 2022 1 commit