1. 19 Dec, 2022 3 commits
  2. 18 Dec, 2022 1 commit
    • Will Berman's avatar
      kakaobrain unCLIP (#1428) · 2dcf64b7
      Will Berman authored
      
      
      * [wip] attention block updates
      
      * [wip] unCLIP unet decoder and super res
      
      * [wip] unCLIP prior transformer
      
      * [wip] scheduler changes
      
      * [wip] text proj utility class
      
      * [wip] UnCLIPPipeline
      
      * [wip] kakaobrain unCLIP convert script
      
      * [unCLIP pipeline] fixes re: @patrickvonplaten
      
      remove callbacks
      
      move denoising loops into call function
      
      * UNCLIPScheduler re: @patrickvonplaten
      
      Revert changes to DDPMScheduler. Make UNCLIPScheduler, a modified
      DDPM scheduler with changes to support karlo
      
      * mask -> attention_mask re: @patrickvonplaten
      
      * [DDPMScheduler] remove leftover change
      
      * [docs] PriorTransformer
      
      * [docs] UNet2DConditionModel and UNet2DModel
      
      * [nit] UNCLIPScheduler -> UnCLIPScheduler
      
      matches existing unclip naming better
      
      * [docs] SchedulingUnCLIP
      
      * [docs] UnCLIPTextProjModel
      
      * refactor
      
      * finish licenses
      
      * rename all to attention_mask and prep in models
      
      * more renaming
      
      * don't expose unused configs
      
      * final renaming fixes
      
      * remove x attn mask when not necessary
      
      * configure kakao script to use new class embedding config
      
      * fix copies
      
      * [tests] UnCLIPScheduler
      
      * finish x attn
      
      * finish
      
      * remove more
      
      * rename condition blocks
      
      * clean more
      
      * Apply suggestions from code review
      
      * up
      
      * fix
      
      * [tests] UnCLIPPipelineFastTests
      
      * remove unused imports
      
      * [tests] UnCLIPPipelineIntegrationTests
      
      * correct
      
      * make style
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      2dcf64b7
  3. 09 Dec, 2022 1 commit
  4. 07 Dec, 2022 3 commits
  5. 05 Dec, 2022 1 commit
  6. 03 Dec, 2022 1 commit
  7. 02 Dec, 2022 1 commit
  8. 01 Dec, 2022 1 commit
  9. 25 Nov, 2022 1 commit
    • Kashif Rasul's avatar
      [MPS] call contiguous after permute (#1411) · babfb8a0
      Kashif Rasul authored
      * call contiguous after permute
      
      Fixes for MPS device
      
      * Fix MPS UserWarning
      
      * make style
      
      * Revert "Fix MPS UserWarning"
      
      This reverts commit b46c32810ee5fdc4c16a8e9224a826490b66cf49.
      babfb8a0
  10. 24 Nov, 2022 1 commit
    • Suraj Patil's avatar
      Adapt UNet2D for supre-resolution (#1385) · cecdd8bd
      Suraj Patil authored
      * allow disabling self attention
      
      * add class_embedding
      
      * fix copies
      
      * fix condition
      
      * fix copies
      
      * do_self_attention -> only_cross_attention
      
      * fix copies
      
      * num_classes -> num_class_embeds
      
      * fix default value
      cecdd8bd
  11. 23 Nov, 2022 3 commits
    • Suraj Patil's avatar
      [Transformer2DModel] don't norm twice (#1381) · 15241225
      Suraj Patil authored
      don't norm twice
      15241225
    • Suraj Patil's avatar
      update unet2d (#1376) · f07a16e0
      Suraj Patil authored
      * boom boom
      
      * remove duplicate arg
      
      * add use_linear_proj arg
      
      * fix copies
      
      * style
      
      * add fast tests
      
      * use_linear_proj -> use_linear_projection
      f07a16e0
    • Patrick von Platen's avatar
      [Versatile Diffusion] Add versatile diffusion model (#1283) · 2625fb59
      Patrick von Platen authored
      
      
      * up
      
      * convert dual unet
      
      * revert dual attn
      
      * adapt for vd-official
      
      * test the full pipeline
      
      * mixed inference
      
      * mixed inference for text2img
      
      * add image prompting
      
      * fix clip norm
      
      * split text2img and img2img
      
      * fix format
      
      * refactor text2img
      
      * mega pipeline
      
      * add optimus
      
      * refactor image var
      
      * wip text_unet
      
      * text unet end to end
      
      * update tests
      
      * reshape
      
      * fix image to text
      
      * add some first docs
      
      * dual guided pipeline
      
      * fix token ratio
      
      * propose change
      
      * dual transformer as a native module
      
      * DualTransformer(nn.Module)
      
      * DualTransformer(nn.Module)
      
      * correct unconditional image
      
      * save-load with mega pipeline
      
      * remove image to text
      
      * up
      
      * uP
      
      * fix
      
      * up
      
      * final fix
      
      * remove_unused_weights
      
      * test updates
      
      * save progress
      
      * uP
      
      * fix dual prompts
      
      * some fixes
      
      * finish
      
      * style
      
      * finish renaming
      
      * up
      
      * fix
      
      * fix
      
      * fix
      
      * finish
      Co-authored-by: default avataranton-l <anton@huggingface.co>
      2625fb59
  12. 22 Nov, 2022 1 commit
  13. 21 Nov, 2022 1 commit
  14. 14 Nov, 2022 1 commit
  15. 08 Nov, 2022 1 commit
  16. 03 Nov, 2022 1 commit
    • Will Berman's avatar
      VQ-diffusion (#658) · ef2ea33c
      Will Berman authored
      
      
      * Changes for VQ-diffusion VQVAE
      
      Add specify dimension of embeddings to VQModel:
      `VQModel` will by default set the dimension of embeddings to the number
      of latent channels. The VQ-diffusion VQVAE has a smaller
      embedding dimension, 128, than number of latent channels, 256.
      
      Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down
      unet block helpers. VQ-diffusion's VQVAE uses those two block types.
      
      * Changes for VQ-diffusion transformer
      
      Modify attention.py so SpatialTransformer can be used for
      VQ-diffusion's transformer.
      
      SpatialTransformer:
      - Can now operate over discrete inputs (classes of vector embeddings) as well as continuous.
      - `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs
      - modified forward pass to take optional timestep embeddings
      
      ImagePositionalEmbeddings:
      - added to provide positional embeddings to discrete inputs for latent pixels
      
      BasicTransformerBlock:
      - norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings
      - modified forward pass to take optional timestep embeddings
      
      CrossAttention:
      - now may optionally take a bias parameter for its query, key, and value linear layers
      
      FeedForward:
      - Internal layers are now configurable
      
      ApproximateGELU:
      - Activation function in VQ-diffusion's feedforward layer
      
      AdaLayerNorm:
      - Norm layer modified to incorporate timestep embeddings
      
      * Add VQ-diffusion scheduler
      
      * Add VQ-diffusion pipeline
      
      * Add VQ-diffusion convert script to diffusers
      
      * Add VQ-diffusion dummy objects
      
      * Add VQ-diffusion markdown docs
      
      * Add VQ-diffusion tests
      
      * some renaming
      
      * some fixes
      
      * more renaming
      
      * correct
      
      * fix typo
      
      * correct weights
      
      * finalize
      
      * fix tests
      
      * Apply suggestions from code review
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * finish
      
      * finish
      
      * up
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      ef2ea33c
  17. 02 Nov, 2022 2 commits
    • Omiita's avatar
      Fix a small typo of a variable name (#1063) · 1216a3b1
      Omiita authored
      Fix a small typo
      
      fix a typo in `models/attention.py`.
      weight -> width
      1216a3b1
    • MatthieuTPHR's avatar
      Up to 2x speedup on GPUs using memory efficient attention (#532) · 98c42134
      MatthieuTPHR authored
      
      
      * 2x speedup using memory efficient attention
      
      * remove einops dependency
      
      * Swap K, M in op instantiation
      
      * Simplify code, remove unnecessary maybe_init call and function, remove unused self.scale parameter
      
      * make xformers a soft dependency
      
      * remove one-liner functions
      
      * change one letter variable to appropriate names
      
      * Remove Env variable dependency, remove MemoryEfficientCrossAttention class and use enable_xformers_memory_efficient_attention method
      
      * Add memory efficient attention toggle to img2img and inpaint pipelines
      
      * Clearer management of xformers' availability
      
      * update optimizations markdown to add info about memory efficient attention
      
      * add benchmarks for TITAN RTX
      
      * More detailed explanation of how the mem eff benchmark were ran
      
      * Removing autocast from optimization markdown
      
      * import_utils: import torch only if is available
      Co-authored-by: default avatarNouamane Tazi <nouamane98@gmail.com>
      98c42134
  18. 31 Oct, 2022 1 commit
  19. 29 Oct, 2022 1 commit
    • Pedro Cuenca's avatar
      Experimental: allow fp16 in `mps` (#961) · 95414bd6
      Pedro Cuenca authored
      * Docs: refer to pre-RC version of PyTorch 1.13.0.
      
      * Remove temporary workaround for unavailable op.
      
      * Update comment to make it less ambiguous.
      
      * Remove use of contiguous in mps.
      
      It appears to not longer be necessary.
      
      * Special case: use einsum for much better performance in mps
      
      * Update mps docs.
      
      * MPS: make pipeline work in half precision.
      95414bd6
  20. 25 Oct, 2022 1 commit
  21. 12 Oct, 2022 1 commit
  22. 30 Sep, 2022 2 commits
    • Nouamane Tazi's avatar
      Fix slow tests (#689) · b2cfc7a0
      Nouamane Tazi authored
      * revert using baddbmm in attention
      - to fix `test_stable_diffusion_memory_chunking` test
      
      * styling
      b2cfc7a0
    • Nouamane Tazi's avatar
      Optimize Stable Diffusion (#371) · 9ebaea54
      Nouamane Tazi authored
      * initial commit
      
      * make UNet stream capturable
      
      * try to fix noise_pred value
      
      * remove cuda graph and keep NB
      
      * non blocking unet with PNDMScheduler
      
      * make timesteps np arrays for pndm scheduler
      because lists don't get formatted to tensors in `self.set_format`
      
      * make max async in pndm
      
      * use channel last format in unet
      
      * avoid moving timesteps device in each unet call
      
      * avoid memcpy op in `get_timestep_embedding`
      
      * add `channels_last` kwarg to `DiffusionPipeline.from_pretrained`
      
      * update TODO
      
      * replace `channels_last` kwarg with `memory_format` for more generality
      
      * revert the channels_last changes to leave it for another PR
      
      * remove non_blocking when moving input ids to device
      
      * remove blocking from all .to() operations at beginning of pipeline
      
      * fix merging
      
      * fix merging
      
      * model can run in other precisions without autocast
      
      * attn refactoring
      
      * Revert "attn refactoring"
      
      This reverts commit 0c70c0e189cd2c4d8768274c9fcf5b940ee310fb.
      
      * remove restriction to run conv_norm in fp32
      
      * use `baddbmm` instead of `matmul`for better in attention for better perf
      
      * removing all reshapes to test perf
      
      * Revert "removing all reshapes to test perf"
      
      This reverts commit 006ccb8a8c6bc7eb7e512392e692a29d9b1553cd.
      
      * add shapes comments
      
      * hardcore whats needed for jitting
      
      * Revert "hardcore whats needed for jitting"
      
      This reverts commit 2fa9c698eae2890ac5f8e367ca80532ecf94df9a.
      
      * Revert "remove restriction to run conv_norm in fp32"
      
      This reverts commit cec592890c32da3d1b78d38b49e4307aedf459b9.
      
      * revert using baddmm in attention's forward
      
      * cleanup comment
      
      * remove restriction to run conv_norm in fp32. no quality loss was noticed
      
      This reverts commit cc9bc1339c998ebe9e7d733f910c6d72d9792213.
      
      * add more optimizations techniques to docs
      
      * Revert "add shapes comments"
      
      This reverts commit 31c58eadb8892f95478cdf05229adf678678c5f4.
      
      * apply suggestions
      
      * make quality
      
      * apply suggestions
      
      * styling
      
      * `scheduler.timesteps` are now arrays so we dont need .to()
      
      * remove useless .type()
      
      * use mean instead of max in `test_stable_diffusion_inpaint_pipeline_k_lms`
      
      * move scheduler timestamps to correct device if tensors
      
      * add device to `set_timesteps` in LMSD scheduler
      
      * `self.scheduler.set_timesteps` now uses device arg for schedulers that accept it
      
      * quick fix
      
      * styling
      
      * remove kwargs from schedulers `set_timesteps`
      
      * revert to using max in K-LMS inpaint pipeline test
      
      * Revert "`self.scheduler.set_timesteps` now uses device arg for schedulers that accept it"
      
      This reverts commit 00d5a51e5c20d8d445c8664407ef29608106d899.
      
      * move timesteps to correct device before loop in SD pipeline
      
      * apply previous fix to other SD pipelines
      
      * UNet now accepts tensor timesteps even on wrong device, to avoid errors
      - it shouldnt affect performance if timesteps are alrdy on correct device
      - it does slow down performance if they're on the wrong device
      
      * fix pipeline when timesteps are arrays with strides
      9ebaea54
  23. 27 Sep, 2022 1 commit
  24. 19 Sep, 2022 3 commits
  25. 15 Sep, 2022 1 commit
  26. 14 Sep, 2022 1 commit
  27. 09 Sep, 2022 2 commits
  28. 08 Sep, 2022 2 commits
    • Kashif Rasul's avatar
      [Docs] Models (#416) · 5e6417e9
      Kashif Rasul authored
      
      
      * docs for attention
      
      * types for embeddings
      
      * unet2d docstrings
      
      * UNet2DConditionModel docstrings
      
      * fix typos
      
      * style and vq-vae docstrings
      
      * docstrings  for VAE
      
      * Update src/diffusers/models/unet_2d.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * make style
      
      * added inherits from sentence
      
      * docstring to forward
      
      * make style
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * finish model docs
      
      * up
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      5e6417e9
    • Pedro Cuenca's avatar
      Inference support for `mps` device (#355) · 5dda1735
      Pedro Cuenca authored
      * Initial support for mps in Stable Diffusion pipeline.
      
      * Initial "warmup" implementation when using mps.
      
      * Make some deterministic tests pass with mps.
      
      * Disable training tests when using mps.
      
      * SD: generate latents in CPU then move to device.
      
      This is especially important when using the mps device, because
      generators are not supported there. See for example
      https://github.com/pytorch/pytorch/issues/84288.
      
      In addition, the other pipelines seem to use the same approach: generate
      the random samples then move to the appropriate device.
      
      After this change, generating an image in MPS produces the same result
      as when using the CPU, if the same seed is used.
      
      * Remove prints.
      
      * Pass AutoencoderKL test_output_pretrained with mps.
      
      Sampling from `posterior` must be done in CPU.
      
      * Style
      
      * Do not use torch.long for log op in mps device.
      
      * Perform incompatible padding ops in CPU.
      
      UNet tests now pass.
      See https://github.com/pytorch/pytorch/issues/84535
      
      
      
      * Style: fix import order.
      
      * Remove unused symbols.
      
      * Remove MPSWarmupMixin, do not apply automatically.
      
      We do apply warmup in the tests, but not during normal use.
      This adopts some PR suggestions by @patrickvonplaten.
      
      * Add comment for mps fallback to CPU step.
      
      * Add README_mps.md for mps installation and use.
      
      * Apply `black` to modified files.
      
      * Restrict README_mps to SD, show measures in table.
      
      * Make PNDM indexing compatible with mps.
      
      Addresses #239.
      
      * Do not use float64 when using LDMScheduler.
      
      Fixes #358.
      
      * Fix typo identified by @patil-suraj
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Adapt example to new output style.
      
      * Restore 1:1 results reproducibility with CompVis.
      
      However, mps latents need to be generated in CPU because generators
      don't work in the mps device.
      
      * Move PyTorch nightly to requirements.
      
      * Adapt `test_scheduler_outputs_equivalence` ton MPS.
      
      * mps: skip training tests instead of ignoring silently.
      
      * Make VQModel tests pass on mps.
      
      * mps ddim tests: warmup, increase tolerance.
      
      * ScoreSdeVeScheduler indexing made mps compatible.
      
      * Make ldm pipeline tests pass using warmup.
      
      * Style
      
      * Simplify casting as suggested in PR.
      
      * Add Known Issues to readme.
      
      * `isort` import order.
      
      * Remove _mps_warmup helpers from ModelMixin.
      
      And just make changes to the tests.
      
      * Skip tests using unittest decorator for consistency.
      
      * Remove temporary var.
      
      * Remove spurious blank space.
      
      * Remove unused symbol.
      
      * Remove README_mps.
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> 
      5dda1735