1. 27 Dec, 2022 1 commit
  2. 20 Dec, 2022 3 commits
  3. 19 Dec, 2022 3 commits
  4. 18 Dec, 2022 1 commit
    • Will Berman's avatar
      kakaobrain unCLIP (#1428) · 2dcf64b7
      Will Berman authored
      
      
      * [wip] attention block updates
      
      * [wip] unCLIP unet decoder and super res
      
      * [wip] unCLIP prior transformer
      
      * [wip] scheduler changes
      
      * [wip] text proj utility class
      
      * [wip] UnCLIPPipeline
      
      * [wip] kakaobrain unCLIP convert script
      
      * [unCLIP pipeline] fixes re: @patrickvonplaten
      
      remove callbacks
      
      move denoising loops into call function
      
      * UNCLIPScheduler re: @patrickvonplaten
      
      Revert changes to DDPMScheduler. Make UNCLIPScheduler, a modified
      DDPM scheduler with changes to support karlo
      
      * mask -> attention_mask re: @patrickvonplaten
      
      * [DDPMScheduler] remove leftover change
      
      * [docs] PriorTransformer
      
      * [docs] UNet2DConditionModel and UNet2DModel
      
      * [nit] UNCLIPScheduler -> UnCLIPScheduler
      
      matches existing unclip naming better
      
      * [docs] SchedulingUnCLIP
      
      * [docs] UnCLIPTextProjModel
      
      * refactor
      
      * finish licenses
      
      * rename all to attention_mask and prep in models
      
      * more renaming
      
      * don't expose unused configs
      
      * final renaming fixes
      
      * remove x attn mask when not necessary
      
      * configure kakao script to use new class embedding config
      
      * fix copies
      
      * [tests] UnCLIPScheduler
      
      * finish x attn
      
      * finish
      
      * remove more
      
      * rename condition blocks
      
      * clean more
      
      * Apply suggestions from code review
      
      * up
      
      * fix
      
      * [tests] UnCLIPPipelineFastTests
      
      * remove unused imports
      
      * [tests] UnCLIPPipelineIntegrationTests
      
      * correct
      
      * make style
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      2dcf64b7
  5. 09 Dec, 2022 1 commit
  6. 07 Dec, 2022 3 commits
  7. 05 Dec, 2022 1 commit
  8. 03 Dec, 2022 1 commit
  9. 02 Dec, 2022 1 commit
  10. 01 Dec, 2022 1 commit
  11. 25 Nov, 2022 1 commit
    • Kashif Rasul's avatar
      [MPS] call contiguous after permute (#1411) · babfb8a0
      Kashif Rasul authored
      * call contiguous after permute
      
      Fixes for MPS device
      
      * Fix MPS UserWarning
      
      * make style
      
      * Revert "Fix MPS UserWarning"
      
      This reverts commit b46c32810ee5fdc4c16a8e9224a826490b66cf49.
      babfb8a0
  12. 24 Nov, 2022 1 commit
    • Suraj Patil's avatar
      Adapt UNet2D for supre-resolution (#1385) · cecdd8bd
      Suraj Patil authored
      * allow disabling self attention
      
      * add class_embedding
      
      * fix copies
      
      * fix condition
      
      * fix copies
      
      * do_self_attention -> only_cross_attention
      
      * fix copies
      
      * num_classes -> num_class_embeds
      
      * fix default value
      cecdd8bd
  13. 23 Nov, 2022 3 commits
    • Suraj Patil's avatar
      [Transformer2DModel] don't norm twice (#1381) · 15241225
      Suraj Patil authored
      don't norm twice
      15241225
    • Suraj Patil's avatar
      update unet2d (#1376) · f07a16e0
      Suraj Patil authored
      * boom boom
      
      * remove duplicate arg
      
      * add use_linear_proj arg
      
      * fix copies
      
      * style
      
      * add fast tests
      
      * use_linear_proj -> use_linear_projection
      f07a16e0
    • Patrick von Platen's avatar
      [Versatile Diffusion] Add versatile diffusion model (#1283) · 2625fb59
      Patrick von Platen authored
      
      
      * up
      
      * convert dual unet
      
      * revert dual attn
      
      * adapt for vd-official
      
      * test the full pipeline
      
      * mixed inference
      
      * mixed inference for text2img
      
      * add image prompting
      
      * fix clip norm
      
      * split text2img and img2img
      
      * fix format
      
      * refactor text2img
      
      * mega pipeline
      
      * add optimus
      
      * refactor image var
      
      * wip text_unet
      
      * text unet end to end
      
      * update tests
      
      * reshape
      
      * fix image to text
      
      * add some first docs
      
      * dual guided pipeline
      
      * fix token ratio
      
      * propose change
      
      * dual transformer as a native module
      
      * DualTransformer(nn.Module)
      
      * DualTransformer(nn.Module)
      
      * correct unconditional image
      
      * save-load with mega pipeline
      
      * remove image to text
      
      * up
      
      * uP
      
      * fix
      
      * up
      
      * final fix
      
      * remove_unused_weights
      
      * test updates
      
      * save progress
      
      * uP
      
      * fix dual prompts
      
      * some fixes
      
      * finish
      
      * style
      
      * finish renaming
      
      * up
      
      * fix
      
      * fix
      
      * fix
      
      * finish
      Co-authored-by: default avataranton-l <anton@huggingface.co>
      2625fb59
  14. 22 Nov, 2022 1 commit
  15. 21 Nov, 2022 1 commit
  16. 14 Nov, 2022 1 commit
  17. 08 Nov, 2022 1 commit
  18. 03 Nov, 2022 1 commit
    • Will Berman's avatar
      VQ-diffusion (#658) · ef2ea33c
      Will Berman authored
      
      
      * Changes for VQ-diffusion VQVAE
      
      Add specify dimension of embeddings to VQModel:
      `VQModel` will by default set the dimension of embeddings to the number
      of latent channels. The VQ-diffusion VQVAE has a smaller
      embedding dimension, 128, than number of latent channels, 256.
      
      Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down
      unet block helpers. VQ-diffusion's VQVAE uses those two block types.
      
      * Changes for VQ-diffusion transformer
      
      Modify attention.py so SpatialTransformer can be used for
      VQ-diffusion's transformer.
      
      SpatialTransformer:
      - Can now operate over discrete inputs (classes of vector embeddings) as well as continuous.
      - `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs
      - modified forward pass to take optional timestep embeddings
      
      ImagePositionalEmbeddings:
      - added to provide positional embeddings to discrete inputs for latent pixels
      
      BasicTransformerBlock:
      - norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings
      - modified forward pass to take optional timestep embeddings
      
      CrossAttention:
      - now may optionally take a bias parameter for its query, key, and value linear layers
      
      FeedForward:
      - Internal layers are now configurable
      
      ApproximateGELU:
      - Activation function in VQ-diffusion's feedforward layer
      
      AdaLayerNorm:
      - Norm layer modified to incorporate timestep embeddings
      
      * Add VQ-diffusion scheduler
      
      * Add VQ-diffusion pipeline
      
      * Add VQ-diffusion convert script to diffusers
      
      * Add VQ-diffusion dummy objects
      
      * Add VQ-diffusion markdown docs
      
      * Add VQ-diffusion tests
      
      * some renaming
      
      * some fixes
      
      * more renaming
      
      * correct
      
      * fix typo
      
      * correct weights
      
      * finalize
      
      * fix tests
      
      * Apply suggestions from code review
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * finish
      
      * finish
      
      * up
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      ef2ea33c
  19. 02 Nov, 2022 2 commits
    • Omiita's avatar
      Fix a small typo of a variable name (#1063) · 1216a3b1
      Omiita authored
      Fix a small typo
      
      fix a typo in `models/attention.py`.
      weight -> width
      1216a3b1
    • MatthieuTPHR's avatar
      Up to 2x speedup on GPUs using memory efficient attention (#532) · 98c42134
      MatthieuTPHR authored
      
      
      * 2x speedup using memory efficient attention
      
      * remove einops dependency
      
      * Swap K, M in op instantiation
      
      * Simplify code, remove unnecessary maybe_init call and function, remove unused self.scale parameter
      
      * make xformers a soft dependency
      
      * remove one-liner functions
      
      * change one letter variable to appropriate names
      
      * Remove Env variable dependency, remove MemoryEfficientCrossAttention class and use enable_xformers_memory_efficient_attention method
      
      * Add memory efficient attention toggle to img2img and inpaint pipelines
      
      * Clearer management of xformers' availability
      
      * update optimizations markdown to add info about memory efficient attention
      
      * add benchmarks for TITAN RTX
      
      * More detailed explanation of how the mem eff benchmark were ran
      
      * Removing autocast from optimization markdown
      
      * import_utils: import torch only if is available
      Co-authored-by: default avatarNouamane Tazi <nouamane98@gmail.com>
      98c42134
  20. 31 Oct, 2022 1 commit
  21. 29 Oct, 2022 1 commit
    • Pedro Cuenca's avatar
      Experimental: allow fp16 in `mps` (#961) · 95414bd6
      Pedro Cuenca authored
      * Docs: refer to pre-RC version of PyTorch 1.13.0.
      
      * Remove temporary workaround for unavailable op.
      
      * Update comment to make it less ambiguous.
      
      * Remove use of contiguous in mps.
      
      It appears to not longer be necessary.
      
      * Special case: use einsum for much better performance in mps
      
      * Update mps docs.
      
      * MPS: make pipeline work in half precision.
      95414bd6
  22. 25 Oct, 2022 1 commit
  23. 12 Oct, 2022 1 commit
  24. 30 Sep, 2022 2 commits
    • Nouamane Tazi's avatar
      Fix slow tests (#689) · b2cfc7a0
      Nouamane Tazi authored
      * revert using baddbmm in attention
      - to fix `test_stable_diffusion_memory_chunking` test
      
      * styling
      b2cfc7a0
    • Nouamane Tazi's avatar
      Optimize Stable Diffusion (#371) · 9ebaea54
      Nouamane Tazi authored
      * initial commit
      
      * make UNet stream capturable
      
      * try to fix noise_pred value
      
      * remove cuda graph and keep NB
      
      * non blocking unet with PNDMScheduler
      
      * make timesteps np arrays for pndm scheduler
      because lists don't get formatted to tensors in `self.set_format`
      
      * make max async in pndm
      
      * use channel last format in unet
      
      * avoid moving timesteps device in each unet call
      
      * avoid memcpy op in `get_timestep_embedding`
      
      * add `channels_last` kwarg to `DiffusionPipeline.from_pretrained`
      
      * update TODO
      
      * replace `channels_last` kwarg with `memory_format` for more generality
      
      * revert the channels_last changes to leave it for another PR
      
      * remove non_blocking when moving input ids to device
      
      * remove blocking from all .to() operations at beginning of pipeline
      
      * fix merging
      
      * fix merging
      
      * model can run in other precisions without autocast
      
      * attn refactoring
      
      * Revert "attn refactoring"
      
      This reverts commit 0c70c0e189cd2c4d8768274c9fcf5b940ee310fb.
      
      * remove restriction to run conv_norm in fp32
      
      * use `baddbmm` instead of `matmul`for better in attention for better perf
      
      * removing all reshapes to test perf
      
      * Revert "removing all reshapes to test perf"
      
      This reverts commit 006ccb8a8c6bc7eb7e512392e692a29d9b1553cd.
      
      * add shapes comments
      
      * hardcore whats needed for jitting
      
      * Revert "hardcore whats needed for jitting"
      
      This reverts commit 2fa9c698eae2890ac5f8e367ca80532ecf94df9a.
      
      * Revert "remove restriction to run conv_norm in fp32"
      
      This reverts commit cec592890c32da3d1b78d38b49e4307aedf459b9.
      
      * revert using baddmm in attention's forward
      
      * cleanup comment
      
      * remove restriction to run conv_norm in fp32. no quality loss was noticed
      
      This reverts commit cc9bc1339c998ebe9e7d733f910c6d72d9792213.
      
      * add more optimizations techniques to docs
      
      * Revert "add shapes comments"
      
      This reverts commit 31c58eadb8892f95478cdf05229adf678678c5f4.
      
      * apply suggestions
      
      * make quality
      
      * apply suggestions
      
      * styling
      
      * `scheduler.timesteps` are now arrays so we dont need .to()
      
      * remove useless .type()
      
      * use mean instead of max in `test_stable_diffusion_inpaint_pipeline_k_lms`
      
      * move scheduler timestamps to correct device if tensors
      
      * add device to `set_timesteps` in LMSD scheduler
      
      * `self.scheduler.set_timesteps` now uses device arg for schedulers that accept it
      
      * quick fix
      
      * styling
      
      * remove kwargs from schedulers `set_timesteps`
      
      * revert to using max in K-LMS inpaint pipeline test
      
      * Revert "`self.scheduler.set_timesteps` now uses device arg for schedulers that accept it"
      
      This reverts commit 00d5a51e5c20d8d445c8664407ef29608106d899.
      
      * move timesteps to correct device before loop in SD pipeline
      
      * apply previous fix to other SD pipelines
      
      * UNet now accepts tensor timesteps even on wrong device, to avoid errors
      - it shouldnt affect performance if timesteps are alrdy on correct device
      - it does slow down performance if they're on the wrong device
      
      * fix pipeline when timesteps are arrays with strides
      9ebaea54
  25. 27 Sep, 2022 1 commit
  26. 19 Sep, 2022 3 commits
  27. 15 Sep, 2022 1 commit
  28. 14 Sep, 2022 1 commit