1. 30 Sep, 2022 2 commits
    • Josh Achiam's avatar
      Allow resolutions that are not multiples of 64 (#505) · a784be2e
      Josh Achiam authored
      
      
      * Allow resolutions that are not multiples of 64
      
      * ran black
      
      * fix bug
      
      * add test
      
      * more explanation
      
      * more comments
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      a784be2e
    • Nouamane Tazi's avatar
      Optimize Stable Diffusion (#371) · 9ebaea54
      Nouamane Tazi authored
      * initial commit
      
      * make UNet stream capturable
      
      * try to fix noise_pred value
      
      * remove cuda graph and keep NB
      
      * non blocking unet with PNDMScheduler
      
      * make timesteps np arrays for pndm scheduler
      because lists don't get formatted to tensors in `self.set_format`
      
      * make max async in pndm
      
      * use channel last format in unet
      
      * avoid moving timesteps device in each unet call
      
      * avoid memcpy op in `get_timestep_embedding`
      
      * add `channels_last` kwarg to `DiffusionPipeline.from_pretrained`
      
      * update TODO
      
      * replace `channels_last` kwarg with `memory_format` for more generality
      
      * revert the channels_last changes to leave it for another PR
      
      * remove non_blocking when moving input ids to device
      
      * remove blocking from all .to() operations at beginning of pipeline
      
      * fix merging
      
      * fix merging
      
      * model can run in other precisions without autocast
      
      * attn refactoring
      
      * Revert "attn refactoring"
      
      This reverts commit 0c70c0e189cd2c4d8768274c9fcf5b940ee310fb.
      
      * remove restriction to run conv_norm in fp32
      
      * use `baddbmm` instead of `matmul`for better in attention for better perf
      
      * removing all reshapes to test perf
      
      * Revert "removing all reshapes to test perf"
      
      This reverts commit 006ccb8a8c6bc7eb7e512392e692a29d9b1553cd.
      
      * add shapes comments
      
      * hardcore whats needed for jitting
      
      * Revert "hardcore whats needed for jitting"
      
      This reverts commit 2fa9c698eae2890ac5f8e367ca80532ecf94df9a.
      
      * Revert "remove restriction to run conv_norm in fp32"
      
      This reverts commit cec592890c32da3d1b78d38b49e4307aedf459b9.
      
      * revert using baddmm in attention's forward
      
      * cleanup comment
      
      * remove restriction to run conv_norm in fp32. no quality loss was noticed
      
      This reverts commit cc9bc1339c998ebe9e7d733f910c6d72d9792213.
      
      * add more optimizations techniques to docs
      
      * Revert "add shapes comments"
      
      This reverts commit 31c58eadb8892f95478cdf05229adf678678c5f4.
      
      * apply suggestions
      
      * make quality
      
      * apply suggestions
      
      * styling
      
      * `scheduler.timesteps` are now arrays so we dont need .to()
      
      * remove useless .type()
      
      * use mean instead of max in `test_stable_diffusion_inpaint_pipeline_k_lms`
      
      * move scheduler timestamps to correct device if tensors
      
      * add device to `set_timesteps` in LMSD scheduler
      
      * `self.scheduler.set_timesteps` now uses device arg for schedulers that accept it
      
      * quick fix
      
      * styling
      
      * remove kwargs from schedulers `set_timesteps`
      
      * revert to using max in K-LMS inpaint pipeline test
      
      * Revert "`self.scheduler.set_timesteps` now uses device arg for schedulers that accept it"
      
      This reverts commit 00d5a51e5c20d8d445c8664407ef29608106d899.
      
      * move timesteps to correct device before loop in SD pipeline
      
      * apply previous fix to other SD pipelines
      
      * UNet now accepts tensor timesteps even on wrong device, to avoid errors
      - it shouldnt affect performance if timesteps are alrdy on correct device
      - it does slow down performance if they're on the wrong device
      
      * fix pipeline when timesteps are arrays with strides
      9ebaea54
  2. 29 Sep, 2022 3 commits
  3. 28 Sep, 2022 1 commit
  4. 27 Sep, 2022 10 commits
    • Pedro Cuenca's avatar
      Fix `main`: stable diffusion pipelines cannot be loaded (#655) · 235770dd
      Pedro Cuenca authored
      * Replace deprecation warning f-string with class name.
      
      When `__repr__` is invoked in the instance serialization of
      `config_dict` fails, because it contains `kwargs` of type `<class
      inspect._empty>`.
      
      * Revert "Replace deprecation warning f-string with class name."
      
      This reverts commit 1c4eb8cb104374bd84e43865fc3865862473799c.
      
      * Do not attempt to register `"kwargs"` as an attribute.
      
      Otherwise serialization could fail.
      This may happen for other attributes, so we should create a better
      solution.
      235770dd
    • Anton Lozhkov's avatar
      Fix onnx tensor format (#654) · d8572f20
      Anton Lozhkov authored
      fix np onnx
      d8572f20
    • Kashif Rasul's avatar
      [Pytorch] add dep. warning for pytorch schedulers (#651) · 85494e88
      Kashif Rasul authored
      * add dep. warning for schedulers
      
      * fix format
      85494e88
    • Suraj Patil's avatar
      [DDIM, DDPM] fix add_noise (#648) · 33045382
      Suraj Patil authored
      fix add noise
      33045382
    • Kashif Rasul's avatar
      [Pytorch] Pytorch only schedulers (#534) · bd8df2da
      Kashif Rasul authored
      
      
      * pytorch only schedulers
      
      * fix style
      
      * remove match_shape
      
      * pytorch only ddpm
      
      * remove SchedulerMixin
      
      * remove numpy from karras_ve
      
      * fix types
      
      * remove numpy from lms_discrete
      
      * remove numpy from pndm
      
      * fix typo
      
      * remove mixin and numpy from sde_vp and ve
      
      * remove remaining tensor_format
      
      * fix style
      
      * sigmas has to be torch tensor
      
      * removed set_format in readme
      
      * remove set format from docs
      
      * remove set_format from pipelines
      
      * update tests
      
      * fix typo
      
      * continue to use mixin
      
      * fix imports
      
      * removed unsed imports
      
      * match shape instead of assuming image shapes
      
      * remove import typo
      
      * update call to add_noise
      
      * use math instead of numpy
      
      * fix t_index
      
      * removed commented out numpy tests
      
      * timesteps needs to be discrete
      
      * cast timesteps to int in flax scheduler too
      
      * fix device mismatch issue
      
      * small fix
      
      * Update src/diffusers/schedulers/scheduling_pndm.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      bd8df2da
    • Yih-Dar's avatar
      Fix `SpatialTransformer` (#578) · d886e497
      Yih-Dar authored
      
      
      * Fix SpatialTransformer
      
      * Fix SpatialTransformer
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      d886e497
    • Pedro Cuenca's avatar
      Flax pipeline pndm (#583) · ab3fd671
      Pedro Cuenca authored
      
      
      * WIP: flax FlaxDiffusionPipeline & FlaxStableDiffusionPipeline
      
      * todo comment
      
      * Fix imports
      
      * Fix imports
      
      * add dummies
      
      * Fix empty init
      
      * make pipeline work
      
      * up
      
      * Allow dtype to be overridden on model load.
      
      This may be a temporary solution until #567 is addressed.
      
      * Convert params to bfloat16 or fp16 after loading.
      
      This deals with the weights, not the model.
      
      * Use Flax schedulers (typing, docstring)
      
      * PNDM: replace control flow with jax functions.
      
      Otherwise jitting/parallelization don't work properly as they don't know
      how to deal with traced objects.
      
      I temporarily removed `step_prk`.
      
      * Pass latents shape to scheduler set_timesteps()
      
      PNDMScheduler uses it to reserve space, other schedulers will just
      ignore it.
      
      * Wrap model imports inside availability checks.
      
      * Optionally return state in from_config.
      
      Useful for Flax schedulers.
      
      * Do not convert model weights to dtype.
      
      * Re-enable PRK steps with functional implementation.
      
      Values returned still not verified for correctness.
      
      * Remove left over has_state var.
      
      * make style
      
      * Apply suggestion list -> tuple
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Apply suggestion list -> tuple
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Remove unused comments.
      
      * Use zeros instead of empty.
      Co-authored-by: default avatarMishig Davaadorj <dmishig@gmail.com>
      Co-authored-by: default avatarMishig Davaadorj <mishig.davaadorj@coloradocollege.edu>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      ab3fd671
    • Pedro Cuenca's avatar
      c070e5f0
    • Pedro Cuenca's avatar
      Remove deprecated `torch_device` kwarg (#623) · b671cb09
      Pedro Cuenca authored
      * Remove deprecated `torch_device` kwarg.
      
      * Remove unused imports.
      b671cb09
    • Yuta Hayashibe's avatar
      Warning for too long prompts in DiffusionPipelines (Resolve #447) (#472) · f7ebe569
      Yuta Hayashibe authored
      * Return encoded texts by DiffusionPipelines
      
      * Updated README to show hot to use enoded_text_input
      
      * Reverted examples in README.md
      
      * Reverted all
      
      * Warning for long prompts
      
      * Fix bugs
      
      * Formatted
      f7ebe569
  5. 24 Sep, 2022 2 commits
  6. 23 Sep, 2022 6 commits
  7. 22 Sep, 2022 4 commits
  8. 21 Sep, 2022 9 commits
  9. 20 Sep, 2022 3 commits