1. 05 Oct, 2022 3 commits
  2. 04 Oct, 2022 1 commit
  3. 03 Oct, 2022 4 commits
  4. 02 Oct, 2022 1 commit
  5. 30 Sep, 2022 2 commits
    • Ryan Russell's avatar
      refactor: update ldm-bert `config.json` url closes #675 (#680) · 877bec8a
      Ryan Russell authored
      
      
      refactor: update ldm-bert `config.json` url
      Signed-off-by: default avatarRyan Russell <git@ryanrussell.org>
      Signed-off-by: default avatarRyan Russell <git@ryanrussell.org>
      877bec8a
    • Nouamane Tazi's avatar
      Optimize Stable Diffusion (#371) · 9ebaea54
      Nouamane Tazi authored
      * initial commit
      
      * make UNet stream capturable
      
      * try to fix noise_pred value
      
      * remove cuda graph and keep NB
      
      * non blocking unet with PNDMScheduler
      
      * make timesteps np arrays for pndm scheduler
      because lists don't get formatted to tensors in `self.set_format`
      
      * make max async in pndm
      
      * use channel last format in unet
      
      * avoid moving timesteps device in each unet call
      
      * avoid memcpy op in `get_timestep_embedding`
      
      * add `channels_last` kwarg to `DiffusionPipeline.from_pretrained`
      
      * update TODO
      
      * replace `channels_last` kwarg with `memory_format` for more generality
      
      * revert the channels_last changes to leave it for another PR
      
      * remove non_blocking when moving input ids to device
      
      * remove blocking from all .to() operations at beginning of pipeline
      
      * fix merging
      
      * fix merging
      
      * model can run in other precisions without autocast
      
      * attn refactoring
      
      * Revert "attn refactoring"
      
      This reverts commit 0c70c0e189cd2c4d8768274c9fcf5b940ee310fb.
      
      * remove restriction to run conv_norm in fp32
      
      * use `baddbmm` instead of `matmul`for better in attention for better perf
      
      * removing all reshapes to test perf
      
      * Revert "removing all reshapes to test perf"
      
      This reverts commit 006ccb8a8c6bc7eb7e512392e692a29d9b1553cd.
      
      * add shapes comments
      
      * hardcore whats needed for jitting
      
      * Revert "hardcore whats needed for jitting"
      
      This reverts commit 2fa9c698eae2890ac5f8e367ca80532ecf94df9a.
      
      * Revert "remove restriction to run conv_norm in fp32"
      
      This reverts commit cec592890c32da3d1b78d38b49e4307aedf459b9.
      
      * revert using baddmm in attention's forward
      
      * cleanup comment
      
      * remove restriction to run conv_norm in fp32. no quality loss was noticed
      
      This reverts commit cc9bc1339c998ebe9e7d733f910c6d72d9792213.
      
      * add more optimizations techniques to docs
      
      * Revert "add shapes comments"
      
      This reverts commit 31c58eadb8892f95478cdf05229adf678678c5f4.
      
      * apply suggestions
      
      * make quality
      
      * apply suggestions
      
      * styling
      
      * `scheduler.timesteps` are now arrays so we dont need .to()
      
      * remove useless .type()
      
      * use mean instead of max in `test_stable_diffusion_inpaint_pipeline_k_lms`
      
      * move scheduler timestamps to correct device if tensors
      
      * add device to `set_timesteps` in LMSD scheduler
      
      * `self.scheduler.set_timesteps` now uses device arg for schedulers that accept it
      
      * quick fix
      
      * styling
      
      * remove kwargs from schedulers `set_timesteps`
      
      * revert to using max in K-LMS inpaint pipeline test
      
      * Revert "`self.scheduler.set_timesteps` now uses device arg for schedulers that accept it"
      
      This reverts commit 00d5a51e5c20d8d445c8664407ef29608106d899.
      
      * move timesteps to correct device before loop in SD pipeline
      
      * apply previous fix to other SD pipelines
      
      * UNet now accepts tensor timesteps even on wrong device, to avoid errors
      - it shouldnt affect performance if timesteps are alrdy on correct device
      - it does slow down performance if they're on the wrong device
      
      * fix pipeline when timesteps are arrays with strides
      9ebaea54
  6. 27 Sep, 2022 5 commits
    • Anton Lozhkov's avatar
      Fix onnx tensor format (#654) · d8572f20
      Anton Lozhkov authored
      fix np onnx
      d8572f20
    • Kashif Rasul's avatar
      [Pytorch] Pytorch only schedulers (#534) · bd8df2da
      Kashif Rasul authored
      
      
      * pytorch only schedulers
      
      * fix style
      
      * remove match_shape
      
      * pytorch only ddpm
      
      * remove SchedulerMixin
      
      * remove numpy from karras_ve
      
      * fix types
      
      * remove numpy from lms_discrete
      
      * remove numpy from pndm
      
      * fix typo
      
      * remove mixin and numpy from sde_vp and ve
      
      * remove remaining tensor_format
      
      * fix style
      
      * sigmas has to be torch tensor
      
      * removed set_format in readme
      
      * remove set format from docs
      
      * remove set_format from pipelines
      
      * update tests
      
      * fix typo
      
      * continue to use mixin
      
      * fix imports
      
      * removed unsed imports
      
      * match shape instead of assuming image shapes
      
      * remove import typo
      
      * update call to add_noise
      
      * use math instead of numpy
      
      * fix t_index
      
      * removed commented out numpy tests
      
      * timesteps needs to be discrete
      
      * cast timesteps to int in flax scheduler too
      
      * fix device mismatch issue
      
      * small fix
      
      * Update src/diffusers/schedulers/scheduling_pndm.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      bd8df2da
    • Pedro Cuenca's avatar
      Flax pipeline pndm (#583) · ab3fd671
      Pedro Cuenca authored
      
      
      * WIP: flax FlaxDiffusionPipeline & FlaxStableDiffusionPipeline
      
      * todo comment
      
      * Fix imports
      
      * Fix imports
      
      * add dummies
      
      * Fix empty init
      
      * make pipeline work
      
      * up
      
      * Allow dtype to be overridden on model load.
      
      This may be a temporary solution until #567 is addressed.
      
      * Convert params to bfloat16 or fp16 after loading.
      
      This deals with the weights, not the model.
      
      * Use Flax schedulers (typing, docstring)
      
      * PNDM: replace control flow with jax functions.
      
      Otherwise jitting/parallelization don't work properly as they don't know
      how to deal with traced objects.
      
      I temporarily removed `step_prk`.
      
      * Pass latents shape to scheduler set_timesteps()
      
      PNDMScheduler uses it to reserve space, other schedulers will just
      ignore it.
      
      * Wrap model imports inside availability checks.
      
      * Optionally return state in from_config.
      
      Useful for Flax schedulers.
      
      * Do not convert model weights to dtype.
      
      * Re-enable PRK steps with functional implementation.
      
      Values returned still not verified for correctness.
      
      * Remove left over has_state var.
      
      * make style
      
      * Apply suggestion list -> tuple
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Apply suggestion list -> tuple
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Remove unused comments.
      
      * Use zeros instead of empty.
      Co-authored-by: default avatarMishig Davaadorj <dmishig@gmail.com>
      Co-authored-by: default avatarMishig Davaadorj <mishig.davaadorj@coloradocollege.edu>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      ab3fd671
    • Pedro Cuenca's avatar
      Remove deprecated `torch_device` kwarg (#623) · b671cb09
      Pedro Cuenca authored
      * Remove deprecated `torch_device` kwarg.
      
      * Remove unused imports.
      b671cb09
    • Yuta Hayashibe's avatar
      Warning for too long prompts in DiffusionPipelines (Resolve #447) (#472) · f7ebe569
      Yuta Hayashibe authored
      * Return encoded texts by DiffusionPipelines
      
      * Updated README to show hot to use enoded_text_input
      
      * Reverted examples in README.md
      
      * Reverted all
      
      * Warning for long prompts
      
      * Fix bugs
      
      * Formatted
      f7ebe569
  7. 24 Sep, 2022 1 commit
  8. 23 Sep, 2022 3 commits
  9. 22 Sep, 2022 1 commit
    • Suraj Patil's avatar
      [UNet2DConditionModel] add gradient checkpointing (#461) · e7120bae
      Suraj Patil authored
      * add grad ckpt to downsample blocks
      
      * make it work
      
      * don't pass gradient_checkpointing to upsample block
      
      * add tests for UNet2DConditionModel
      
      * add test_gradient_checkpointing
      
      * add gradient_checkpointing for up and down blocks
      
      * add functions to enable and disable grad ckpt
      
      * remove the forward argument
      
      * better naming
      
      * make supports_gradient_checkpointing private
      e7120bae
  10. 21 Sep, 2022 1 commit
    • Pedro Cuenca's avatar
      Allow dtype to be specified in Flax pipeline (#600) · fb2fbab1
      Pedro Cuenca authored
      * Fix typo in docstring.
      
      * Allow dtype to be overridden on model load.
      
      This may be a temporary solution until #567 is addressed.
      
      * Create latents in float32
      
      The denoising loop always computes the next step in float32, so this
      would fail when using `bfloat16`.
      fb2fbab1
  11. 20 Sep, 2022 4 commits
  12. 19 Sep, 2022 2 commits
  13. 17 Sep, 2022 1 commit
  14. 16 Sep, 2022 2 commits
  15. 15 Sep, 2022 1 commit
    • Kashif Rasul's avatar
      Karras VE, DDIM and DDPM flax schedulers (#508) · b34be039
      Kashif Rasul authored
      * beta never changes removed from state
      
      * fix typos in docs
      
      * removed unused var
      
      * initial ddim flax scheduler
      
      * import
      
      * added dummy objects
      
      * fix style
      
      * fix typo
      
      * docs
      
      * fix typo in comment
      
      * set return type
      
      * added flax ddom
      
      * fix style
      
      * remake
      
      * pass PRNG key as argument and split before use
      
      * fix doc string
      
      * use config
      
      * added flax Karras VE scheduler
      
      * make style
      
      * fix dummy
      
      * fix ndarray type annotation
      
      * replace returns a new state
      
      * added lms_discrete scheduler
      
      * use self.config
      
      * add_noise needs state
      
      * use config
      
      * use config
      
      * docstring
      
      * added flax score sde ve
      
      * fix imports
      
      * fix typos
      b34be039
  16. 13 Sep, 2022 1 commit
  17. 12 Sep, 2022 1 commit
    • Kashif Rasul's avatar
      update expected results of slow tests (#268) · f4781a0b
      Kashif Rasul authored
      
      
      * update expected results of slow tests
      
      * relax sum and mean tests
      
      * Print shapes when reporting exception
      
      * formatting
      
      * fix sentence
      
      * relax test_stable_diffusion_fast_ddim for gpu fp16
      
      * relax flakey tests on GPU
      
      * added comment on large tolerences
      
      * black
      
      * format
      
      * set scheduler seed
      
      * added generator
      
      * use np.isclose
      
      * set num_inference_steps to 50
      
      * fix dep. warning
      
      * update expected_slice
      
      * preprocess if image
      
      * updated expected results
      
      * updated expected from CI
      
      * pass generator to VAE
      
      * undo change back to orig
      
      * use orignal
      
      * revert back the expected on cpu
      
      * revert back values for CPU
      
      * more undo
      
      * update result after using gen
      
      * update mean
      
      * set generator for mps
      
      * update expected on CI server
      
      * undo
      
      * use new seed every time
      
      * cpu manual seed
      
      * reduce num_inference_steps
      
      * style
      
      * use generator for randn
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      f4781a0b
  18. 08 Sep, 2022 6 commits
    • Patrick von Platen's avatar
      [Black] Update black (#433) · b2b3b1a8
      Patrick von Platen authored
      * Update black
      
      * update table
      b2b3b1a8
    • Patrick von Platen's avatar
      [Docs] Correct links (#432) · 44968e42
      Patrick von Platen authored
      44968e42
    • Patrick von Platen's avatar
      Mark in painting experimental (#430) · 195ebe5a
      Patrick von Platen authored
      195ebe5a
    • Patrick von Platen's avatar
      [Outputs] Improve syntax (#423) · f6fb3282
      Patrick von Platen authored
      
      
      * [Outputs] Improve syntax
      
      * improve more
      
      * fix docstring return
      
      * correct all
      
      * uP
      Co-authored-by: default avatarMishig Davaadorj <dmishig@gmail.com>
      f6fb3282
    • Anton Lozhkov's avatar
      [ONNX] Stable Diffusion exporter and pipeline (#399) · 8d9c4a53
      Anton Lozhkov authored
      
      
      * initial export and design
      
      * update imports
      
      * custom prover, import fixes
      
      * Update src/diffusers/onnx_utils.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/onnx_utils.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * remove push_to_hub
      
      * Update src/diffusers/onnx_utils.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * remove torch_device
      
      * numpify the rest of the pipeline
      
      * torchify the safety checker
      
      * revert tensor
      
      * Code review suggestions + quality
      
      * fix tests
      
      * fix provider, add an end-to-end test
      
      * style
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      8d9c4a53
    • Pedro Cuenca's avatar
      Inference support for `mps` device (#355) · 5dda1735
      Pedro Cuenca authored
      * Initial support for mps in Stable Diffusion pipeline.
      
      * Initial "warmup" implementation when using mps.
      
      * Make some deterministic tests pass with mps.
      
      * Disable training tests when using mps.
      
      * SD: generate latents in CPU then move to device.
      
      This is especially important when using the mps device, because
      generators are not supported there. See for example
      https://github.com/pytorch/pytorch/issues/84288.
      
      In addition, the other pipelines seem to use the same approach: generate
      the random samples then move to the appropriate device.
      
      After this change, generating an image in MPS produces the same result
      as when using the CPU, if the same seed is used.
      
      * Remove prints.
      
      * Pass AutoencoderKL test_output_pretrained with mps.
      
      Sampling from `posterior` must be done in CPU.
      
      * Style
      
      * Do not use torch.long for log op in mps device.
      
      * Perform incompatible padding ops in CPU.
      
      UNet tests now pass.
      See https://github.com/pytorch/pytorch/issues/84535
      
      
      
      * Style: fix import order.
      
      * Remove unused symbols.
      
      * Remove MPSWarmupMixin, do not apply automatically.
      
      We do apply warmup in the tests, but not during normal use.
      This adopts some PR suggestions by @patrickvonplaten.
      
      * Add comment for mps fallback to CPU step.
      
      * Add README_mps.md for mps installation and use.
      
      * Apply `black` to modified files.
      
      * Restrict README_mps to SD, show measures in table.
      
      * Make PNDM indexing compatible with mps.
      
      Addresses #239.
      
      * Do not use float64 when using LDMScheduler.
      
      Fixes #358.
      
      * Fix typo identified by @patil-suraj
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Adapt example to new output style.
      
      * Restore 1:1 results reproducibility with CompVis.
      
      However, mps latents need to be generated in CPU because generators
      don't work in the mps device.
      
      * Move PyTorch nightly to requirements.
      
      * Adapt `test_scheduler_outputs_equivalence` ton MPS.
      
      * mps: skip training tests instead of ignoring silently.
      
      * Make VQModel tests pass on mps.
      
      * mps ddim tests: warmup, increase tolerance.
      
      * ScoreSdeVeScheduler indexing made mps compatible.
      
      * Make ldm pipeline tests pass using warmup.
      
      * Style
      
      * Simplify casting as suggested in PR.
      
      * Add Known Issues to readme.
      
      * `isort` import order.
      
      * Remove _mps_warmup helpers from ModelMixin.
      
      And just make changes to the tests.
      
      * Skip tests using unittest decorator for consistency.
      
      * Remove temporary var.
      
      * Remove spurious blank space.
      
      * Remove unused symbol.
      
      * Remove README_mps.
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> 
      5dda1735