1. 24 Oct, 2022 1 commit
  2. 21 Oct, 2022 1 commit
    • mkshing's avatar
      Support LMSDiscreteScheduler in LDMPipeline (#891) · 31af4d17
      mkshing authored
      
      
      * Support LMSDiscreteScheduler in LDMPipeline
      
      This is a small change to support all schedulers such as LMSDiscreteScheduler in LDMPipeline.
      
      What's changed
      -------
      * Add the `scale_model_input` function before `step` to ensure correct denoising (L77)
      
      * Add "scale the initial noise by the standard deviation required by the scheduler"
      
      * run `make style`
      Co-authored-by: default avatarAnton Lozhkov <anton@huggingface.co>
      31af4d17
  3. 19 Oct, 2022 5 commits
  4. 18 Oct, 2022 2 commits
  5. 14 Oct, 2022 1 commit
  6. 13 Oct, 2022 7 commits
  7. 12 Oct, 2022 1 commit
  8. 11 Oct, 2022 2 commits
  9. 10 Oct, 2022 1 commit
    • Patrick von Platen's avatar
      [Low CPU memory] + device map (#772) · fab17528
      Patrick von Platen authored
      
      
      * add accelerate to load models with smaller memory footprint
      
      * remove low_cpu_mem_usage as it is reduntant
      
      * move accelerate init weights context to modelling utils
      
      * add test to ensure results are the same when loading with accelerate
      
      * add tests to ensure ram usage gets lower when using accelerate
      
      * move accelerate logic to single snippet under modelling utils and remove it from configuration utils
      
      * format code using to pass quality check
      
      * fix imports with isor
      
      * add accelerate to test extra deps
      
      * only import accelerate if device_map is set to auto
      
      * move accelerate availability check to diffusers import utils
      
      * format code
      
      * add device map to pipeline abstraction
      
      * lint it to pass PR quality check
      
      * fix class check to use accelerate when using diffusers ModelMixin subclasses
      
      * use low_cpu_mem_usage in transformers if device_map is not available
      
      * NoModuleLayer
      
      * comment out tests
      
      * up
      
      * uP
      
      * finish
      
      * Update src/diffusers/pipelines/stable_diffusion/safety_checker.py
      
      * finish
      
      * uP
      
      * make style
      Co-authored-by: default avatarPi Esposito <piero.skywalker@gmail.com>
      fab17528
  10. 07 Oct, 2022 1 commit
    • Suraj Patil's avatar
      [img2img, inpainting] fix fp16 inference (#769) · 92d70863
      Suraj Patil authored
      * handle dtype in vae and image2image pipeline
      
      * fix inpaint in fp16
      
      * dtype should be handled in add_noise
      
      * style
      
      * address review comments
      
      * add simple fast tests to check fp16
      
      * fix test name
      
      * put mask in fp16
      92d70863
  11. 06 Oct, 2022 3 commits
  12. 05 Oct, 2022 6 commits
  13. 04 Oct, 2022 1 commit
  14. 03 Oct, 2022 4 commits
  15. 02 Oct, 2022 1 commit
  16. 30 Sep, 2022 2 commits
    • Ryan Russell's avatar
      refactor: update ldm-bert `config.json` url closes #675 (#680) · 877bec8a
      Ryan Russell authored
      
      
      refactor: update ldm-bert `config.json` url
      Signed-off-by: default avatarRyan Russell <git@ryanrussell.org>
      Signed-off-by: default avatarRyan Russell <git@ryanrussell.org>
      877bec8a
    • Nouamane Tazi's avatar
      Optimize Stable Diffusion (#371) · 9ebaea54
      Nouamane Tazi authored
      * initial commit
      
      * make UNet stream capturable
      
      * try to fix noise_pred value
      
      * remove cuda graph and keep NB
      
      * non blocking unet with PNDMScheduler
      
      * make timesteps np arrays for pndm scheduler
      because lists don't get formatted to tensors in `self.set_format`
      
      * make max async in pndm
      
      * use channel last format in unet
      
      * avoid moving timesteps device in each unet call
      
      * avoid memcpy op in `get_timestep_embedding`
      
      * add `channels_last` kwarg to `DiffusionPipeline.from_pretrained`
      
      * update TODO
      
      * replace `channels_last` kwarg with `memory_format` for more generality
      
      * revert the channels_last changes to leave it for another PR
      
      * remove non_blocking when moving input ids to device
      
      * remove blocking from all .to() operations at beginning of pipeline
      
      * fix merging
      
      * fix merging
      
      * model can run in other precisions without autocast
      
      * attn refactoring
      
      * Revert "attn refactoring"
      
      This reverts commit 0c70c0e189cd2c4d8768274c9fcf5b940ee310fb.
      
      * remove restriction to run conv_norm in fp32
      
      * use `baddbmm` instead of `matmul`for better in attention for better perf
      
      * removing all reshapes to test perf
      
      * Revert "removing all reshapes to test perf"
      
      This reverts commit 006ccb8a8c6bc7eb7e512392e692a29d9b1553cd.
      
      * add shapes comments
      
      * hardcore whats needed for jitting
      
      * Revert "hardcore whats needed for jitting"
      
      This reverts commit 2fa9c698eae2890ac5f8e367ca80532ecf94df9a.
      
      * Revert "remove restriction to run conv_norm in fp32"
      
      This reverts commit cec592890c32da3d1b78d38b49e4307aedf459b9.
      
      * revert using baddmm in attention's forward
      
      * cleanup comment
      
      * remove restriction to run conv_norm in fp32. no quality loss was noticed
      
      This reverts commit cc9bc1339c998ebe9e7d733f910c6d72d9792213.
      
      * add more optimizations techniques to docs
      
      * Revert "add shapes comments"
      
      This reverts commit 31c58eadb8892f95478cdf05229adf678678c5f4.
      
      * apply suggestions
      
      * make quality
      
      * apply suggestions
      
      * styling
      
      * `scheduler.timesteps` are now arrays so we dont need .to()
      
      * remove useless .type()
      
      * use mean instead of max in `test_stable_diffusion_inpaint_pipeline_k_lms`
      
      * move scheduler timestamps to correct device if tensors
      
      * add device to `set_timesteps` in LMSD scheduler
      
      * `self.scheduler.set_timesteps` now uses device arg for schedulers that accept it
      
      * quick fix
      
      * styling
      
      * remove kwargs from schedulers `set_timesteps`
      
      * revert to using max in K-LMS inpaint pipeline test
      
      * Revert "`self.scheduler.set_timesteps` now uses device arg for schedulers that accept it"
      
      This reverts commit 00d5a51e5c20d8d445c8664407ef29608106d899.
      
      * move timesteps to correct device before loop in SD pipeline
      
      * apply previous fix to other SD pipelines
      
      * UNet now accepts tensor timesteps even on wrong device, to avoid errors
      - it shouldnt affect performance if timesteps are alrdy on correct device
      - it does slow down performance if they're on the wrong device
      
      * fix pipeline when timesteps are arrays with strides
      9ebaea54
  17. 27 Sep, 2022 1 commit