1. 28 May, 2024 2 commits
  2. 27 May, 2024 1 commit
    • Anton Obukhov's avatar
      [Pipeline] Marigold depth and normals estimation (#7847) · b3d10d6d
      Anton Obukhov authored
      
      
      * implement marigold depth and normals pipelines in diffusers core
      
      * remove bibtex
      
      * remove deprecations
      
      * remove save_memory argument
      
      * remove validate_vae
      
      * remove config output
      
      * remove batch_size autodetection
      
      * remove presets logic
      move default denoising_steps and processing_resolution into the model config
      make default ensemble_size 1
      
      * remove no_grad
      
      * add fp16 to the example usage
      
      * implement is_matplotlib_available
      use is_matplotlib_available, is_scipy_available for conditional imports in the marigold depth pipeline
      
      * move colormap, visualize_depth, and visualize_normals into export_utils.py
      
      * make the denoising loop more lucid
      fix the outputs to always be 4d tensors or lists of pil images
      support a 4d input_image case
      attempt to support model_cpu_offload_seq
      move check_inputs into a separate function
      change default batch_size to 1, remove any logic to make it bigger implicitly
      
      * style
      
      * rename denoising_steps into num_inference_steps
      
      * rename input_image into image
      
      * rename input_latent into latents
      
      * remove decode_image
      change decode_prediction to use the AutoencoderKL.decode method
      
      * move clean_latent outside of progress_bar
      
      * refactor marigold-reusable image processing bits into MarigoldImageProcessor class
      
      * clean up the usage example docstring
      
      * make ensemble functions members of the pipelines
      
      * add early checks in check_inputs
      rename E into ensemble_size in depth ensembling
      
      * fix vae_scale_factor computation
      
      * better compatibility with torch.compile
      better variable naming
      
      * move export_depth_to_png to export_utils
      
      * remove encode_prediction
      
      * improve visualize_depth and visualize_normals to accept multi-dimensional data and lists
      remove visualization functions from the pipelines
      move exporting depth as 16-bit PNGs functionality from the depth pipeline
      update example docstrings
      
      * do not shortcut vae.config variables
      
      * change all asserts to raise ValueError
      
      * rename output_prediction_type to output_type
      
      * better variable names
      clean up variable deletion code
      
      * better variable names
      
      * pass desc and leave kwargs into the diffusers progress_bar
      implement nested progress bar for images and steps loops
      
      * implement scale_invariant and shift_invariant flags in the ensemble_depth function
      add scale_invariant and shift_invariant flags readout from the model config
      further refactor ensemble_depth
      support ensembling without alignment
      add ensemble_depth docstring
      
      * fix generator device placement checks
      
      * move encode_empty_text body into the pipeline call
      
      * minor empty text encoding simplifications
      
      * adjust pipelines' class docstrings to explain the added construction arguments
      
      * improve the scipy failure condition
      add comments
      improve docstrings
      change the default use_full_z_range to True
      
      * make input image values range check configurable in the preprocessor
      refactor load_image_canonical in preprocessor to reject unknown types and return the image in the expected 4D format of tensor and on right device
      support a list of everything as inputs to the pipeline, change type to PipelineImageInput
      implement a check that all input list elements have the same dimensions
      improve docstrings of pipeline outputs
      remove check_input pipeline argument
      
      * remove forgotten print
      
      * add prediction_type model config
      
      * add uncertainty visualization into export utils
      fix NaN values in normals uncertainties
      
      * change default of output_uncertainty to False
      better handle the case of an attempt to export or visualize none
      
      * fix `output_uncertainty=False`
      
      * remove kwargs
      fix check_inputs according to the new inputs of the pipeline
      
      * rename prepare_latent into prepare_latents as in other pipelines
      annotate prepare_latents in normals pipeline with "Copied from"
      annotate encode_image in normals pipeline with "Copied from"
      
      * move nested-capable `progress_bar` method into the pipelines
      revert the original `progress_bar` method in pipeline_utils
      
      * minor message improvement
      
      * fix cpu offloading
      
      * move colormap, visualize_depth, export_depth_to_16bit_png, visualize_normals, visualize_uncertainty to marigold_image_processing.py
      update example docstrings
      
      * fix missing comma
      
      * change torch.FloatTensor to torch.Tensor
      
      * fix importing of MarigoldImageProcessor
      
      * fix vae offloading
      fix batched image encoding
      remove separate encode_image function and use vae.encode instead
      
      * implement marigold's intial tests
      relax generator checks in line with other pipelines
      implement return_dict __call__ argument in line with other pipelines
      
      * fix num_images computation
      
      * remove MarigoldImageProcessor and outputs from import structure
      update tests
      
      * update docstrings
      
      * update init
      
      * update
      
      * style
      
      * fix
      
      * fix
      
      * up
      
      * up
      
      * up
      
      * add simple test
      
      * up
      
      * update expected np input/output to be channel last
      
      * move expand_tensor_or_array into the MarigoldImageProcessor
      
      * rewrite tests to follow conventions - hardcoded slices instead of image artifacts
      write more smoke tests
      
      * add basic docs.
      
      * add anton's contribution statement
      
      * remove todos.
      
      * fix assertion values for marigold depth slow tests
      
      * fix assertion values for depth normals.
      
      * remove print
      
      * support AutoencoderTiny in the pipelines
      
      * update documentation page
      add Available Pipelines section
      add Available Checkpoints section
      add warning about num_inference_steps
      
      * fix missing import in docstring
      fix wrong value in visualize_depth docstring
      
      * [doc] add marigold to pipelines overview
      
      * [doc] add section "usage examples"
      
      * fix an issue with latents check in the pipelines
      
      * add "Frame-by-frame Video Processing with Consistency" section
      
      * grammarly
      
      * replace tables with images with css-styled images (blindly)
      
      * style
      
      * print
      
      * fix the assertions.
      
      * take from the github runner.
      
      * take the slices from action artifacts
      
      * style.
      
      * update with the slices from the runner.
      
      * remove unnecessary code blocks.
      
      * Revert "[doc] add marigold to pipelines overview"
      
      This reverts commit a505165150afd8dab23c474d1a054ea505a56a5f.
      
      * remove invitation for new modalities
      
      * split out marigold usage examples
      
      * doc cleanup
      
      ---------
      Co-authored-by: default avataryiyixuxu <yixu310@gmail.com>
      Co-authored-by: default avataryiyixuxu <yixu310@gmail,com>
      Co-authored-by: default avatarsayakpaul <spsayakpaul@gmail.com>
      b3d10d6d
  3. 20 May, 2024 1 commit
  4. 10 May, 2024 1 commit
    • Sayak Paul's avatar
      [Core] introduce videoprocessor. (#7776) · 04f4bd54
      Sayak Paul authored
      
      
      * introduce videoprocessor.
      
      * fix quality
      
      * address yiyi's feedback
      
      * fix preprocess_video call.
      
      * video_processor -> image_processor
      
      * fix
      
      * fix more.
      
      * quality
      
      * image_processor -> video_processor
      
      * support List[List[PIL.Image.Image]]
      
      * change to video_processor.
      
      * documentation
      
      * Apply suggestions from code review
      
      * changes
      
      * remove print.
      
      * refactor video processor (part # 7776) (#7861)
      
      * update
      
      * update remove deprecate
      
      * Update src/diffusers/video_processor.py
      
      * update
      
      * Apply suggestions from code review
      
      * deprecate list of 5d for video and list of 4d for image + apply other feedbacks
      
      * up
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * add doc.
      
      * tensor2vid -> postprocess_video.
      
      * refactor preprocess with preprocess_video
      
      * set default values.
      
      * empty commit
      
      * more refactoring of prepare_latents in animatediff vid2vid
      
      * checking documentation
      
      * remove documentation for now.
      
      * fix animatediff sdxl
      
      * fix test failure [part of video processor PR] (#7905)
      
      up
      
      * remove preceed_with_frames.
      
      * doc
      
      * fix
      
      * fix
      
      * remove video input as a single-frame video.
      
      ---------
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      04f4bd54
  5. 06 May, 2024 1 commit
  6. 03 May, 2024 1 commit
  7. 30 Apr, 2024 1 commit
  8. 25 Apr, 2024 2 commits
  9. 23 Apr, 2024 1 commit
  10. 17 Apr, 2024 2 commits
  11. 16 Apr, 2024 1 commit
    • UmerHA's avatar
      Fixing implementation of ControlNet-XS (#6772) · fda1531d
      UmerHA authored
      
      
      * CheckIn - created DownSubBlocks
      
      * Added extra channels, implemented subblock fwd
      
      * Fixed connection sizes
      
      * checkin
      
      * Removed iter, next in forward
      
      * Models for SD21 & SDXL run through
      
      * Added back pipelines, cleared up connections
      
      * Cleaned up connection creation
      
      * added debug logs
      
      * updated logs
      
      * logs: added input loading
      
      * Update umer_debug_logger.py
      
      * log: Loading hint
      
      * Update umer_debug_logger.py
      
      * added logs
      
      * Changed debug logging
      
      * debug: added more logs
      
      * Fixed num_norm_groups
      
      * Debug: Logging all of SDXL input
      
      * Update umer_debug_logger.py
      
      * debug: updated logs
      
      * checkim
      
      * Readded tests
      
      * Removed debug logs
      
      * Fixed Slow Tests
      
      * Added value ckecks | Updated model_cpu_offload_seq
      
      * accelerate-offloading works ; fast tests work
      
      * Made unet & addon explicit in controlnet
      
      * Updated slow tests
      
      * Added dtype/device to ControlNetXS
      
      * Filled in test model paths
      
      * Added image_encoder/feature_extractor to XL pipe
      
      * Fixed fast tests
      
      * Added comments and docstrings
      
      * Fixed copies
      
      * Added docs ; Updates slow tests
      
      * Moved changes to UNetMidBlock2DCrossAttn
      
      * tiny cleanups
      
      * Removed stray prints
      
      * Removed ip adapters + freeU
      
      - Removed ip adapters + freeU as they don't make sense for ControlNet-XS
      - Fixed imports of UNet components
      
      * Fixed test_save_load_float16
      
      * Make style, quality, fix-copies
      
      * Changed loading/saving API for ControlNetXS
      
      - Changed loading/saving API for ControlNetXS
      - other small fixes
      
      * Removed ControlNet-XS from research examples
      
      * Make style, quality, fix-copies
      
      * Small fixes
      
      - deleted ControlNetXSModel.init_original
      - added time_embedding_mix to StableDiffusionControlNetXSPipeline .from_pretrained / StableDiffusionXLControlNetXSPipeline.from_pretrained
      - fixed copy hints
      
      * checkin May 11 '23
      
      * CheckIn Mar 12 '24
      
      * Fixed tests for SD
      
      * Added tests for UNetControlNetXSModel
      
      * Fixed SDXL tests
      
      * cleanup
      
      * Delete Pipfile
      
      * CheckIn Mar 20
      
      Started replacing sub blocks  by `ControlNetXSCrossAttnDownBlock2D` and `ControlNetXSCrossAttnUplock2D`
      
      * check-in Mar 23
      
      * checkin 24 Mar
      
      * Created init for UNetCnxs and CnxsAddon
      
      * CheckIn
      
      * Made from_modules, from_unet and no_control work
      
      * make style,quality,fix-copies & small changes
      
      * Fixed freezing
      
      * Added gradient ckpt'ing; fixed tests
      
      * Fix slow tests(+compile) ; clear naming confusion
      
      * Don't create UNet in init ; removed class_emb
      
      * Incorporated review feedback
      
      - Deleted get_base_pipeline /  get_controlnet_addon for pipes
      - Pipes inherit from StableDiffusionXLPipeline
      - Made module dicts for cnxs-addon's down/mid/up classes
      - Added support for qkv fusion and freeU
      
      * Make style, quality, fix-copies
      
      * Implemented review feedback
      
      * Removed compatibility check for vae/ctrl embedding
      
      * make style, quality, fix-copies
      
      * Delete Pipfile
      
      * Integrated review feedback
      
      - Importing ControlNetConditioningEmbedding now
      - get_down/mid/up_block_addon now outside class
      - renamed `do_control` to `apply_control`
      
      * Reduced size of test tensors
      
      For this, added `norm_num_groups` as parameter everywhere
      
      * Renamed cnxs-`Addon` to cnxs-`Adapter`
      
      - `ControlNetXSAddon` -> `ControlNetXSAdapter`
      - `ControlNetXSAddonDownBlockComponents` -> `DownBlockControlNetXSAdapter`, and similarly for mid/up
      - `get_mid_block_addon` -> `get_mid_block_adapter`, and similarly for mid/up
      
      * Fixed save_pretrained/from_pretrained bug
      
      * Removed redundant code
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      fda1531d
  12. 11 Apr, 2024 1 commit
  13. 10 Apr, 2024 1 commit
  14. 18 Mar, 2024 1 commit
  15. 14 Mar, 2024 1 commit
  16. 13 Mar, 2024 2 commits
    • Michael's avatar
      Add Intro page of TCD (#7259) · b3005173
      Michael authored
      
      
      * add tcd intro
      
      * resolve repos
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * revise NFEs related
      
      * change inpainting location
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      b3005173
    • Manuel Brack's avatar
      [Pipeline] Add LEDITS++ pipelines (#6074) · 00eca4b8
      Manuel Brack authored
      
      
      * Setup LEdits++ file structure
      
      * Fix import
      
      * LEditsPP Stable Diffusion pipeline
      
      * Include variable image aspect ratios
      
      * Implement LEDITS++ for SDXL
      
      * clean up LEditsPPPipelineStableDiffusion
      
      * Adjust inversion output
      
      * Added docu, more cleanup for LEditsPPPipelineStableDiffusion
      
      * clean up LEditsPPPipelineStableDiffusionXL
      
      * Update documentation
      
      * Fix documentation import
      
      * Add skeleton IF implementation
      
      * Fix documentation typo
      
      * Add LEDTIS docu to toctree
      
      * Add missing title
      
      * Finalize SD documentation
      
      * Finalize SD-XL documentation
      
      * Fix code style and quality
      
      * Fix typo
      
      * Fix return types
      
      * added LEditsPPPipelineIF; minor changes for LEditsPPPipelineStableDiffusion and LEditsPPPipelineStableDiffusionXL
      
      * Fix copy reference
      
      * add documentation for IF
      
      * Add first tests
      
      * Fix batching for SD-XL
      
      * Fix text encoding and perfect reconstruction for SD-XL
      
      * Add tests for SD-XL, minor changes
      
      * move user_mask to correct device, use cross_attention_kwargs also for inversion
      
      * Example docstring
      
      * Fix attention resolution for non-square images
      
      * Refactoring for PR review
      
      * Safely remove ledits_utils.py
      
      * Style fixes
      
      * Replace assertions with ValueError
      
      * Remove LEditsPPPipelineIF
      
      * Remove unecessary input checks
      
      * Refactoring of CrossAttnProcessor
      
      * Revert unecessary changes to scheduler
      
      * Remove first progress-bar in inversion
      
      * Refactor scheduler usage and reset
      
      * Use imageprocessor instead of custom logic
      
      * Fix scheduler init warning
      
      * Fix error when running the pipeline in fp16
      
      * Update documentation wrt perfect inversion
      
      * Update tests
      
      * Fix code quality and copy consistency
      
      * Update LEditsPP import
      
      * Remove enable/disable methods that are now in StableDiffusionMixin
      
      * Change import in docs
      
      * Revert import structure change
      
      * Fix ledits imports
      
      ---------
      Co-authored-by: default avatarKatharina Kornmeier <katharina.kornmeier@stud.tu-darmstadt.de>
      00eca4b8
  17. 07 Mar, 2024 1 commit
  18. 06 Mar, 2024 1 commit
    • Kashif Rasul's avatar
      [Pipiline] Wuerstchen v3 aka Stable Cascasde pipeline (#6487) · 40aa47b9
      Kashif Rasul authored
      
      
      * initial diffNext v3
      
      * move to v3 folder
      
      * imports
      
      * dry up the unets
      
      * no switch_level
      
      * fix init
      
      * add switch_level tp config
      
      * Fixed some things
      
      * Added pooled text embeddings
      
      * Initial work on adding image encoder
      
      * changes from @dome272
      
      * Stuff for the image encoder processing and variable naming in decoder
      
      * fix arg name
      
      * inference fixes
      
      * inference fixes
      
      * default TimestepBlock without conds
      
      * c_skip=0 by default
      
      * fix bfloat16 to cpu
      
      * use config
      
      * undo temp change
      
      * fix gen_c_embeddings args
      
      * change text encoding
      
      * text encoding
      
      * undo print
      
      * undo .gitignore change
      
      * Allow WuerstchenV3PriorPipeline to use the base DDPM & DDIM schedulers
      
      * use WuerstchenV3Unet in both pipelines
      
      * fix imports
      
      * initial failing tests
      
      * cleanup
      
      * use scheduler.timesterps
      
      * some fixes to the tests, still not fully working
      
      * fix tests
      
      * fix prior tests
      
      * add dropout to the model_kwargs
      
      * more tests passing
      
      * update expected_slice
      
      * initial rename
      
      * rename tests
      
      * rename class names
      
      * make fix-copies
      
      * initial docs
      
      * autodocs
      
      * typos
      
      * fix arg docs
      
      * add text_encoder info
      
      * combined pipeline has optional image arg
      
      * fix documentation
      
      * Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * use self.config
      
      * Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * c_in -> in_channels
      
      * removed kwargs from unet's forward
      
      * Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * remove older callback api
      
      * removed kwargs and fixed decoder guidance > 1
      
      * decoder takes emeds
      
      * check and use image_embeds
      
      * fixed all but one decoder test
      
      * fix decoder tests
      
      * update callback api
      
      * fix some more combined tests
      
      * push combined pipeline
      
      * initial docs
      
      * fix doc_string
      
      * update combined api
      
      * no test_callback_inputs test for combined pipeline
      
      * add optional components
      
      * fix ordering of components
      
      * fix combined tests
      
      * update convert script
      
      * Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade_prior.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade_prior.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade_prior.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * fix imports
      
      * move effnet out of deniosing loop
      
      * prompt_embeds_pooled only when doing guidance
      
      * Fix repeat shape
      
      * move StableCascadeUnet to models/unets/
      
      * more descriptive names
      
      * converted when numpy()
      
      * StableCascadePriorPipelineOutput docs
      
      * rename StableCascadeUNet
      
      * add slow tests
      
      * fix slow tests
      
      * update
      
      * update
      
      * updated model_path
      
      * add args for weights
      
      * set push_to_hub to false
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      ---------
      Co-authored-by: default avatarDominic Rampas <d6582533@gmail.com>
      Co-authored-by: default avatarPablo Pernias <pablo@pernias.com>
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      Co-authored-by: default avatar99991 <99991@users.noreply.github.com>
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      40aa47b9
  19. 05 Mar, 2024 1 commit
  20. 17 Feb, 2024 1 commit
  21. 14 Feb, 2024 1 commit
    • Steven Liu's avatar
      [docs] IP-Adapter (#6897) · 9efe1e52
      Steven Liu authored
      * use cases
      
      * first draft
      
      * fix image links
      
      * lcm-lora
      
      * feedback
      
      * review
      
      * feedback
      
      * feedback
      9efe1e52
  22. 31 Jan, 2024 2 commits
  23. 25 Jan, 2024 1 commit
  24. 10 Jan, 2024 1 commit
  25. 05 Jan, 2024 2 commits
  26. 27 Dec, 2023 1 commit
  27. 26 Dec, 2023 1 commit
  28. 21 Dec, 2023 1 commit
    • Will Berman's avatar
      open muse (#5437) · 40398152
      Will Berman authored
      
      
      amused
      
      rename
      
      Update docs/source/en/api/pipelines/amused.md
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      AdaLayerNormContinuous default values
      
      custom micro conditioning
      
      micro conditioning docs
      
      put lookup from codebook in constructor
      
      fix conversion script
      
      remove manual fused flash attn kernel
      
      add training script
      
      temp remove training script
      
      add dummy gradient checkpointing func
      
      clarify temperatures is an instance variable by setting it
      
      remove additional SkipFF block args
      
      hardcode norm args
      
      rename tests folder
      
      fix paths and samples
      
      fix tests
      
      add training script
      
      training readme
      
      lora saving and loading
      
      non-lora saving/loading
      
      some readme fixes
      
      guards
      
      Update docs/source/en/api/pipelines/amused.md
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      Update examples/amused/README.md
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      Update examples/amused/train_amused.py
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      vae upcasting
      
      add fp16 integration tests
      
      use tuple for micro cond
      
      copyrights
      
      remove casts
      
      delegate to torch.nn.LayerNorm
      
      move temperature to pipeline call
      
      upsampling/downsampling changes
      40398152
  29. 18 Dec, 2023 1 commit
    • Dhruv Nair's avatar
      Deprecate Pipelines (#6169) · a0c54828
      Dhruv Nair authored
      
      
      * deprecate pipe
      
      * make style
      
      * update
      
      * add deprecation message
      
      * format
      
      * remove tests for deprecated pipelines
      
      * remove deprecation message
      
      * make style
      
      * fix copies
      
      * clean up
      
      * clean
      
      * clean
      
      * clean
      
      * clean up
      
      * clean up
      
      * clean up toctree
      
      * clean up
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      a0c54828
  30. 14 Dec, 2023 1 commit
  31. 06 Dec, 2023 1 commit
    • UmerHA's avatar
      Add ControlNet-XS support (#5827) · e192ae08
      UmerHA authored
      
      
      * Check in 23-10-05
      
      * check-in 23-10-06
      
      * check-in 23-10-07 2pm
      
      * check-in 23-10-08
      
      * check-in 231009T1200
      
      * check-in 230109
      
      * checkin 231010
      
      * init + forward run
      
      * checkin
      
      * checkin
      
      * ControlNetXSModel is now saveable+loadable
      
      * Forward works
      
      * checkin
      
      * Pipeline works with `no_control=True`
      
      * checkin
      
      * debug: save intermediate outputs of resnet
      
      * checkin
      
      * Understood time error + fixed connection error
      
      * checkin
      
      * checkin 231106T1600
      
      * turned off detailled debug prints
      
      * time debug logs
      
      * small fix
      
      * Separated control_scale for connections/time
      
      * simplified debug logging
      
      * Full denoising works with control scale = 0
      
      * aligned logs
      
      * Added control_attention_head_dim param
      
      * Passing n_heads instead of dim_head into ctrl unet
      
      * Fixed ctrl midblock bug
      
      * Cleanup
      
      * Fixed time dtype bug
      
      * checkin
      
      * 1. from_unet, 2. base passed, 3. all unet params
      
      * checkin
      
      * Finished docstrings
      
      * cleanup
      
      * make style
      
      * checkin
      
      * more tests pass
      
      * Fixed tests
      
      * removed debug logs
      
      * make style + quality
      
      * make fix-copies
      
      * fixed documentation
      
      * added cnxs to doc toc
      
      * added control start/end param
      
      * Update controlnetxs_sdxl.md
      
      * tried to fix copies..
      
      * Fixed norm_num_groups in from_unet
      
      * added sdxl-depth test
      
      * created SD2.1 controlnet-xs pipeline
      
      * re-added debug logs
      
      * Adjusting group norm ; readded logs
      
      * Added debug log statements
      
      * removed debug logs ; started tests for sd2.1
      
      * updated sd21 tests
      
      * fixed tests
      
      * fixed tests
      
      * slightly increased error tolerance for 1 test
      
      * make style & quality
      
      * Added docs for CNXS-SD
      
      * make fix-copies
      
      * Fixed sd compile test ; fixed gradient ckpointing
      
      * vae downs = cnxs conditioning downs; removed guess
      
      * make style & quality
      
      * Fixed tests
      
      * fixed test
      
      * Incorporated review feedback
      
      * simplified control model surgery
      
      * fixed tests & make style / quality
      
      * Updated docs; deleted pip & cursor files
      
      * Rolled back minimal change to resnet
      
      * Update resnet.py
      
      * Update resnet.py
      
      * Update src/diffusers/models/controlnetxs.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/diffusers/models/controlnetxs.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Incorporated review feedback
      
      * Update docs/source/en/api/pipelines/controlnetxs_sdxl.md
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update docs/source/en/api/pipelines/controlnetxs.md
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update docs/source/en/api/pipelines/controlnetxs.md
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update docs/source/en/api/pipelines/controlnetxs.md
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/models/controlnetxs.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/models/controlnetxs.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/pipelines/controlnet_xs/pipeline_controlnet_xs.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update docs/source/en/api/pipelines/controlnetxs.md
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/pipelines/controlnet_xs/pipeline_controlnet_xs_sd_xl.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Incorporated doc feedback
      
      ---------
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      e192ae08
  32. 29 Nov, 2023 3 commits
    • Patrick von Platen's avatar
      [SDXL Turbo] Add some docs (#5982) · b34acbdc
      Patrick von Platen authored
      
      
      * add diffusers example
      
      * add diffusers example
      
      * Comment about making it faster
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      ---------
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      b34acbdc
    • Suraj Patil's avatar
      Add SVD (#5895) · 63f767ef
      Suraj Patil authored
      
      
      * begin model
      
      * finish blocks
      
      * add_embedding
      
      * addition_time_embed_dim
      
      * use TimestepEmbedding
      
      * fix temporal res block
      
      * fix time_pos_embed
      
      * fix add_embedding
      
      * add conversion script
      
      * fix model
      
      * up
      
      * add new resnet blocks
      
      * make forward work
      
      * return sample in original shape
      
      * fix temb shape in TemporalResnetBlock
      
      * add spatio temporal transformers
      
      * add vae blocks
      
      * fix blocks
      
      * update
      
      * update
      
      * fix shapes in Alphablender and add time activation in res blcok
      
      * use new blocks
      
      * style
      
      * fix temb shape
      
      * fix SpatioTemporalResBlock
      
      * reuse TemporalBasicTransformerBlock
      
      * fix TemporalBasicTransformerBlock
      
      * use TransformerSpatioTemporalModel
      
      * fix TransformerSpatioTemporalModel
      
      * fix time_context dim
      
      * clean up
      
      * make temb optional
      
      * add blocks
      
      * rename model
      
      * update conversion script
      
      * remove UNetMidBlockSpatioTemporal
      
      * add in init
      
      * remove unused arg
      
      * remove unused arg
      
      * remove more unsed args
      
      * up
      
      * up
      
      * check for None
      
      * update vae
      
      * update up/mid blocks for decoder
      
      * begin pipeline
      
      * adapt scheduler
      
      * add guidance scalings
      
      * fix norm eps in temporal transformers
      
      * add temporal autoencoder
      
      * make pipeline run
      
      * fix frame decodig
      
      * decode in float32
      
      * decode n frames at a time
      
      * pass decoding_t to decode_latents
      
      * fix decode_latents
      
      * vae encode/decode in fp32
      
      * fix dtype in TransformerSpatioTemporalModel
      
      * type image_latents same as image_embeddings
      
      * allow using differnt eps in temporal block for video decoder
      
      * fix default values in vae
      
      * pass num frames in decode
      
      * switch spatial to temporal for mixing in VAE
      
      * fix num frames during split decoding
      
      * cast alpha to sample dtype
      
      * fix attention in MidBlockTemporalDecoder
      
      * fix typo
      
      * fix guidance_scales dtype
      
      * fix missing activation in TemporalDecoder
      
      * skip_post_quant_conv
      
      * add vae conversion
      
      * style
      
      * take guidance scale as input
      
      * up
      
      * allow passing PIL to export_video
      
      * accept fps as arg
      
      * add pipeline and vae in init
      
      * remove hack
      
      * use AutoencoderKLTemporalDecoder
      
      * don't scale image latents
      
      * add unet tests
      
      * clean up unet
      
      * clean TransformerSpatioTemporalModel
      
      * add slow svd test
      
      * clean up
      
      * make temb optional in Decoder mid block
      
      * fix norm eps in TransformerSpatioTemporalModel
      
      * clean up temp decoder
      
      * clean up
      
      * clean up
      
      * use c_noise values for timesteps
      
      * use math for log
      
      * update
      
      * fix copies
      
      * doc
      
      * upcast vae
      
      * update forward pass for gradient checkpointing
      
      * make added_time_ids is tensor
      
      * up
      
      * fix upcasting
      
      * remove post quant conv
      
      * add _resize_with_antialiasing
      
      * fix _compute_padding
      
      * cleanup model
      
      * more cleanup
      
      * more cleanup
      
      * more cleanup
      
      * remove freeu
      
      * remove attn slice
      
      * small clean
      
      * up
      
      * up
      
      * remove extra step kwargs
      
      * remove eta
      
      * remove dropout
      
      * remove callback
      
      * remove merge factor args
      
      * clean
      
      * clean up
      
      * move to dedicated folder
      
      * remove attention_head_dim
      
      * docstr and small fix
      
      * update unet doc strings
      
      * rename decoding_t
      
      * correct linting
      
      * store c_skip and c_out
      
      * cleanup
      
      * clean TemporalResnetBlock
      
      * more cleanup
      
      * clean up vae
      
      * clean up
      
      * begin doc
      
      * more cleanup
      
      * up
      
      * up
      
      * doc
      
      * Improve
      
      * better naming
      
      * better naming
      
      * better naming
      
      * better naming
      
      * better naming
      
      * better naming
      
      * better naming
      
      * better naming
      
      * Apply suggestions from code review
      
      * Default chunk size to None
      
      * add example
      
      * Better
      
      * Apply suggestions from code review
      
      * update doc
      
      * Update src/diffusers/pipelines/stable_diffusion_video/pipeline_stable_diffusion_video.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * style
      
      * Get torch compile working
      
      * up
      
      * rename
      
      * fix doc
      
      * add chunking
      
      * torch compile
      
      * torch compile
      
      * add modelling outputs
      
      * torch compile
      
      * Improve chunking
      
      * Apply suggestions from code review
      
      * Update docs/source/en/using-diffusers/svd.md
      
      * Close diff tag
      
      * remove slicing
      
      * resnet docstr
      
      * add docstr in resnet
      
      * rename
      
      * Apply suggestions from code review
      
      * update tests
      
      * Fix output type latents
      
      * fix more
      
      * fix more
      
      * Update docs/source/en/using-diffusers/svd.md
      
      * fix more
      
      * add pipeline tests
      
      * remove unused arg
      
      * clean  up
      
      * make sure get_scaling receives tensors
      
      * fix euler scheduler
      
      * fix get_scalings
      
      * simply euler for now
      
      * remove old test file
      
      * use randn_tensor to create noise
      
      * fix device for rand tensor
      
      * increase expected_max_difference
      
      * fix test_inference_batch_single_identical
      
      * actually fix test_inference_batch_single_identical
      
      * disable test_save_load_float16
      
      * skip test_float16_inference
      
      * skip test_inference_batch_single_identical
      
      * fix test_xformers_attention_forwardGenerator_pass
      
      * Apply suggestions from code review
      
      * update StableVideoDiffusionPipelineSlowTests
      
      * update image
      
      * add diffusers example
      
      * fix more
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarapolinário <joaopaulo.passos@gmail.com>
      63f767ef
    • Steven Liu's avatar
      [docs] LCM training (#5796) · ddd8bd53
      Steven Liu authored
      * first draft
      
      * feedback
      ddd8bd53