1. 06 Jun, 2024 1 commit
  2. 05 Jun, 2024 2 commits
  3. 01 Jun, 2024 1 commit
  4. 31 May, 2024 1 commit
    • Sayak Paul's avatar
      [Core] Introduce class variants for `Transformer2DModel` (#7647) · 983dec3b
      Sayak Paul authored
      * init for patches
      
      * finish patched model.
      
      * continuous transformer
      
      * vectorized transformer2d.
      
      * style.
      
      * inits.
      
      * fix-copies.
      
      * introduce DiTTransformer2DModel.
      
      * fixes
      
      * use REMAPPING as suggested by @DN6
      
      * better logging.
      
      * add pixart transformer model.
      
      * inits.
      
      * caption_channels.
      
      * attention masking.
      
      * fix use_additional_conditions.
      
      * remove print.
      
      * debug
      
      * flatten
      
      * fix: assertion for sigma
      
      * handle remapping for modeling_utils
      
      * add tests for dit transformer2d
      
      * quality
      
      * placeholder for pixart tests
      
      * pixart tests
      
      * add _no_split_modules
      
      * add docs.
      
      * check
      
      * check
      
      * check
      
      * check
      
      * fix tests
      
      * fix tests
      
      * move Transformer output to modeling_output
      
      * move errors better and bring back use_additional_conditions attribute.
      
      * add unnecessary things from DiT.
      
      * clean up pixart
      
      * fix remapping
      
      * fix device_map things in pixart2d.
      
      * replace Transformer2DModel with appropriate classes in dit, pixart tests
      
      * empty
      
      * legacy mixin classes./
      
      * use a remapping dict for fetching class names.
      
      * change to specifc model types in the pipeline implementations.
      
      * move _fetch_remapped_cls_from_config to modeling_loading_utils.py
      
      * fix dependency problems.
      
      * add deprecation note.
      983dec3b
  5. 29 May, 2024 2 commits
  6. 27 May, 2024 1 commit
    • Anton Obukhov's avatar
      [Pipeline] Marigold depth and normals estimation (#7847) · b3d10d6d
      Anton Obukhov authored
      
      
      * implement marigold depth and normals pipelines in diffusers core
      
      * remove bibtex
      
      * remove deprecations
      
      * remove save_memory argument
      
      * remove validate_vae
      
      * remove config output
      
      * remove batch_size autodetection
      
      * remove presets logic
      move default denoising_steps and processing_resolution into the model config
      make default ensemble_size 1
      
      * remove no_grad
      
      * add fp16 to the example usage
      
      * implement is_matplotlib_available
      use is_matplotlib_available, is_scipy_available for conditional imports in the marigold depth pipeline
      
      * move colormap, visualize_depth, and visualize_normals into export_utils.py
      
      * make the denoising loop more lucid
      fix the outputs to always be 4d tensors or lists of pil images
      support a 4d input_image case
      attempt to support model_cpu_offload_seq
      move check_inputs into a separate function
      change default batch_size to 1, remove any logic to make it bigger implicitly
      
      * style
      
      * rename denoising_steps into num_inference_steps
      
      * rename input_image into image
      
      * rename input_latent into latents
      
      * remove decode_image
      change decode_prediction to use the AutoencoderKL.decode method
      
      * move clean_latent outside of progress_bar
      
      * refactor marigold-reusable image processing bits into MarigoldImageProcessor class
      
      * clean up the usage example docstring
      
      * make ensemble functions members of the pipelines
      
      * add early checks in check_inputs
      rename E into ensemble_size in depth ensembling
      
      * fix vae_scale_factor computation
      
      * better compatibility with torch.compile
      better variable naming
      
      * move export_depth_to_png to export_utils
      
      * remove encode_prediction
      
      * improve visualize_depth and visualize_normals to accept multi-dimensional data and lists
      remove visualization functions from the pipelines
      move exporting depth as 16-bit PNGs functionality from the depth pipeline
      update example docstrings
      
      * do not shortcut vae.config variables
      
      * change all asserts to raise ValueError
      
      * rename output_prediction_type to output_type
      
      * better variable names
      clean up variable deletion code
      
      * better variable names
      
      * pass desc and leave kwargs into the diffusers progress_bar
      implement nested progress bar for images and steps loops
      
      * implement scale_invariant and shift_invariant flags in the ensemble_depth function
      add scale_invariant and shift_invariant flags readout from the model config
      further refactor ensemble_depth
      support ensembling without alignment
      add ensemble_depth docstring
      
      * fix generator device placement checks
      
      * move encode_empty_text body into the pipeline call
      
      * minor empty text encoding simplifications
      
      * adjust pipelines' class docstrings to explain the added construction arguments
      
      * improve the scipy failure condition
      add comments
      improve docstrings
      change the default use_full_z_range to True
      
      * make input image values range check configurable in the preprocessor
      refactor load_image_canonical in preprocessor to reject unknown types and return the image in the expected 4D format of tensor and on right device
      support a list of everything as inputs to the pipeline, change type to PipelineImageInput
      implement a check that all input list elements have the same dimensions
      improve docstrings of pipeline outputs
      remove check_input pipeline argument
      
      * remove forgotten print
      
      * add prediction_type model config
      
      * add uncertainty visualization into export utils
      fix NaN values in normals uncertainties
      
      * change default of output_uncertainty to False
      better handle the case of an attempt to export or visualize none
      
      * fix `output_uncertainty=False`
      
      * remove kwargs
      fix check_inputs according to the new inputs of the pipeline
      
      * rename prepare_latent into prepare_latents as in other pipelines
      annotate prepare_latents in normals pipeline with "Copied from"
      annotate encode_image in normals pipeline with "Copied from"
      
      * move nested-capable `progress_bar` method into the pipelines
      revert the original `progress_bar` method in pipeline_utils
      
      * minor message improvement
      
      * fix cpu offloading
      
      * move colormap, visualize_depth, export_depth_to_16bit_png, visualize_normals, visualize_uncertainty to marigold_image_processing.py
      update example docstrings
      
      * fix missing comma
      
      * change torch.FloatTensor to torch.Tensor
      
      * fix importing of MarigoldImageProcessor
      
      * fix vae offloading
      fix batched image encoding
      remove separate encode_image function and use vae.encode instead
      
      * implement marigold's intial tests
      relax generator checks in line with other pipelines
      implement return_dict __call__ argument in line with other pipelines
      
      * fix num_images computation
      
      * remove MarigoldImageProcessor and outputs from import structure
      update tests
      
      * update docstrings
      
      * update init
      
      * update
      
      * style
      
      * fix
      
      * fix
      
      * up
      
      * up
      
      * up
      
      * add simple test
      
      * up
      
      * update expected np input/output to be channel last
      
      * move expand_tensor_or_array into the MarigoldImageProcessor
      
      * rewrite tests to follow conventions - hardcoded slices instead of image artifacts
      write more smoke tests
      
      * add basic docs.
      
      * add anton's contribution statement
      
      * remove todos.
      
      * fix assertion values for marigold depth slow tests
      
      * fix assertion values for depth normals.
      
      * remove print
      
      * support AutoencoderTiny in the pipelines
      
      * update documentation page
      add Available Pipelines section
      add Available Checkpoints section
      add warning about num_inference_steps
      
      * fix missing import in docstring
      fix wrong value in visualize_depth docstring
      
      * [doc] add marigold to pipelines overview
      
      * [doc] add section "usage examples"
      
      * fix an issue with latents check in the pipelines
      
      * add "Frame-by-frame Video Processing with Consistency" section
      
      * grammarly
      
      * replace tables with images with css-styled images (blindly)
      
      * style
      
      * print
      
      * fix the assertions.
      
      * take from the github runner.
      
      * take the slices from action artifacts
      
      * style.
      
      * update with the slices from the runner.
      
      * remove unnecessary code blocks.
      
      * Revert "[doc] add marigold to pipelines overview"
      
      This reverts commit a505165150afd8dab23c474d1a054ea505a56a5f.
      
      * remove invitation for new modalities
      
      * split out marigold usage examples
      
      * doc cleanup
      
      ---------
      Co-authored-by: default avataryiyixuxu <yixu310@gmail.com>
      Co-authored-by: default avataryiyixuxu <yixu310@gmail,com>
      Co-authored-by: default avatarsayakpaul <spsayakpaul@gmail.com>
      b3d10d6d
  7. 24 May, 2024 1 commit
  8. 19 May, 2024 1 commit
  9. 16 May, 2024 1 commit
  10. 10 May, 2024 1 commit
    • Mark Van Aken's avatar
      #7535 Update FloatTensor type hints to Tensor (#7883) · be4afa0b
      Mark Van Aken authored
      * find & replace all FloatTensors to Tensor
      
      * apply formatting
      
      * Update torch.FloatTensor to torch.Tensor in the remaining files
      
      * formatting
      
      * Fix the rest of the places where FloatTensor is used as well as in documentation
      
      * formatting
      
      * Update new file from FloatTensor to Tensor
      be4afa0b
  11. 09 May, 2024 3 commits
  12. 08 May, 2024 2 commits
    • YiYi Xu's avatar
      fix offload test (#7868) · 35358a2d
      YiYi Xu authored
      
      
      fix
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      35358a2d
    • Aryan's avatar
      [Pipeline] AnimateDiff SDXL (#6721) · 818f7607
      Aryan authored
      
      
      * update conversion script to handle motion adapter sdxl checkpoint
      
      * add animatediff xl
      
      * handle addition_embed_type
      
      * fix output
      
      * update
      
      * add imports
      
      * make fix-copies
      
      * add decode latents
      
      * update docstrings
      
      * add animatediff sdxl to docs
      
      * remove unnecessary lines
      
      * update example
      
      * add test
      
      * revert conv_in conv_out kernel param
      
      * remove unused param addition_embed_type_num_heads
      
      * latest IPAdapter impl
      
      * make fix-copies
      
      * fix return
      
      * add IPAdapterTesterMixin to tests
      
      * fix return
      
      * revert based on suggestion
      
      * add freeinit
      
      * fix test_to_dtype test
      
      * use StableDiffusionMixin instead of different helper methods
      
      * fix progress bar iterations
      
      * apply suggestions from review
      
      * hardcode flip_sin_to_cos and freq_shift
      
      * make fix-copies
      
      * fix ip adapter implementation
      
      * fix last failing test
      
      * make style
      
      * Update docs/source/en/api/pipelines/animatediff.md
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      
      * remove todo
      
      * fix doc-builder errors
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      818f7607
  13. 03 May, 2024 1 commit
  14. 02 May, 2024 2 commits
  15. 01 May, 2024 1 commit
  16. 30 Apr, 2024 3 commits
  17. 24 Apr, 2024 2 commits
  18. 22 Apr, 2024 4 commits
  19. 19 Apr, 2024 2 commits
  20. 16 Apr, 2024 1 commit
    • UmerHA's avatar
      Fixing implementation of ControlNet-XS (#6772) · fda1531d
      UmerHA authored
      
      
      * CheckIn - created DownSubBlocks
      
      * Added extra channels, implemented subblock fwd
      
      * Fixed connection sizes
      
      * checkin
      
      * Removed iter, next in forward
      
      * Models for SD21 & SDXL run through
      
      * Added back pipelines, cleared up connections
      
      * Cleaned up connection creation
      
      * added debug logs
      
      * updated logs
      
      * logs: added input loading
      
      * Update umer_debug_logger.py
      
      * log: Loading hint
      
      * Update umer_debug_logger.py
      
      * added logs
      
      * Changed debug logging
      
      * debug: added more logs
      
      * Fixed num_norm_groups
      
      * Debug: Logging all of SDXL input
      
      * Update umer_debug_logger.py
      
      * debug: updated logs
      
      * checkim
      
      * Readded tests
      
      * Removed debug logs
      
      * Fixed Slow Tests
      
      * Added value ckecks | Updated model_cpu_offload_seq
      
      * accelerate-offloading works ; fast tests work
      
      * Made unet & addon explicit in controlnet
      
      * Updated slow tests
      
      * Added dtype/device to ControlNetXS
      
      * Filled in test model paths
      
      * Added image_encoder/feature_extractor to XL pipe
      
      * Fixed fast tests
      
      * Added comments and docstrings
      
      * Fixed copies
      
      * Added docs ; Updates slow tests
      
      * Moved changes to UNetMidBlock2DCrossAttn
      
      * tiny cleanups
      
      * Removed stray prints
      
      * Removed ip adapters + freeU
      
      - Removed ip adapters + freeU as they don't make sense for ControlNet-XS
      - Fixed imports of UNet components
      
      * Fixed test_save_load_float16
      
      * Make style, quality, fix-copies
      
      * Changed loading/saving API for ControlNetXS
      
      - Changed loading/saving API for ControlNetXS
      - other small fixes
      
      * Removed ControlNet-XS from research examples
      
      * Make style, quality, fix-copies
      
      * Small fixes
      
      - deleted ControlNetXSModel.init_original
      - added time_embedding_mix to StableDiffusionControlNetXSPipeline .from_pretrained / StableDiffusionXLControlNetXSPipeline.from_pretrained
      - fixed copy hints
      
      * checkin May 11 '23
      
      * CheckIn Mar 12 '24
      
      * Fixed tests for SD
      
      * Added tests for UNetControlNetXSModel
      
      * Fixed SDXL tests
      
      * cleanup
      
      * Delete Pipfile
      
      * CheckIn Mar 20
      
      Started replacing sub blocks  by `ControlNetXSCrossAttnDownBlock2D` and `ControlNetXSCrossAttnUplock2D`
      
      * check-in Mar 23
      
      * checkin 24 Mar
      
      * Created init for UNetCnxs and CnxsAddon
      
      * CheckIn
      
      * Made from_modules, from_unet and no_control work
      
      * make style,quality,fix-copies & small changes
      
      * Fixed freezing
      
      * Added gradient ckpt'ing; fixed tests
      
      * Fix slow tests(+compile) ; clear naming confusion
      
      * Don't create UNet in init ; removed class_emb
      
      * Incorporated review feedback
      
      - Deleted get_base_pipeline /  get_controlnet_addon for pipes
      - Pipes inherit from StableDiffusionXLPipeline
      - Made module dicts for cnxs-addon's down/mid/up classes
      - Added support for qkv fusion and freeU
      
      * Make style, quality, fix-copies
      
      * Implemented review feedback
      
      * Removed compatibility check for vae/ctrl embedding
      
      * make style, quality, fix-copies
      
      * Delete Pipfile
      
      * Integrated review feedback
      
      - Importing ControlNetConditioningEmbedding now
      - get_down/mid/up_block_addon now outside class
      - renamed `do_control` to `apply_control`
      
      * Reduced size of test tensors
      
      For this, added `norm_num_groups` as parameter everywhere
      
      * Renamed cnxs-`Addon` to cnxs-`Adapter`
      
      - `ControlNetXSAddon` -> `ControlNetXSAdapter`
      - `ControlNetXSAddonDownBlockComponents` -> `DownBlockControlNetXSAdapter`, and similarly for mid/up
      - `get_mid_block_addon` -> `get_mid_block_adapter`, and similarly for mid/up
      
      * Fixed save_pretrained/from_pretrained bug
      
      * Removed redundant code
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      fda1531d
  21. 10 Apr, 2024 2 commits
    • Sayak Paul's avatar
      [Tests] reduce the model sizes in the SD fast tests (#7580) · b2323aa2
      Sayak Paul authored
      * give it a shot.
      
      * print.
      
      * correct assertion.
      
      * gather results from the rest of the tests.
      
      * change the assertion values where needed.
      
      * remove print statements.
      b2323aa2
    • Sayak Paul's avatar
      [Core] add "balanced" `device_map` support to pipelines (#6857) · 3e4a6bd2
      Sayak Paul authored
      
      
      * get device <-> component mapping when using multiple gpus.
      
      * condition the device_map bits.
      
      * relax condition
      
      * device_map progress.
      
      * device_map enhancement
      
      * some cleaning up and debugging
      
      * Apply suggestions from code review
      Co-authored-by: default avatarMarc Sun <57196510+SunMarc@users.noreply.github.com>
      
      * incorporate suggestions from PR.
      
      * remove multi-gpu condition for now.
      
      * guard check the component -> device mapping
      
      * fix: device_memory variable
      
      * dispatching transformers model to have force_hooks=True
      
      * better guarding for transformers device_map
      
      * introduce support balanced_low_memory and balanced_ultra_low_memory.
      
      * remove device_map patch.
      
      * fix: intermediate variable scoping.
      
      * fix: condition in cpu offload.
      
      * fix: flax class restrictions.
      
      * remove modifications from cpu_offload and model_offload
      
      * incorporate changes.
      
      * add a simple forward pass test
      
      * add: torch_device in get_inputs()
      
      * add: tests
      
      * remove print
      
      * safe-guard to(), model offloading and cpu offloading when balanced is used as a device_map.
      
      * style
      
      * remove .
      
      * safeguard device_map with more checks and remove invalid device_mapping strategues.
      
      * make  a class attribute and adjust tests accordingly.
      
      * fix device_map check
      
      * fix test
      
      * adjust comment
      
      * fix: device_map attribute
      
      * fix: dispatching.
      
      * max_memory test for pipeline
      
      * version guard the tests
      
      * fix guard.
      
      * address review feedback.
      
      * reset_device_map method.
      
      * add: test for reset_hf_device_map
      
      * fix a couple things.
      
      * add reset_device_map() in the error message.
      
      * add tests for checking reset_device_map doesn't have unintended consequences.
      
      * fix reset_device_map and offloading tests.
      
      * create _get_final_device_map utility.
      
      * hf_device_map -> _hf_device_map
      
      * add documentation
      
      * add notes suggested by Marc.
      
      * styling.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * move updates within gpu condition.
      
      * other docs related things
      
      * note on ignore a device not specified in .
      
      * provide a suggestion if device mapping errors out.
      
      * fix: typo.
      
      * _hf_device_map -> hf_device_map
      
      * Empty-Commit
      
      * add: example hf_device_map.
      
      ---------
      Co-authored-by: default avatarMarc Sun <57196510+SunMarc@users.noreply.github.com>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      3e4a6bd2
  22. 09 Apr, 2024 1 commit
  23. 08 Apr, 2024 1 commit
    • Nguyễn Công Tú Anh's avatar
      Add AudioLDM2 TTS (#5381) · 56a76082
      Nguyễn Công Tú Anh authored
      
      
      * add audioldm2 tts
      
      * change gpt2 max new tokens
      
      * remove unnecessary pipeline and class
      
      * add TTS to AudioLDM2Pipeline
      
      * add TTS docs
      
      * delete unnecessary file
      
      * remove unnecessary import
      
      * add audioldm2 slow testcase
      
      * fix code quality
      
      * remove AudioLDMLearnablePositionalEmbedding
      
      * add variable check vits encoder
      
      * add use_learned_position_embedding
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      56a76082
  24. 04 Apr, 2024 1 commit
    • UmerHA's avatar
      Skip `test_freeu_enabled ` on MPS (#7570) · 71f49a5d
      UmerHA authored
      * Skip `test_freeu_enabled ` on MPS
      
      * Small fixes
      
      - import skip_mps correctly
      - disable all instances of test_freeu_enabled
      
      * Empty commit to trigger tests
      
      * Empty commit to trigger CI
      71f49a5d
  25. 03 Apr, 2024 1 commit
  26. 02 Apr, 2024 1 commit
    • Sayak Paul's avatar
      [Tests] Speed up fast pipelines part II (#7521) · 2b04ec2f
      Sayak Paul authored
      
      
      * start printing the tensors.
      
      * print full throttle
      
      * set static slices for 7 tests.
      
      * remove printing.
      
      * flatten
      
      * disable test for controlnet
      
      * what happens when things are seeded properly?
      
      * set the right value
      
      * style./
      
      * make pia test fail to check things
      
      * print.
      
      * fix pia.
      
      * checking for animatediff.
      
      * fix: animatediff.
      
      * video synthesis
      
      * final piece.
      
      * style.
      
      * print guess.
      
      * fix: assertion for control guess.
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      2b04ec2f