• Sayak Paul's avatar
    [Quantization] Add quantization support for `bitsandbytes` (#9213) · b821f006
    Sayak Paul authored
    * quantization config.
    
    * fix-copies
    
    * fix
    
    * modules_to_not_convert
    
    * add bitsandbytes utilities.
    
    * make progress.
    
    * fixes
    
    * quality
    
    * up
    
    * up
    
    rotary embedding refactor 2: update comments, fix dtype for use_real=False (#9312)
    
    fix notes and dtype
    
    up
    
    up
    
    * minor
    
    * up
    
    * up
    
    * fix
    
    * provide credits where due.
    
    * make configurations work.
    
    * fixes
    
    * fix
    
    * update_missing_keys
    
    * fix
    
    * fix
    
    * make it work.
    
    * fix
    
    * provide credits to transformers.
    
    * empty commit
    
    * handle to() better.
    
    * tests
    
    * change to bnb from bitsandbytes
    
    * fix tests
    
    fix slow quality tests
    
    SD3 remark
    
    fix
    
    complete int4 tests
    
    add a readme to the test files.
    
    add model cpu offload tests
    
    warning test
    
    * better safeguard.
    
    * change merging status
    
    * courtesy to transformers.
    
    * move  upper.
    
    * better
    
    * make the unused kwargs warning friendlier.
    
    * harmonize changes with https://github.com/huggingface/transformers/pull/33122
    
    
    
    * style
    
    * trainin tests
    
    * feedback part i.
    
    * Add Flux inpainting and Flux Img2Img (#9135)
    
    ---------
    Co-authored-by: default avataryiyixuxu <yixu310@gmail.com>
    
    Update `UNet2DConditionModel`'s error messages (#9230)
    
    * refactor
    
    [CI] Update Single file Nightly Tests (#9357)
    
    * update
    
    * update
    
    feedback.
    
    improve README for flux dreambooth lora (#9290)
    
    * improve readme
    
    * improve readme
    
    * improve readme
    
    * improve readme
    
    fix one uncaught deprecation warning for accessing vae_latent_channels in VaeImagePreprocessor (#9372)
    
    deprecation warning vae_latent_channels
    
    add mixed int8 tests and more tests to nf4.
    
    [core] Freenoise memory improvements (#9262)
    
    * update
    
    * implement prompt interpolation
    
    * make style
    
    * resnet memory optimizations
    
    * more memory optimizations; todo: refactor
    
    * update
    
    * update animatediff controlnet with latest changes
    
    * refactor chunked inference changes
    
    * remove print statements
    
    * update
    
    * chunk -> split
    
    * remove changes from incorrect conflict resolution
    
    * remove changes from incorrect conflict resolution
    
    * add explanation of SplitInferenceModule
    
    * update docs
    
    * Revert "update docs"
    
    This reverts commit c55a50a271b2cefa8fe340a4f2a3ab9b9d374ec0.
    
    * update docstring for freenoise split inference
    
    * apply suggestions from review
    
    * add tests
    
    * apply suggestions from review
    
    quantization docs.
    
    docs.
    
    * Revert "Add Flux inpainting and Flux Img2Img (#9135)"
    
    This reverts commit 5799954dd4b3d753c7c1b8d722941350fe4f62ca.
    
    * tests
    
    * don
    
    * Apply suggestions from code review
    Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * contribution guide.
    
    * changes
    
    * empty
    
    * fix tests
    
    * harmonize with https://github.com/huggingface/transformers/pull/33546
    
    .
    
    * numpy_cosine_distance
    
    * config_dict modification.
    
    * remove if config comment.
    
    * note for load_state_dict changes.
    
    * float8 check.
    
    * quantizer.
    
    * raise an error for non-True low_cpu_mem_usage values when using quant.
    
    * low_cpu_mem_usage shenanigans when using fp32 modules.
    
    * don't re-assign _pre_quantization_type.
    
    * make comments clear.
    
    * remove comments.
    
    * handle mixed types better when moving to cpu.
    
    * add tests to check if we're throwing warning rightly.
    
    * better check.
    
    * fix 8bit test_quality.
    
    * handle dtype more robustly.
    
    * better message when keep_in_fp32_modules.
    
    * handle dtype casting.
    
    * fix dtype checks in pipeline.
    
    * fix warning message.
    
    * Update src/diffusers/models/modeling_utils.py
    Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
    
    * mitigate the confusing cpu warning
    
    ---------
    Co-authored-by: default avatarVishnu V Jaddipal <95531133+Gothos@users.noreply.github.com>
    Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
    Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
    b821f006
pipeline_utils.py 92.6 KB