Commits · aba4a5799a37103705c90f990417e6a5e70706d2 · renzhc / diffusers_dcu

24 Feb, 2025 3 commits
- Add SD3 ControlNet to AutoPipeline (#10888) · aba4a579
  hlky authored Feb 24, 2025
```
Co-authored-by: puhuk <wetr235@gmail.com>
```
  aba4a579
- [LoRA] restrict certain keys to be checked for peft config update. (#10808) · b0550a66
  Sayak Paul authored Feb 24, 2025
```
* restruct certain keys to be checked for peft config update.

* updates

* finish./

* finish 2.

* updates
```
  b0550a66
- Fix `torch_dtype` in Kolors text encoder with `transformers` v4.49 (#10816) · 6f74ef55
  hlky authored Feb 24, 2025
```
* Fix `torch_dtype` in Kolors text encoder with `transformers` v4.49

* Default torch_dtype and warning
```
  6f74ef55
22 Feb, 2025 2 commits

Comprehensive type checking for `from_pretrained` kwargs (#10758) · 9c7e2051

Daniel Regado authored Feb 22, 2025



* More robust from_pretrained init_kwargs type checking

* Corrected for Python 3.10

* Type checks subclasses and fixed type warnings

* More type corrections and skip tokenizer type checking

* make style && make quality

* Updated docs and types for Lumina pipelines

* Fixed check for empty signature

* changed location of helper functions

* make style

---------
Co-authored-by: hlky <hlky@hlky.ac>

9c7e2051

[docs] LoRA support (#10844) · 64dec70e

Steven Liu authored Feb 21, 2025



* lora

* update

* update

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

64dec70e

21 Feb, 2025 9 commits
- remove format check for safetensors file (#10864) · ffb6777a
  Marc Sun authored Feb 21, 2025
```
remove check
```
  ffb6777a
- [Fix] Docs overview.md (#10858) · 85fcbaf3
  SahilCarterr authored Feb 21, 2025
```
Fix docs
```
  85fcbaf3
- `device_map` in `load_model_dict_into_meta` (#10851) · d75ea3c7
  hlky authored Feb 21, 2025
```
* `device_map` in `load_model_dict_into_meta`

* _LOW_CPU_MEM_USAGE_DEFAULT

* fix is_peft_version is_bitsandbytes_version
```
  d75ea3c7
- [CI] Update always test Pipelines list in Pipeline fetcher (#10856) · b27d4edb
  Dhruv Nair authored Feb 21, 2025
```
* update

* update

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
```
  b27d4edb
- [CI] Fix incorrectly named test module for Hunyuan DiT (#10854) · 2b2d0429
  Dhruv Nair authored Feb 21, 2025
```
update
```
  2b2d0429
- fix remote vae template (#10852) · 6cef7d23
  Sayak Paul authored Feb 21, 2025
```
fix
```
  6cef7d23
- [chore] template for remote vae. (#10849) · 9055ccb3
  Sayak Paul authored Feb 21, 2025
```
template for remote vae.
```
  9055ccb3
- fix: run tests from a pr workflow. (#9696) · 1871a69e
  Sayak Paul authored Feb 21, 2025
```
* fix: run tests from a pr workflow.

* correct

* update

* checking.
```
  1871a69e
- SkyReels Hunyuan T2V & I2V (#10837) · e3bc4aab
  Aryan authored Feb 21, 2025
```
* update

* make fix-copies

* update

* tests

* update

* update

* add co-author
Co-Authored-By: Langdx <82783347+Langdx@users.noreply.github.com>

* add co-author
Co-Authored-By: howe <howezhang2018@gmail.com>

* update

---------
Co-authored-by: Langdx <82783347+Langdx@users.noreply.github.com>
Co-authored-by: howe <howezhang2018@gmail.com>
```
  e3bc4aab
20 Feb, 2025 13 commits

Some consistency-related fixes for HunyuanVideo (#10835) · f0707751
Aryan authored Feb 21, 2025
```
* update

* update
```
f0707751
SD3 IP-Adapter runtime checkpoint conversion (#10718) · d9ee3879
Daniel Regado authored Feb 20, 2025
```
* Added runtime checkpoint conversion

* Updated docs

* Fix for quantized model
```
d9ee3879
[CI] run fast gpu tests conditionally on pull requests. (#10310) · 454f82e6
Sayak Paul authored Feb 20, 2025
```
* run fast gpu tests conditionally on pull requests.

* revert unneeded changes.

* simplify PR.
```
454f82e6
[CI] install accelerate transformers from `main` (#10289) · 1f853504
Sayak Paul authored Feb 20, 2025
```
install accelerate transformers from .
```
1f853504

Notebooks for Community Scripts-7 (#10846) · 51941387

Parag Ekbote authored Feb 20, 2025

Add 5 Notebooks, improve their example
scripts and update the missing links for the
example README.

51941387

fix: support transformer models' `generation_config` in pipeline (#10779) · c7a8c439
Haoyun Qin authored Feb 20, 2025

c7a8c439
store activation cls instead of function (#10832) · a4c1aac3
Marc Sun authored Feb 20, 2025
```
* store cls instead of an obj

* style
```
a4c1aac3

[tests] test `encode_prompt()` in isolation (#10438) · b2ca39c8

Sayak Paul authored Feb 20, 2025

* poc encode_prompt() tests

* fix

* updates.

* fixes

* fixes

* updates

* updates

* updates

* revert

* updates

* updates

* updates

* updates

* remove SDXLOptionalComponentsTesterMixin.

* remove tests that directly leveraged encode_prompt() in some way or the other.

* fix imports.

* remove _save_load

* fixes

* fixes

* fixes

* fixes

b2ca39c8

Add missing `isinstance` for arg checks in GGUFParameter (#10834) · 53217126
AstraliteHeart authored Feb 19, 2025

53217126

[Utils] add utilities for checking if certain utilities are properly documented (#7763) · f550745a

Sayak Paul authored Feb 20, 2025



* add; utility to check if attn_procs,norms,acts are properly documented.

* add support listing to the workflows.

* change to 2024.

* small fixes.

* does adding detailed docstrings help?

* uncomment image processor check

* quality

* fix, thanks to @mishig.

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* style

* JointAttnProcessor2_0

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* Update docs/source/en/api/normalization.md
Co-authored-by: hlky <hlky@hlky.ac>

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>

f550745a

[LoRA] add LoRA support to Lumina2 and fine-tuning script (#10818) · f10d3c6d

Sayak Paul authored Feb 20, 2025

* feat: lora support for Lumina2.

* fix-copies.

* updates

* updates

* docs.

* fix

* add: training script.

* tests

* updates

* updates

* major updates.

* updates

* fixes

* docs.

* updates

* updates

f10d3c6d

[tests] use proper gemma class and config in lumina2 tests. (#10828) · 0fb70683
Sayak Paul authored Feb 20, 2025
```
use proper gemma class and config in lumina2 tests.
```
0fb70683
Remove print statements (#10836) · f8b54cf0
Aryan authored Feb 20, 2025
```
remove prints
```
f8b54cf0

19 Feb, 2025 4 commits

[misc] feat: introduce a style bot. (#10274) · 680a8ed8

Sayak Paul authored Feb 19, 2025



* feat: introduce a style bot.

* updates

* Apply suggestions from code review
Co-authored-by: Guillaume LEGENDRE <glegendre01@gmail.com>

* apply suggestion

* fixes

* updates

---------
Co-authored-by: Guillaume LEGENDRE <glegendre01@gmail.com>

680a8ed8

[FEAT] Model loading refactor (#10604) · f5929e03

Marc Sun authored Feb 19, 2025



* first draft model loading refactor

* revert name change

* fix bnb

* revert name

* fix dduf

* fix huanyan

* style

* Update src/diffusers/models/model_loading_utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* suggestions from reviews

* Update src/diffusers/models/modeling_utils.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove safetensors check

* fix default value

* more fix from suggestions

* revert logic for single file

* style

* typing + fix couple of issues

* improve speed

* Update src/diffusers/models/modeling_utils.py
Co-authored-by: Aryan <aryan@huggingface.co>

* fp8 dtype

* add tests

* rename resolved_archive_file to resolved_model_file

* format

* map_location default cpu

* add utility function

* switch to smaller model + test inference

* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* rm comment

* add log

* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* add decorator

* cosine sim instead

* fix use_keep_in_fp32_modules

* comm

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>

f5929e03

[LoRA] make `set_adapters()` robust on silent failures. (#9618) · 6fe05b9b

Sayak Paul authored Feb 19, 2025

* make set_adapters() robust on silent failures.

* fixes to tests

* flaky decorator.

* fix

* flaky to sd3.

* remove warning.

* sort

* quality

* skip test_simple_inference_with_text_denoiser_multi_adapter_block_lora

* skip testing unsupported features.

* raise warning instead of error.

6fe05b9b

DiffusionPipeline mixin `to`+FromOriginalModelMixin/FromSingleFileMixin... · 2bc82d63

hlky authored Feb 19, 2025

DiffusionPipeline mixin `to`+FromOriginalModelMixin/FromSingleFileMixin `from_single_file` type hint (#10811)

* DiffusionPipeline mixin `to` type hint

* FromOriginalModelMixin from_single_file

* FromSingleFileMixin from_single_file

2bc82d63

18 Feb, 2025 2 commits

[docs] add missing entries to the lora docs. (#10819) · 924f880d
Sayak Paul authored Feb 18, 2025
```
add missing entries to the lora docs.
```
924f880d

Fix max_shift value in flux and related functions to 1.15 (issue #10675) (#10807) · b75b204a

puhuk authored Feb 18, 2025

This PR updates the max_shift value in flux to 1.15 for consistency across the codebase. In addition to modifying max_shift in flux, all related functions that copy and use this logic, such as calculate_shift in `src/diffusers/pipelines/stable_diffusion_3/pipeline_stable_diffusion_3_img2img.py`, have also been updated to ensure uniform behavior.

b75b204a

17 Feb, 2025 2 commits
- [LoRA] improve lora support for flux. (#10810) · c14057c8
  Sayak Paul authored Feb 17, 2025
```
update lora support for flux.
```
  c14057c8
- [chore] update notes generation spaces (#10592) · 3579cd2b
  Sayak Paul authored Feb 17, 2025
```
fix
```
  3579cd2b
16 Feb, 2025 2 commits

Extend Support for callback_on_step_end for AuraFlow and LuminaText2Img Pipelines (#10746) · 3e99b567

Parag Ekbote authored Feb 16, 2025



* Add support for callback_on_step_end for
AuraFlowPipeline and LuminaText2ImgPipeline.

* Apply the suggestions from code review for lumina and auraflow
Co-authored-by: hlky <hlky@hlky.ac>

* Update missing inputs and imports.

* Add input field.

* Apply suggestions from code review-2
Co-authored-by: hlky <hlky@hlky.ac>

* Apply the suggestions from review for unused imports.
Co-authored-by: hlky <hlky@hlky.ac>

* make style.

* Update pipeline_aura_flow.py

* Update pipeline_lumina.py

* Update pipeline_lumina.py

* Update pipeline_aura_flow.py

* Update pipeline_lumina.py

---------
Co-authored-by: hlky <hlky@hlky.ac>

3e99b567

typo fix (#10802) · 952b9131
Yaniv Galron authored Feb 16, 2025

952b9131

15 Feb, 2025 2 commits

CogView4 (supports different length c and uc) (#10649) · d90cd362

Yuxuan Zhang authored Feb 16, 2025



* init

* encode with glm

* draft schedule

* feat(scheduler): Add CogView scheduler implementation

* feat(embeddings): add CogView 2D rotary positional embedding

* 1

* Update pipeline_cogview4.py

* fix the timestep init and sigma

* update latent

* draft patch(not work)

* fix

* [WIP][cogview4]: implement initial CogView4 pipeline

Implement the basic CogView4 pipeline structure with the following changes:
- Add CogView4 pipeline implementation
- Implement DDIM scheduler for CogView4
- Add CogView3Plus transformer architecture
- Update embedding models

Current limitations:
- CFG implementation uses padding for sequence length alignment
- Need to verify transformer inference alignment with Megatron

TODO:
- Consider separate forward passes for condition/uncondition
  instead of padding approach

* [WIP][cogview4][refactor]: Split condition/uncondition forward pass in CogView4 pipeline

Split the forward pass for conditional and unconditional predictions in the CogView4 pipeline to match the original implementation. The noise prediction is now done separately for each case before combining them for guidance. However, the results still need improvement.

This is a work in progress as the generated images are not yet matching expected quality.

* use with -2 hidden state

* remove text_projector

* 1

* [WIP] Add tensor-reload to align input from transformer block

* [WIP] for older glm

* use with cogview4 transformers forward twice of u and uc

* Update convert_cogview4_to_diffusers.py

* remove this

* use main example

* change back

* reset

* setback

* back

* back 4

* Fix qkv conversion logic for CogView4 to Diffusers format

* back5

* revert to sat to cogview4 version

* update a new convert from megatron

* [WIP][cogview4]: implement CogView4 attention processor

Add CogView4AttnProcessor class for implementing scaled dot-product attention
with rotary embeddings for the CogVideoX model. This processor concatenates
encoder and hidden states, applies QKV projections and RoPE, but does not
include spatial normalization.

TODO:
- Fix incorrect QKV projection weights
- Resolve ~25% error in RoPE implementation compared to Megatron

* [cogview4] implement CogView4 transformer block

Implement CogView4 transformer block following the Megatron architecture:
- Add multi-modulate and multi-gate mechanisms for adaptive layer normalization
- Implement dual-stream attention with encoder-decoder structure
- Add feed-forward network with GELU activation
- Support rotary position embeddings for image tokens

The implementation follows the original CogView4 architecture while adapting
it to work within the diffusers framework.

* with new attn

* [bugfix] fix dimension mismatch in CogView4 attention

* [cogview4][WIP]: update final normalization in CogView4 transformer

Refactored the final normalization layer in CogView4 transformer to use separate layernorm and AdaLN operations instead of combined AdaLayerNormContinuous. This matches the original implementation but needs validation.

Needs verification against reference implementation.

* 1

* put back

* Update transformer_cogview4.py

* change time_shift

* Update pipeline_cogview4.py

* change timesteps

* fix

* change text_encoder_id

* [cogview4][rope] align RoPE implementation with Megatron

- Implement apply_rope method in attention processor to match Megatron's implementation
- Update position embeddings to ensure compatibility with Megatron-style rotary embeddings
- Ensure consistent rotary position encoding across attention layers

This change improves compatibility with Megatron-based models and provides
better alignment with the original implementation's positional encoding approach.

* [cogview4][bugfix] apply silu activation to time embeddings in CogView4

Applied silu activation to time embeddings before splitting into conditional
and unconditional parts in CogView4Transformer2DModel. This matches the
original implementation and helps ensure correct time conditioning behavior.

* [cogview4][chore] clean up pipeline code

- Remove commented out code and debug statements
- Remove unused retrieve_timesteps function
- Clean up code formatting and documentation

This commit focuses on code cleanup in the CogView4 pipeline implementation, removing unnecessary commented code and improving readability without changing functionality.

* [cogview4][scheduler] Implement CogView4 scheduler and pipeline

* now It work

* add timestep

* batch

* change convert scipt

* refactor pt. 1; make style

* refactor pt. 2

* refactor pt. 3

* add tests

* make fix-copies

* update toctree.yml

* use flow match scheduler instead of custom

* remove scheduling_cogview.py

* add tiktoken to test dependencies

* Update src/diffusers/models/embeddings.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

* apply suggestions from review

* use diffusers apply_rotary_emb

* update flow match scheduler to accept timesteps

* fix comment

* apply review sugestions

* Update src/diffusers/schedulers/scheduling_flow_match_euler_discrete.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>

---------
Co-authored-by: 三洋三洋 <1258009915@qq.com>
Co-authored-by: OleehyO <leehy0357@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: YiYi Xu <yixu310@gmail.com>

d90cd362

follow-up refactor on lumina2 (#10776) · 69f919d8
YiYi Xu authored Feb 14, 2025
```
* up
```
69f919d8

14 Feb, 2025 1 commit
- [FIX] check_inputs function in lumina2 (#10784) · a6b843a7
  SahilCarterr authored Feb 15, 2025
  
  a6b843a7