- 22 Sep, 2025 1 commit
-
-
Sayak Paul authored
* factor out the overlaps in save_lora_weights(). * remove comment. * remove comment. * up * fix-copies
-
- 01 Sep, 2025 1 commit
-
-
apolinário authored
* Fix lora conversion function for ai-toolkit Qwen Image LoRAs * add forgotten parenthesis * remove space new line * update pipeline * detect if arrow or letter * remove whitespaces * style * apply suggestion * apply suggestion * apply suggestion --------- Co-authored-by:Sayak Paul <spsayakpaul@gmail.com>
-
- 19 Aug, 2025 1 commit
-
-
Linoy Tsaban authored
* add alpha * load into 2nd transformer * Update src/diffusers/loaders/lora_conversion_utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * Update src/diffusers/loaders/lora_conversion_utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * pr comments * pr comments * pr comments * fix * fix * Apply style fixes * fix copies * fix * fix copies * Update src/diffusers/loaders/lora_pipeline.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * revert change * revert change * fix copies * up * fix --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by:
linoy <linoy@hf.co>
-
- 18 Aug, 2025 1 commit
-
-
Sayak Paul authored
* feat: support more Qwen LoRAs from the community. * revert unrelated changes. * Revert "revert unrelated changes." This reverts commit 82dea555dc9afce1fbb4dc2323be45212ded9092.
-
- 11 Aug, 2025 1 commit
-
-
Sayak Paul authored
* feat: support qwen lightning lora. * add docs. * fix
-
- 05 Aug, 2025 1 commit
-
-
Sayak Paul authored
* feat: support lora in qwen image and training script * up * up * up * up * up * up * add lora tests * fix * add tests * fix * reviewer feedback * up[ * Apply suggestions from code review Co-authored-by:
Aryan <aryan@huggingface.co> --------- Co-authored-by:
Aryan <aryan@huggingface.co>
-
- 16 Jul, 2025 1 commit
-
-
Tolga Cangöz authored
* style * Fix class name casing for SkyReelsV2 components in multiple files to ensure consistency and correct functionality. * cleaning * cleansing * Refactor `get_timestep_embedding` to move modifications into `SkyReelsV2TimeTextImageEmbedding`. * Remove unnecessary line break in `get_timestep_embedding` function for cleaner code. * Remove `skyreels_v2` entry from `_import_structure` and update its initialization to directly assign the list of SkyReelsV2 components. * cleansing * Refactor attention processing in `SkyReelsV2AttnProcessor2_0` to always convert query, key, and value to `torch.bfloat16`, simplifying the code and improving clarity. * Enhance example usage in `pipeline_skyreels_v2_diffusion_forcing.py` by adding VAE initialization and detailed prompt for video generation, improving clarity and usability of the documentation. * Refactor import structure in `__init__.py` for SkyReelsV2 components and improve formatting in `pipeline_skyreels_v2_diffusion_forcing.py` to enhance code readability and maintainability. * Update `guidance_scale` parameter in `SkyReelsV2DiffusionForcingPipeline` from 5.0 to 6.0 to enhance video generation quality. * Update `guidance_scale` parameter in example documentation and class definition of `SkyReelsV2DiffusionForcingPipeline` to ensure consistency and improve video generation quality. * Update `causal_block_size` parameter in `SkyReelsV2DiffusionForcingPipeline` to default to `None`. * up * Fix dtype conversion for `timestep_proj` in `SkyReelsV2Transformer3DModel` to *ensure* correct tensor operations. * Optimize causal mask generation by replacing repeated tensor with `repeat_interleave` for improved efficiency in `SkyReelsV2Transformer3DModel`. * style * Enhance example documentation in `SkyReelsV2DiffusionForcingPipeline` with guidance scale and shift parameters for T2V and I2V. Remove unused `retrieve_latents` function to streamline the code. * Refactor sample scheduler creation in `SkyReelsV2DiffusionForcingPipeline` to use `deepcopy` for improved state management during inference steps. * Enhance error handling and documentation in `SkyReelsV2DiffusionForcingPipeline` for `overlap_history` and `addnoise_condition` parameters to improve long video generation guidance. * Update documentation and progress bar handling in `SkyReelsV2DiffusionForcingPipeline` to clarify asynchronous inference settings and improve progress tracking during denoising steps. * Refine progress bar calculation in `SkyReelsV2DiffusionForcingPipeline` by rounding the step size to one decimal place for improved readability during denoising steps. * Update import statements in `SkyReelsV2DiffusionForcingPipeline` documentation for improved clarity and organization. * Refactor progress bar handling in `SkyReelsV2DiffusionForcingPipeline` to use total steps instead of calculated step size. * update templates for i2v, v2v * Add `retrieve_latents` function to streamline latent retrieval in `SkyReelsV2DiffusionForcingPipeline`. Update video latent processing to utilize this new function for improved clarity and maintainability. * Add `retrieve_latents` function to both i2v and v2v pipelines for consistent latent retrieval. Update video latent processing to utilize this function, enhancing clarity and maintainability across the SkyReelsV2DiffusionForcingPipeline implementations. * Remove redundant ValueError for `overlap_history` in `SkyReelsV2DiffusionForcingPipeline` to streamline error handling and improve user guidance for long video generation. * Update default video dimensions and flow matching scheduler parameter in `SkyReelsV2DiffusionForcingPipeline` to enhance video generation capabilities. * Refactor `SkyReelsV2DiffusionForcingPipeline` to support Image-to-Video (i2v) generation. Update class name, add image encoding functionality, and adjust parameters for improved video generation. Enhance error handling for image inputs and update documentation accordingly. * Improve organization for image-last_image condition. * Refactor `SkyReelsV2DiffusionForcingImageToVideoPipeline` to improve latent preparation and video condition handling integration. * style * style * Add example usage of PIL for image input in `SkyReelsV2DiffusionForcingImageToVideoPipeline` documentation. * Refactor `SkyReelsV2DiffusionForcingPipeline` to `SkyReelsV2DiffusionForcingVideoToVideoPipeline`, enhancing support for Video-to-Video (v2v) generation. Introduce video input handling, update latent preparation logic, and improve error handling for input parameters. * Refactor `SkyReelsV2DiffusionForcingImageToVideoPipeline` by removing the `image_encoder` and `image_processor` dependencies. Update the CPU offload sequence accordingly. * Refactor `SkyReelsV2DiffusionForcingImageToVideoPipeline` to enhance latent preparation logic and condition handling. Update image input type to `Optional`, streamline video condition processing, and improve handling of `last_image` during latent generation. * Enhance `SkyReelsV2DiffusionForcingPipeline` by refining latent preparation for long video generation. Introduce new parameters for video handling, overlap history, and causal block size. Update logic to accommodate both short and long video scenarios, ensuring compatibility and improved processing. * refactor * fix num_frames * fix prefix_video_latents * up * refactor * Fix typo in scheduler method call within `SkyReelsV2DiffusionForcingVideoToVideoPipeline` to ensure proper noise scaling during latent generation. * up * Enhance `SkyReelsV2DiffusionForcingImageToVideoPipeline` by adding support for `last_image` parameter and refining latent frame calculations. Update preprocessing logic. * add statistics * Refine latent frame handling in `SkyReelsV2DiffusionForcingImageToVideoPipeline` by correcting variable names and reintroducing latent mean and standard deviation calculations. Update logic for frame preparation and sampling to ensure accurate video generation. * up * refactor * up * Refactor `SkyReelsV2DiffusionForcingVideoToVideoPipeline` to improve latent handling by enforcing tensor input for video, updating frame preparation logic, and adjusting default frame count. Enhance preprocessing and postprocessing steps for better integration. * style * fix vae output indexing * upup * up * Fix tensor concatenation and repetition logic in `SkyReelsV2DiffusionForcingImageToVideoPipeline` to ensure correct dimensionality for video conditions and latent conditions. * Refactor latent retrieval logic in `SkyReelsV2DiffusionForcingVideoToVideoPipeline` to handle tensor dimensions more robustly, ensuring compatibility with both 3D and 4D video inputs. * Enhance logging in `SkyReelsV2DiffusionForcing` pipelines by adding iteration print statements for better debugging. Clean up unused code related to prefix video latents length calculation in `SkyReelsV2DiffusionForcingImageToVideoPipeline`. * Update latent handling in `SkyReelsV2DiffusionForcingImageToVideoPipeline` to conditionally set latents based on video iteration state, improving flexibility for video input processing. * Refactor `SkyReelsV2TimeTextImageEmbedding` to utilize `get_1d_sincos_pos_embed_from_grid` for timestep projection. * Enhance `get_1d_sincos_pos_embed_from_grid` function to include an optional parameter `flip_sin_to_cos` for flipping sine and cosine embeddings, improving flexibility in positional embedding generation. * Update timestep projection in `SkyReelsV2TimeTextImageEmbedding` to include `flip_sin_to_cos` parameter, enhancing the flexibility of time embedding generation. * Refactor tensor type handling in `SkyReelsV2AttnProcessor2_0` and `SkyReelsV2TransformerBlock` to ensure consistent use of `torch.float32` and `torch.bfloat16`, improving integration. * Update tensor type in `SkyReelsV2RotaryPosEmbed` to use `torch.float32` for frequency calculations, ensuring consistency in data types across the model. * Refactor `SkyReelsV2TimeTextImageEmbedding` to utilize automatic mixed precision for timestep projection. * down * down * style * Add debug tensor tracking to `SkyReelsV2Transformer3DModel` for enhanced debugging and output analysis; update `Transformer2DModelOutput` to include debug tensors. * up * Refactor indentation in `SkyReelsV2AttnProcessor2_0` to improve code readability and maintain consistency in style. * Convert query, key, and value tensors to bfloat16 in `SkyReelsV2AttnProcessor2_0` for improved performance. * Add debug print statements in `SkyReelsV2TransformerBlock` to track tensor shapes and values for improved debugging and analysis. * debug * debug * Remove commented-out debug tensor tracking from `SkyReelsV2TransformerBlock` * Add functionality to save processed video latents as a Safetensors file in `SkyReelsV2DiffusionForcingPipeline`. * up * Add functionality to save output latents as a Safetensors file in `SkyReelsV2DiffusionForcingPipeline`. * up * Remove additional commented-out debug tensor tracking from `SkyReelsV2TransformerBlock` and `SkyReelsV2Transformer3DModel` for cleaner code. * style * cleansing * Update example documentation and parameters in `SkyReelsV2Pipeline`. Adjusted example code for loading models, modified default values for height, width, num_frames, and guidance_scale, and improved output video quality settings. * Update shift parameter in example documentation and default values across SkyReels V2 pipelines. Adjusted shift values for I2V from 3.0 to 5.0 and updated related example code for consistency. * Update example documentation in SkyReels V2 pipelines to include available model options and update model references for loading. Adjusted model names to reflect the latest versions across I2V, V2V, and T2V pipelines. * Add test templates * style * Add docs template * Add SkyReels V2 Diffusion Forcing Video-to-Video Pipeline to imports * style * fix-copies * convert i2v 1.3b * Update transformer configuration to include `image_dim` for SkyReels V2 models and refactor imports to use `SkyReelsV2Transformer3DModel`. * Refactor transformer import in SkyReels V2 pipeline to use `SkyReelsV2Transformer3DModel` for consistency. * Update transformer configuration in SkyReels V2 to increase `in_channels` from 16 to 36 for i2v conf. * Update transformer configuration in SkyReels V2 to set `added_kv_proj_dim` values for different model types. * up * up * up * Add SkyReelsV2Pipeline support for T2V model type in conversion script * upp * Refactor model type checks in conversion script to use substring matching for improved flexibility * upp * Fix shard path formatting in conversion script to accommodate varying model types by dynamically adjusting zero padding. * Update sharded safetensors loading logic in conversion script to use substring matching for model directory checks * Update scheduler parameters in SkyReels V2 test files for consistency across image and video pipelines * Refactor conversion script to initialize text encoder, tokenizer, and scheduler for SkyReels pipelines, enhancing model integration * style * Update documentation for SkyReels-V2, introducing the Infinite-length Film Generative model, enhancing text-to-video generation examples, and updating model references throughout the API documentation. * Add SkyReelsV2Transformer3DModel and FlowMatchUniPCMultistepScheduler documentation, updating TOC and introducing new model and scheduler files. * style * Update documentation for SkyReelsV2DiffusionForcingPipeline to correct flow matching scheduler parameter for I2V from 3.0 to 5.0, ensuring clarity in usage examples. * Add documentation for causal_block_size parameter in SkyReelsV2DF pipelines, clarifying its role in asynchronous inference. * Simplify min_ar_step calculation in SkyReelsV2DiffusionForcingPipeline to improve clarity. * style and fix-copies * style * Add documentation for SkyReelsV2Transformer3DModel Introduced a new markdown file detailing the SkyReelsV2Transformer3DModel, including usage instructions and model output specifications. * Update test configurations for SkyReelsV2 pipelines - Adjusted `in_channels` from 36 to 16 in `test_skyreels_v2_df_image_to_video.py`. - Added new parameters: `overlap_history`, `num_frames`, and `base_num_frames` in `test_skyreels_v2_df_video_to_video.py`. - Updated expected output shape in video tests from (17, 3, 16, 16) to (41, 3, 16, 16). * Refines SkyReelsV2DF test parameters * Update src/diffusers/models/modeling_outputs.py Co-authored-by:
Aryan <contact.aryanvs@gmail.com> * Refactor `grid_sizes` processing by using already-calculated post-patch parameters to simplify * Update docs/source/en/api/pipelines/skyreels_v2.md Co-authored-by:
Aryan <contact.aryanvs@gmail.com> * Refactor parameter naming for diffusion forcing in SkyReelsV2 pipelines - Changed `flag_df` to `enable_diffusion_forcing` for clarity in the SkyReelsV2Transformer3DModel and associated pipelines. - Updated all relevant method calls to reflect the new parameter name. * Revert _toctree.yml to adjust section expansion states * style * Update docs/source/en/api/models/skyreels_v2_transformer_3d.md Co-authored-by:
YiYi Xu <yixu310@gmail.com> * Add copying label to SkyReelsV2ImageEmbedding from WanImageEmbedding. * Refactor transformer block processing in SkyReelsV2Transformer3DModel - Ensured proper handling of hidden states during both gradient checkpointing and standard processing. * Update SkyReels V2 documentation to remove VRAM requirement and streamline imports - Removed the mention of ~13GB VRAM requirement for the SkyReels-V2 model. - Simplified import statements by removing unused `load_image` import. * Add SkyReelsV2LoraLoaderMixin for loading and managing LoRA layers in SkyReelsV2Transformer3DModel - Introduced SkyReelsV2LoraLoaderMixin class to handle loading, saving, and fusing of LoRA weights specific to the SkyReelsV2 model. - Implemented methods for state dict management, including compatibility checks for various LoRA formats. - Enhanced functionality for loading weights with options for low CPU memory usage and hotswapping. - Added detailed docstrings for clarity on parameters and usage. * Update SkyReelsV2 documentation and loader mixin references - Corrected the documentation to reference the new `SkyReelsV2LoraLoaderMixin` for loading LoRA weights. - Updated comments in the `SkyReelsV2LoraLoaderMixin` class to reflect changes in model references from `WanTransformer3DModel` to `SkyReelsV2Transformer3DModel`. * Enhance SkyReelsV2 integration by adding SkyReelsV2LoraLoaderMixin references - Added `SkyReelsV2LoraLoaderMixin` to the documentation and loader imports for improved LoRA weight management. - Updated multiple pipeline classes to inherit from `SkyReelsV2LoraLoaderMixin` instead of `WanLoraLoaderMixin`. * Update SkyReelsV2 model references in documentation - Replaced placeholder model paths with actual paths for SkyReels-V2 models in multiple pipeline files. - Ensured consistency across the documentation for loading models in the SkyReelsV2 pipelines. * style * fix-copies * Refactor `fps_projection` in `SkyReelsV2Transformer3DModel` - Replaced the sequential linear layers for `fps_projection` with a `FeedForward` layer using `SiLU` activation for better integration. * Update docs * Refactor video processing in SkyReelsV2DiffusionForcingPipeline - Renamed parameters for clarity: `video` to `video_latents` and `overlap_history` to `overlap_history_latent_frames`. - Updated logic for handling long video generation, including adjustments to latent frame calculations and accumulation. - Consolidated handling of latents for both long and short video generation scenarios. - Final decoding step now consistently converts latents to pixels, ensuring proper output format. * Update activation function in `fps_projection` of `SkyReelsV2Transformer3DModel` - Changed activation function from `silu` to `linear-silu` in the `fps_projection` layer for improved performance and integration. * Add fps_projection layer renaming in convert_skyreelsv2_to_diffusers.py - Updated key mappings for the `fps_projection` layer to align with new naming conventions, ensuring consistency in model integration. * Fix fps_projection assignment in SkyReelsV2Transformer3DModel - Corrected the assignment of the `fps_projection` layer to ensure it is properly cast to the appropriate data type, enhancing model functionality. * Update _keep_in_fp32_modules in SkyReelsV2Transformer3DModel - Added `fps_projection` to the list of modules that should remain in FP32 precision, ensuring proper handling of data types during model operations. * Remove integration test classes from SkyReelsV2 test files - Deleted the `SkyReelsV2DiffusionForcingPipelineIntegrationTests` and `SkyReelsV2PipelineIntegrationTests` classes along with their associated setup, teardown, and test methods, as they were not implemented and not needed for current testing. * style * Refactor: Remove hardcoded `torch.bfloat16` cast in attention * Refactor: Simplify data type handling in transformer model Removes unnecessary data type conversions for the FPS embedding and timestep projection. This change simplifies the forward pass by relying on the inherent data types of the tensors. * Refactor: Remove `fps_projection` from `_keep_in_fp32_modules` in `SkyReelsV2Transformer3DModel` * Update src/diffusers/models/transformers/transformer_skyreels_v2.py Co-authored-by:
Aryan <contact.aryanvs@gmail.com> * Refactor: Remove unused flags and simplify attention mask handling in SkyReelsV2AttnProcessor2_0 and SkyReelsV2Transformer3DModel Refactor: Simplify causal attention logic in SkyReelsV2 Removes the `flag_causal_attention` and `_flag_ar_attention` flags to simplify the implementation. The decision to apply a causal attention mask is now based directly on the `num_frame_per_block` configuration, eliminating redundant flags and conditional checks. This streamlines the attention mechanism and simplifies the `set_ar_attention` methods. * Refactor: Clarify variable names for latent frames Renames `base_num_frames` to `base_latent_num_frames` to make it explicit that the variable refers to the number of frames in the latent space. This change improves code readability and reduces potential confusion between latent frames and decoded video frames. The `num_frames` parameter in `generate_timestep_matrix` is also renamed to `num_latent_frames` for consistency. * Enhance documentation: Add detailed docstring for timestep matrix generation in SkyReelsV2DiffusionForcingPipeline * Docs: Clarify long video chunking in pipeline docstring Improves the explanation of long video processing within the pipeline's docstring. The update replaces the abstract description with a concrete example, illustrating how the sliding window mechanism works with overlapping chunks. This makes the roles of `base_num_frames` and `overlap_history` clearer for users. * Docs: Move visual demonstration and processing details for SkyReelsV2DiffusionForcingPipeline to docs page from the code * Docs: Update asynchronous processing timeline and examples for long video handling in SkyReels-V2 documentation * Enhance timestep matrix generation documentation and logic for synchronous/asynchronous video processing * Update timestep matrix documentation and enhance analysis for clarity in SkyReelsV2DiffusionForcingPipeline * Docs: Update visual demonstration section and add detailed step matrix construction example for asynchronous processing in SkyReelsV2DiffusionForcingPipeline * style * fix-copies * Refactor parameter names for clarity in SkyReelsV2DiffusionForcingImageToVideoPipeline and SkyReelsV2DiffusionForcingVideoToVideoPipeline * Refactor: Avoid VAE roundtrip in long video generation Improves performance and quality for long video generation by operating entirely in latent space during the iterative generation process. Instead of decoding latents to video and then re-encoding the overlapping section for the next chunk, this change passes the generated latents directly between iterations. This avoids a computationally expensive and potentially lossy VAE decode/encode cycle within the loop. The full video is now decoded only once from the accumulated latents at the end of the process. * Refactor: Rename prefix_video_latents_length to prefix_video_latents_frames for clarity * Refactor: Rename num_latent_frames to current_num_latent_frames for clarity in SkyReelsV2DiffusionForcingImageToVideoPipeline * Refactor: Enhance long video generation logic and improve latent handling in SkyReelsV2DiffusionForcingImageToVideoPipeline Refactor: Unify video generation and pass latents directly Unifies the separate code paths for short and long video generation into a single, streamlined loop. This change eliminates the inefficient decode-encode cycle during long video generation. Instead of converting latents to pixel-space video between chunks, the pipeline now passes the generated latents directly to the next iteration. This improves performance, avoids potential quality loss from intermediate VAE steps, and enhances code maintainability by removing significant duplication. * style * Refactor: Remove overlap_history parameter and streamline long video generation logic in SkyReelsV2DiffusionForcingImageToVideoPipeline Refactor: Streamline long video generation logic Removes the `overlap_history` parameter and simplifies the conditioning process for long video generation. This change avoids a redundant VAE encoding step by directly using latent frames from the previous chunk for conditioning. It also moves image preprocessing outside the main generation loop to prevent repeated computations and clarifies the handling of prefix latents. * style * Refactor latent handling in i2v diffusion forcing pipeline Improves the latent conditioning and accumulation logic within the image-to-video diffusion forcing loop. - Corrects the splitting of the initial conditioning tensor to robustly handle both even and odd lengths. - Simplifies how latents are accumulated across iterations for long video generation. - Ensures the final latents are trimmed correctly before decoding only when a `last_image` is provided. * Refactor: Remove overlap_history parameter from SkyReelsV2DiffusionForcingImageToVideoPipeline * Refactor: Adjust video_latents parameter handling in prepare_latents method * style * Refactor: Update long video iteration print statements for clarity * Fix: Update transformer config with dynamic causal block size Updates the SkyReelsV2 pipelines to correctly set the `causal_block_size` in the transformer's configuration when it's provided during a pipeline call. This ensures the model configuration reflects the user's specified setting for the inference run. The `set_ar_attention` method is also renamed to `_set_ar_attention` to mark it as an internal helper. * style * Refactor: Adjust video input size and expected output shape in inference test * Refactor: Rename video variables for clarity in SkyReelsV2DiffusionForcingVideoToVideoPipeline * Docs: Clarify time embedding logic in SkyReelsV2 Adds comments to explain the handling of different time embedding tensor dimensions. A 2D tensor is used for standard models with a single time embedding per batch, while a 3D tensor is used for Diffusion Forcing models where each frame has its own time embedding. This clarifies the expected input for different model variations. * Docs: Update SkyReels V2 pipeline examples Updates the docstring examples for the SkyReels V2 pipelines to reflect current best practices and API changes. - Removes the `shift` parameter from pipeline call examples, as it is now configured directly on the scheduler. - Replaces the `set_ar_attention` method call with the `causal_block_size` argument in the pipeline call for diffusion forcing examples. - Adjusts recommended parameters for I2V and V2V examples, including inference steps, guidance scale, and `ar_step`. * Refactor: Remove `shift` parameter from SkyReelsV2 pipelines Removes the `shift` parameter from the call signature of all SkyReelsV2 pipelines. This parameter is a scheduler-specific configuration and should be set directly on the scheduler during its initialization, rather than being passed at runtime through the pipeline. This change simplifies the pipeline API. Usage examples are updated to reflect that the `shift` value should now be passed when creating the `FlowMatchUniPCMultistepScheduler`. * Refactors SkyReelsV2 image-to-video tests and adds last image case Simplifies the test suite by removing a duplicated test class and streamlining the dummy component and input generation. Adds a new test to verify the pipeline's behavior when a `last_image` is provided as input for conditioning. * test: Add image components to SkyReelsV2 pipeline test Adds the `image_encoder` and `image_processor` to the test components for the image-to-video pipeline. Also replaces a hardcoded value for the positional embedding sequence length with a more descriptive calculation, improving clarity. * test: Add callback configuration test for SkyReelsV2DiffusionForcingVideoToVideoPipeline test: Add callback test for SkyReelsV2DFV2V pipeline Adds a test to validate the callback functionality for the `SkyReelsV2DiffusionForcingVideoToVideoPipeline`. This test confirms that `callback_on_step_end` is invoked correctly and can modify the pipeline's state during inference. It uses a callback to dynamically increase the `guidance_scale` and asserts that the final value is as expected. The implementation correctly accounts for the nested denoising loops present in diffusion forcing pipelines. * style * fix: Update image_encoder type to CLIPVisionModelWithProjection in SkyReelsV2ImageToVideoPipeline * UP * Add conversion support for SkyReels-V2-FLF2V models Adds configurations for three new FLF2V model variants (1.3B-540P, 14B-540P, and 14B-720P) to the conversion script. This change also introduces specific handling to zero out the image positional embeddings for these models and updates the main script to correctly initialize the image-to-video pipeline. * Docs: Update and simplify SkyReels V2 usage examples Simplifies the text-to-video example by removing the manual group offloading configuration, making it more straightforward. Adds comments to pipeline parameters to clarify their purpose and provides guidance for different resolutions and long video generation. Introduces a new section with a code example for the video-to-video pipeline. * style * docs: Add SkyReels-V2 FLF2V 1.3B model to supported models list * docs: Update SkyReels-V2 documentation * Move the initialization of the `gradient_checkpointing` attribute to its suggested location. * Refactor: Use logger for long video progress messages Replaces `print()` calls with `logger.debug()` for reporting progress during long video generation in SkyReelsV2DF pipelines. This change reduces console output verbosity for standard runs while allowing developers to view progress by enabling debug-level logging. * Refactor SkyReelsV2 timestep embedding into a module Extract the sinusoidal timestep embedding logic into a new `SkyReelsV2Timesteps` `nn.Module`. This change encapsulates the embedding generation, which simplifies the `SkyReelsV2TimeTextImageEmbedding` class and improves code modularity. * Fix: Preserve original shape in timestep embeddings Reshapes the timestep embedding tensor to match the original input shape. This ensures that batched timestep inputs retain their batch dimension after embedding, preventing potential shape mismatches. * style * Refactor: Move SkyReelsV2Timesteps to model file Colocates the `SkyReelsV2Timesteps` class with the SkyReelsV2 transformer model. This change moves model-specific timestep embedding logic from the general embeddings module to the transformer's own file, improving modularity and making the model more self-contained. * Refactor parameter dtype retrieval to use utility function Replaces manual parameter iteration with the `get_parameter_dtype` helper to determine the time embedder's data type. This change improves code readability and centralizes the logic. * Add comments to track the tensor shape transformations * Add copied froms * style * fix-copies * up * Remove FlowMatchUniPCMultistepScheduler Deletes the `FlowMatchUniPCMultistepScheduler` as it is no longer being used. * Refactor: Replace FlowMatchUniPC scheduler with UniPC Removes the `FlowMatchUniPCMultistepScheduler` and integrates its functionality into the existing `UniPCMultistepScheduler`. This consolidation is achieved by using the `use_flow_sigmas=True` parameter in `UniPCMultistepScheduler`, simplifying the scheduler API and reducing code duplication. All usages, documentation, and tests are updated accordingly. * style * Remove text_encoder parameter from SkyReelsV2DiffusionForcingPipeline initialization * Docs: Rename `pipe` to `pipeline` in SkyReels examples Updates the variable name from `pipe` to `pipeline` across all SkyReels V2 documentation examples. This change improves clarity and consistency. * Fix: Rename shift parameter to flow_shift in SkyReels-V2 examples * Fix: Rename shift parameter to flow_shift in example documentation across SkyReels-V2 files * Fix: Rename shift parameter to flow_shift in UniPCMultistepScheduler initialization across SkyReels test files * Removes unused generator argument from scheduler step The `generator` parameter is not used by the scheduler's `step` method within the SkyReelsV2 diffusion forcing pipelines. This change removes the unnecessary argument from the method call for code clarity and consistency. * Fix: Update time_embedder_dtype assignment to use the first parameter's dtype in SkyReelsV2TimeTextImageEmbedding * style * Refactor: Use get_parameter_dtype utility function Replaces manual parameter iteration with the `get_parameter_dtype` helper. * Fix: Prevent (potential) error in parameter dtype check Adds a check to ensure the `_keep_in_fp32_modules` attribute exists on a parameter before it is accessed. This prevents a potential `AttributeError`, making the utility function more robust when used with models that do not define this attribute. --------- Co-authored-by:
YiYi Xu <yixu310@gmail.com> Co-authored-by:
Aryan <contact.aryanvs@gmail.com>
-
- 02 Jul, 2025 1 commit
-
-
Linoy Tsaban authored
* initial commit * initial commit * initial commit * fix import * fix prefix * remove print * Apply style fixes --------- Co-authored-by:github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
- 19 Jun, 2025 1 commit
-
-
Aryan authored
update
-
- 16 Jun, 2025 1 commit
-
-
Sayak Paul authored
* fix flux lora loader when return_metadata is true for non-diffusers * remove annotation
-
- 13 Jun, 2025 1 commit
-
-
Sayak Paul authored
* feat: parse metadata from lora state dicts. * tests * fix tests * key renaming * fix * smol update * smol updates * load metadata. * automatically save metadata in save_lora_adapter. * propagate changes. * changes * add test to models too. * tigher tests. * updates * fixes * rename tests. * sorted. * Update src/diffusers/loaders/lora_base.py Co-authored-by:
Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * review suggestions. * removeprefix. * propagate changes. * fix-copies * sd * docs. * fixes * get review ready. * one more test to catch error. * change to a different approach. * fix-copies. * todo * sd3 * update * revert changes in get_peft_kwargs. * update * fixes * fixes * simplify _load_sft_state_dict_metadata * update * style fix * uipdate * update * update * empty commit * _pack_dict_with_prefix * update * TODO 1. * todo: 2. * todo: 3. * update * update * Apply suggestions from code review Co-authored-by:
Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * reraise. * move argument. --------- Co-authored-by:
Benjamin Bossan <BenjaminBossan@users.noreply.github.com> Co-authored-by:
Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
-
- 11 Jun, 2025 1 commit
-
-
Sayak Paul authored
support Flux Control LoRA with bnb 8bit.
-
- 22 May, 2025 1 commit
-
-
Sayak Paul authored
* fix peft delete adapters for flux. * add test * empty commit
-
- 20 May, 2025 1 commit
-
-
Linoy Tsaban authored
* testing * testing * testing * testing * testing * i2v * i2v * device fix * testing * fix * fix * fix * fix * fix * Apply style fixes * empty commit --------- Co-authored-by:github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
- 19 May, 2025 1 commit
-
-
Linoy Tsaban authored
* support non diffusers loras for ltxv * Update src/diffusers/loaders/lora_conversion_utils.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * Update src/diffusers/loaders/lora_pipeline.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * Apply style fixes * empty commit --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
- 09 May, 2025 1 commit
-
-
Sayak Paul authored
* support non-diffusers hidream loras * make fix-copies
-
- 06 May, 2025 1 commit
-
-
Sayak Paul authored
* use removeprefix to preserve sanity. * f-string.
-
- 01 May, 2025 1 commit
-
-
co63oc authored
* Fix typos in docs and comments * Apply style fixes --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
- 28 Apr, 2025 1 commit
-
-
Yao Matrix authored
* enable gguf test cases on XPU Signed-off-by:
YAO Matrix <matrix.yao@intel.com> * make SD35LargeGGUFSingleFileTests::test_pipeline_inference pas Signed-off-by:
root <root@a4bf01945cfe.jf.intel.com> * make FluxControlLoRAGGUFTests::test_lora_loading pass Signed-off-by:
Yao Matrix <matrix.yao@intel.com> * polish code Signed-off-by:
Yao Matrix <matrix.yao@intel.com> * Apply style fixes --------- Signed-off-by:
YAO Matrix <matrix.yao@intel.com> Signed-off-by:
root <root@a4bf01945cfe.jf.intel.com> Signed-off-by:
Yao Matrix <matrix.yao@intel.com> Co-authored-by:
root <root@a4bf01945cfe.jf.intel.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
- 22 Apr, 2025 1 commit
-
-
Linoy Tsaban authored
* initial commit * initial commit * initial commit * initial commit * initial commit * initial commit * Update examples/dreambooth/train_dreambooth_lora_hidream.py Co-authored-by:
Bagheera <59658056+bghira@users.noreply.github.com> * move prompt embeds, pooled embeds outside * Update examples/dreambooth/train_dreambooth_lora_hidream.py Co-authored-by:
hlky <hlky@hlky.ac> * Update examples/dreambooth/train_dreambooth_lora_hidream.py Co-authored-by:
hlky <hlky@hlky.ac> * fix import * fix import and tokenizer 4, text encoder 4 loading * te * prompt embeds * fix naming * shapes * initial commit to add HiDreamImageLoraLoaderMixin * fix init * add tests * loader * fix model input * add code example to readme * fix default max length of text encoders * prints * nullify training cond in unpatchify for temp fix to incompatible shaping of transformer output during training * smol fix * unpatchify * unpatchify * fix validation * flip pred and loss * fix shift!!! * revert unpatchify changes (for now) * smol fix * Apply style fixes * workaround moe training * workaround moe training * remove prints * to reduce some memory, keep vae in `weight_dtype` same as we have for flux (as it's the same vae) https://github.com/huggingface/diffusers/blob/bbd0c161b55ba2234304f1e6325832dd69c60565/examples/dreambooth/train_dreambooth_lora_flux.py#L1207 * refactor to align with HiDream refactor * refactor to align with HiDream refactor * refactor to align with HiDream refactor * add support for cpu offloading of text encoders * Apply style fixes * adjust lr and rank for train example * fix copies * Apply style fixes * update README * update README * update README * fix license * keep prompt2,3,4 as None in validation * remove reverse ode comment * Update examples/dreambooth/train_dreambooth_lora_hidream.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * Update examples/dreambooth/train_dreambooth_lora_hidream.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * vae offload change * fix text encoder offloading * Apply style fixes * cleaner to_kwargs * fix module name in copied from * add requirements * fix offloading * fix offloading * fix offloading * update transformers version in reqs * try AutoTokenizer * try AutoTokenizer * Apply style fixes * empty commit * Delete tests/lora/test_lora_layers_hidream.py * change tokenizer_4 to load with AutoTokenizer as well * make text_encoder_four and tokenizer_four configurable * save model card * save model card * revert T5 * fix test * remove non diffusers lumina2 conversion --------- Co-authored-by:
Bagheera <59658056+bghira@users.noreply.github.com> Co-authored-by:
hlky <hlky@hlky.ac> Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
- 17 Apr, 2025 1 commit
-
-
Sayak Paul authored
* propagate hotswap to other load_lora_weights() methods. * simplify documentations. * updates * propagate to load_lora_into_text_encoder. * empty commit
-
- 15 Apr, 2025 1 commit
-
-
Hameer Abbasi authored
* Add AuraFlowLoraLoaderMixin * Add comments, remove qkv fusion * Add Tests * Add AuraFlowLoraLoaderMixin to documentation * Add Suggested changes * Change attention_kwargs->joint_attention_kwargs * Rebasing derp. * fix * fix * Quality fixes. * make style * `make fix-copies` * `ruff check --fix` * Attept 1 to fix tests. * Attept 2 to fix tests. * Attept 3 to fix tests. * Address review comments. * Rebasing derp. * Get more tests passing by copying from Flux. Address review comments. * `joint_attention_kwargs`->`attention_kwargs` * Add `lora_scale` property for te LoRAs. * Make test better. * Remove useless property. * Skip TE-only tests for AuraFlow. * Support LoRA for non-CLIP TEs. * Restore LoRA tests. * Undo adding LoRA support for non-CLIP TEs. * Undo support for TE in AuraFlow LoRA. * `make fix-copies` * Sync with upstream changes. * Remove unneeded stuff. * Mirror `Lumina2`. * Skip for MPS. * Address review comments. * Remove duplicated code. * Remove unnecessary code. * Remove repeated docs. * Propagate attention. * Fix TE target modules. * MPS fix for LoRA tests. * Unrelated TE LoRA tests fix. * Fix AuraFlow LoRA tests by applying to the right denoiser layers. Co-authored-by:
AstraliteHeart <81396681+AstraliteHeart@users.noreply.github.com> * Apply style fixes * empty commit * Fix the repo consistency issues. * Remove unrelated changes. * Style. * Fix `test_lora_fuse_nan`. * fix quality issues. * `pytest.xfail` -> `ValueError`. * Add back `skip_mps`. * Apply style fixes * `make fix-copies` --------- Co-authored-by:
Warlord-K <warlordk28@gmail.com> Co-authored-by:
hlky <hlky@hlky.ac> Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
AstraliteHeart <81396681+AstraliteHeart@users.noreply.github.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
- 10 Apr, 2025 1 commit
-
-
Sayak Paul authored
* support musubi wan loras. * Update src/diffusers/loaders/lora_conversion_utils.py Co-authored-by:
hlky <hlky@hlky.ac> * support i2v loras from musubi too. --------- Co-authored-by:
hlky <hlky@hlky.ac>
-
- 08 Apr, 2025 2 commits
-
-
hlky authored
* Flux quantized with lora * fix * changes * Apply suggestions from code review Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * Apply style fixes * enable model cpu offload() * Update src/diffusers/loaders/lora_pipeline.py Co-authored-by:
hlky <hlky@hlky.ac> * update * Apply suggestions from code review * update * add peft as an additional dependency for gguf --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com>
-
Benjamin Bossan authored
* [WIP][LoRA] Implement hot-swapping of LoRA This PR adds the possibility to hot-swap LoRA adapters. It is WIP. Description As of now, users can already load multiple LoRA adapters. They can offload existing adapters or they can unload them (i.e. delete them). However, they cannot "hotswap" adapters yet, i.e. substitute the weights from one LoRA adapter with the weights of another, without the need to create a separate LoRA adapter. Generally, hot-swapping may not appear not super useful but when the model is compiled, it is necessary to prevent recompilation. See #9279 for more context. Caveats To hot-swap a LoRA adapter for another, these two adapters should target exactly the same layers and the "hyper-parameters" of the two adapters should be identical. For instance, the LoRA alpha has to be the same: Given that we keep the alpha from the first adapter, the LoRA scaling would be incorrect for the second adapter otherwise. Theoretically, we could override the scaling dict with the alpha values derived from the second adapter's config, but changing the dict will trigger a guard for recompilation, defeating the main purpose of the feature. I also found that compilation flags can have an impact on whether this works or not. E.g. when passing "reduce-overhead", there will be errors of the type: > input name: arg861_1. data pointer changed from 139647332027392 to 139647331054592 I don't know enough about compilation to determine whether this is problematic or not. Current state This is obviously WIP right now to collect feedback and discuss which direction to take this. If this PR turns out to be useful, the hot-swapping functions will be added to PEFT itself and can be imported here (or there is a separate copy in diffusers to avoid the need for a min PEFT version to use this feature). Moreover, more tests need to be added to better cover this feature, although we don't necessarily need tests for the hot-swapping functionality itself, since those tests will be added to PEFT. Furthermore, as of now, this is only implemented for the unet. Other pipeline components have yet to implement this feature. Finally, it should be properly documented. I would like to collect feedback on the current state of the PR before putting more time into finalizing it. * Reviewer feedback * Reviewer feedback, adjust test * Fix, doc * Make fix * Fix for possible g++ error * Add test for recompilation w/o hotswapping * Make hotswap work Requires https://github.com/huggingface/peft/pull/2366 More changes to make hotswapping work. Together with the mentioned PEFT PR, the tests pass for me locally. List of changes: - docstring for hotswap - remove code copied from PEFT, import from PEFT now - adjustments to PeftAdapterMixin.load_lora_adapter (unfortunately, some state dict renaming was necessary, LMK if there is a better solution) - adjustments to UNet2DConditionLoadersMixin._process_lora: LMK if this is even necessary or not, I'm unsure what the overall relationship is between this and PeftAdapterMixin.load_lora_adapter - also in UNet2DConditionLoadersMixin._process_lora, I saw that there is no LoRA unloading when loading the adapter fails, so I added it there (in line with what happens in PeftAdapterMixin.load_lora_adapter) - rewritten tests to avoid shelling out, make the test more precise by making sure that the outputs align, parametrize it - also checked the pipeline code mentioned in this comment: https://github.com/huggingface/diffusers/pull/9453#issuecomment-2418508871; when running this inside the with torch._dynamo.config.patch(error_on_recompile=True) context, there is no error, so I think hotswapping is now working with pipelines. * Address reviewer feedback: - Revert deprecated method - Fix PEFT doc link to main - Don't use private function - Clarify magic numbers - Add pipeline test Moreover: - Extend docstrings - Extend existing test for outputs != 0 - Extend existing test for wrong adapter name * Change order of test decorators parameterized.expand seems to ignore skip decorators if added in last place (i.e. innermost decorator). * Split model and pipeline tests Also increase test coverage by also targeting conv2d layers (support of which was added recently on the PEFT PR). * Reviewer feedback: Move decorator to test classes ... instead of having them on each test method. * Apply suggestions from code review Co-authored-by:
hlky <hlky@hlky.ac> * Reviewer feedback: version check, TODO comment * Add enable_lora_hotswap method * Reviewer feedback: check _lora_loadable_modules * Revert changes in unet.py * Add possibility to ignore enabled at wrong time * Fix docstrings * Log possible PEFT error, test * Raise helpful error if hotswap not supported I.e. for the text encoder * Formatting * More linter * More ruff * Doc-builder complaint * Update docstring: - mention no text encoder support yet - make it clear that LoRA is meant - mention that same adapter name should be passed * Fix error in docstring * Update more methods with hotswap argument - SDXL - SD3 - Flux No changes were made to load_lora_into_transformer. * Add hotswap argument to load_lora_into_transformer For SD3 and Flux. Use shorter docstring for brevity. * Extend docstrings * Add version guards to tests * Formatting * Fix LoRA loading call to add prefix=None See: https://github.com/huggingface/diffusers/pull/10187#issuecomment-2717571064 * Run make fix-copies * Add hot swap documentation to the docs * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> Co-authored-by:
hlky <hlky@hlky.ac> Co-authored-by:
YiYi Xu <yixu310@gmail.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 19 Mar, 2025 1 commit
-
-
Linoy Tsaban authored
* @hlky t2v->i2v * Apply style fixes * try with ones to not nullify layers * fix method name * revert to zeros * add check to state_dict keys * add comment * copies fix * Revert "copies fix" This reverts commit 051f534d185c0ea065bf36a9926c4b48f496d429. * remove copied from * Update src/diffusers/loaders/lora_pipeline.py Co-authored-by:
hlky <hlky@hlky.ac> * Update src/diffusers/loaders/lora_pipeline.py Co-authored-by:
hlky <hlky@hlky.ac> * update * update * Update src/diffusers/loaders/lora_pipeline.py Co-authored-by:
hlky <hlky@hlky.ac> * Apply style fixes --------- Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by:
Linoy <linoy@hf.co> Co-authored-by:
hlky <hlky@hlky.ac>
-
- 11 Mar, 2025 2 commits
-
-
CyberVy authored
* Update lora_pipeline.py * Apply style fixes * fix-copies --------- Co-authored-by:
hlky <hlky@hlky.ac> Co-authored-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com>
-
Sayak Paul authored
* support wan i2v loras from the world. * remove copied from. * upates * add lora.
-
- 10 Mar, 2025 2 commits
-
-
Aryan authored
* update * make fix-copies * update
-
Sayak Paul authored
* updates * updates * updates * updates * notebooks revert * fix-copies. * seeing * fix * revert * fixes * fixes * fixes * remove print * fix * conflicts ii. * updates * fixes * better filtering of prefix. --------- Co-authored-by:hlky <hlky@hlky.ac>
-
- 08 Mar, 2025 1 commit
-
-
Sayak Paul authored
* more sanity of mind with copied from ... * better * better
-
- 04 Mar, 2025 2 commits
-
-
Aryan authored
* update * refactor image-to-video pipeline * update * fix copied from * use FP32LayerNorm
-
Sayak Paul authored
* feat: support non-diffusers lumina2 LoRAs. * revert ipynb changes (but I don't know why this is required
☹ ️) * empty --------- Co-authored-by:Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by:
YiYi Xu <yixu310@gmail.com>
-
- 20 Feb, 2025 1 commit
-
-
Sayak Paul authored
* feat: lora support for Lumina2. * fix-copies. * updates * updates * docs. * fix * add: training script. * tests * updates * updates * major updates. * updates * fixes * docs. * updates * updates
-
- 15 Jan, 2025 1 commit
-
-
Sayak Paul authored
* feat: support loading loras into 4bit quantized models. * updates * update * remove weight check.
-
- 09 Jan, 2025 1 commit
-
-
Sayak Paul authored
* factor out text encoder loading. * make fix-copies * remove copied from fuse_lora and unfuse_lora as needed. * remove unused imports
-
- 07 Jan, 2025 1 commit
-
-
Aryan authored
* update * fix make copies * update * add relevant markers to the integration test suite. * add copied. * fox-copies * temporarily add print. * directly place on CUDA as CPU isn't that big on the CIO. * fixes to fuse_lora, aryan was right. * fixes --------- Co-authored-by:Sayak Paul <spsayakpaul@gmail.com>
-
- 06 Jan, 2025 1 commit
-
-
Sayak Paul authored
* fix: lora unloading when using expanded Flux LoRAs. * fix argument name. Co-authored-by:
a-r-r-o-w <contact.aryanvs@gmail.com> * docs. --------- Co-authored-by:
a-r-r-o-w <contact.aryanvs@gmail.com>
-
- 02 Jan, 2025 1 commit
-
-
maxs-kan authored
* check for base_layer key in transformer state dict * test_lora_expansion_works_for_absent_keys * check * Update tests/lora/test_lora_layers_flux.py Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com> * check * test_lora_expansion_works_for_absent_keys/test_lora_expansion_works_for_extra_keys * absent->extra --------- Co-authored-by:
hlky <hlky@hlky.ac> Co-authored-by:
Sayak Paul <spsayakpaul@gmail.com>
-
- 25 Dec, 2024 1 commit
-
-
Sayak Paul authored
* feat: support unload_lora_weights() for Flux Control. * tighten test * minor * updates * meta device fixes.
-