1. 01 Dec, 2025 1 commit
  2. 26 Nov, 2025 1 commit
  3. 14 Nov, 2025 1 commit
    • David El Malih's avatar
      Improve docstrings and type hints in scheduling_euler_discrete.py (#12654) · 63dd6017
      David El Malih authored
      * refactor: enhance type hints and documentation in EulerDiscreteScheduler
      
      Updated type hints for function parameters and return types in the EulerDiscreteScheduler class to improve code clarity and maintainability. Enhanced docstrings for several methods to provide clearer descriptions of their functionality and expected arguments. This includes specifying Literal types for certain parameters and ensuring consistent return type annotations across the class.
      
      * refactor: enhance type hints and documentation across multiple schedulers
      
      Updated type hints and improved docstrings in various scheduler classes, including CMStochasticIterativeScheduler, CosineDPMSolverMultistepScheduler, and others. This includes specifying parameter types, return types, and providing clearer descriptions of method functionalities. Notable changes include the addition of default values in the begin_index argument and enhanced explanations for noise addition methods. These improvements aim to enhance code clarity and maintainability across the scheduling module.
      
      * refactor: update docstrings to clarify noise schedule construction
      
      Revised docstrings across multiple scheduler classes to enhance clarity regarding the construction of noise schedules. Updated references to relevant papers, ensuring accurate citations for the methodologies used. This includes changes in DEISMultistepScheduler, DPMSolverMultistepInverseScheduler, and others, improving documentation consistency and readability.
      63dd6017
  4. 13 Nov, 2025 2 commits
    • David El Malih's avatar
      Improve docstrings and type hints in scheduling_ddpm.py (#12651) · 3c1ca869
      David El Malih authored
      * Enhance type hints and docstrings in scheduling_ddpm.py
      
      - Added type hints for function parameters and return types across the DDPMScheduler class and related functions.
      - Improved docstrings for clarity, including detailed descriptions of parameters and return values.
      - Updated the alpha_transform_type and beta_schedule parameters to use Literal types for better type safety.
      - Refined the _get_variance and previous_timestep methods with comprehensive documentation.
      
      * Refactor docstrings and type hints in scheduling_ddpm.py
      
      - Cleaned up whitespace in the rescale_zero_terminal_snr function.
      - Enhanced the variance_type parameter in the DDPMScheduler class with improved formatting for better readability.
      - Updated the docstring for the compute_variance method to maintain consistency and clarity in parameter descriptions and return values.
      
      * Apply `make fix-copies`
      
      * Refactor type hints across multiple scheduler files
      
      - Updated type hints to include `Literal` for improved type safety in various scheduling files.
      - Ensured consistency in type hinting for parameters and return types across the affected modules.
      - This change enhances code clarity and maintainability.
      3c1ca869
    • David El Malih's avatar
      Improve docstrings and type hints in scheduling_ddim.py (#12622) · 6fe4a6ff
      David El Malih authored
      * Improve docstrings and type hints in scheduling_ddim.py
      
      - Add complete type hints for all function parameters
      - Enhance docstrings to follow project conventions
      - Add missing parameter descriptions
      
      Fixes #9567
      
      * Enhance docstrings and type hints in scheduling_ddim.py
      
      - Update parameter types and descriptions for clarity
      - Improve explanations in method docstrings to align with project standards
      - Add optional annotations for parameters where applicable
      
      * Refine type hints and docstrings in scheduling_ddim.py
      
      - Update parameter types to use Literal for specific string options
      - Enhance docstring descriptions for clarity and consistency
      - Ensure all parameters have appropriate type annotations and defaults
      
      * Apply review feedback on scheduling_ddim.py
      
      - Replace "prevent singularities" with "avoid numerical instability" for better clarity
      - Add backticks around `alpha_bar` variable name for consistent formatting
      - Convert Imagen Video paper URLs to Hugging Face papers references
      
      * Propagate changes using 'make fix-copies'
      
      * Add missing Literal
      6fe4a6ff
  5. 16 Jul, 2025 2 commits
    • Tolga Cangöz's avatar
      Add SkyReels V2: Infinite-Length Film Generative Model (#11518) · 7298bdd8
      Tolga Cangöz authored
      
      
      * style
      
      * Fix class name casing for SkyReelsV2 components in multiple files to ensure consistency and correct functionality.
      
      * cleaning
      
      * cleansing
      
      * Refactor `get_timestep_embedding` to move modifications into `SkyReelsV2TimeTextImageEmbedding`.
      
      * Remove unnecessary line break in `get_timestep_embedding` function for cleaner code.
      
      * Remove `skyreels_v2` entry from `_import_structure` and update its initialization to directly assign the list of SkyReelsV2 components.
      
      * cleansing
      
      * Refactor attention processing in `SkyReelsV2AttnProcessor2_0` to always convert query, key, and value to `torch.bfloat16`, simplifying the code and improving clarity.
      
      * Enhance example usage in `pipeline_skyreels_v2_diffusion_forcing.py` by adding VAE initialization and detailed prompt for video generation, improving clarity and usability of the documentation.
      
      * Refactor import structure in `__init__.py` for SkyReelsV2 components and improve formatting in `pipeline_skyreels_v2_diffusion_forcing.py` to enhance code readability and maintainability.
      
      * Update `guidance_scale` parameter in `SkyReelsV2DiffusionForcingPipeline` from 5.0 to 6.0 to enhance video generation quality.
      
      * Update `guidance_scale` parameter in example documentation and class definition of `SkyReelsV2DiffusionForcingPipeline` to ensure consistency and improve video generation quality.
      
      * Update `causal_block_size` parameter in `SkyReelsV2DiffusionForcingPipeline` to default to `None`.
      
      * up
      
      * Fix dtype conversion for `timestep_proj` in `SkyReelsV2Transformer3DModel` to *ensure* correct tensor operations.
      
      * Optimize causal mask generation by replacing repeated tensor with `repeat_interleave` for improved efficiency in `SkyReelsV2Transformer3DModel`.
      
      * style
      
      * Enhance example documentation in `SkyReelsV2DiffusionForcingPipeline` with guidance scale and shift parameters for T2V and I2V. Remove unused `retrieve_latents` function to streamline the code.
      
      * Refactor sample scheduler creation in `SkyReelsV2DiffusionForcingPipeline` to use `deepcopy` for improved state management during inference steps.
      
      * Enhance error handling and documentation in `SkyReelsV2DiffusionForcingPipeline` for `overlap_history` and `addnoise_condition` parameters to improve long video generation guidance.
      
      * Update documentation and progress bar handling in `SkyReelsV2DiffusionForcingPipeline` to clarify asynchronous inference settings and improve progress tracking during denoising steps.
      
      * Refine progress bar calculation in `SkyReelsV2DiffusionForcingPipeline` by rounding the step size to one decimal place for improved readability during denoising steps.
      
      * Update import statements in `SkyReelsV2DiffusionForcingPipeline` documentation for improved clarity and organization.
      
      * Refactor progress bar handling in `SkyReelsV2DiffusionForcingPipeline` to use total steps instead of calculated step size.
      
      * update templates for i2v, v2v
      
      * Add `retrieve_latents` function to streamline latent retrieval in `SkyReelsV2DiffusionForcingPipeline`. Update video latent processing to utilize this new function for improved clarity and maintainability.
      
      * Add `retrieve_latents` function to both i2v and v2v pipelines for consistent latent retrieval. Update video latent processing to utilize this function, enhancing clarity and maintainability across the SkyReelsV2DiffusionForcingPipeline implementations.
      
      * Remove redundant ValueError for `overlap_history` in `SkyReelsV2DiffusionForcingPipeline` to streamline error handling and improve user guidance for long video generation.
      
      * Update default video dimensions and flow matching scheduler parameter in `SkyReelsV2DiffusionForcingPipeline` to enhance video generation capabilities.
      
      * Refactor `SkyReelsV2DiffusionForcingPipeline` to support Image-to-Video (i2v) generation. Update class name, add image encoding functionality, and adjust parameters for improved video generation. Enhance error handling for image inputs and update documentation accordingly.
      
      * Improve organization for image-last_image condition.
      
      * Refactor `SkyReelsV2DiffusionForcingImageToVideoPipeline` to improve latent preparation and video condition handling integration.
      
      * style
      
      * style
      
      * Add example usage of PIL for image input in `SkyReelsV2DiffusionForcingImageToVideoPipeline` documentation.
      
      * Refactor `SkyReelsV2DiffusionForcingPipeline` to `SkyReelsV2DiffusionForcingVideoToVideoPipeline`, enhancing support for Video-to-Video (v2v) generation. Introduce video input handling, update latent preparation logic, and improve error handling for input parameters.
      
      * Refactor `SkyReelsV2DiffusionForcingImageToVideoPipeline` by removing the `image_encoder` and `image_processor` dependencies. Update the CPU offload sequence accordingly.
      
      * Refactor `SkyReelsV2DiffusionForcingImageToVideoPipeline` to enhance latent preparation logic and condition handling. Update image input type to `Optional`, streamline video condition processing, and improve handling of `last_image` during latent generation.
      
      * Enhance `SkyReelsV2DiffusionForcingPipeline` by refining latent preparation for long video generation. Introduce new parameters for video handling, overlap history, and causal block size. Update logic to accommodate both short and long video scenarios, ensuring compatibility and improved processing.
      
      * refactor
      
      * fix num_frames
      
      * fix prefix_video_latents
      
      * up
      
      * refactor
      
      * Fix typo in scheduler method call within `SkyReelsV2DiffusionForcingVideoToVideoPipeline` to ensure proper noise scaling during latent generation.
      
      * up
      
      * Enhance `SkyReelsV2DiffusionForcingImageToVideoPipeline` by adding support for `last_image` parameter and refining latent frame calculations. Update preprocessing logic.
      
      * add statistics
      
      * Refine latent frame handling in `SkyReelsV2DiffusionForcingImageToVideoPipeline` by correcting variable names and reintroducing latent mean and standard deviation calculations. Update logic for frame preparation and sampling to ensure accurate video generation.
      
      * up
      
      * refactor
      
      * up
      
      * Refactor `SkyReelsV2DiffusionForcingVideoToVideoPipeline` to improve latent handling by enforcing tensor input for video, updating frame preparation logic, and adjusting default frame count. Enhance preprocessing and postprocessing steps for better integration.
      
      * style
      
      * fix vae output indexing
      
      * upup
      
      * up
      
      * Fix tensor concatenation and repetition logic in `SkyReelsV2DiffusionForcingImageToVideoPipeline` to ensure correct dimensionality for video conditions and latent conditions.
      
      * Refactor latent retrieval logic in `SkyReelsV2DiffusionForcingVideoToVideoPipeline` to handle tensor dimensions more robustly, ensuring compatibility with both 3D and 4D video inputs.
      
      * Enhance logging in `SkyReelsV2DiffusionForcing` pipelines by adding iteration print statements for better debugging. Clean up unused code related to prefix video latents length calculation in `SkyReelsV2DiffusionForcingImageToVideoPipeline`.
      
      * Update latent handling in `SkyReelsV2DiffusionForcingImageToVideoPipeline` to conditionally set latents based on video iteration state, improving flexibility for video input processing.
      
      * Refactor `SkyReelsV2TimeTextImageEmbedding` to utilize `get_1d_sincos_pos_embed_from_grid` for timestep projection.
      
      * Enhance `get_1d_sincos_pos_embed_from_grid` function to include an optional parameter `flip_sin_to_cos` for flipping sine and cosine embeddings, improving flexibility in positional embedding generation.
      
      * Update timestep projection in `SkyReelsV2TimeTextImageEmbedding` to include `flip_sin_to_cos` parameter, enhancing the flexibility of time embedding generation.
      
      * Refactor tensor type handling in `SkyReelsV2AttnProcessor2_0` and `SkyReelsV2TransformerBlock` to ensure consistent use of `torch.float32` and `torch.bfloat16`, improving integration.
      
      * Update tensor type in `SkyReelsV2RotaryPosEmbed` to use `torch.float32` for frequency calculations, ensuring consistency in data types across the model.
      
      * Refactor `SkyReelsV2TimeTextImageEmbedding` to utilize automatic mixed precision for timestep projection.
      
      * down
      
      * down
      
      * style
      
      * Add debug tensor tracking to `SkyReelsV2Transformer3DModel` for enhanced debugging and output analysis; update `Transformer2DModelOutput` to include debug tensors.
      
      * up
      
      * Refactor indentation in `SkyReelsV2AttnProcessor2_0` to improve code readability and maintain consistency in style.
      
      * Convert query, key, and value tensors to bfloat16 in `SkyReelsV2AttnProcessor2_0` for improved performance.
      
      * Add debug print statements in `SkyReelsV2TransformerBlock` to track tensor shapes and values for improved debugging and analysis.
      
      * debug
      
      * debug
      
      * Remove commented-out debug tensor tracking from `SkyReelsV2TransformerBlock`
      
      * Add functionality to save processed video latents as a Safetensors file in `SkyReelsV2DiffusionForcingPipeline`.
      
      * up
      
      * Add functionality to save output latents as a Safetensors file in `SkyReelsV2DiffusionForcingPipeline`.
      
      * up
      
      * Remove additional commented-out debug tensor tracking from `SkyReelsV2TransformerBlock` and `SkyReelsV2Transformer3DModel` for cleaner code.
      
      * style
      
      * cleansing
      
      * Update example documentation and parameters in `SkyReelsV2Pipeline`. Adjusted example code for loading models, modified default values for height, width, num_frames, and guidance_scale, and improved output video quality settings.
      
      * Update shift parameter in example documentation and default values across SkyReels V2 pipelines. Adjusted shift values for I2V from 3.0 to 5.0 and updated related example code for consistency.
      
      * Update example documentation in SkyReels V2 pipelines to include available model options and update model references for loading. Adjusted model names to reflect the latest versions across I2V, V2V, and T2V pipelines.
      
      * Add test templates
      
      * style
      
      * Add docs template
      
      * Add SkyReels V2 Diffusion Forcing Video-to-Video Pipeline to imports
      
      * style
      
      * fix-copies
      
      * convert i2v 1.3b
      
      * Update transformer configuration to include `image_dim` for SkyReels V2 models and refactor imports to use `SkyReelsV2Transformer3DModel`.
      
      * Refactor transformer import in SkyReels V2 pipeline to use `SkyReelsV2Transformer3DModel` for consistency.
      
      * Update transformer configuration in SkyReels V2 to increase `in_channels` from 16 to 36 for i2v conf.
      
      * Update transformer configuration in SkyReels V2 to set `added_kv_proj_dim` values for different model types.
      
      * up
      
      * up
      
      * up
      
      * Add SkyReelsV2Pipeline support for T2V model type in conversion script
      
      * upp
      
      * Refactor model type checks in conversion script to use substring matching for improved flexibility
      
      * upp
      
      * Fix shard path formatting in conversion script to accommodate varying model types by dynamically adjusting zero padding.
      
      * Update sharded safetensors loading logic in conversion script to use substring matching for model directory checks
      
      * Update scheduler parameters in SkyReels V2 test files for consistency across image and video pipelines
      
      * Refactor conversion script to initialize text encoder, tokenizer, and scheduler for SkyReels pipelines, enhancing model integration
      
      * style
      
      * Update documentation for SkyReels-V2, introducing the Infinite-length Film Generative model, enhancing text-to-video generation examples, and updating model references throughout the API documentation.
      
      * Add SkyReelsV2Transformer3DModel and FlowMatchUniPCMultistepScheduler documentation, updating TOC and introducing new model and scheduler files.
      
      * style
      
      * Update documentation for SkyReelsV2DiffusionForcingPipeline to correct flow matching scheduler parameter for I2V from 3.0 to 5.0, ensuring clarity in usage examples.
      
      * Add documentation for causal_block_size parameter in SkyReelsV2DF pipelines, clarifying its role in asynchronous inference.
      
      * Simplify min_ar_step calculation in SkyReelsV2DiffusionForcingPipeline to improve clarity.
      
      * style and fix-copies
      
      * style
      
      * Add documentation for SkyReelsV2Transformer3DModel
      
      Introduced a new markdown file detailing the SkyReelsV2Transformer3DModel, including usage instructions and model output specifications.
      
      * Update test configurations for SkyReelsV2 pipelines
      
      - Adjusted `in_channels` from 36 to 16 in `test_skyreels_v2_df_image_to_video.py`.
      - Added new parameters: `overlap_history`, `num_frames`, and `base_num_frames` in `test_skyreels_v2_df_video_to_video.py`.
      - Updated expected output shape in video tests from (17, 3, 16, 16) to (41, 3, 16, 16).
      
      * Refines SkyReelsV2DF test parameters
      
      * Update src/diffusers/models/modeling_outputs.py
      Co-authored-by: default avatarAryan <contact.aryanvs@gmail.com>
      
      * Refactor `grid_sizes` processing by using already-calculated post-patch parameters to simplify
      
      * Update docs/source/en/api/pipelines/skyreels_v2.md
      Co-authored-by: default avatarAryan <contact.aryanvs@gmail.com>
      
      * Refactor parameter naming for diffusion forcing in SkyReelsV2 pipelines
      
      - Changed `flag_df` to `enable_diffusion_forcing` for clarity in the SkyReelsV2Transformer3DModel and associated pipelines.
      - Updated all relevant method calls to reflect the new parameter name.
      
      * Revert _toctree.yml to adjust section expansion states
      
      * style
      
      * Update docs/source/en/api/models/skyreels_v2_transformer_3d.md
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Add copying label to SkyReelsV2ImageEmbedding from WanImageEmbedding.
      
      * Refactor transformer block processing in SkyReelsV2Transformer3DModel
      
      - Ensured proper handling of hidden states during both gradient checkpointing and standard processing.
      
      * Update SkyReels V2 documentation to remove VRAM requirement and streamline imports
      
      - Removed the mention of ~13GB VRAM requirement for the SkyReels-V2 model.
      - Simplified import statements by removing unused `load_image` import.
      
      * Add SkyReelsV2LoraLoaderMixin for loading and managing LoRA layers in SkyReelsV2Transformer3DModel
      
      - Introduced SkyReelsV2LoraLoaderMixin class to handle loading, saving, and fusing of LoRA weights specific to the SkyReelsV2 model.
      - Implemented methods for state dict management, including compatibility checks for various LoRA formats.
      - Enhanced functionality for loading weights with options for low CPU memory usage and hotswapping.
      - Added detailed docstrings for clarity on parameters and usage.
      
      * Update SkyReelsV2 documentation and loader mixin references
      
      - Corrected the documentation to reference the new `SkyReelsV2LoraLoaderMixin` for loading LoRA weights.
      - Updated comments in the `SkyReelsV2LoraLoaderMixin` class to reflect changes in model references from `WanTransformer3DModel` to `SkyReelsV2Transformer3DModel`.
      
      * Enhance SkyReelsV2 integration by adding SkyReelsV2LoraLoaderMixin references
      
      - Added `SkyReelsV2LoraLoaderMixin` to the documentation and loader imports for improved LoRA weight management.
      - Updated multiple pipeline classes to inherit from `SkyReelsV2LoraLoaderMixin` instead of `WanLoraLoaderMixin`.
      
      * Update SkyReelsV2 model references in documentation
      
      - Replaced placeholder model paths with actual paths for SkyReels-V2 models in multiple pipeline files.
      - Ensured consistency across the documentation for loading models in the SkyReelsV2 pipelines.
      
      * style
      
      * fix-copies
      
      * Refactor `fps_projection` in `SkyReelsV2Transformer3DModel`
      
      - Replaced the sequential linear layers for `fps_projection` with a `FeedForward` layer using `SiLU` activation for better integration.
      
      * Update docs
      
      * Refactor video processing in SkyReelsV2DiffusionForcingPipeline
      
      - Renamed parameters for clarity: `video` to `video_latents` and `overlap_history` to `overlap_history_latent_frames`.
      - Updated logic for handling long video generation, including adjustments to latent frame calculations and accumulation.
      - Consolidated handling of latents for both long and short video generation scenarios.
      - Final decoding step now consistently converts latents to pixels, ensuring proper output format.
      
      * Update activation function in `fps_projection` of `SkyReelsV2Transformer3DModel`
      
      - Changed activation function from `silu` to `linear-silu` in the `fps_projection` layer for improved performance and integration.
      
      * Add fps_projection layer renaming in convert_skyreelsv2_to_diffusers.py
      
      - Updated key mappings for the `fps_projection` layer to align with new naming conventions, ensuring consistency in model integration.
      
      * Fix fps_projection assignment in SkyReelsV2Transformer3DModel
      
      - Corrected the assignment of the `fps_projection` layer to ensure it is properly cast to the appropriate data type, enhancing model functionality.
      
      * Update _keep_in_fp32_modules in SkyReelsV2Transformer3DModel
      
      - Added `fps_projection` to the list of modules that should remain in FP32 precision, ensuring proper handling of data types during model operations.
      
      * Remove integration test classes from SkyReelsV2 test files
      
      - Deleted the `SkyReelsV2DiffusionForcingPipelineIntegrationTests` and `SkyReelsV2PipelineIntegrationTests` classes along with their associated setup, teardown, and test methods, as they were not implemented and not needed for current testing.
      
      * style
      
      * Refactor: Remove hardcoded `torch.bfloat16` cast in attention
      
      * Refactor: Simplify data type handling in transformer model
      
      Removes unnecessary data type conversions for the FPS embedding and timestep projection.
      
      This change simplifies the forward pass by relying on the inherent data types of the tensors.
      
      * Refactor: Remove `fps_projection` from `_keep_in_fp32_modules` in `SkyReelsV2Transformer3DModel`
      
      * Update src/diffusers/models/transformers/transformer_skyreels_v2.py
      Co-authored-by: default avatarAryan <contact.aryanvs@gmail.com>
      
      * Refactor: Remove unused flags and simplify attention mask handling in SkyReelsV2AttnProcessor2_0 and SkyReelsV2Transformer3DModel
      
      Refactor: Simplify causal attention logic in SkyReelsV2
      
      Removes the `flag_causal_attention` and `_flag_ar_attention` flags to simplify the implementation.
      
      The decision to apply a causal attention mask is now based directly on the `num_frame_per_block` configuration, eliminating redundant flags and conditional checks. This streamlines the attention mechanism and simplifies the `set_ar_attention` methods.
      
      * Refactor: Clarify variable names for latent frames
      
      Renames `base_num_frames` to `base_latent_num_frames` to make it explicit that the variable refers to the number of frames in the latent space.
      
      This change improves code readability and reduces potential confusion between latent frames and decoded video frames.
      
      The `num_frames` parameter in `generate_timestep_matrix` is also renamed to `num_latent_frames` for consistency.
      
      * Enhance documentation: Add detailed docstring for timestep matrix generation in SkyReelsV2DiffusionForcingPipeline
      
      * Docs: Clarify long video chunking in pipeline docstring
      
      Improves the explanation of long video processing within the pipeline's docstring.
      
      The update replaces the abstract description with a concrete example, illustrating how the sliding window mechanism works with overlapping chunks. This makes the roles of `base_num_frames` and `overlap_history` clearer for users.
      
      * Docs: Move visual demonstration and processing details for SkyReelsV2DiffusionForcingPipeline to docs page from the code
      
      * Docs: Update asynchronous processing timeline and examples for long video handling in SkyReels-V2 documentation
      
      * Enhance timestep matrix generation documentation and logic for synchronous/asynchronous video processing
      
      * Update timestep matrix documentation and enhance analysis for clarity in SkyReelsV2DiffusionForcingPipeline
      
      * Docs: Update visual demonstration section and add detailed step matrix construction example for asynchronous processing in SkyReelsV2DiffusionForcingPipeline
      
      * style
      
      * fix-copies
      
      * Refactor parameter names for clarity in SkyReelsV2DiffusionForcingImageToVideoPipeline and SkyReelsV2DiffusionForcingVideoToVideoPipeline
      
      * Refactor: Avoid VAE roundtrip in long video generation
      
      Improves performance and quality for long video generation by operating entirely in latent space during the iterative generation process.
      
      Instead of decoding latents to video and then re-encoding the overlapping section for the next chunk, this change passes the generated latents directly between iterations.
      
      This avoids a computationally expensive and potentially lossy VAE decode/encode cycle within the loop. The full video is now decoded only once from the accumulated latents at the end of the process.
      
      * Refactor: Rename prefix_video_latents_length to prefix_video_latents_frames for clarity
      
      * Refactor: Rename num_latent_frames to current_num_latent_frames for clarity in SkyReelsV2DiffusionForcingImageToVideoPipeline
      
      * Refactor: Enhance long video generation logic and improve latent handling in SkyReelsV2DiffusionForcingImageToVideoPipeline
      
      Refactor: Unify video generation and pass latents directly
      
      Unifies the separate code paths for short and long video generation into a single, streamlined loop.
      
      This change eliminates the inefficient decode-encode cycle during long video generation. Instead of converting latents to pixel-space video between chunks, the pipeline now passes the generated latents directly to the next iteration.
      
      This improves performance, avoids potential quality loss from intermediate VAE steps, and enhances code maintainability by removing significant duplication.
      
      * style
      
      * Refactor: Remove overlap_history parameter and streamline long video generation logic in SkyReelsV2DiffusionForcingImageToVideoPipeline
      
      Refactor: Streamline long video generation logic
      
      Removes the `overlap_history` parameter and simplifies the conditioning process for long video generation.
      
      This change avoids a redundant VAE encoding step by directly using latent frames from the previous chunk for conditioning. It also moves image preprocessing outside the main generation loop to prevent repeated computations and clarifies the handling of prefix latents.
      
      * style
      
      * Refactor latent handling in i2v diffusion forcing pipeline
      
      Improves the latent conditioning and accumulation logic within the image-to-video diffusion forcing loop.
      
      - Corrects the splitting of the initial conditioning tensor to robustly handle both even and odd lengths.
      - Simplifies how latents are accumulated across iterations for long video generation.
      - Ensures the final latents are trimmed correctly before decoding only when a `last_image` is provided.
      
      * Refactor: Remove overlap_history parameter from SkyReelsV2DiffusionForcingImageToVideoPipeline
      
      * Refactor: Adjust video_latents parameter handling in prepare_latents method
      
      * style
      
      * Refactor: Update long video iteration print statements for clarity
      
      * Fix: Update transformer config with dynamic causal block size
      
      Updates the SkyReelsV2 pipelines to correctly set the `causal_block_size` in the transformer's configuration when it's provided during a pipeline call.
      
      This ensures the model configuration reflects the user's specified setting for the inference run. The `set_ar_attention` method is also renamed to `_set_ar_attention` to mark it as an internal helper.
      
      * style
      
      * Refactor: Adjust video input size and expected output shape in inference test
      
      * Refactor: Rename video variables for clarity in SkyReelsV2DiffusionForcingVideoToVideoPipeline
      
      * Docs: Clarify time embedding logic in SkyReelsV2
      
      Adds comments to explain the handling of different time embedding tensor dimensions.
      
      A 2D tensor is used for standard models with a single time embedding per batch, while a 3D tensor is used for Diffusion Forcing models where each frame has its own time embedding. This clarifies the expected input for different model variations.
      
      * Docs: Update SkyReels V2 pipeline examples
      
      Updates the docstring examples for the SkyReels V2 pipelines to reflect current best practices and API changes.
      
      - Removes the `shift` parameter from pipeline call examples, as it is now configured directly on the scheduler.
      - Replaces the `set_ar_attention` method call with the `causal_block_size` argument in the pipeline call for diffusion forcing examples.
      - Adjusts recommended parameters for I2V and V2V examples, including inference steps, guidance scale, and `ar_step`.
      
      * Refactor: Remove `shift` parameter from SkyReelsV2 pipelines
      
      Removes the `shift` parameter from the call signature of all SkyReelsV2 pipelines.
      
      This parameter is a scheduler-specific configuration and should be set directly on the scheduler during its initialization, rather than being passed at runtime through the pipeline. This change simplifies the pipeline API.
      
      Usage examples are updated to reflect that the `shift` value should now be passed when creating the `FlowMatchUniPCMultistepScheduler`.
      
      * Refactors SkyReelsV2 image-to-video tests and adds last image case
      
      Simplifies the test suite by removing a duplicated test class and streamlining the dummy component and input generation.
      
      Adds a new test to verify the pipeline's behavior when a `last_image` is provided as input for conditioning.
      
      * test: Add image components to SkyReelsV2 pipeline test
      
      Adds the `image_encoder` and `image_processor` to the test components for the image-to-video pipeline.
      
      Also replaces a hardcoded value for the positional embedding sequence length with a more descriptive calculation, improving clarity.
      
      * test: Add callback configuration test for SkyReelsV2DiffusionForcingVideoToVideoPipeline
      
      test: Add callback test for SkyReelsV2DFV2V pipeline
      
      Adds a test to validate the callback functionality for the `SkyReelsV2DiffusionForcingVideoToVideoPipeline`.
      
      This test confirms that `callback_on_step_end` is invoked correctly and can modify the pipeline's state during inference. It uses a callback to dynamically increase the `guidance_scale` and asserts that the final value is as expected.
      
      The implementation correctly accounts for the nested denoising loops present in diffusion forcing pipelines.
      
      * style
      
      * fix: Update image_encoder type to CLIPVisionModelWithProjection in SkyReelsV2ImageToVideoPipeline
      
      * UP
      
      * Add conversion support for SkyReels-V2-FLF2V models
      
      Adds configurations for three new FLF2V model variants (1.3B-540P, 14B-540P, and 14B-720P) to the conversion script.
      
      This change also introduces specific handling to zero out the image positional embeddings for these models and updates the main script to correctly initialize the image-to-video pipeline.
      
      * Docs: Update and simplify SkyReels V2 usage examples
      
      Simplifies the text-to-video example by removing the manual group offloading configuration, making it more straightforward.
      
      Adds comments to pipeline parameters to clarify their purpose and provides guidance for different resolutions and long video generation.
      
      Introduces a new section with a code example for the video-to-video pipeline.
      
      * style
      
      * docs: Add SkyReels-V2 FLF2V 1.3B model to supported models list
      
      * docs: Update SkyReels-V2 documentation
      
      * Move the initialization of the `gradient_checkpointing` attribute to its suggested location.
      
      * Refactor: Use logger for long video progress messages
      
      Replaces `print()` calls with `logger.debug()` for reporting progress during long video generation in SkyReelsV2DF pipelines.
      
      This change reduces console output verbosity for standard runs while allowing developers to view progress by enabling debug-level logging.
      
      * Refactor SkyReelsV2 timestep embedding into a module
      
      Extract the sinusoidal timestep embedding logic into a new `SkyReelsV2Timesteps` `nn.Module`.
      
      This change encapsulates the embedding generation, which simplifies the `SkyReelsV2TimeTextImageEmbedding` class and improves code modularity.
      
      * Fix: Preserve original shape in timestep embeddings
      
      Reshapes the timestep embedding tensor to match the original input shape.
      
      This ensures that batched timestep inputs retain their batch dimension after embedding, preventing potential shape mismatches.
      
      * style
      
      * Refactor: Move SkyReelsV2Timesteps to model file
      
      Colocates the `SkyReelsV2Timesteps` class with the SkyReelsV2 transformer model.
      
      This change moves model-specific timestep embedding logic from the general embeddings module to the transformer's own file, improving modularity and making the model more self-contained.
      
      * Refactor parameter dtype retrieval to use utility function
      
      Replaces manual parameter iteration with the `get_parameter_dtype` helper to determine the time embedder's data type.
      
      This change improves code readability and centralizes the logic.
      
      * Add comments to track the tensor shape transformations
      
      * Add copied froms
      
      * style
      
      * fix-copies
      
      * up
      
      * Remove FlowMatchUniPCMultistepScheduler
      
      Deletes the `FlowMatchUniPCMultistepScheduler` as it is no longer being used.
      
      * Refactor: Replace FlowMatchUniPC scheduler with UniPC
      
      Removes the `FlowMatchUniPCMultistepScheduler` and integrates its functionality into the existing `UniPCMultistepScheduler`.
      
      This consolidation is achieved by using the `use_flow_sigmas=True` parameter in `UniPCMultistepScheduler`, simplifying the scheduler API and reducing code duplication. All usages, documentation, and tests are updated accordingly.
      
      * style
      
      * Remove text_encoder parameter from SkyReelsV2DiffusionForcingPipeline initialization
      
      * Docs: Rename `pipe` to `pipeline` in SkyReels examples
      
      Updates the variable name from `pipe` to `pipeline` across all SkyReels V2 documentation examples. This change improves clarity and consistency.
      
      * Fix: Rename shift parameter to flow_shift in SkyReels-V2 examples
      
      * Fix: Rename shift parameter to flow_shift in example documentation across SkyReels-V2 files
      
      * Fix: Rename shift parameter to flow_shift in UniPCMultistepScheduler initialization across SkyReels test files
      
      * Removes unused generator argument from scheduler step
      
      The `generator` parameter is not used by the scheduler's `step` method within the SkyReelsV2 diffusion forcing pipelines. This change removes the unnecessary argument from the method call for code clarity and consistency.
      
      * Fix: Update time_embedder_dtype assignment to use the first parameter's dtype in SkyReelsV2TimeTextImageEmbedding
      
      * style
      
      * Refactor: Use get_parameter_dtype utility function
      
      Replaces manual parameter iteration with the `get_parameter_dtype` helper.
      
      * Fix: Prevent (potential) error in parameter dtype check
      
      Adds a check to ensure the `_keep_in_fp32_modules` attribute exists on a parameter before it is accessed.
      
      This prevents a potential `AttributeError`, making the utility function more robust when used with models that do not define this attribute.
      
      ---------
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      Co-authored-by: default avatarAryan <contact.aryanvs@gmail.com>
      7298bdd8
    • G.O.D's avatar
      enable flux pipeline compatible with unipc and dpm-solver (#11908) · 5c520972
      G.O.D authored
      
      
      * Update pipeline_flux.py
      
      have flux pipeline work with unipc/dpm schedulers
      
      * clean code
      
      * Update scheduling_dpmsolver_multistep.py
      
      * Update scheduling_unipc_multistep.py
      
      * Update pipeline_flux.py
      
      * Update scheduling_deis_multistep.py
      
      * Update scheduling_dpmsolver_singlestep.py
      
      * Apply style fixes
      
      ---------
      Co-authored-by: default avatargithub-actions[bot] <github-actions[bot]@users.noreply.github.com>
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      Co-authored-by: default avatarÁlvaro Somoza <asomoza@users.noreply.github.com>
      5c520972
  6. 19 Jun, 2025 1 commit
  7. 19 May, 2025 1 commit
  8. 24 Apr, 2025 1 commit
  9. 18 Dec, 2024 1 commit
  10. 16 Dec, 2024 1 commit
  11. 15 Dec, 2024 1 commit
    • Junsong Chen's avatar
      [Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`,... · 5a196e3d
      Junsong Chen authored
      
      [Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`, `LinearAttentionProcessor`, `Flow-based DPM-sovler` and so on. (#9982)
      
      * first add a script for DC-AE;
      
      * DC-AE init
      
      * replace triton with custom implementation
      
      * 1. rename file and remove un-used codes;
      
      * no longer rely on omegaconf and dataclass
      
      * replace custom activation with diffuers activation
      
      * remove dc_ae attention in attention_processor.py
      
      * iinherit from ModelMixin
      
      * inherit from ConfigMixin
      
      * dc-ae reduce to one file
      
      * update downsample and upsample
      
      * clean code
      
      * support DecoderOutput
      
      * remove get_same_padding and val2tuple
      
      * remove autocast and some assert
      
      * update ResBlock
      
      * remove contents within super().__init__
      
      * Update src/diffusers/models/autoencoders/dc_ae.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * remove opsequential
      
      * update other blocks to support the removal of build_norm
      
      * remove build encoder/decoder project in/out
      
      * remove inheritance of RMSNorm2d from LayerNorm
      
      * remove reset_parameters for RMSNorm2d
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * remove device and dtype in RMSNorm2d __init__
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update src/diffusers/models/autoencoders/dc_ae.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update src/diffusers/models/autoencoders/dc_ae.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * Update src/diffusers/models/autoencoders/dc_ae.py
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      
      * remove op_list & build_block
      
      * remove build_stage_main
      
      * change file name to autoencoder_dc
      
      * move LiteMLA to attention.py
      
      * align with other vae decode output;
      
      * add DC-AE into init files;
      
      * update
      
      * make quality && make style;
      
      * quick push before dgx disappears again
      
      * update
      
      * make style
      
      * update
      
      * update
      
      * fix
      
      * refactor
      
      * refactor
      
      * refactor
      
      * update
      
      * possibly change to nn.Linear
      
      * refactor
      
      * make fix-copies
      
      * replace vae with ae
      
      * replace get_block_from_block_type to get_block
      
      * replace downsample_block_type from Conv to conv for consistency
      
      * add scaling factors
      
      * incorporate changes for all checkpoints
      
      * make style
      
      * move mla to attention processor file; split qkv conv to linears
      
      * refactor
      
      * add tests
      
      * from original file loader
      
      * add docs
      
      * add standard autoencoder methods
      
      * combine attention processor
      
      * fix tests
      
      * update
      
      * minor fix
      
      * minor fix
      
      * minor fix & in/out shortcut rename
      
      * minor fix
      
      * make style
      
      * fix paper link
      
      * update docs
      
      * update single file loading
      
      * make style
      
      * remove single file loading support; todo for DN6
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * add abstract
      
      * 1. add DCAE into diffusers;
      2. make style and make quality;
      
      * add DCAE_HF into diffusers;
      
      * bug fixed;
      
      * add SanaPipeline, SanaTransformer2D into diffusers;
      
      * add sanaLinearAttnProcessor2_0;
      
      * first update for SanaTransformer;
      
      * first update for SanaPipeline;
      
      * first success run SanaPipeline;
      
      * model output finally match with original model with the same intput;
      
      * code update;
      
      * code update;
      
      * add a flow dpm-solver scripts
      
      * 🎉[important update]
      1. Integrate flow-dpm-sovler into diffusers;
      2. finally run successfully on both `FlowMatchEulerDiscreteScheduler` and `FlowDPMSolverMultistepScheduler`;
      
      * 🎉🔧
      
      [important update & fix huge bugs!!]
      1. add SanaPAGPipeline & several related Sana linear attention operators;
      2. `SanaTransformer2DModel` not supports multi-resolution input;
      2. fix the multi-scale HW bugs in SanaPipeline and SanaPAGPipeline;
      3. fix the flow-dpm-solver set_timestep() init `model_output` and `lower_order_nums` bugs;
      
      * remove prints;
      
      * add convert sana official checkpoint to diffusers format Safetensor.
      
      * Update src/diffusers/models/transformers/sana_transformer_2d.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/models/transformers/sana_transformer_2d.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/models/transformers/sana_transformer_2d.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/pipelines/pag/pipeline_pag_sana.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/models/transformers/sana_transformer_2d.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/models/transformers/sana_transformer_2d.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/pipelines/sana/pipeline_sana.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/diffusers/pipelines/sana/pipeline_sana.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * update Sana for DC-AE's recent commit;
      
      * make style && make quality
      
      * Add StableDiffusion3PAGImg2Img Pipeline + Fix SD3 Unconditional PAG (#9932)
      
      * fix progress bar updates in SD 1.5 PAG Img2Img pipeline
      
      ---------
      Co-authored-by: default avatarVinh H. Pham <phamvinh257@gmail.com>
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * make the vae can be None in `__init__` of `SanaPipeline`
      
      * Update src/diffusers/models/transformers/sana_transformer_2d.py
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      
      * change the ae related code due to the latest update of DCAE branch;
      
      * change the ae related code due to the latest update of DCAE branch;
      
      * 1. change code based on AutoencoderDC;
      2. fix the bug of new GLUMBConv;
      3. run success;
      
      * update for solving conversation.
      
      * 1. fix bugs and run convert script success;
      2. Downloading ckpt from hub automatically;
      
      * make style && make quality;
      
      * 1. remove un-unsed parameters in init;
      2. code update;
      
      * remove test file
      
      * refactor; add docs; add tests; update conversion script
      
      * make style
      
      * make fix-copies
      
      * refactor
      
      * udpate pipelines
      
      * pag tests and refactor
      
      * remove sana pag conversion script
      
      * handle weight casting in conversion script
      
      * update conversion script
      
      * add a processor
      
      * 1. add bf16 pth file path;
      2. add complex human instruct in pipeline;
      
      * fix fast \tests
      
      * change gemma-2-2b-it ckpt to a non-gated repo;
      
      * fix the pth path bug in conversion script;
      
      * change grad ckpt to original; make style
      
      * fix the complex_human_instruct bug and typo;
      
      * remove dpmsolver flow scheduler
      
      * apply review suggestions
      
      * change the `FlowMatchEulerDiscreteScheduler` to default `DPMSolverMultistepScheduler` with flow matching scheduler.
      
      * fix the tokenizer.padding_side='right' bug;
      
      * update docs
      
      * make fix-copies
      
      * fix imports
      
      * fix docs
      
      * add integration test
      
      * update docs
      
      * update examples
      
      * fix convert_model_output in schedulers
      
      * fix failing tests
      
      ---------
      Co-authored-by: default avatarJunyu Chen <chenjydl2003@gmail.com>
      Co-authored-by: default avatarYiYi Xu <yixu310@gmail.com>
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarchenjy2003 <70215701+chenjy2003@users.noreply.github.com>
      Co-authored-by: default avatarAryan <aryan@huggingface.co>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      Co-authored-by: default avatarhlky <hlky@hlky.ac>
      5a196e3d
  12. 20 Nov, 2024 1 commit
  13. 30 Sep, 2024 1 commit
  14. 25 Sep, 2024 1 commit
  15. 20 Jul, 2024 1 commit
  16. 24 May, 2024 1 commit
  17. 10 May, 2024 1 commit
    • Mark Van Aken's avatar
      #7535 Update FloatTensor type hints to Tensor (#7883) · be4afa0b
      Mark Van Aken authored
      * find & replace all FloatTensors to Tensor
      
      * apply formatting
      
      * Update torch.FloatTensor to torch.Tensor in the remaining files
      
      * formatting
      
      * Fix the rest of the places where FloatTensor is used as well as in documentation
      
      * formatting
      
      * Update new file from FloatTensor to Tensor
      be4afa0b
  18. 03 Apr, 2024 2 commits
    • Beinsezii's avatar
      UniPC Multistep add `rescale_betas_zero_snr` (#7531) · aa190259
      Beinsezii authored
      * UniPC Multistep add `rescale_betas_zero_snr`
      
      Same patch as DPM and Euler with the patched final alpha cumprod
      
      BF16 doesn't seem to break down, I think cause UniPC upcasts during some
      phases already? We could still force an upcast since it only
      loses ≈ 0.005 it/s for me but the difference in output is very small. A
      better endeavor might upcasting in step() and removing all the other
      upcasts elsewhere?
      
      * UniPC ZSNR UT
      
      * Re-add `rescale_betas_zsnr` doc oops
      aa190259
    • Beinsezii's avatar
      UniPC Multistep fix tensor dtype/device on order=3 (#7532) · 19ab04ff
      Beinsezii authored
      * UniPC UTs iterate solvers on FP16
      
      It wasn't catching errs on order==3. Might be excessive?
      
      * UniPC Multistep fix tensor dtype/device on order=3
      
      * UniPC UTs Add v_pred to fp16 test iter
      
      For completions sake. Probably overkill?
      19ab04ff
  19. 02 Apr, 2024 1 commit
    • Sayak Paul's avatar
      add: utility to format our docs too 📜 (#7314) · 4a343077
      Sayak Paul authored
      * add: utility to format our docs too 📜
      
      * debugging saga
      
      * fix: message
      
      * checking
      
      * should be fixed.
      
      * revert pipeline_fixture
      
      * remove empty line
      
      * make style
      
      * fix: setup.py
      
      * style.
      4a343077
  20. 30 Mar, 2024 1 commit
    • Beinsezii's avatar
      Add `final_sigma_zero` to UniPCMultistep (#7517) · f0c81562
      Beinsezii authored
      * Add `final_sigma_zero` to UniPCMultistep
      
      Effectively the same trick as DDIM's `set_alpha_to_one` and
      DPM's `final_sigma_type='zero'`.
      Currently False by default but maybe this should be True?
      
      * `final_sigma_zero: bool` -> `final_sigmas_type: str`
      
      Should 1:1 match DPM Multistep now.
      
      * Set `final_sigmas_type='sigma_min'` in UniPC UTs
      f0c81562
  21. 21 Mar, 2024 1 commit
  22. 19 Mar, 2024 1 commit
  23. 18 Mar, 2024 1 commit
    • M. Tolga Cangöz's avatar
      Fix Typos (#7325) · 6a05b274
      M. Tolga Cangöz authored
      * Fix PyTorch's convention for inplace functions
      
      * Fix import structure in __init__.py and update config loading logic in test_config.py
      
      * Update configuration access
      
      * Fix typos
      
      * Trim trailing white spaces
      
      * Fix typo in logger name
      
      * Revert "Fix PyTorch's convention for inplace functions"
      
      This reverts commit f65dc4afcb57ceb43d5d06389229d47bafb10d2d.
      
      * Fix typo in step_index property description
      
      * Revert "Update configuration access"
      
      This reverts commit 8d44e870b8c1ad08802e3e904c34baeca1b598f8.
      
      * Revert "Fix import structure in __init__.py and update config loading logic in test_config.py"
      
      This reverts commit 2ad5e8bca25aede3b912da22bd57285b598fe171.
      
      * Fix typos
      
      * Fix typos
      
      * Fix typos
      
      * Fix a typo: tranform -> transform
      6a05b274
  24. 14 Mar, 2024 1 commit
  25. 08 Feb, 2024 1 commit
  26. 01 Feb, 2024 1 commit
  27. 26 Jan, 2024 1 commit
  28. 15 Dec, 2023 1 commit
  29. 07 Dec, 2023 1 commit
  30. 01 Dec, 2023 1 commit
  31. 29 Nov, 2023 1 commit
    • Suraj Patil's avatar
      Add SVD (#5895) · 63f767ef
      Suraj Patil authored
      
      
      * begin model
      
      * finish blocks
      
      * add_embedding
      
      * addition_time_embed_dim
      
      * use TimestepEmbedding
      
      * fix temporal res block
      
      * fix time_pos_embed
      
      * fix add_embedding
      
      * add conversion script
      
      * fix model
      
      * up
      
      * add new resnet blocks
      
      * make forward work
      
      * return sample in original shape
      
      * fix temb shape in TemporalResnetBlock
      
      * add spatio temporal transformers
      
      * add vae blocks
      
      * fix blocks
      
      * update
      
      * update
      
      * fix shapes in Alphablender and add time activation in res blcok
      
      * use new blocks
      
      * style
      
      * fix temb shape
      
      * fix SpatioTemporalResBlock
      
      * reuse TemporalBasicTransformerBlock
      
      * fix TemporalBasicTransformerBlock
      
      * use TransformerSpatioTemporalModel
      
      * fix TransformerSpatioTemporalModel
      
      * fix time_context dim
      
      * clean up
      
      * make temb optional
      
      * add blocks
      
      * rename model
      
      * update conversion script
      
      * remove UNetMidBlockSpatioTemporal
      
      * add in init
      
      * remove unused arg
      
      * remove unused arg
      
      * remove more unsed args
      
      * up
      
      * up
      
      * check for None
      
      * update vae
      
      * update up/mid blocks for decoder
      
      * begin pipeline
      
      * adapt scheduler
      
      * add guidance scalings
      
      * fix norm eps in temporal transformers
      
      * add temporal autoencoder
      
      * make pipeline run
      
      * fix frame decodig
      
      * decode in float32
      
      * decode n frames at a time
      
      * pass decoding_t to decode_latents
      
      * fix decode_latents
      
      * vae encode/decode in fp32
      
      * fix dtype in TransformerSpatioTemporalModel
      
      * type image_latents same as image_embeddings
      
      * allow using differnt eps in temporal block for video decoder
      
      * fix default values in vae
      
      * pass num frames in decode
      
      * switch spatial to temporal for mixing in VAE
      
      * fix num frames during split decoding
      
      * cast alpha to sample dtype
      
      * fix attention in MidBlockTemporalDecoder
      
      * fix typo
      
      * fix guidance_scales dtype
      
      * fix missing activation in TemporalDecoder
      
      * skip_post_quant_conv
      
      * add vae conversion
      
      * style
      
      * take guidance scale as input
      
      * up
      
      * allow passing PIL to export_video
      
      * accept fps as arg
      
      * add pipeline and vae in init
      
      * remove hack
      
      * use AutoencoderKLTemporalDecoder
      
      * don't scale image latents
      
      * add unet tests
      
      * clean up unet
      
      * clean TransformerSpatioTemporalModel
      
      * add slow svd test
      
      * clean up
      
      * make temb optional in Decoder mid block
      
      * fix norm eps in TransformerSpatioTemporalModel
      
      * clean up temp decoder
      
      * clean up
      
      * clean up
      
      * use c_noise values for timesteps
      
      * use math for log
      
      * update
      
      * fix copies
      
      * doc
      
      * upcast vae
      
      * update forward pass for gradient checkpointing
      
      * make added_time_ids is tensor
      
      * up
      
      * fix upcasting
      
      * remove post quant conv
      
      * add _resize_with_antialiasing
      
      * fix _compute_padding
      
      * cleanup model
      
      * more cleanup
      
      * more cleanup
      
      * more cleanup
      
      * remove freeu
      
      * remove attn slice
      
      * small clean
      
      * up
      
      * up
      
      * remove extra step kwargs
      
      * remove eta
      
      * remove dropout
      
      * remove callback
      
      * remove merge factor args
      
      * clean
      
      * clean up
      
      * move to dedicated folder
      
      * remove attention_head_dim
      
      * docstr and small fix
      
      * update unet doc strings
      
      * rename decoding_t
      
      * correct linting
      
      * store c_skip and c_out
      
      * cleanup
      
      * clean TemporalResnetBlock
      
      * more cleanup
      
      * clean up vae
      
      * clean up
      
      * begin doc
      
      * more cleanup
      
      * up
      
      * up
      
      * doc
      
      * Improve
      
      * better naming
      
      * better naming
      
      * better naming
      
      * better naming
      
      * better naming
      
      * better naming
      
      * better naming
      
      * better naming
      
      * Apply suggestions from code review
      
      * Default chunk size to None
      
      * add example
      
      * Better
      
      * Apply suggestions from code review
      
      * update doc
      
      * Update src/diffusers/pipelines/stable_diffusion_video/pipeline_stable_diffusion_video.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * style
      
      * Get torch compile working
      
      * up
      
      * rename
      
      * fix doc
      
      * add chunking
      
      * torch compile
      
      * torch compile
      
      * add modelling outputs
      
      * torch compile
      
      * Improve chunking
      
      * Apply suggestions from code review
      
      * Update docs/source/en/using-diffusers/svd.md
      
      * Close diff tag
      
      * remove slicing
      
      * resnet docstr
      
      * add docstr in resnet
      
      * rename
      
      * Apply suggestions from code review
      
      * update tests
      
      * Fix output type latents
      
      * fix more
      
      * fix more
      
      * Update docs/source/en/using-diffusers/svd.md
      
      * fix more
      
      * add pipeline tests
      
      * remove unused arg
      
      * clean  up
      
      * make sure get_scaling receives tensors
      
      * fix euler scheduler
      
      * fix get_scalings
      
      * simply euler for now
      
      * remove old test file
      
      * use randn_tensor to create noise
      
      * fix device for rand tensor
      
      * increase expected_max_difference
      
      * fix test_inference_batch_single_identical
      
      * actually fix test_inference_batch_single_identical
      
      * disable test_save_load_float16
      
      * skip test_float16_inference
      
      * skip test_inference_batch_single_identical
      
      * fix test_xformers_attention_forwardGenerator_pass
      
      * Apply suggestions from code review
      
      * update StableVideoDiffusionPipelineSlowTests
      
      * update image
      
      * add diffusers example
      
      * fix more
      
      ---------
      Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarapolinário <joaopaulo.passos@gmail.com>
      63f767ef
  32. 20 Nov, 2023 1 commit
  33. 31 Oct, 2023 1 commit
  34. 03 Oct, 2023 1 commit
  35. 02 Oct, 2023 2 commits
  36. 23 Sep, 2023 1 commit