• Suraj Patil's avatar
    Add SVD (#5895) · 63f767ef
    Suraj Patil authored
    
    
    * begin model
    
    * finish blocks
    
    * add_embedding
    
    * addition_time_embed_dim
    
    * use TimestepEmbedding
    
    * fix temporal res block
    
    * fix time_pos_embed
    
    * fix add_embedding
    
    * add conversion script
    
    * fix model
    
    * up
    
    * add new resnet blocks
    
    * make forward work
    
    * return sample in original shape
    
    * fix temb shape in TemporalResnetBlock
    
    * add spatio temporal transformers
    
    * add vae blocks
    
    * fix blocks
    
    * update
    
    * update
    
    * fix shapes in Alphablender and add time activation in res blcok
    
    * use new blocks
    
    * style
    
    * fix temb shape
    
    * fix SpatioTemporalResBlock
    
    * reuse TemporalBasicTransformerBlock
    
    * fix TemporalBasicTransformerBlock
    
    * use TransformerSpatioTemporalModel
    
    * fix TransformerSpatioTemporalModel
    
    * fix time_context dim
    
    * clean up
    
    * make temb optional
    
    * add blocks
    
    * rename model
    
    * update conversion script
    
    * remove UNetMidBlockSpatioTemporal
    
    * add in init
    
    * remove unused arg
    
    * remove unused arg
    
    * remove more unsed args
    
    * up
    
    * up
    
    * check for None
    
    * update vae
    
    * update up/mid blocks for decoder
    
    * begin pipeline
    
    * adapt scheduler
    
    * add guidance scalings
    
    * fix norm eps in temporal transformers
    
    * add temporal autoencoder
    
    * make pipeline run
    
    * fix frame decodig
    
    * decode in float32
    
    * decode n frames at a time
    
    * pass decoding_t to decode_latents
    
    * fix decode_latents
    
    * vae encode/decode in fp32
    
    * fix dtype in TransformerSpatioTemporalModel
    
    * type image_latents same as image_embeddings
    
    * allow using differnt eps in temporal block for video decoder
    
    * fix default values in vae
    
    * pass num frames in decode
    
    * switch spatial to temporal for mixing in VAE
    
    * fix num frames during split decoding
    
    * cast alpha to sample dtype
    
    * fix attention in MidBlockTemporalDecoder
    
    * fix typo
    
    * fix guidance_scales dtype
    
    * fix missing activation in TemporalDecoder
    
    * skip_post_quant_conv
    
    * add vae conversion
    
    * style
    
    * take guidance scale as input
    
    * up
    
    * allow passing PIL to export_video
    
    * accept fps as arg
    
    * add pipeline and vae in init
    
    * remove hack
    
    * use AutoencoderKLTemporalDecoder
    
    * don't scale image latents
    
    * add unet tests
    
    * clean up unet
    
    * clean TransformerSpatioTemporalModel
    
    * add slow svd test
    
    * clean up
    
    * make temb optional in Decoder mid block
    
    * fix norm eps in TransformerSpatioTemporalModel
    
    * clean up temp decoder
    
    * clean up
    
    * clean up
    
    * use c_noise values for timesteps
    
    * use math for log
    
    * update
    
    * fix copies
    
    * doc
    
    * upcast vae
    
    * update forward pass for gradient checkpointing
    
    * make added_time_ids is tensor
    
    * up
    
    * fix upcasting
    
    * remove post quant conv
    
    * add _resize_with_antialiasing
    
    * fix _compute_padding
    
    * cleanup model
    
    * more cleanup
    
    * more cleanup
    
    * more cleanup
    
    * remove freeu
    
    * remove attn slice
    
    * small clean
    
    * up
    
    * up
    
    * remove extra step kwargs
    
    * remove eta
    
    * remove dropout
    
    * remove callback
    
    * remove merge factor args
    
    * clean
    
    * clean up
    
    * move to dedicated folder
    
    * remove attention_head_dim
    
    * docstr and small fix
    
    * update unet doc strings
    
    * rename decoding_t
    
    * correct linting
    
    * store c_skip and c_out
    
    * cleanup
    
    * clean TemporalResnetBlock
    
    * more cleanup
    
    * clean up vae
    
    * clean up
    
    * begin doc
    
    * more cleanup
    
    * up
    
    * up
    
    * doc
    
    * Improve
    
    * better naming
    
    * better naming
    
    * better naming
    
    * better naming
    
    * better naming
    
    * better naming
    
    * better naming
    
    * better naming
    
    * Apply suggestions from code review
    
    * Default chunk size to None
    
    * add example
    
    * Better
    
    * Apply suggestions from code review
    
    * update doc
    
    * Update src/diffusers/pipelines/stable_diffusion_video/pipeline_stable_diffusion_video.py
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    
    * style
    
    * Get torch compile working
    
    * up
    
    * rename
    
    * fix doc
    
    * add chunking
    
    * torch compile
    
    * torch compile
    
    * add modelling outputs
    
    * torch compile
    
    * Improve chunking
    
    * Apply suggestions from code review
    
    * Update docs/source/en/using-diffusers/svd.md
    
    * Close diff tag
    
    * remove slicing
    
    * resnet docstr
    
    * add docstr in resnet
    
    * rename
    
    * Apply suggestions from code review
    
    * update tests
    
    * Fix output type latents
    
    * fix more
    
    * fix more
    
    * Update docs/source/en/using-diffusers/svd.md
    
    * fix more
    
    * add pipeline tests
    
    * remove unused arg
    
    * clean  up
    
    * make sure get_scaling receives tensors
    
    * fix euler scheduler
    
    * fix get_scalings
    
    * simply euler for now
    
    * remove old test file
    
    * use randn_tensor to create noise
    
    * fix device for rand tensor
    
    * increase expected_max_difference
    
    * fix test_inference_batch_single_identical
    
    * actually fix test_inference_batch_single_identical
    
    * disable test_save_load_float16
    
    * skip test_float16_inference
    
    * skip test_inference_batch_single_identical
    
    * fix test_xformers_attention_forwardGenerator_pass
    
    * Apply suggestions from code review
    
    * update StableVideoDiffusionPipelineSlowTests
    
    * update image
    
    * add diffusers example
    
    * fix more
    
    ---------
    Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    Co-authored-by: default avatarapolinário <joaopaulo.passos@gmail.com>
    63f767ef
resnet.py 51 KB