1. 23 Mar, 2023 1 commit
  2. 22 Mar, 2023 1 commit
    • Patrick von Platen's avatar
      [MS Text To Video] Add first text to video (#2738) · ca1a2229
      Patrick von Platen authored
      
      
      * [MS Text To Video} Add first text to video
      
      * upload
      
      * make first model example
      
      * match unet3d params
      
      * make sure weights are correcctly converted
      
      * improve
      
      * forward pass works, but diff result
      
      * make forward work
      
      * fix more
      
      * finish
      
      * refactor video output class.
      
      * feat: add support for a video export utility.
      
      * fix: opencv availability check.
      
      * run make fix-copies.
      
      * add: docs for the model components.
      
      * add: standalone pipeline doc.
      
      * edit docstring of the pipeline.
      
      * add: right path to TransformerTempModel
      
      * add: first set of tests.
      
      * complete fast tests for text to video.
      
      * fix bug
      
      * up
      
      * three fast tests failing.
      
      * add: note on slow tests
      
      * make work with all schedulers
      
      * apply styling.
      
      * add slow tests
      
      * change file name
      
      * update
      
      * more correction
      
      * more fixes
      
      * finish
      
      * up
      
      * Apply suggestions from code review
      
      * up
      
      * finish
      
      * make copies
      
      * fix pipeline tests
      
      * fix more tests
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * apply suggestions
      
      * up
      
      * revert
      
      ---------
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      ca1a2229
  3. 02 Mar, 2023 1 commit
    • Takuma Mori's avatar
      Add a ControlNet model & pipeline (#2407) · 8dfff7c0
      Takuma Mori authored
      
      
      * add scaffold
      - copied convert_controlnet_to_diffusers.py from
      convert_original_stable_diffusion_to_diffusers.py
      
      * Add support to load ControlNet (WIP)
      - this makes Missking Key error on ControlNetModel
      
      * Update to convert ControlNet without error msg
      - init impl for StableDiffusionControlNetPipeline
      - init impl for ControlNetModel
      
      * cleanup of commented out
      
      * split create_controlnet_diffusers_config()
      from create_unet_diffusers_config()
      
      - add config: hint_channels
      
      * Add input_hint_block, input_zero_conv and
      middle_block_out
      - this makes missing key error on loading model
      
      * add unet_2d_blocks_controlnet.py
      - copied from unet_2d_blocks.py as impl CrossAttnDownBlock2D,DownBlock2D
      - this makes missing key error on loading model
      
      * Add loading for input_hint_block, zero_convs
      and middle_block_out
      
      - this makes no error message on model loading
      
      * Copy from UNet2DConditionalModel except __init__
      
      * Add ultra primitive test for ControlNetModel
      inference
      
      * Support ControlNetModel inference
      - without exceptions
      
      * copy forward() from UNet2DConditionModel
      
      * Impl ControlledUNet2DConditionModel inference
      - test_controlled_unet_inference passed
      
      * Frozen weight & biases for training
      
      * Minimized version of ControlNet/ControlledUnet
      - test_modules_controllnet.py passed
      
      * make style
      
      * Add support model loading for minimized ver
      
      * Remove all previous version files
      
      * from_pretrained and inference test passed
      
      * copied from pipeline_stable_diffusion.py
      except `__init__()`
      
      * Impl pipeline, pixel match test (almost) passed.
      
      * make style
      
      * make fix-copies
      
      * Fix to add import ControlNet blocks
      for `make fix-copies`
      
      * Remove einops dependency
      
      * Support  np.ndarray, PIL.Image for controlnet_hint
      
      * set default config file as lllyasviel's
      
      * Add support grayscale (hw) numpy array
      
      * Add and update docstrings
      
      * add control_net.mdx
      
      * add control_net.mdx to toctree
      
      * Update copyright year
      
      * Fix to add PIL.Image RGB->BGR conversion
      - thanks @Mystfit
      
      * make fix-copies
      
      * add basic fast test for controlnet
      
      * add slow test for controlnet/unet
      
      * Ignore down/up_block len check on ControlNet
      
      * add a copy from test_stable_diffusion.py
      
      * Accept controlnet_hint is None
      
      * merge pipeline_stable_diffusion.py diff
      
      * Update class name to SDControlNetPipeline
      
      * make style
      
      * Baseline fast test almost passed (w long desc)
      
      * still needs investigate.
      
      Following didn't passed descriped in TODO comment:
      - test_stable_diffusion_long_prompt
      - test_stable_diffusion_no_safety_checker
      
      Following didn't passed same as stable_diffusion_pipeline:
      - test_attention_slicing_forward_pass
      - test_inference_batch_single_identical
      - test_xformers_attention_forwardGenerator_pass
      these seems come from calc accuracy.
      
      * Add note comment related vae_scale_factor
      
      * add test_stable_diffusion_controlnet_ddim
      
      * add assertion for vae_scale_factor != 8
      
      * slow test of pipeline almost passed
      Failed: test_stable_diffusion_pipeline_with_model_offloading
      - ImportError: `enable_model_offload` requires `accelerate v0.17.0` or higher
      
      but currently latest version == 0.16.0
      
      * test_stable_diffusion_long_prompt passed
      
      * test_stable_diffusion_no_safety_checker passed
      
      - due to its model size, move to slow test
      
      * remove PoC test files
      
      * fix num_of_image, prompt length issue add add test
      
      * add support List[PIL.Image] for controlnet_hint
      
      * wip
      
      * all slow test passed
      
      * make style
      
      * update for slow test
      
      * RGB(PIL)->BGR(ctrlnet) conversion
      
      * fixes
      
      * remove manual num_images_per_prompt test
      
      * add document
      
      * add `image` argument docstring
      
      * make style
      
      * Add line to correct conversion
      
      * add controlnet_conditioning_scale (aka control_scales
      strength)
      
      * rgb channel ordering by default
      
      * image batching logic
      
      * Add control image descriptions for each checkpoint
      
      * Only save controlnet model in conversion script
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py
      
      typo
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/control_net.mdx
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * add gerated image example
      
      * a depth mask -> a depth map
      
      * rename control_net.mdx to controlnet.mdx
      
      * fix toc title
      
      * add ControlNet abstruct and link
      
      * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py
      Co-authored-by: default avatardqueue <dbyqin@gmail.com>
      
      * remove controlnet constructor arguments re: @patrickvonplaten
      
      * [integration tests] test canny
      
      * test_canny fixes
      
      * [integration tests] test_depth
      
      * [integration tests] test_hed
      
      * [integration tests] test_mlsd
      
      * add channel order config to controlnet
      
      * [integration tests] test normal
      
      * [integration tests] test_openpose test_scribble
      
      * change height and width to default to conditioning image
      
      * [integration tests] test seg
      
      * style
      
      * test_depth fix
      
      * [integration tests] size fixes
      
      * [integration tests] cpu offloading
      
      * style
      
      * generalize controlnet embedding
      
      * fix conversion script
      
      * Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Update docs/source/en/api/pipelines/stable_diffusion/controlnet.mdx
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      
      * Style adapted to the documentation of pix2pix
      
      * merge main by hand
      
      * style
      
      * [docs] controlling generation doc nits
      
      * correct some things
      
      * add: controlnetmodel to autodoc.
      
      * finish docs
      
      * finish
      
      * finish 2
      
      * correct images
      
      * finish controlnet
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * uP
      
      * upload model
      
      * up
      
      * up
      
      ---------
      Co-authored-by: default avatarWilliam Berman <WLBberman@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      Co-authored-by: default avatardqueue <dbyqin@gmail.com>
      Co-authored-by: default avatarSayak Paul <spsayakpaul@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      8dfff7c0
  4. 01 Mar, 2023 1 commit
  5. 04 Jan, 2023 1 commit
    • Chanran Kim's avatar
      Init for korean docs (#1910) · 75d53cc8
      Chanran Kim authored
      * init for korean docs
      
      * edit build yml file for multi language docs
      
      * edit one more build yml file for multi language docs
      
      * add title for get_frontmatter error
      75d53cc8
  6. 01 Jan, 2023 1 commit
  7. 30 Dec, 2022 1 commit
  8. 19 Dec, 2022 1 commit
  9. 14 Nov, 2022 1 commit
    • Nathan Lambert's avatar
      Add UNet 1d for RL model for planning + colab (#105) · 7c5fef81
      Nathan Lambert authored
      
      
      * re-add RL model code
      
      * match model forward api
      
      * add register_to_config, pass training tests
      
      * fix tests, update forward outputs
      
      * remove unused code, some comments
      
      * add to docs
      
      * remove extra embedding code
      
      * unify time embedding
      
      * remove conv1d output sequential
      
      * remove sequential from conv1dblock
      
      * style and deleting duplicated code
      
      * clean files
      
      * remove unused variables
      
      * clean variables
      
      * add 1d resnet block structure for downsample
      
      * rename as unet1d
      
      * fix renaming
      
      * rename files
      
      * add get_block(...) api
      
      * unify args for model1d like model2d
      
      * minor cleaning
      
      * fix docs
      
      * improve 1d resnet blocks
      
      * fix tests, remove permuts
      
      * fix style
      
      * add output activation
      
      * rename flax blocks file
      
      * Add Value Function and corresponding example script to Diffuser implementation (#884)
      
      * valuefunction code
      
      * start example scripts
      
      * missing imports
      
      * bug fixes and placeholder example script
      
      * add value function scheduler
      
      * load value function from hub and get best actions in example
      
      * very close to working example
      
      * larger batch size for planning
      
      * more tests
      
      * merge unet1d changes
      
      * wandb for debugging, use newer models
      
      * success!
      
      * turns out we just need more diffusion steps
      
      * run on modal
      
      * merge and code cleanup
      
      * use same api for rl model
      
      * fix variance type
      
      * wrong normalization function
      
      * add tests
      
      * style
      
      * style and quality
      
      * edits based on comments
      
      * style and quality
      
      * remove unused var
      
      * hack unet1d into a value function
      
      * add pipeline
      
      * fix arg order
      
      * add pipeline to core library
      
      * community pipeline
      
      * fix couple shape bugs
      
      * style
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNathan Lambert <nathan@huggingface.co>
      
      * update post merge of scripts
      
      * add mdiblock / outblock architecture
      
      * Pipeline cleanup (#947)
      
      * valuefunction code
      
      * start example scripts
      
      * missing imports
      
      * bug fixes and placeholder example script
      
      * add value function scheduler
      
      * load value function from hub and get best actions in example
      
      * very close to working example
      
      * larger batch size for planning
      
      * more tests
      
      * merge unet1d changes
      
      * wandb for debugging, use newer models
      
      * success!
      
      * turns out we just need more diffusion steps
      
      * run on modal
      
      * merge and code cleanup
      
      * use same api for rl model
      
      * fix variance type
      
      * wrong normalization function
      
      * add tests
      
      * style
      
      * style and quality
      
      * edits based on comments
      
      * style and quality
      
      * remove unused var
      
      * hack unet1d into a value function
      
      * add pipeline
      
      * fix arg order
      
      * add pipeline to core library
      
      * community pipeline
      
      * fix couple shape bugs
      
      * style
      
      * Apply suggestions from code review
      
      * clean up comments
      
      * convert older script to using pipeline and add readme
      
      * rename scripts
      
      * style, update tests
      
      * delete unet rl model file
      
      * remove imports in src
      Co-authored-by: default avatarNathan Lambert <nathan@huggingface.co>
      
      * Update src/diffusers/models/unet_1d_blocks.py
      
      * Update tests/test_models_unet.py
      
      * RL Cleanup v2 (#965)
      
      * valuefunction code
      
      * start example scripts
      
      * missing imports
      
      * bug fixes and placeholder example script
      
      * add value function scheduler
      
      * load value function from hub and get best actions in example
      
      * very close to working example
      
      * larger batch size for planning
      
      * more tests
      
      * merge unet1d changes
      
      * wandb for debugging, use newer models
      
      * success!
      
      * turns out we just need more diffusion steps
      
      * run on modal
      
      * merge and code cleanup
      
      * use same api for rl model
      
      * fix variance type
      
      * wrong normalization function
      
      * add tests
      
      * style
      
      * style and quality
      
      * edits based on comments
      
      * style and quality
      
      * remove unused var
      
      * hack unet1d into a value function
      
      * add pipeline
      
      * fix arg order
      
      * add pipeline to core library
      
      * community pipeline
      
      * fix couple shape bugs
      
      * style
      
      * Apply suggestions from code review
      
      * clean up comments
      
      * convert older script to using pipeline and add readme
      
      * rename scripts
      
      * style, update tests
      
      * delete unet rl model file
      
      * remove imports in src
      
      * add specific vf block and update tests
      
      * style
      
      * Update tests/test_models_unet.py
      Co-authored-by: default avatarNathan Lambert <nathan@huggingface.co>
      
      * fix quality in tests
      
      * fix quality style, split test file
      
      * fix checks / tests
      
      * make timesteps closer to main
      
      * unify block API
      
      * unify forward api
      
      * delete lines in examples
      
      * style
      
      * examples style
      
      * all tests pass
      
      * make style
      
      * make dance_diff test pass
      
      * Refactoring RL PR (#1200)
      
      * init file changes
      
      * add import utils
      
      * finish cleaning files, imports
      
      * remove import flags
      
      * clean examples
      
      * fix imports, tests for merge
      
      * update readmes
      
      * hotfix for tests
      
      * quality
      
      * fix some tests
      
      * change defaults
      
      * more mps test fixes
      
      * unet1d defaults
      
      * do not default import experimental
      
      * defaults for tests
      
      * fix tests
      
      * fix-copies
      
      * fix
      
      * changes per Patrik's comments (#1285)
      
      * changes per Patrik's comments
      
      * update conversion script
      
      * fix renaming
      
      * skip more mps tests
      
      * last test fix
      
      * Update examples/rl/README.md
      Co-authored-by: default avatarBen Glickenhaus <benglickenhaus@gmail.com>
      7c5fef81
  10. 03 Nov, 2022 1 commit
    • Will Berman's avatar
      VQ-diffusion (#658) · ef2ea33c
      Will Berman authored
      
      
      * Changes for VQ-diffusion VQVAE
      
      Add specify dimension of embeddings to VQModel:
      `VQModel` will by default set the dimension of embeddings to the number
      of latent channels. The VQ-diffusion VQVAE has a smaller
      embedding dimension, 128, than number of latent channels, 256.
      
      Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down
      unet block helpers. VQ-diffusion's VQVAE uses those two block types.
      
      * Changes for VQ-diffusion transformer
      
      Modify attention.py so SpatialTransformer can be used for
      VQ-diffusion's transformer.
      
      SpatialTransformer:
      - Can now operate over discrete inputs (classes of vector embeddings) as well as continuous.
      - `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs
      - modified forward pass to take optional timestep embeddings
      
      ImagePositionalEmbeddings:
      - added to provide positional embeddings to discrete inputs for latent pixels
      
      BasicTransformerBlock:
      - norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings
      - modified forward pass to take optional timestep embeddings
      
      CrossAttention:
      - now may optionally take a bias parameter for its query, key, and value linear layers
      
      FeedForward:
      - Internal layers are now configurable
      
      ApproximateGELU:
      - Activation function in VQ-diffusion's feedforward layer
      
      AdaLayerNorm:
      - Norm layer modified to incorporate timestep embeddings
      
      * Add VQ-diffusion scheduler
      
      * Add VQ-diffusion pipeline
      
      * Add VQ-diffusion convert script to diffusers
      
      * Add VQ-diffusion dummy objects
      
      * Add VQ-diffusion markdown docs
      
      * Add VQ-diffusion tests
      
      * some renaming
      
      * some fixes
      
      * more renaming
      
      * correct
      
      * fix typo
      
      * correct weights
      
      * finalize
      
      * fix tests
      
      * Apply suggestions from code review
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * finish
      
      * finish
      
      * up
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      ef2ea33c
  11. 25 Oct, 2022 1 commit
  12. 06 Oct, 2022 2 commits
  13. 23 Sep, 2022 1 commit
    • Younes Belkada's avatar
      Flax documentation (#589) · 8b0be935
      Younes Belkada authored
      
      
      * documenting `attention_flax.py` file
      
      * documenting `embeddings_flax.py`
      
      * documenting `unet_blocks_flax.py`
      
      * Add new objs to doc page
      
      * document `vae_flax.py`
      
      * Apply suggestions from code review
      
      * modify `unet_2d_condition_flax.py`
      
      * make style
      
      * Apply suggestions from code review
      
      * make style
      
      * Apply suggestions from code review
      
      * fix indent
      
      * fix typo
      
      * fix indent unet
      
      * Update src/diffusers/models/vae_flax.py
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      Co-authored-by: default avatarMishig Davaadorj <dmishig@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      8b0be935
  14. 08 Sep, 2022 1 commit
  15. 07 Sep, 2022 1 commit
  16. 13 Jul, 2022 1 commit
    • Nathan Lambert's avatar
      Docs (#45) · c3d78cd3
      Nathan Lambert authored
      * first pass at docs structure
      
      * minor reformatting, add github actions for docs
      
      * populate docs (primarily from README, some writing)
      c3d78cd3