1. 18 Dec, 2022 1 commit
    • Will Berman's avatar
      kakaobrain unCLIP (#1428) · 2dcf64b7
      Will Berman authored
      
      
      * [wip] attention block updates
      
      * [wip] unCLIP unet decoder and super res
      
      * [wip] unCLIP prior transformer
      
      * [wip] scheduler changes
      
      * [wip] text proj utility class
      
      * [wip] UnCLIPPipeline
      
      * [wip] kakaobrain unCLIP convert script
      
      * [unCLIP pipeline] fixes re: @patrickvonplaten
      
      remove callbacks
      
      move denoising loops into call function
      
      * UNCLIPScheduler re: @patrickvonplaten
      
      Revert changes to DDPMScheduler. Make UNCLIPScheduler, a modified
      DDPM scheduler with changes to support karlo
      
      * mask -> attention_mask re: @patrickvonplaten
      
      * [DDPMScheduler] remove leftover change
      
      * [docs] PriorTransformer
      
      * [docs] UNet2DConditionModel and UNet2DModel
      
      * [nit] UNCLIPScheduler -> UnCLIPScheduler
      
      matches existing unclip naming better
      
      * [docs] SchedulingUnCLIP
      
      * [docs] UnCLIPTextProjModel
      
      * refactor
      
      * finish licenses
      
      * rename all to attention_mask and prep in models
      
      * more renaming
      
      * don't expose unused configs
      
      * final renaming fixes
      
      * remove x attn mask when not necessary
      
      * configure kakao script to use new class embedding config
      
      * fix copies
      
      * [tests] UnCLIPScheduler
      
      * finish x attn
      
      * finish
      
      * remove more
      
      * rename condition blocks
      
      * clean more
      
      * Apply suggestions from code review
      
      * up
      
      * fix
      
      * [tests] UnCLIPPipelineFastTests
      
      * remove unused imports
      
      * [tests] UnCLIPPipelineIntegrationTests
      
      * correct
      
      * make style
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      2dcf64b7
  2. 16 Nov, 2022 1 commit
  3. 03 Nov, 2022 1 commit
    • Will Berman's avatar
      VQ-diffusion (#658) · ef2ea33c
      Will Berman authored
      
      
      * Changes for VQ-diffusion VQVAE
      
      Add specify dimension of embeddings to VQModel:
      `VQModel` will by default set the dimension of embeddings to the number
      of latent channels. The VQ-diffusion VQVAE has a smaller
      embedding dimension, 128, than number of latent channels, 256.
      
      Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down
      unet block helpers. VQ-diffusion's VQVAE uses those two block types.
      
      * Changes for VQ-diffusion transformer
      
      Modify attention.py so SpatialTransformer can be used for
      VQ-diffusion's transformer.
      
      SpatialTransformer:
      - Can now operate over discrete inputs (classes of vector embeddings) as well as continuous.
      - `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs
      - modified forward pass to take optional timestep embeddings
      
      ImagePositionalEmbeddings:
      - added to provide positional embeddings to discrete inputs for latent pixels
      
      BasicTransformerBlock:
      - norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings
      - modified forward pass to take optional timestep embeddings
      
      CrossAttention:
      - now may optionally take a bias parameter for its query, key, and value linear layers
      
      FeedForward:
      - Internal layers are now configurable
      
      ApproximateGELU:
      - Activation function in VQ-diffusion's feedforward layer
      
      AdaLayerNorm:
      - Norm layer modified to incorporate timestep embeddings
      
      * Add VQ-diffusion scheduler
      
      * Add VQ-diffusion pipeline
      
      * Add VQ-diffusion convert script to diffusers
      
      * Add VQ-diffusion dummy objects
      
      * Add VQ-diffusion markdown docs
      
      * Add VQ-diffusion tests
      
      * some renaming
      
      * some fixes
      
      * more renaming
      
      * correct
      
      * fix typo
      
      * correct weights
      
      * finalize
      
      * fix tests
      
      * Apply suggestions from code review
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * finish
      
      * finish
      
      * up
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      ef2ea33c