"docs/vscode:/vscode.git/clone" did not exist on "56bea6b4a110e7e976b8881eb39a21158c2fa705"
  • Will Berman's avatar
    VQ-diffusion (#658) · ef2ea33c
    Will Berman authored
    
    
    * Changes for VQ-diffusion VQVAE
    
    Add specify dimension of embeddings to VQModel:
    `VQModel` will by default set the dimension of embeddings to the number
    of latent channels. The VQ-diffusion VQVAE has a smaller
    embedding dimension, 128, than number of latent channels, 256.
    
    Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down
    unet block helpers. VQ-diffusion's VQVAE uses those two block types.
    
    * Changes for VQ-diffusion transformer
    
    Modify attention.py so SpatialTransformer can be used for
    VQ-diffusion's transformer.
    
    SpatialTransformer:
    - Can now operate over discrete inputs (classes of vector embeddings) as well as continuous.
    - `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs
    - modified forward pass to take optional timestep embeddings
    
    ImagePositionalEmbeddings:
    - added to provide positional embeddings to discrete inputs for latent pixels
    
    BasicTransformerBlock:
    - norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings
    - modified forward pass to take optional timestep embeddings
    
    CrossAttention:
    - now may optionally take a bias parameter for its query, key, and value linear layers
    
    FeedForward:
    - Internal layers are now configurable
    
    ApproximateGELU:
    - Activation function in VQ-diffusion's feedforward layer
    
    AdaLayerNorm:
    - Norm layer modified to incorporate timestep embeddings
    
    * Add VQ-diffusion scheduler
    
    * Add VQ-diffusion pipeline
    
    * Add VQ-diffusion convert script to diffusers
    
    * Add VQ-diffusion dummy objects
    
    * Add VQ-diffusion markdown docs
    
    * Add VQ-diffusion tests
    
    * some renaming
    
    * some fixes
    
    * more renaming
    
    * correct
    
    * fix typo
    
    * correct weights
    
    * finalize
    
    * fix tests
    
    * Apply suggestions from code review
    Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
    
    * Apply suggestions from code review
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    
    * finish
    
    * finish
    
    * up
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
    Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
    ef2ea33c
overview.mdx 15.2 KB