• Kashif Rasul's avatar
    Music Spectrogram diffusion pipeline (#1044) · 2ef9bdd7
    Kashif Rasul authored
    
    
    * initial TokenEncoder and ContinuousEncoder
    
    * initial modules
    
    * added ContinuousContextTransformer
    
    * fix copy paste error
    
    * use numpy for get_sequence_length
    
    * initial terminal relative positional encodings
    
    * fix weights keys
    
    * fix assert
    
    * cross attend style: concat encodings
    
    * make style
    
    * concat once
    
    * fix formatting
    
    * Initial SpectrogramPipeline
    
    * fix input_tokens
    
    * make style
    
    * added mel output
    
    * ignore weights for config
    
    * move mel to numpy
    
    * import pipeline
    
    * fix class names and import
    
    * moved models to models folder
    
    * import ContinuousContextTransformer and SpectrogramDiffusionPipeline
    
    * initial spec diffusion converstion script
    
    * renamed config to t5config
    
    * added weight loading
    
    * use arguments instead of t5config
    
    * broadcast noise time to batch dim
    
    * fix call
    
    * added scale_to_features
    
    * fix weights
    
    * transpose laynorm weight
    
    * scale is a vector
    
    * scale the query outputs
    
    * added comment
    
    * undo scaling
    
    * undo depth_scaling
    
    * inital get_extended_attention_mask
    
    * attention_mask is none in self-attention
    
    * cleanup
    
    * manually invert attention
    
    * nn.linear need bias=False
    
    * added T5LayerFFCond
    
    * remove to fix conflict
    
    * make style and dummy
    
    * remove unsed variables
    
    * remove predict_epsilon
    
    * Move accelerate to a soft-dependency (#1134)
    
    * finish
    
    * finish
    
    * Update src/diffusers/modeling_utils.py
    
    * Update src/diffusers/pipeline_utils.py
    Co-authored-by: default avatarAnton Lozhkov <anton@huggingface.co>
    
    * more fixes
    
    * fix
    Co-authored-by: default avatarAnton Lozhkov <anton@huggingface.co>
    
    * fix order
    
    * added initial midi to note token data pipeline
    
    * added int to int tokenizer
    
    * remove duplicate
    
    * added logic for segments
    
    * add melgan to pipeline
    
    * move autoregressive gen into pipeline
    
    * added note_representation_processor_chain
    
    * fix dtypes
    
    * remove immutabledict req
    
    * initial doc
    
    * use np.where
    
    * require note_seq
    
    * fix typo
    
    * update dependency
    
    * added note-seq to test
    
    * added is_note_seq_available
    
    * fix import
    
    * added toc
    
    * added example usage
    
    * undo for now
    
    * moved docs
    
    * fix merge
    
    * fix imports
    
    * predict first segment
    
    * avoid un-needed copy to and from cpu
    
    * make style
    
    * Copyright
    
    * fix style
    
    * add test and fix inference steps
    
    * remove bogus files
    
    * reorder models
    
    * up
    
    * remove transformers dependency
    
    * make work with diffusers cross attention
    
    * clean more
    
    * remove @
    
    * improve further
    
    * up
    
    * uP
    
    * Apply suggestions from code review
    
    * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py
    
    * loop over all tokens
    
    * make style
    
    * Added a section on the model
    
    * fix formatting
    
    * grammer
    
    * formatting
    
    * make fix-copies
    
    * Update src/diffusers/pipelines/__init__.py
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    
    * Update src/diffusers/pipelines/spectrogram_diffusion/pipeline_spectrogram_diffusion.py
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    
    * added callback ad optional ionnx
    
    * do not squeeze batch dim
    
    * clean up more
    
    * upload
    
    * convert jax to nnumpy
    
    * make style
    
    * fix warning
    
    * make fix-copies
    
    * fix warning
    
    * add initial fast tests
    
    * add initial pipeline_params
    
    * eval mode due to dropout
    
    * skip batch tests as pipeline runs on a single file
    
    * make style
    
    * fix relative path
    
    * fix doc tests
    
    * Update src/diffusers/models/t5_film_transformer.py
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    
    * Update src/diffusers/models/t5_film_transformer.py
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    
    * Update docs/source/en/api/pipelines/spectrogram_diffusion.mdx
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    
    * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    
    * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    
    * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    
    * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    
    * add MidiProcessor
    
    * format
    
    * fix org
    
    * Apply suggestions from code review
    
    * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py
    
    * make style
    
    * pin protobuf to <4
    
    * fix formatting
    
    * white space
    
    * tensorboard needs protobuf
    
    ---------
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    Co-authored-by: default avatarAnton Lozhkov <anton@huggingface.co>
    2ef9bdd7
notes_encoder.py 2.85 KB