• Sanchit Gandhi's avatar
    Add Musicgen (#24109) · 1c1c9075
    Sanchit Gandhi authored
    
    
    * Add Audiocraft
    
    * add cross attention
    
    * style
    
    * add for lm
    
    * convert and verify
    
    * introduce t5
    
    * split configs
    
    * load t5 + lm
    
    * clean conversion
    
    * copy from t5
    
    * style
    
    * start pattern provider
    
    * make generation work
    
    * style
    
    * fix pos embs
    
    * propagate shape changes
    
    * propagate shape changes
    
    * style
    
    * delay pattern: pad tokens at end
    
    * audiocraft -> musicgen
    
    * fix inits
    
    * add mdx
    
    * style
    
    * fix pad token in processor
    
    * override generate and add todos
    
    * add init to test
    
    * undo pattern delay mask after gen
    
    * remove cfg logits processor
    
    * remove cfg logits processor
    
    * remove logits processor in favour of mask
    
    * clean pos embs
    
    * make fix copies
    
    * update readmes
    
    * clean pos emb
    
    * refactor encoder/decoder
    
    * make fix copies
    
    * update conversion
    
    * fix config imports
    
    * update config docs
    
    * make style
    
    * send pattern mask to device
    
    * pattern mask with delay
    
    * recover prompted audio tokens
    
    * fix docstrings
    
    * laydown test file
    
    * pattern edge case
    
    * remove t5 ref
    
    * add processing class
    
    * config refactor
    
    * better pattern comment
    
    * check if mask is not present
    
    * check if mask is not present
    
    * refactor to auto class
    
    * remove encoder configs
    
    * fix processor
    
    * processor import
    
    * start updating conversion
    
    * start updating tests
    
    * make style
    
    * convert t5, encodec, lm
    
    * convert as composite
    
    * also convert processor
    
    * run generate
    
    * classifier free gen
    
    * comments and clean up
    
    * make style
    
    * docs for logit proc
    
    * docstring for uncond gen
    
    * start lm tests
    
    * work tests
    
    * let the lm generate
    
    * refactor: reshape inside forward
    
    * undo greedy loop changes
    
    * from_enc_dec -> from_sub_model
    
    * fix input id shapes in docstrings
    
    * Apply suggestions from code review
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    
    * undo generate changes
    
    * from sub model config
    
    * Update src/transformers/models/musicgen/modeling_musicgen.py
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    
    * make generate work again
    
    * generate uncond -> get uncond inputs
    
    * remove prefix allowed tokens fn
    
    * better error message
    
    * logit proc checks
    
    * Apply suggestions from code review
    Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
    
    * make decoder only tests work
    
    * composite fast tests
    
    * make style
    
    * uncond generation
    
    * feat extr padding
    
    * make audio prompt work
    
    * fix inputs docstrings
    
    * unconditional inputs: dict -> model output
    
    * clean up tests
    
    * more clean up tests
    
    * make style
    
    * t5 encoder -> auto text encoder
    
    * remove comments
    
    * deal with frames
    
    * fix auto text
    
    * slow tests
    
    * nice mdx
    
    * remove can generate
    
    * todo - hub id
    
    * convert m/l
    
    * make fix copies
    
    * only import generation with torch
    
    * ignore decoder from tests
    
    * don't wrap uncond inputs
    
    * make style
    
    * cleaner uncond inputs
    
    * add example to musicgen forward
    
    * fix docs
    
    * ignore MusicGen Model/ForConditionalGeneration in auto mapping
    
    * add doc section to toctree
    
    * add to doc tests
    
    * add processor tests
    
    * fix push to hub in conversion
    
    * tips for decoder only loading
    
    * Apply suggestions from code review
    Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
    
    * fix conversion for s / m / l checkpoints
    
    * import stopping criteria from module
    
    * remove from pipeline tests
    
    * fix uncond docstring
    
    * decode audio method
    
    * fix docs
    
    * org: sanchit-gandhi -> facebook
    
    * fix max pos embeddings
    
    * remove auto doc (not compatible with shapes)
    
    * bump max pos emb
    
    * make style
    
    * fix doc
    
    * fix config doc
    
    * fix config doc
    
    * ignore musicgen config from docstring
    
    * make style
    
    * fix config
    
    * fix config for doctest
    
    * consistent from_sub_models
    
    * don't automap decoder
    
    * fix mdx save audio file
    
    * fix mdx save audio file
    
    * processor batch decode for audio
    
    * remove keys to ignore
    
    * update doc md
    
    * update generation config
    
    * allow changes for default generation config
    
    * update tests
    
    * make style
    
    * fix docstring for uncond
    
    * fix processor test
    
    * fix processor test
    
    ---------
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
    Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
    1c1c9075
README.md 90 KB