• Yoach Lacombe's avatar
    Add bark (#24086) · f42a35e6
    Yoach Lacombe authored
    
    
    * first raw version of the bark integration
    
    * working code on small models with single run
    
    * add converting script from suno weights 2 hf
    
    * many changes
    
    * correct past_kv output
    
    * working implementation for inference
    
    * update the converting script according to the architecture changes
    
    * add a working end-to-end inference code
    
    * remove some comments and make small changes
    
    * remove unecessary comment
    
    * add docstrings and ensure no unecessary intermediary output during audio generation
    
    * remove done TODOs
    
    * make style + add config docstrings
    
    * modification for batch inference support on the whole model
    
    * add details to .generation_audio method
    
    * add copyright
    
    * convert EncodecModel from original library to transformers implementation
    
    * add two class in order to facilitate model and sub-models loading from the hub
    
    * add support of loading the whole model
    
    * add BarkProcessor
    
    * correct modeling according to processor output
    
    * Add proper __init__ and auto support
    
    * Add up-to-date copyright/license message
    
    * add relative import instead of absolute
    
    * cleaner head_dim computation
    
    * small comment removal or changes
    
    * more verbose LayerNorm init method
    
    * specify eps for clearer comprehension
    
    * more verbose variable naming in the MLP module
    
    * remove unecessary BarkBlock parameter
    
    * clearer code in the forward pass of the BarkBlock
    
    * remove _initialize_modules method for cleaner code
    
    * Remove unnecessary methods from sub-models
    
    * move code to remove unnecessary function
    
    * rename a variable for clarity and change an assert
    
    * move code and change variable name for clarity
    
    * remove unnecessary asserts
    
    * correct small bug
    
    * correct a comment
    
    * change variable names for clarity
    
    * remove asserts
    
    * change import from absolute to relative
    
    * correct small error due to comma missing + correct import
    
    * Add attribute Bark config
    
    * add first version of tests
    
    * update attention_map
    
    * add tie_weights and resize_token_embeddings for fineModel
    
    * correct getting attention_mask in generate_text_semantic
    
    * remove Bark inference trick
    
    * leave more choices in barkProcessor
    
    * remove _no_split_modules
    
    * fixe error in forward of block and introduce clearer notations
    
    * correct converting script with last changes
    
    * make style + add draft bark.mdx
    
    * correct BarkModelTest::test_generate_text_semantic
    
    * add Bark in main README
    
    * add dummy_pt_objects for Bark
    
    * add missing models in the main init
    
    * correct test_decoder_model_past_with_large_inputs
    
    * disable torchscript test
    
    * change docstring of BarkProcessor
    
    * Add test_processor_bark
    
    * make style
    
    * correct copyrights
    
    * add bark.mdx + make style, quality and consistency
    
    * Apply suggestions from code review
    Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
    
    * Remove unnecessary test method
    
    * simply logic of a test
    
    * Only check first ids for slow audio generation
    
    * split full end-to-end generation tests
    
    * remove unneccessary comment
    
    * change submodel names for clearer naming
    
    * remove ModuleDict from modeling_bark
    
    * combine two if statements
    
    * ensure that an edge misued won't happen
    
    * modify variable name
    
    * move code snippet to the right place (coarse instead of semantic)
    
    * change BarkSemanticModule -> BarkSemanticModel
    
    * align BarkProcessor with transformers paradigm
    
    * correct BarkProcessor tests with last commit changes
    
    * change _validate_voice_preset to an instance method instead of a class method
    
    * tie_weights already called with post_init
    
    * add codec_model config to configuration
    
    * update bark modeling tests with recent BarkProcessor changes
    
    * remove SubModelPretrainedModel + change speakers embeddings prompt type in BarkModel
    
    * change absolute imports to relative
    
    * remove TODO
    
    * change docstrings
    
    * add examples to docs and docstrings
    
    * make style
    
    * uses BatchFeature in BarkProcessor insteads of dict
    
    * continue improving docstrings and docs + make style
    
    * correct docstrings examples
    
    * more comprehensible speaker_embeddings load/Save
    
    * rename speaker_embeddings_dict -> speaker_embeddings
    
    * correct bark.mdx + add bark to documentation_tests
    
    * correct docstrings configuration_bark
    
    * integrate last nit suggestions
    
    * integrate BarkGeneration configs
    
    * make style
    
    * remove bark tests from documentation_tests.txt because timeout - tested manually
    
    * add proper generation config initialization
    
    * small bark.mdx documentation changes
    
    * rename bark.mdx -> bark.md
    
    * add torch.no_grad behind BarkModel.generate_audio()
    
    * replace assert by ValueError in convert_suno_to_hf.py
    
    * integrate a series of short comments from reviewer
    
    * move SemanticLogitsProcessors and remove .detach() from Bark docs and docstrings
    
    * actually remove SemanticLogitsProcessor from modeling_bark.oy
    
    * BarkProcessor returns a single output instead of tuple + correct docstrings
    
    * make style + correct bug
    
    * add initializer_range to BarkConfig + correct slow modeling tests
    
    * add .clone() to history_prompt.coarse_prompt to avoid modifying input array
    
    * Making sure no extra "`" are present
    
    * remove extra characters in modeling_bark.py
    
    * Correct output if history_prompt is None
    
    * remove TODOs
    
    * remove ravel comment
    
    * completing generation_configuration_bark.py docstrings
    
    * change docstrings - number of audio codebooks instead of Encodec codebooks
    
    * change 'bias' docstrings in configuration_bark.py
    
    * format code
    
    * rename BarkModel.generate_audio -> BarkModel.generate_speech
    
    * modify AutoConfig instead of EncodecConfig in BarkConfig
    
    * correct AutoConfig wrong init
    
    * refactor BarkModel and sub-models generate_coarse, generate_fine, generate_text_semantic
    
    * remove SemanticLogitsProcessor and replace it with SuppressTokensLogitsProcessor
    
    * move nb_codebook related config arguments to BarkFineConfig
    
    * rename bark.mdx -> bark.md
    
    * correcting BarkModelConfig from_pretrained + remove keys_to_ignore
    
    * correct bark.md with correct hub path
    
    * correct code bug in bark.md
    
    * correct list tokens_to_suppress
    
    * modify Processor to load nested speaker embeddings in a safer way
    
    * correct batch sampling in BarkFineModel.generate_fine
    
    * Apply suggestions from code review
    
    Small docstrings correction and code improvements
    Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * give more details about num_layers in docstrings
    
    * correct indentation mistake
    
    * correct submodelconfig order of docstring variables
    
    * put audio models in alphabetical order in utils/check_repo.my
    
    * remove useless line from test_modeling_bark.py
    
    * makes BarkCoarseModelTest inherits from (ModelTesterMixin, GenerationTesterMixin, unittest.TestCase) instead of BarkSemanticModelTest
    
    * make a Tester class for each sub-model instead of inheriting
    
    * add test_resize_embeddings=True for Bark sub-models
    
    * add Copied from transformers.models.gpt_neo.modeling_gpt_neo.GPTNeoSelfAttention._split_heads
    
    * remove 'Copied fom Bark' comment
    
    * remove unneccessary comment
    
    * change np.min -> min in modeling_bark.py
    
    * refactored all custom layers to have Bark prefix
    
    * add attention_mask as an argument of generate_text_semantic
    
    * refactor sub-models start docstrings to have more precise config class definition
    
    * move _tied_weights_keys overriding
    
    * add docstrings to generate_xxx in modeling_bark.py
    
    * add loading whole BarkModel to convert_suno_to_hf
    
    * refactor attribute and variable names
    
    * make style convert_suno
    
    * update bark checkpoints
    
    * remove never entered if statement
    
    * move bark_modeling docstrings after BarkPretrainedModel class definition
    
    * refactor modeling_bark.py: kv -> key_values
    
    * small nits - code refactoring and removing unecessary lines from _init_weights
    
    * nits - replace inplace method by variable assigning
    
    * remove *optional* when necessary
    
    * remove some lines in generate_speech
    
    * add default value for optional parameter
    
    * Refactor preprocess_histories_before_coarse -> preprocess_histories
    Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
    
    * correct usage after refactoring
    
    * refactor Bark's generate_xxx -> generate and modify docstrings and tests accordingly
    
    * update docstrings python in configuration_bark.py
    
    * add bark files in utils/documentation_test.txt
    
    * correct docstrings python snippet
    
    * add the ability to use parameters in the form of e.g coarse_temperature
    
    * add semantic_max_new_tokens in python snippet in docstrings for quicker generation
    
    * Reformate sub-models kwargs in BakModel.generate
    Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * correct kwargs in BarkModel.generate
    
    * correct attention_mask kwarg in BarkModel.generate
    
    * add tests for sub-models args in BarkModel.generate and correct BarkFineModel.test_generate_fp16
    
    * enrich BarkModel.generate docstrings with a description of how to use the kwargs
    
    ---------
    Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
    Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
    Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
    f42a35e6
README.md 91.2 KB