- 18 Jul, 2023 1 commit
-
-
NielsRogge authored
* First draft * More improvements * Convert patch embedding layer * Convert all weights * Make conversion work * Improve conversion script * Fix style * Make all tests pass * Add image processor to auto mapping * Add swiglu ffn * Add image processor to conversion script * Fix conversion of giant model * Fix documentation * Fix style * Fix tests * Address comments * Address more comments * Remove unused arguments * Remove more arguments * Rename parameters * Include mask token * Address comments * Add docstring * Transfer checkpoints * Empty commit
-
- 17 Jul, 2023 2 commits
-
-
Sylvain Gugger authored
-
Yoach Lacombe authored
* first raw version of the bark integration * working code on small models with single run * add converting script from suno weights 2 hf * many changes * correct past_kv output * working implementation for inference * update the converting script according to the architecture changes * add a working end-to-end inference code * remove some comments and make small changes * remove unecessary comment * add docstrings and ensure no unecessary intermediary output during audio generation * remove done TODOs * make style + add config docstrings * modification for batch inference support on the whole model * add details to .generation_audio method * add copyright * convert EncodecModel from original library to transformers implementation * add two class in order to facilitate model and sub-models loading from the hub * add support of loading the whole model * add BarkProcessor * correct modeling according to processor output * Add proper __init__ and auto support * Add up-to-date copyright/license message * add relative import instead of absolute * cleaner head_dim computation * small comment removal or changes * more verbose LayerNorm init method * specify eps for clearer comprehension * more verbose variable naming in the MLP module * remove unecessary BarkBlock parameter * clearer code in the forward pass of the BarkBlock * remove _initialize_modules method for cleaner code * Remove unnecessary methods from sub-models * move code to remove unnecessary function * rename a variable for clarity and change an assert * move code and change variable name for clarity * remove unnecessary asserts * correct small bug * correct a comment * change variable names for clarity * remove asserts * change import from absolute to relative * correct small error due to comma missing + correct import * Add attribute Bark config * add first version of tests * update attention_map * add tie_weights and resize_token_embeddings for fineModel * correct getting attention_mask in generate_text_semantic * remove Bark inference trick * leave more choices in barkProcessor * remove _no_split_modules * fixe error in forward of block and introduce clearer notations * correct converting script with last changes * make style + add draft bark.mdx * correct BarkModelTest::test_generate_text_semantic * add Bark in main README * add dummy_pt_objects for Bark * add missing models in the main init * correct test_decoder_model_past_with_large_inputs * disable torchscript test * change docstring of BarkProcessor * Add test_processor_bark * make style * correct copyrights * add bark.mdx + make style, quality and consistency * Apply suggestions from code review Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Remove unnecessary test method * simply logic of a test * Only check first ids for slow audio generation * split full end-to-end generation tests * remove unneccessary comment * change submodel names for clearer naming * remove ModuleDict from modeling_bark * combine two if statements * ensure that an edge misued won't happen * modify variable name * move code snippet to the right place (coarse instead of semantic) * change BarkSemanticModule -> BarkSemanticModel * align BarkProcessor with transformers paradigm * correct BarkProcessor tests with last commit changes * change _validate_voice_preset to an instance method instead of a class method * tie_weights already called with post_init * add codec_model config to configuration * update bark modeling tests with recent BarkProcessor changes * remove SubModelPretrainedModel + change speakers embeddings prompt type in BarkModel * change absolute imports to relative * remove TODO * change docstrings * add examples to docs and docstrings * make style * uses BatchFeature in BarkProcessor insteads of dict * continue improving docstrings and docs + make style * correct docstrings examples * more comprehensible speaker_embeddings load/Save * rename speaker_embeddings_dict -> speaker_embeddings * correct bark.mdx + add bark to documentation_tests * correct docstrings configuration_bark * integrate last nit suggestions * integrate BarkGeneration configs * make style * remove bark tests from documentation_tests.txt because timeout - tested manually * add proper generation config initialization * small bark.mdx documentation changes * rename bark.mdx -> bark.md * add torch.no_grad behind BarkModel.generate_audio() * replace assert by ValueError in convert_suno_to_hf.py * integrate a series of short comments from reviewer * move SemanticLogitsProcessors and remove .detach() from Bark docs and docstrings * actually remove SemanticLogitsProcessor from modeling_bark.oy * BarkProcessor returns a single output instead of tuple + correct docstrings * make style + correct bug * add initializer_range to BarkConfig + correct slow modeling tests * add .clone() to history_prompt.coarse_prompt to avoid modifying input array * Making sure no extra "`" are present * remove extra characters in modeling_bark.py * Correct output if history_prompt is None * remove TODOs * remove ravel comment * completing generation_configuration_bark.py docstrings * change docstrings - number of audio codebooks instead of Encodec codebooks * change 'bias' docstrings in configuration_bark.py * format code * rename BarkModel.generate_audio -> BarkModel.generate_speech * modify AutoConfig instead of EncodecConfig in BarkConfig * correct AutoConfig wrong init * refactor BarkModel and sub-models generate_coarse, generate_fine, generate_text_semantic * remove SemanticLogitsProcessor and replace it with SuppressTokensLogitsProcessor * move nb_codebook related config arguments to BarkFineConfig * rename bark.mdx -> bark.md * correcting BarkModelConfig from_pretrained + remove keys_to_ignore * correct bark.md with correct hub path * correct code bug in bark.md * correct list tokens_to_suppress * modify Processor to load nested speaker embeddings in a safer way * correct batch sampling in BarkFineModel.generate_fine * Apply suggestions from code review Small docstrings correction and code improvements Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * give more details about num_layers in docstrings * correct indentation mistake * correct submodelconfig order of docstring variables * put audio models in alphabetical order in utils/check_repo.my * remove useless line from test_modeling_bark.py * makes BarkCoarseModelTest inherits from (ModelTesterMixin, GenerationTesterMixin, unittest.TestCase) instead of BarkSemanticModelTest * make a Tester class for each sub-model instead of inheriting * add test_resize_embeddings=True for Bark sub-models * add Copied from transformers.models.gpt_neo.modeling_gpt_neo.GPTNeoSelfAttention._split_heads * remove 'Copied fom Bark' comment * remove unneccessary comment * change np.min -> min in modeling_bark.py * refactored all custom layers to have Bark prefix * add attention_mask as an argument of generate_text_semantic * refactor sub-models start docstrings to have more precise config class definition * move _tied_weights_keys overriding * add docstrings to generate_xxx in modeling_bark.py * add loading whole BarkModel to convert_suno_to_hf * refactor attribute and variable names * make style convert_suno * update bark checkpoints * remove never entered if statement * move bark_modeling docstrings after BarkPretrainedModel class definition * refactor modeling_bark.py: kv -> key_values * small nits - code refactoring and removing unecessary lines from _init_weights * nits - replace inplace method by variable assigning * remove *optional* when necessary * remove some lines in generate_speech * add default value for optional parameter * Refactor preprocess_histories_before_coarse -> preprocess_histories Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct usage after refactoring * refactor Bark's generate_xxx -> generate and modify docstrings and tests accordingly * update docstrings python in configuration_bark.py * add bark files in utils/documentation_test.txt * correct docstrings python snippet * add the ability to use parameters in the form of e.g coarse_temperature * add semantic_max_new_tokens in python snippet in docstrings for quicker generation * Reformate sub-models kwargs in BakModel.generate Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * correct kwargs in BarkModel.generate * correct attention_mask kwarg in BarkModel.generate * add tests for sub-models args in BarkModel.generate and correct BarkFineModel.test_generate_fp16 * enrich BarkModel.generate docstrings with a description of how to use the kwargs --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 14 Jul, 2023 1 commit
-
-
Kadir Nar authored
* [
🔗 Docs] Fixed Incorrect Migration Link * Update README.md Co-authored-by:amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 12 Jul, 2023 1 commit
-
-
Lysandre Debut authored
-
- 11 Jul, 2023 2 commits
-
-
Jegor Kitškerkin authored
* Add model * Add ability to get classification head weights * Add docs * Add imports to __init__.py * Run style * Fix imports and add mdx doc * Run style * Fix copyright * Fix config docstring * Remove imports of ViViTLayer and load_tf_weights_in_vivit * Remove FeatureExtractor and replace with ImageProcessor everywhere * Remove ViViTForPreTraining from vivit.mdx * Change ViViT -> Vivit everywhere * Add model_doc to _toctree.yml * Replace tuples with lists in arguments of VivitConfig * Rename patch_size to tubelet_size in TubeletEmbeddings * Fix checkpoint names * Add tests * Remove unused num_frames * Fix imports for VivitImageProcessor * Minor fixes * Decrease number of frames in VivitModelTester from 32 to 16 * Decrease number of frames in VivitModelTester from 16 to 8 * Add initialization for pos embeddings * Rename Vivit -> ViViT in some places * Fix docstring and formatting * Rename TubeletEmbeddings -> VivitTubeletEmbeddings * Remove load_tf_weights_in_vivit * Change checkpoint name * Remove Vivit _TOKENIZER_FOR_DOC * Fix * Fix VivitTubeletEmbeddings and pass config object as parameter * Use image_size and num_frames instead of video_size * Change conversion script and fix differences with the orig implementation * Fix docstrings * Add attention head pruning * Run style and fixup * Fix tests * Add ViViT to video_classification.mdx * Save processor in conversion script * Fix * Add image processor test * Run fixup and style * Run fix-copies * Update tests/models/vivit/test_modeling_vivit.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/vivit/test_modeling_vivit.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use PyAV instead of decord * Add unittest.skip * Run style * Remove unneeded test * Update docs/source/en/model_doc/vivit.mdx Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/configuration_vivit.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/modeling_vivit.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add model * Add docs * Run style * Fix imports and add mdx doc * Remove FeatureExtractor and replace with ImageProcessor everywhere * Change ViViT -> Vivit everywhere * Rename Vivit -> ViViT in some places * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Run make style * Remove inputs save * Fix image processor * Fix * Run `make style` * Decrease parameters of VivitModelTester * Decrease tubelet size * Rename vivit.mdx * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/vivit/image_processing_vivit.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix default values in image_processing_vivit.py --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Matt authored
* Initial commit * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Cleanup config docstring * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Convert to relative imports * Remove torch < 1.8 warning * Restructure cos_sin header * qkv -> query, key, value * Refactor attention calculation * Add a couple of config variables to account for the different checkpoints * Successful merging of the code paths! * Fix misplaced line in the non-parallel attention path * Update config and tests * Add a pad_token_id when testing * Support output_attentions when alibi is None * make fixup * Skip KV cache shape test * No more _keys_to_ignore_on_load_missing * Simplify self attention a bit * Simplify self attention a bit * make fixup * stash commit * Some more attention mask updates * Should pass all tests except assisted generation! * Add big model generation test * make fixup * Add temporary workaround for test * Test overrides for assisted generation * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/falcon/test_modeling_falcon.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Test overrides for assisted generation * Add generation demo * Update copyright * Make the docstring model actually small * Add module-level docstring * Remove all assertions * Add copied from bloom * Reformat the QKV layer * Add copied from bloom * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove unused line and reformat * No single letter variables * Cleanup return names * Add copied from line * Remove the deprecated arguments blocks * Change the embeddings test to an alibi on/off test * Remove position_ids from FalconForQA * Remove old check for token type IDs * Fix the alibi path when multi_query is False * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/falcon/test_modeling_falcon.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update config naming * Fix typo for new_decoder_architecture * Add some comments * Fix docstring * Fix docstring * Create range in the right dtype from the start * Review comment cleanup * n_head_kv -> num_kv_heads * self.alibi -> self.use_alibi * self.num_kv -> self.num_kv_heads * Reorder config args * Made alibi arguments Optional * Add all model docstrings * Add extra checkpoints * Add author info for Falcon * Stop removing token_type_ids because our checkpoints shouldn't return it anymore * Add one hopeful comment for the future * Fix typo * Update tests, fix cache issue for generation * Use -1e9 instead of -inf to avoid float overflow * Recompute the rotary embeddings much less often * Re-enable disabled tests * One final fix to attention mask calculation, and update tests * Cleanup targeting falcon-40b equivalency * Post-rebase docs update * Update docstrings, especially in the config * More descriptive variable names, and comments where we can't rename them --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 10 Jul, 2023 1 commit
-
-
novice authored
* Add all files * Update masked_language_modeling.md * fix mlm models * fix conflicts * fix conflicts * fix copies * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Reduce seq_len and hidden_size in ModelTester * remove output_attentions * fix conflicts * remove copied from statements * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 03 Jul, 2023 1 commit
-
-
Arthur authored
* add tokenization template * update conversion script * update modeling code * update * update convert checkpoint * update modeling * revert changes on convert script * new conversion script for new format * correct position bias * cleaning a bit * Credit co authors Co-authored-by:
agemagician <ahmed.elnaggar@tum.de> Co-authored-by: stefan-it <> * styling * Add docq * fix copies * add co author * Other Author * Merge branch 'main' of https://github.com/huggingface/transformers into add-umt5 * add testing * nit * Update docs/source/en/model_doc/umt5.mdx Co-authored-by:
Stefan Schweter <stefan@schweter.it> * fix t5 * actual fix? * revert wrong changes * remove * update test * more fixes * revert some changes * add SPIECE_UNDERLINE * add a commone xample * upfate * fix copies * revert changes on t5 conversion script * revert bytefallback changes since there was no addition yet * fixup * fixup * ingore umt5 cutom testing folder * fix readmes * revertT5 changes * same outputs * fixup * update example * Apply suggestions from code review * style * draft addition of all new files * current update * fix attention and stuff * finish refactoring * auto config * fixup * more nits * add umt5 to init * use md format * Update README.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * revert changes on mt5 * revert mt4 changes * update test * more fixes * add to mapping * fix-copies * fix copies * foix retain grad * fix some tests * nits * done * Update src/transformers/models/umt5/modeling_umt5.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/umt5.md * Update src/transformers/models/umt5/__init__.py * Update docs/source/en/model_doc/umt5.md Co-authored-by:
Stefan Schweter <stefan@schweter.it> * Update src/transformers/models/umt5/modeling_umt5.py * update conversion script + use google checkpoints * nits * update test and modelling * stash slow convert * update fixupd * don't change slow --------- Co-authored-by: stefan-it <> Co-authored-by:
Stefan Schweter <stefan@schweter.it> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 29 Jun, 2023 1 commit
-
-
Sanchit Gandhi authored
* Add Audiocraft * add cross attention * style * add for lm * convert and verify * introduce t5 * split configs * load t5 + lm * clean conversion * copy from t5 * style * start pattern provider * make generation work * style * fix pos embs * propagate shape changes * propagate shape changes * style * delay pattern: pad tokens at end * audiocraft -> musicgen * fix inits * add mdx * style * fix pad token in processor * override generate and add todos * add init to test * undo pattern delay mask after gen * remove cfg logits processor * remove cfg logits processor * remove logits processor in favour of mask * clean pos embs * make fix copies * update readmes * clean pos emb * refactor encoder/decoder * make fix copies * update conversion * fix config imports * update config docs * make style * send pattern mask to device * pattern mask with delay * recover prompted audio tokens * fix docstrings * laydown test file * pattern edge case * remove t5 ref * add processing class * config refactor * better pattern comment * check if mask is not present * check if mask is not present * refactor to auto class * remove encoder configs * fix processor * processor import * start updating conversion * start updating tests * make style * convert t5, encodec, lm * convert as composite * also convert processor * run generate * classifier free gen * comments and clean up * make style * docs for logit proc * docstring for uncond gen * start lm tests * work tests * let the lm generate * refactor: reshape inside forward * undo greedy loop changes * from_enc_dec -> from_sub_model * fix input id shapes in docstrings * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * undo generate changes * from sub model config * Update src/transformers/models/musicgen/modeling_musicgen.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * make generate work again * generate uncond -> get uncond inputs * remove prefix allowed tokens fn * better error message * logit proc checks * Apply suggestions from code review Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * make decoder only tests work * composite fast tests * make style * uncond generation * feat extr padding * make audio prompt work * fix inputs docstrings * unconditional inputs: dict -> model output * clean up tests * more clean up tests * make style * t5 encoder -> auto text encoder * remove comments * deal with frames * fix auto text * slow tests * nice mdx * remove can generate * todo - hub id * convert m/l * make fix copies * only import generation with torch * ignore decoder from tests * don't wrap uncond inputs * make style * cleaner uncond inputs * add example to musicgen forward * fix docs * ignore MusicGen Model/ForConditionalGeneration in auto mapping * add doc section to toctree * add to doc tests * add processor tests * fix push to hub in conversion * tips for decoder only loading * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix conversion for s / m / l checkpoints * import stopping criteria from module * remove from pipeline tests * fix uncond docstring * decode audio method * fix docs * org: sanchit-gandhi -> facebook * fix max pos embeddings * remove auto doc (not compatible with shapes) * bump max pos emb * make style * fix doc * fix config doc * fix config doc * ignore musicgen config from docstring * make style * fix config * fix config for doctest * consistent from_sub_models * don't automap decoder * fix mdx save audio file * fix mdx save audio file * processor batch decode for audio * remove keys to ignore * update doc md * update generation config * allow changes for default generation config * update tests * make style * fix docstring for uncond * fix processor test * fix processor test --------- Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 26 Jun, 2023 1 commit
-
-
NielsRogge authored
* Squash 88 commits * Use markdown * Remove mdx files due to bad rebase * Fix modeling files due to bad rebase * Fix style * Update comment * fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 15 Jun, 2023 1 commit
-
-
Sayed Qaiser Ali authored
Update README.md Updated the tested versions
-
- 14 Jun, 2023 1 commit
-
-
Matthijs Hollemans authored
* boilerplate stuff * messing around with the feature extractor * fix feature extractor * unit tests for feature extractor * rename speech to audio * quick-and-dirty import of Meta's code * import weights (sort of) * cleaning up * more cleaning up * move encoder/decoder args into config * cleanup model * rename EnCodec -> Encodec * RVQ parameters in config * add slow test * add lstm init and test_init * Add save & load * finish EncodecModel * remove decoder_input_values as they are ont used anywhere (not removed from doc yet) * fix test feature extraction model name * Add better slow test * Fix tests * some fixup and cleaning * Improve further * cleaning up quantizer * fix up conversion script * test don't pass, _encode_fram does not work * update tests with output per encode and decode * more cleanup * rename _codebook * remove old config cruft * ratios & hop_length * use ModuleList instead of Sequential * clean up resnet block * update types * update tests * fixup * quick cleanup * fix padding * more styl,ing * add patrick feedback * fix copies * fixup * fix lstm * fix shape issues * fixup * rename conv layers * fixup * fix decoding * small conv refactoring * remove norm_params * simplify conv layers * rename conv layers * stuff * Clean up * Add padding logic use padding mask small conv refactoring remove norm_params simplify conv layers rename conv layers stuff add batched test update Clean up merge and update for padding fix padding fixup * clean up more * clean up more * More clean ups * cleanup convolutions * typo * fix typos * fixup * build PR doc? * start refactoring docstring * fix don't pad when no strid and chunk * update docstring * update docstring * nits * update going to lunch * update config and model * fix broken testse (becaue of the config changes) * fix scale computation * fixu[ * only return dict if speciefied or if config returns it * remove todos * update defaults in config * update conversion script * fix doctest * more docstring + fixup * nits on batched_tests * more nits * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * update basxed on review * fix update * updaet tests * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fixup * add overlap and chunl_length_s * cleanup feature extraction * teste edge cases truncation and padding * correct processor values * update config encodec, nits * fix tests * fixup * fix 24Hz test * elle tests are green * fix fixup * Apply suggestions from code review * revert readme changes * fixup * add example * use facebook checkpoints * fix typo * no pipeline tests * use slef.pad everywhere we can * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update based on review * update * update mdx * fix bug and tests * fixup * fix doctest * remove comment * more nits * add more coverage for `test_truncation_and_padding` * fixup * add last test * fix text * nits * Update tests/models/encodec/test_modeling_encodec.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * take care of the last comments * typo * fix test * nits * fixup * Update src/transformers/models/encodec/feature_extraction_encodec.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
arthur.zucker@gmail.com <arthur.zucker@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 07 Jun, 2023 1 commit
-
-
Sylvain Gugger authored
-
- 02 Jun, 2023 2 commits
-
-
Shehan Munasinghe authored
* generated code from add-new-model-like * Add code for modeling, config, and weight conversion * add tests for image-classification, update modeling and config * add code, tests for semantic-segmentation * make style, make quality, make fix-copies * make fix-copies * Update modeling_mobilevitv2.py fix bugs * Update _toctree.yml * update modeling, config fix bugs * Edit docs - fix bug MobileViTv2v2 -> MobileViTv2 * Update mobilevitv2.mdx * update docstrings * Update configuration_mobilevitv2.py make style * Update convert_mlcvnets_to_pytorch.py remove unused options * Update convert_mlcvnets_to_pytorch.py make style * Add suggestions from code review Co-Authored-By:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make style, make quality * Add suggestions from code review Co-Authored-By:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add suggestions from code review Remove MobileViTv2ImageProcessor Co-Authored-By:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make style * Add suggestions from code review Rename MobileViTv2 -> MobileViTV2 Co-Authored-By:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add suggestions from code review Co-Authored-By:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update modeling_mobilevitv2.py make style * Update serialization.mdx * Update modeling_mobilevitv2.py --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Patrick von Platen authored
* add fine-tuned with adapter layer * Add set_target_lang to tokenizer * Implement load adapter * add tests * make style * Apply suggestions from code review * Update src/transformers/models/wav2vec2/tokenization_wav2vec2.py * make fix-copies * Apply suggestions from code review * make fix-copies * make style again * mkae style again * fix doc string * Update tests/models/wav2vec2/test_tokenization_wav2vec2.py * Apply suggestions from code review * fix * Correct wav2vec2 adapter * mkae style * Update src/transformers/models/wav2vec2/modeling_wav2vec2.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * add more nice docs * finish * finish * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review * all finish --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 30 May, 2023 1 commit
-
-
Eli Simhayev authored
* ran `transformers-cli add-new-model-like` * added `AutoformerLayernorm` and `AutoformerSeriesDecomposition` * added `decomposition_layer` in `init` and `moving_avg` to config * added `AutoformerAutoCorrelation` to encoder & decoder * removed caninical self attention `AutoformerAttention` * added arguments in config and model tester. Init works!
😁 * WIP autoformer attention with autocorrlation * fixed `attn_weights` size * wip time_delay_agg_training * fixing sizes and debug time_delay_agg_training * aggregation in training works!😁 * `top_k_delays` -> `top_k_delays_index` and added `contiguous()` * wip time_delay_agg_inference * finish time_delay_agg_inference😎 * added resize to autocorrelation * bug fix: added the length of the output signal to `irfft` * `attention_mask = None` in the decoder * fixed test: changed attention expected size, `test_attention_outputs` works! * removed unnecessary code * apply AutoformerLayernorm in final norm in enc & dec * added series decomposition to the encoder * added series decomp to decoder, with inputs * added trend todos * added autoformer to README * added to index * added autoformer.mdx * remove scaling and init attention_mask in the decoder * make style * fix copies * make fix-copies * inital fix-copies * fix from https://github.com/huggingface/transformers/pull/22076 * make style * fix class names * added trend * added d_model and projection layers * added `trend_projection` source, and decomp layer init * added trend & seasonal init for decoder input * AutoformerModel cannot be copied as it has the decomp layer too * encoder can be copied from time series transformer * fixed generation and made distrb. out more robust * use context window to calculate decomposition * use the context_window for decomposition * use output_params helper * clean up AutoformerAttention * subsequences_length off by 1 * make fix copies * fix test * added init for nn.Conv1d * fix IGNORE_NON_TESTED * added model_doc * fix ruff * ignore tests * remove dup * fix SPECIAL_CASES_TO_ALLOW * do not copy due to conv1d weight init * remove unused imports * added short summary * added label_length and made the model non-autoregressive * added params docs * better doc for `factor` * fix tests * renamed `moving_avg` to `moving_average` * renamed `factor` to `autocorrelation_factor` * make style * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by:NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix configurations * fix integration tests * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fixing `lags_sequence` doc * Revert "fixing `lags_sequence` doc" This reverts commit 21e34911e36a6f8f45f25cbf43584a49e5316c55. * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * model layers now take the config * added `layer_norm_eps` to the config * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * added `config.layer_norm_eps` to AutoformerLayernorm * added `config.layer_norm_eps` to all layernorm layers * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix variable names * added inital pretrained model * added use_cache docstring * doc strings for trend and use_cache * fix order of args * imports on one line * fixed get_lagged_subsequences docs * add docstring for create_network_inputs * get rid of layer_norm_eps config * add back layernorm * update fixture location * fix signature * use AutoformerModelOutput dataclass * fix pretrain config * no need as default exists * subclass ModelOutput * remove layer_norm_eps config * fix test_model_outputs_equivalence test * test hidden_states_output * make fix-copies * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * removed unused attr * Update tests/models/autoformer/test_modeling_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * use AutoFormerDecoderOutput * fix formatting * fix formatting --------- Co-authored-by:
Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 19 May, 2023 1 commit
-
-
Julien Chaumond authored
* README: Fix affiliation for MEGA * Fix quality --------- Co-authored-by:Lysandre <lysandre@huggingface.co>
-
- 17 May, 2023 1 commit
-
-
Lysandre Debut authored
Fix + link
-
- 12 May, 2023 1 commit
-
-
Shehan Munasinghe authored
* Commit the automatically generated code using add-new-model-like * Update description at swiftformer.mdx file * remove autogenerated code for MaskedImageModeling * update weight conversion scripts * Update modeling_swiftformer.py * update configuration_swiftformer.py * Update test_modeling_swiftformer.py * update modeling code - remove einops dependency * Update _toctree.yml * update modeling code - remove copied from comments * update docs * Revert "update docs" This reverts commit c2e05e2998fe2cd6eaee8b8cc31aca5222bac9fb. * update docs * remove unused reference SwiftFormerImageProcessor * update dependency_versions_table.py * update swiftformer.mdx * update swiftformer.mdx * change model output type - no attentions * update model org name * Fix typo * fix copies * Update tests/models/swiftformer/test_modeling_swiftformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/auto/feature_extraction_auto.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/swiftformer.mdx Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/swiftformer/configuration_swiftformer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-Authored-By:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-Authored-By:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-Authored-By:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update modeling_swiftformer.py fix-copies * make style, make quality, fix-copies * Apply suggestions from code review Co-Authored-By:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-Authored-By:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make style Co-Authored-By:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add suggestions from code review Co-Authored-By:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add suggestions from code review Co-Authored-By:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fix-copies * Update modeling_swiftformer.py * Update modeling_swiftformer.py * Add suggestions from code review Co-Authored-By:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 09 May, 2023 2 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* First draft of RWKV-4 * Add support for generate * Style post-rebase * Properly use state * Write doc * Fix doc * More math * Add model to README, dummies and clean config * Fix init * multiple fixes: - fix common tests - fix configuraion default values - add CI test for checking state computation - fix some CI tests * correct tokenizer * some tweaks - fix config docstring - fix failing tests * fix CI tests - add output_attention / output_hidden_states - override test_initialization - fix failing CIs * fix conversion script - fix sharded case - add new arguments * add slow tests + more fixes on conversion script * add another test * final fixes * change single name variable * add mock attention mask for pipeline to work * correct eos token id * fix nits * add checkpoints * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add `tie_word_embeddings` in docstring * change tensor name * fix final nits * Trigger CI --------- Co-authored-by:
younesbelkada <younesbelkada@gmail.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 28 Apr, 2023 1 commit
-
-
s-JoL authored
* update Open-Llama model * update * update format * update doc * update * update stable embedding test * update test case * update format * update readme * fix typo * update name * remove tokenizer and update format * remove convert_open_llama_weights_to_hf * update warning and doc_string --------- Co-authored-by:songliang.bayesian <songliang.bayesian@bytedance.com>
-
- 27 Apr, 2023 1 commit
-
-
Ehsan M. Kermani authored
* Fix CLAP link across all READMEs * Fix copy only for en
-
- 23 Apr, 2023 1 commit
-
-
NielsRogge authored
Adds FocalNet by Microsoft to transformers --------- Co-authored-by:
Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by:
alaradirik <alaradirik@gmail.com>
-
- 20 Apr, 2023 1 commit
-
-
Younes Belkada authored
put correct link
-
- 19 Apr, 2023 2 commits
-
-
Arthur authored
* initial commit * keys match * update, fix conversion * fixes, inference working * fix * more fixes * more fixes * clean up * more clean up * fix copies and add convext copied layer norm * stash * pretty big upfate * cleaning * more cleaning * fixup stuffs * fix copies * fix iinit * update test removing tokenizer * nits * add pretrained * more nits * remove tracking of pipeline * few fixes * update san and conversion script * fix mask decoder and prompt encoder conversion * fixes * small update * fix order * fix * fix image embeddings * nites * few fixes * fix logits * clean up * fixes boxes inference * v1 AMG * clean up * some clean up * multi points support * amg working * fixup * clean up * readme * update toctree * fix type hint * multiple fixes * fixup * fixes * updates * updates * more tests * few fixes * change to `SamForMaskGeneration` * doc * fixup * fix more tests * multiple fixes * fix CI tests * refactor processor * renamings * draft the pipeline * refactor * fix tests * fix test * few cleanings * fix test * edit pipelien support chunking * udate * add slow tests * fix nit * fixup * fix nit * current chunk pipleine * cast boxes in fp32 * nit * current updates * piepleine works * fixup * clean up config * fix slow tests * fix slow tests * clean up * update doc and pipeline * adds more slow tests * fix slow tests * cleaning * tests pass * add docstring * fix copies * clean up * support batch of images * style * dummy is needed, add tests * fix slow tests * fix CI * update * adds more tests * fixes * fixes * fixup * fixes * few fixes * filter * few fixes * some refactor * touches finales * fix * style * remove pipeline files * fixes nits * revert pipeline changes * fix test * fixup * remove automodel for automatic mask generation * fix failing torch tests * update mdx * revert removal of `MODEL_FOR_AUTOMATIC_MASK_GENERATION_MAPPING` * update sam config based on review Co-authored-by:
amyeroberts <aeroberts4444@gmail.com> Co-authored-by:
sgugger <sylvain.gugger@gmail.com> * update low_resolution_masks -> pred_masks inti ln with layer_norm_eps add_decomposed_rel_pos doc forward doc of SamForMaskGeneration * update processor docstring * remove image processor import empty * update for testing * output vision hidden states + clean recomm also test all iou values * fixup * fixup * remove unused * Update src/transformers/models/sam/modeling_sam.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/sam/image_processing_sam.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * nits * fix * fix CI tests and slow tests * replace with Amy's processor * clearer docstring * add `SamVisionNeck` * refactor - all CI tests should pass * fix broken import on Gcolab * few fixes here and there * fix another bug * fix more bugs * update and merge * correct ckpt * address comments * add tips * revert * fix docstring * replace with `SamModel` * make fixup * add support for bathed images and batch ed points * make fixup this time, really * make fixup again and again * few fixes here and there, this should be the touche finale * Update docs/source/en/model_doc/sam.mdx * fixup * correct checkpoints * correct name * rm unneeded file * add notebook --------- Co-authored-by:
younesbelkada <younesbelkada@gmail.com> Co-authored-by:
amyeroberts <aeroberts4444@gmail.com> Co-authored-by:
sgugger <sylvain.gugger@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
amyeroberts authored
-
- 13 Apr, 2023 1 commit
-
-
Sylvain Gugger authored
-
- 12 Apr, 2023 1 commit
-
-
pioliverse authored
* resolve conflicts * rebase and make style * test * test * test * rebase and make style * rebase and make style * tests * tests * rewrite some functions * rebase and make style * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * fix some bugs & docstring * add models and tests * solve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * tests * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * fix some bugs & docstring * save resolution * make style * delete redefinition code * reformat function * reformat * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * tests * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * resolve conflicts * make style * fix bugs and refactor * modify docstrings and make style * unify import format in __init__.py * fix import-altclp bug * fix copies to update index.md * fix unused config parameters * fix unused config parameters * fix unused config parameters * update README_ja.md * dummy commit for unit test * fix attention mask * add CPMAntTokenizer&-Fast to auto-mapping * drop redundant changes in README_ko * fix defaults in docstring * fix use_cache and some docstring * add missing args in tokenizer * modify tester inheritance * add is_jieba_available * fix some bugs * make style and fix-copies * add doctests * skip integration tests * add is_jieba_available * fix bugs in common tests * adjust docstrings and make style * add argument docstring * adjust code to some specifications * make style and fix-copies * add fast tokenization test * dummy commit for unit test * dummy commit for unit test * dummy commit for unit test * normalize some comments and names * Bert->CPMAnt * camel names and drop redundant codes * make style and fix-coies * add CpmTokenizerFast _import_structure * drop cpmanttokenizerfast in model_doc * fix some problems * fix CPMAnt tokenization for common test * make style and fixup * fix copies and fixup * fix bugs in tokenization test * dummy commit for connection failure in unittest * fix copies * drop trailing comma * fix decorator in tests * dummy commit for connection failure in unittest --------- Co-authored-by:Gong Baitao <gongbaitao11@gmail.com>
-
- 10 Apr, 2023 1 commit
-
-
Joel Lamy-Poirier authored
* Add model with cli tool * Remove unwanted stuff * Add new code * Remove inference runner * Style * Fix checks * Test updates * make fixup * fix docs * fix doc * fix test * hopefully fix pipeline tests * refactor * fix CIs * add comment * rename to `GPTBigCodeForCausalLM` * correct readme * make fixup + docs * make fixup * fixes * fixes * Remove pruning * Remove import * Doc updates * More pruning removal * Combine copies * Single MQA implementation, remove kv cache pre-allocation and padding * Update doc * Revert refactor to match gpt2 style * Merge back key and value caches, fix some type hints * Update doc * Fix position ids pith padding (PR 21080) * Add conversion script temporarily * Update conversion script * Remove checkpoint conversion * New model * Fix MQA test * Fix copies * try fix tests * FIX TEST!! * remove `DoubleHeadsModel` * add MQA tests * add slow tests * clean up * add CPU checker * final fixes * fixes - fix GPU issue - fixed slow tests - skip disk offload * fix final issue * Simplify and comment baddbmm fix * Remove unnecessary code * Transpose tweaks * Use beta=1 on cpu, improve tests --------- Co-authored-by:younesbelkada <younesbelkada@gmail.com>
-
- 05 Apr, 2023 1 commit
-
-
Younes Belkada authored
* add deplot + matcha on `transformers` * more docs * correct path * Update docs/source/en/model_doc/deplot.mdx Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix * use auto processor * Update docs/source/en/model_doc/matcha.mdx Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make fixup * Update docs/source/en/model_doc/deplot.mdx Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add correct names --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com>
-
- 27 Mar, 2023 1 commit
-
-
Arthur authored
* Initial commit * update modeling code * update doc * add functions necessary * fix impotrs * revert changes * fixup * more styling to get going * remove standalone encoder * update code * styling * fix config and model * update code and some refactoring * make more tests pass * Adding NLLB-200 - MoE - 54.5B for no language left behind Fixes #21300 * fix mor common tests * styke * update testing file * update * update * Router2 doc * update check config with sparse layer * add dummy router * update current conversion script * create on the fly conversion script * Fixup * style * style 2 * fix empty return * fix return * Update default config sparse layers * easier to create sparse layers * update * update conversion script * update modeling * add to toctree * styling * make ruff happy * update docstring * update conversion script * update, will break tests but impelemting top2 * update *
❗ local groups are supported here *⚠ ️ Support for local groups is now removed⚠ ️ This is because it has to work with model parallelism that we do not support * finish simplificaiton * Fix forward * style * fixup * Update modelling and test, refactoring * update tests * remove final layer)norm as it is done in the FF * routing works! Logits test added * nit in test * remove top1router * style * make sure sparse are tested. Had to change route_tokens a liottle bit * add support for unslip models when converting * fixup * style * update test s * update test * REFACTOR * encoder outputs match! * style * update testing *🎉 encoder and decoder logits match🎉 * styleing * update tests * cleanup tests * fix router test and CIs * cleanup * cleanup test styling * fix tests * Finally the generation tests match! * cleanup * update test * style testing file * remove script * cleanup * more cleanup * nits * update * NLLB tokenizer is wrong and will be fixed soon * use LongTensors * update tests * revert some small changes * fix second expert sampling and batch prioritized routing * update tests * finish last tests * make ruff happy * update * ruff again * style * Update docs/source/en/model_doc/nllb-moe.mdx Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updates based on review * style and fix import issue * nit * more nits * cleanup * styling * update test_seconde_expert_policy * fix name * last nit on the markdown examples --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 24 Mar, 2023 1 commit
-
-
Mitch Naylor authored
* add mega file structure and plain pytorch version of mega source code * added config class with old naming conventions * filled in mega documentation * added config class and embeddings with optional token types * updated notes * starting the conversion process, deleted intermediate and added use_cache back to config * renamed config attributes in modeling_mega.py * checkpointing before refactoring incremental decoding functions * removed stateful incremental key/values for EMA and self-attention * refactored MovingAverageGatedAttention to remove stateful k/v history and use unified attention mask * MovingAverageGatedAttention works with incremental decoding + past values, added sequence length enforcement * more comments in MovingAverageGatedAttention + checkpointing before GatedCrossAttention * bug fix in attention mask handling in MovingAverageGatedAttention * removed incremental state from GatedCrossAttention and removed IncrementalState class * finished gated cross attention and got MegaLayer working * fixed causal masking in mega decoder * fixed how padding and causal masks are passed through MegaLayer with and without k/v caching * finished MegaModel; tested with encoder, decoder-only, and cross-attention type inputs; started work on downstream classes; removed mentions of position_ids * added optional dense hidden layer for masked and causal LM classes * docstring updates in MultiHeadEMA and GatedCrossAttention, removed unnecessary inputs in cross-attention * removed before_attn_fn in Mega class and updated docstrings and comments up to there * bug fix in MovingAverageGatedAttention masking * working conversion of MLM checkpoint in scratchpad script -- perfect matches * moved arg for hidden dense layer in LM head to config; discovered issue where from_pretrained is renaming gamma and beta parameters * renamed gamma and beta parameters to avoid HF renaming when loading from checkpoint * finished checkpoint conversion script * cleanup old class in mega config script * removed 'copied from' statements and passing integration tests * added num_attention_heads=1 to config for integration compatibility, decoder tests working, generation tests failing * fixed tuple output of megamodel * all common tests passing after fixing issues in decoder, gradient retention, and initialization * added mega-specific tests, ready for more documentation and style checks * updated docstrings; checkpoint before style fixes * style and quality checks, fixed initialization problem in float_tensor, ready for PR * added mega to toctree * removed unnecessary arg in megaconfig * removed unused arg and fixed code samples with leftover roberta models * Apply suggestions from code review Applied all suggestions except the one renaming a class, as I'll need to update that througout Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixed issue where .view breaks batch dimension, conversion script fixed with absolute imports, updated readme with Mega->MEGA * removed asserts in Mega code, renamed sequencenorm, gatedcrossattention, and NFFN, replaced get_activation_fn with ACTFN, and added sequencenorm to layer norms * reformatted .forward() docstrings to match style and removed unused mask input in cross-attention * removed all reset_parameters() methods and rolled into MegaPreTrainedModel._init_weights() * renamed all single-letter variables and improved readability in tensor size comments, Mega->MEGA in 2 documentation files * variable names in NFFN * manual Mega->MEGA changes in docs * Mega->MEGA in config auto * style and quality fixes * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * renamed parameters and variables with confusing names, added copied from statements, moved fft conv to its own method, other cleanup from PR comments * commit before dealing with merge conflicts * made new attention activation functions available in ACT2FN and added generation test from OPT * style and quality in activations and tests * documentation fixes, renaming variables in dropout and rotary positions, used built-in causal masking, encoders->layers in MegaModel, moved comments into docstrings * style and quality fixes after latest updates, before rotary position ids * causal mask in MegaBlock docstring + added missing device passing * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update README.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * added Mega prefixes where missing, reverted MegaSequenceNorm to if-else, other module renaming requested in PR * style and quality fixes + readme updates pointing to main --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 22 Mar, 2023 1 commit
-
-
Younes Belkada authored
* v1 all keys match * clean up * forward pass ok * add correct image transform * generate works, logits matching * clean up * more refactor * revert * revert * clean up * clean ups * clean up * refactor * refactor * fix doc * fix tokenizer test * fix toctree * revert toctree * oops * few fixes * replace to `pixel_embeds` * make fixup * test processing & feat extractor * fix some tests * more fixes * make fixup * clean up * more clean up * add a single slow test * fix test * make fixup * fix * fix authors * fix toctree * update docs * add docstring * revert change * Update src/transformers/models/pix2struct/__init__.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix tokenizer * fix processor test * fix test * make fixup * refactor * fix config * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * format * fix * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * make fixup * add docstring * fix issues * fix * fix * fix * add slow test * fix * fix * fix batched issue * fix training issues * fix ci test * fix slow test * fix conversion script * remove unneeded classes * fix slow test * fix require backends * fix masked fill * revert * fix softmax * add large models support * fix conditional generation * few fixes * add instructions * rm unneeded file * Update src/transformers/models/pix2struct/convert_pix2struct_original_pytorch_to_hf.py * fix ci test * fix ci test really * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix nit * fix nits * fix image processors nits * docstring * clean up * fix nit * fix tests * docstring nit * fix reshape * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix nit * fix repetition * refactor processor * make patch size consistent * refactor forward * fix docstring * fix max_patches issue * update docstirng * update docstring * fix coped from * add skip reasons * few fixes * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * format * fix doctests * refactor and fix * fix doc build issue * fix processor test * small fix conversion script * replace correct weights * make fixup * fix some issues * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * revert config and fixes * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * more details * fixes * fix processor * fix processor test * fix * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * fix processor * Update src/transformers/models/pix2struct/modeling_pix2struct.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add copied * make fixup * fix copies * update docstring * refactor * fix docstring * fix conversion script * fix vqa issue * replace to `flattened_patches` * nit * fix numpy issue * fix image processors * add batched vqa support * fix vqa conversion * make fixup * fix conversion script * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * add correct docstring * update docstring * fix module level + channel dim * use `make_list_of_images` * refactor * correct docstring * fix authors * remove `data_format` * add header text test * Apply suggestions from code review Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * add checkpoints --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com>
-
- 16 Mar, 2023 1 commit
-
-
Jason Phang authored
* LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by:Stella Biderman <stellabiderman@gmail.com>
-
- 14 Mar, 2023 2 commits
-
-
Sylvain Gugger authored
-
Alara Dirik authored
* Add ConvNeXt V2 to transformers * TF model is separated from the PR to fix issues
-
- 13 Mar, 2023 2 commits
-
-
Sylvain Gugger authored
-
wangpeng authored
* add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * remove representation_size from MGPSTRConfig * reformat configuration_mgp_str.py * format test_processor_mgp_str.py * add test for tokenizer and complete model/processer test and model file * rm Unnecessary tupple in modeling_mgp_str * reduce hidden_size/layers/label_size in test_model * add integration tests and change MGPSTR to Mgpstr * add test for logit values * reformat test model file --------- Co-authored-by:yue kun <yuekun.wp@alibaba-inc.com>
-