• Alazar's avatar
    Port IDEFICS to tensorflow (#26870) · 94306352
    Alazar authored
    
    
    * Initial commit
    
    * Just a copy of modeling_idefics.py that will be ported to TF
    
    * - Prepend TF to the name of all classes
    - Convert pytorch ops to TF (not all operations are converted yet)
    
    * Add TF imports
    
    * Add autotranslated files
    
    * Add TF classes to model_tf_auto.py
    
    * Add the TF classes in model_doc
    
    * include auto-translated code
    
    * Adopted from auto-translated version
    
    * Add a forgotten super().build
    
    * Add test code for TF version.
    
    * Fix indentation and load pytorch weights for now
    
    * Some fixes. Many tests are still failing but some are passing now.
    
    - I have added TODO's for some of the hacks I made to unblock me
      and I will address them soon
    - I have the processing_idefics.py hacked in my view to support TF temporarily
    
    * Add ALL_LAYERNORM_LAYERS to match pytorch
    
    * Revert "Add ALL_LAYERNORM_LAYERS to match pytorch"
    
    This reverts commit 7e0a35119b4d7a6284d04d8c543fba1b29e573c9 as it
    is not needed in the tf implementation.
    
    * Fix freeze_relevant_params()
    
    * Some more fixes
    
    * Fix test_attention_outputs
    
    * Add tf stuff to processing_idefics.py
    
    processing_idefics.py supports both pytorch and tf now.
    
    test_processor_idefics.py for pytorch is passing, so i didn't break anything
    but still some issues with tf. I also need to add tf tests in
    test_processor_idefics.py.
    
    * Pass return_tensors to image processing code and fix test
    
    * Pass return_tensors to the image processor __init__
    
    * Fix several test cases
    
    - Make input to some of the forward pass of type `TFModelInputType`
    - Decorate main layer forward pass with `@unpack_inputs`
    - Decorate main layer with `@keras_serializable`
    - Pass `inputs` to TFIdeficsModel
    
    * Some more fixes forgotten in last commit
    
    * Fix processing code and vision_tf.py
    
    * Fix perceiver bug
    
    * Import from
    
    * Auto-add build() methods + style pass
    
    * Fix build() errors due to `None` being passed as shape to some layers
    
    * Change name in TFIdeficsForVisionText2Text to attribute in IdeficsForVisionText2Text
    
    * Fix pytorch weights load for tf2
    
    There were a lot of `name=` missing in weight initialization code.
    
    * Attempt to fix CI
    
    * Add back accidently removed line
    
    * Remove torch-specific stuff from the TF test file
    
    * make fix-copies, make style, remove autotranslated files
    
    * Fixes to imports/docstrings
    
    * Let's try the from future import in desperation
    
    * Fix the core random_attention_mask fn to match the torch/flax behaviour
    
    * Clean random_attention_mask up correctly
    
    * Remove torch-only test
    
    * Fix loss shape, couple of nits
    
    * make style
    
    * Don't test for OOB embeddings because IDEFICS uses those deliberately
    
    * Fix loss computation to handle masking
    
    * Fix test failures when flattening
    
    * Fix some test failures
    
    - Add cross attention gate which was missing and wasn't being passed arround
    - Fix overwriting of image_attention_mask due to hack I had for dummy inputs
    
    * Add a proper stateless scaled_dot_product_attention
    
    * make style
    
    * Adding missing attribute from the PyTorch version
    
    * Small cleanups to decoupledlinearlayer in case that helps
    
    * Pass epsilon to LayerNormalization
    
    * Attemp to fix pytorch weight cross-loading for TFIdeficsEmbedding
    
    * Fix a bug in TFIdeficsGatedCrossAttentionLayer
    
    * Patching up build() methods
    
    * Constant self.inv_freq
    
    * Constant self.inv_freq
    
    * First working version
    
    The TF implementation works now, there was a bug in the TFIdeficsDecoupledLinear
    where the weights were mis-intialized (in_features,out_features)
    when it should be: (out_features, in_features)
    
    I have tested this so far with tiny-random and idefics-9b-instruct
    and gives correct output.
    
    I also dumped the final outputs for both pytorch and TF
    and they are identical.
    
    * Fix some test failures
    
    * remove print statement
    
    * Fix return_tensors
    
    * Fix CI test failure check_code_quality
    
    * Attempt to fix CI failures by running `make fixup`
    
    The hardcoded IDs in test_modeling_tf_idefics.py are for the integration
    test and makes that file unreadable and should probably be moved to a seperate file.
    
    * Attempt to fix tests_pr_documentation_tests
    
    * Fix a test failure in test_image_processing_idefics.py
    
    * Fix test test_pt_tf_model_equivalence
    
    * Fix a few failures
    
    * Tiny fix
    
    * Some minor fixes
    
    * Remove a duplicate test
    
    * Override a few test failures for IDEFICS
    
    - `test_keras_save_load` is passing now
    - `test_compile_tf_model` is still failing
    
    * Fix processing_idefics.py after rebase
    
    * Guard import keras with is_tf_available
    
    * fix check code quality
    
    * fix check code quality
    
    * Minor fixes
    
    * Skip test_save_load temporarily
    
    This test passed on my local box but fails on the CI, skipping
    for now to see if there are other remaining failures on the CI.
    
    * Run `ruff format tests src utils`
    
    * Fix last failing test, `test_compile_tf_model`
    
    * Add fixes for vision_tf.py
    
    I forgot to add this file in last commit.
    
    * Minor fixes
    
    * Replace "<<<" with "<<" for doc tests
    
    IDEFICS-9B is too big for doctest runner, so don't run it there
    
    * Make code more readable
    
    * Fix bug after code review
    
    I added a layer_norm_eps to IdeficsConfig but I don't even need it
    since the vision config has a layer_norm_eps.
    
    * Fix after code review
    
    Use original code tokenizer.convert_tokens_to_ids
    
    * Keep PyTorch as the default return_tensors
    
    * Fixes to modeling_tf after code review
    
    * Fixes from code review
    
    - Remove all references of `TF_IDEFICS_PRETRAINED_MODEL_ARCHIVE_LIST`
    - Pass 1e-5 to LayerNormalization in perceiver
    
    * Run ruff
    
    * Undo a change
    
    * Refactor processing code after Matt's suggestion
    
    * Remove TODO's that aren't needed anymore
    
    * For pytorch, Use original pytorch processing code from main
    
    Since this PR is a TF port it shouldn't make any modifications
    to pytorch IDEFICS code. This changes undo's the pytorch processing
    modifications I made and uses original code from main.
    
    * Update tests/models/idefics/test_modeling_idefics.py
    
    * Update tests/models/idefics/test_modeling_tf_idefics.py
    
    * Add missing imports for is_pt_tf_cross_test
    
    * [DO NOT MERGE]: This is a commit for debugging and will be reverted
    
    The cross test `test_pt_tf_model_equivalence` passes locally but
    fails when running on the CI. This commit is to help debug that
    and will be reverted.
    
    * Revert "[DO NOT MERGE]: This is a commit for debugging and will be reverted"
    
    This reverts commit 8f0d709ec5bd46685fb0b4259d914ffee794875b.
    
    * [DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted
    
    * [DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted
    
    * Revert "[DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted"
    
    This reverts commit 998cc38b8c3d313bf5e5eb55a7f5b7b881897b89.
    
    * Revert "[DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted"
    
    This reverts commit 1c695ac4219c4ae4d39b330b01744dc27deb7dd4.
    
    * Don't skip test_save_load
    
    IIRC test_save_load was also failing on the CI but not on my local
    box, it might be easier to debug that on the CI first than the cross tests
    
    * Debugging commit, will be reverted
    
    * Revert "Debugging commit, will be reverted"
    
    This reverts commit 8eafc8e41e20c4e95a3a90834f06a6e9f445e2d5.
    
    * Override `test_save_load` and push model to save
    
    Maybe this will help me repro this weird bug
    
    * pass my repo_id
    
    * add endpoint
    
    * Pass a temp (write) token just for this CI
    
    * Undo last few commits, still pushing to hub for model debugging
    
    The issue seems to be with save_pretrained(),  when I looked at the model saved
    from the CI test failure it is basically empty and has no weights.
    `self.save_weights(..)` seems to be failing in save_pretrained but needs
    more debugging
    
    * Add logging to modeling tf utils, will be reverted just for debugging
    
    * Debugging, will revert
    
    * Revert "Debugging, will revert"
    
    This reverts commit 9d0d3075fb7c82d8cde3a5c76bc8f3876c5c55d3.
    
    * Revert "Add logging to modeling tf utils, will be reverted just for debugging"
    
    This reverts commit 774b6b7b1c17b3ce5d7634ade768f2f686cee617.
    
    * Remove `test_save_load`
    
    The CI failures are gone after my latest rebase, no idea why
    but I was still saving the model to my hub on HF and the tf_model.h5
    file now has everything.
    
    * Run make fix-copies
    
    * Run ruff format tests src utils
    
    * Debugging commit, will be reverted
    
    * Run ruff, also trigger CI run
    
    * Run ruff again
    
    * Undo debugging commit
    
    ---------
    Co-authored-by: default avatarMatt <rocketknight1@gmail.com>
    Co-authored-by: default avatarMatt <Rocketknight1@users.noreply.github.com>
    94306352
index.md 40.5 KB