1. 21 Nov, 2023 10 commits
    • jiqing-feng's avatar
      TVP model (#25856) · c770600f
      jiqing-feng authored
      * tvp model for video grounding
      
      add tokenizer auto
      
      fix param in TVPProcessor
      
      add docs
      
      clear comments and enable different torch dtype
      
      add image processor test and model test and fix code style
      
      * fix conflict
      
      * fix model doc
      
      * fix image processing tests
      
      * fix tvp tests
      
      * remove torch in processor
      
      * fix grammar error
      
      * add more details on tvp.md
      
      * fix model arch for loss, grammar, and processor
      
      * add docstring and do not regard TvpTransformer, TvpVisionModel as individual model
      
      * use pad_image
      
      * update copyright
      
      * control first downsample stride
      
      * reduce first only works for ResNetBottleNeckLayer
      
      * fix param name
      
      * fix style
      
      * add testing
      
      * fix style
      
      * rm init_weight
      
      * fix style
      
      * add post init
      
      * fix comments
      
      * do not test TvpTransformer
      
      * fix warning
      
      * fix style
      
      * fix example
      
      * fix config map
      
      * add link in config
      
      * fix comments
      
      * fix style
      
      * rm useless param
      
      * change attention
      
      * change test
      
      * add notes
      
      * fix comments
      
      * fix tvp
      
      * import checkpointing
      
      * fix gradient checkpointing
      
      * Use a more accurate example in readme
      
      * update
      
      * fix copy
      
      * fix style
      
      * update readme
      
      * delete print
      
      * remove tvp test_forward_signature
      
      * remove TvpTransformer
      
      * fix test init model
      
      * merge main and make style
      
      * fix tests and others
      
      * fix image processor
      
      * fix style and model_input_names
      
      * fix tests
      c770600f
    • Hz, Ji's avatar
      remove the deprecated method `init_git_repo` (#27617) · f5c9738f
      Hz, Ji authored
      * remove deprecated method `init_git_repo`
      
      * make style
      f5c9738f
    • amyeroberts's avatar
      Fix tracing dinov2 (#27561) · 0145c682
      amyeroberts authored
      * Enable tracing with DINOv2 model
      
      * ABC
      
      * Add note to model doc
      0145c682
    • fxmarty's avatar
      Fix flash attention bugs with Mistral and Falcon (#27625) · 82cc0a79
      fxmarty authored
      * fix various bugs with flash attention
      
      * bump
      
      * fix test
      
      * fix mistral
      
      * use skiptest instead of return that may be misleading
      
      * fix on review
      82cc0a79
    • fxmarty's avatar
      Add RoCm scheduled CI & upgrade RoCm CI to PyTorch 2.1 (#26940) · f93c1e9e
      fxmarty authored
      
      
      * add scheduled ci on amdgpu
      
      * fix likely typo
      
      * more tests, avoid parallelism
      
      * precise comment
      
      * fix report channel
      
      * trigger docker build on this branch
      
      * fix
      
      * fix
      
      * run rocm scheduled ci
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      ---------
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      f93c1e9e
    • Leo Tronchon's avatar
      Idefics: Fix information leak with cross attention gate in modeling (#26839) · 851a4f70
      Leo Tronchon authored
      
      
      * fix image_attention gate in idefics modeling
      
      * update comment
      
      * cleaner gating
      
      * fix gate condition
      
      * create attention gate once
      
      * update comment
      
      * update doc of cross-attention forward
      
      * improve comment
      
      * bring back no_images
      
      * pass cross_attention_gate similarly  to no_images gate
      
      * add information on gate shape
      
      * fix no_images placement
      
      * make tests for gate
      
      * take off no_images logic
      
      * update test based on comments
      
      * raise value error if cross_attention_gate is None
      
      * send cross_attention_gate to device
      
      * Revert "send cross_attention_gate to device"
      
      This reverts commit 054f84228405bfa2e75fecc502f6a96dc83cdc0b.
      
      * send cross_attention_gate to device
      
      * fix device in test + nit
      
      * fill hidden_states with zeros instead of multiplying with the gate
      
      * style
      
      * Update src/transformers/models/idefics/modeling_idefics.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/idefics/modeling_idefics.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      851a4f70
    • Joao Gante's avatar
    • NielsRogge's avatar
      [ConvNext] Improve backbone (#27621) · ade7af93
      NielsRogge authored
      * Improve convnext backbone
      
      * Fix convnext2
      ade7af93
    • Younes Belkada's avatar
      [`core` / `gradient_checkpointing`] add support for old GC method (#27610) · 0e6794ff
      Younes Belkada authored
      * add support for old GC method
      
      * add also disable
      
      * up
      
      * oops
      0e6794ff
    • Dave Berenbaum's avatar
      dvclive callback: warn instead of fail when logging non-scalars (#27608) · 8eb9e29d
      Dave Berenbaum authored
      * dvclive callback: warn instead of fail when logging non-scalars
      
      * tests: log lr as scalar
      8eb9e29d
  2. 20 Nov, 2023 9 commits
  3. 19 Nov, 2023 1 commit
  4. 18 Nov, 2023 1 commit
  5. 17 Nov, 2023 7 commits
  6. 16 Nov, 2023 12 commits