1. 17 Jul, 2024 7 commits
  2. 16 Jul, 2024 13 commits
  3. 15 Jul, 2024 7 commits
  4. 14 Jul, 2024 4 commits
  5. 12 Jul, 2024 2 commits
  6. 11 Jul, 2024 7 commits
    • jiqing-feng's avatar
      [Bug Fix] fix qa pipeline tensor to numpy (#31585) · aec1ca3a
      jiqing-feng authored
      * fix qa pipeline
      
      * fix tensor to numpy
      aec1ca3a
    • Naman Garg's avatar
      Adding hiera (#30356) · c1e139c2
      Naman Garg authored
      
      
      * initialized Structure
      
      * Updated variable names
      
      * Added Config class, basic HF setup, convert_to_hf
      
      * Fixed Convert function, added hiera to HF files, Initilized test files
      
      * better naming for x in forward pass
      
      * Moved utils to hiera
      
      * Change hiera -> hiera_model
      
      * Fixed integration into tranformers
      
      * Fix: Convert Checkpoint
      
      * added documentation for hiera
      
      * added documentation for hiera
      
      * added Docstings to models, Transformers based changes
      
      * make style and quality
      
      * make style and quality
      
      * Integration & Block tests running
      
      * Fixed bugs
      
      * initialized Structure
      
      * Updated variable names
      
      * Added Config class, basic HF setup, convert_to_hf
      
      * Fixed Convert function, added hiera to HF files, Initilized test files
      
      * better naming for x in forward pass
      
      * Moved utils to hiera
      
      * Change hiera -> hiera_model
      
      * Fixed integration into tranformers
      
      * Fix: Convert Checkpoint
      
      * added documentation for hiera
      
      * added documentation for hiera
      
      * added Docstings to models, Transformers based changes
      
      * make style and quality
      
      * make style and quality
      
      * Integration & Block tests running
      
      * Fixed bugs
      
      * Removed tim dependency
      
      * added HieraBlock
      
      * fixed: Model name
      
      * added tests for HieraModel, HieraBlock
      
      * fixed imports
      
      * fixed quality & copies
      
      * Fixes
      
      * Update docs/source/en/model_doc/hiera.md
      
      Fix name
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update docs/source/en/model_doc/hiera.md
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update docs/source/en/model_doc/hiera.md
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/transformers/models/hiera/configuration_hiera.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/transformers/models/hiera/configuration_hiera.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/transformers/models/hiera/modeling_hiera.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Update src/transformers/models/hiera/modeling_hiera.py
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      
      * Fixed formatting
      
      * Code quality & Import differences
      
      * quality and repo-consistency fix
      
      * fixed no torch error
      
      * Docstring fix
      
      * Docstring fix
      
      * doc string fix
      
      * fixed example usage
      
      * Resolved issues in modeling_hiera
      
      * Removed Hiera MAE
      
      * Added test and resolved bug
      
      * fixed doc string
      
      * First commit
      
      * Finished conversion script and model forward working
      
      * Resolved all issues
      
      * nits
      
      * Improving tests
      
      * Nits
      
      * More nits
      
      * Improving HieraForMaskedImageModeling
      
      * More improvements and nits
      
      * Fixed docstrings of outputs
      
      * More fixes
      
      * More imrpovments
      
      * Updated conversion script
      
      * Fixed docstrings
      
      * Improved tests
      
      * Fixed attentou outputs test
      
      * All tests green
      
      * Removed unnecessary file
      
      * contribution attribution
      
      * Resolved a few issues
      
      * Resolved Comments
      
      * Updated model repo id and fixed bugs
      
      * Removed loss print
      
      * Make tests green
      
      * Updated docstrings
      
      * Fix style
      
      * Fixed num_heads in config
      
      * Removed unnecessary video checkpoint related code in the conversion script
      
      * Fix style
      
      * Changed atol in conversion script
      
      * HieraConfig
      
      * Fix copies
      
      * Fixed typo
      
      * Resolved few issues
      
      * make
      
      * converted conv_nd -> nn.Module
      
      * Removed video complexities
      
      * Removed video complexities
      
      * fix style
      
      * Addressing comments
      
      * Update src/transformers/models/hiera/modeling_hiera.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/hiera/modeling_hiera.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/hiera/modeling_hiera.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Fix style
      
      * Fixed tests
      
      * Fixed typo
      
      * Fixed interpolate test
      
      * Made torch fx compatible
      
      * Made sure imageprocesor is correct
      
      * Addressed comments
      
      * Noise directly as torch
      
      * Remove unnecesary attr
      
      * Added return_dit
      
      * Update src/transformers/models/hiera/__init__.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Updated checkpoints
      
      * [run_slow] hiera
      
      * Fixed device mismatch
      
      * [run_slow] hiera
      
      * Fixed GPU tests
      
      * [run_slow] hiera
      
      ---------
      Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-29-50.us-east-2.compute.internal>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      Co-authored-by: default avatarEduardo Pacheco <eduardo.pach@hotmail.com>
      Co-authored-by: default avatarEduardo Pacheco <69953243+EduardoPach@users.noreply.github.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      c1e139c2
    • Apoorv Khandelwal's avatar
      Allow `Trainer.get_optimizer_cls_and_kwargs` to be overridden (#31875) · 574e68d5
      Apoorv Khandelwal authored
      * Change `Trainer.get_optimizer_cls_and_kwargs` to `self.`
      
      * Make `get_optimizer_cls_and_kwargs` an instance method
      
      * Fixing typo
      
      * Revert `get_optimizer_cls_and_kwargs` to staticmethod
      
      * restore newline to trainer.py eof
      574e68d5
    • t11s's avatar
      🚨 fix(SigLip): remove spurious exclusion of first vision output token (#30952) · 52585019
      t11s authored
      fix(SigLip): remove spurious exclusion of first vision output token in classifier
      52585019
    • Joao Gante's avatar
      Generate: fix `SlidingWindowCache.reset()` (#31917) · 6a05f68f
      Joao Gante authored
      fix sliding cache
      6a05f68f
    • Arthur's avatar
      Refactor flash attention implementation in transformers (#31446) · e3143952
      Arthur authored
      
      
      * dumb commit
      
      * nit
      
      * update
      
      * something like this
      
      * unpack in modeling utils
      
      * safe import
      
      * oups
      
      * update
      
      * nits
      
      * diff convert gemma
      
      * update
      
      * start propagating
      
      * udpate other modeling code as well
      
      * update for sliding window models
      
      * nits
      
      * more init cleanups
      
      * styling
      
      * fixup
      
      * noice
      
      * pass fixup
      
      * typo typing_extension -> typing_extensions
      
      * torch.nn.functionnal -> torch.nn.functional
      
      * add to import structure
      
      * unpack
      
      * simplify a bit more for this first version
      
      * nut
      
      * update
      
      * update
      
      * nit
      
      * ease the import of `Unpack`
      
      * remove useless `use_sliding_window`
      
      * no qua please
      
      * protect import?
      
      * style
      
      * [run-slow]
      
      * [run slow] llama,gemma,mistral,mixtral
      
      * remove extra kwargs
      
      * fix llama
      
      * address review comments
      
      * apply diff_model_converter to modeling_gemma.py
      
      * remove cache_position 1
      
      * remove cache_position 2
      
      * some cleaning
      
      * refactor gemma2 as well
      
      * apply review comments
      
      * rename file to modeling_flash_attention_utils.py
      
      * siglip refactor
      
      * remove dead code
      
      * is the hub down?
      
      * still down?
      
      * fix siglip
      
      * fix gemma2
      
      * fatal: Could not read from remote repository.
      
      * fix typo in softcap implem
      
      * flacky
      
      * Failed: Timeout >120.0s
      
      ---------
      Co-authored-by: default avatarfxmarty <9808326+fxmarty@users.noreply.github.com>
      e3143952
    • fxmarty's avatar
      Fix fx tests with inputs_embeds (#31862) · ad4ef3a2
      fxmarty authored
      * fix tests
      
      * [test_all] check
      
      * address review comments
      ad4ef3a2