1. 10 Jun, 2023 1 commit
  2. 09 Jun, 2023 7 commits
  3. 08 Jun, 2023 2 commits
  4. 07 Jun, 2023 4 commits
  5. 06 Jun, 2023 4 commits
    • Sylvain Gugger's avatar
      Remote code improvements (#23959) · f1660d7e
      Sylvain Gugger authored
      
      
      * Fix model load when it has both code on the Hub and locally
      
      * Add input check with timeout
      
      * Add tests
      
      * Apply suggestions from code review
      Co-authored-by: default avatarLysandre Debut <lysandre.debut@reseau.eseo.fr>
      
      * Some non-saved stuff
      
      * Add feature extractors
      
      * Add image processor
      
      * Add model
      
      * Add processor and tokenizer
      
      * Reduce timeout
      
      ---------
      Co-authored-by: default avatarLysandre Debut <lysandre.debut@reseau.eseo.fr>
      f1660d7e
    • Matt's avatar
      Move TF building to an actual build() method (#23760) · 4a55e478
      Matt authored
      * A fun new PR where I break the entire codebase again
      
      * A fun new PR where I break the entire codebase again
      
      * Handle cross-attention
      
      * Move calls to model(model.dummy_inputs) to the new build() method
      
      * Seeing what fails with the build context thing
      
      * make fix-copies
      
      * Let's see what fails with new build methods
      
      * Fix the pytorch crossload build calls
      
      * Fix the overridden build methods in vision_text_dual_encoder
      
      * Make sure all our build methods set self.built or call super().build(), which also sets it
      
      * make fix-copies
      
      * Remove finished TODO
      
      * Tentatively remove unneeded (?) line
      
      * Transpose b in deberta correctly and remove unused threading local
      
      * Get rid of build_with_dummies and all it stands for
      
      * Rollback some changes to TF-PT crossloading
      
      * Correctly call super().build()
      4a55e478
    • amyeroberts's avatar
      Add TimmBackbone model (#22619) · a717e031
      amyeroberts authored
      
      
      * Add test_backbone for convnext
      
      * Add TimmBackbone model
      
      * Add check for backbone type
      
      * Tidying up - config checks
      
      * Update convnextv2
      
      * Tidy up
      
      * Fix indices & clearer comment
      
      * Exceptions for config checks
      
      * Correclty update config for tests
      
      * Safer imports
      
      * Safer safer imports
      
      * Fix where decorators go
      
      * Update import logic and backbone tests
      
      * More import fixes
      
      * Fixup
      
      * Only import all_models if torch available
      
      * Fix kwarg updates in from_pretrained & main rebase
      
      * Tidy up
      
      * Add tests for AutoBackbone
      
      * Tidy up
      
      * Fix import error
      
      * Fix up
      
      * Install nattan in doc_test_job
      
      * Revert back to setting self._out_xxx directly
      
      * Bug fix - out_indices mapping from out_features
      
      * Fix tests
      
      * Dont accept output_loading_info for Timm models
      
      * Set out_xxx and don't remap
      
      * Use smaller checkpoint for test
      
      * Don't remap timm indices - check out_indices based on stage names
      
      * Skip test as it's n/a
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Cleaner imports / spelling is hard
      
      ---------
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      a717e031
    • Sylvain Gugger's avatar
  6. 05 Jun, 2023 3 commits
  7. 02 Jun, 2023 3 commits
  8. 01 Jun, 2023 1 commit
  9. 31 May, 2023 5 commits
    • amyeroberts's avatar
      Bug fix - flip_channel_order for channels first images (#23701) · c608b8fc
      amyeroberts authored
      Bug fix - flip_channel_order for channels_first
      c608b8fc
    • Connor Henderson's avatar
      fix: Replace `add_prefix_space` in `get_prompt_ids` with manual space for... · 7adce8b5
      Connor Henderson authored
      fix: Replace `add_prefix_space` in `get_prompt_ids` with manual space for FastTokenizer compatibility (#23796)
      
      * add ' ' replacement for add_prefix_space
      
      * add fast tokenizer test
      7adce8b5
    • Sanchit Gandhi's avatar
      Unpin numba (#23162) · 8f915c45
      Sanchit Gandhi authored
      * fix for ragged list
      
      * unpin numba
      
      * make style
      
      * np.object -> object
      
      * propagate changes to tokenizer as well
      
      * np.long -> "long"
      
      * revert tokenization changes
      
      * check with tokenization changes
      
      * list/tuple logic
      
      * catch numpy
      
      * catch else case
      
      * clean up
      
      * up
      
      * better check
      
      * trigger ci
      
      * Empty commit to trigger CI
      8f915c45
    • Sourab Mangrulkar's avatar
      accelerate deepspeed and gradient accumulation integrate (#23236) · a73b1d59
      Sourab Mangrulkar authored
      * mixed precision support via accelerate
      
      * fix issues
      
      * fix for the sharded ddp case
      
      * fix flax and tf failing tests
      
      * `refactor the place to create `Accelerator` object
      
      * move ddp prep to accelerate
      
      * fix 😅
      
      * resolving comments
      
      * move fsdp handling to accelerate
      
      * fixex
      
      * fix saving
      
      * shift torch dynamo handling to accelerate
      
      * shift deepspeed integration and save & load utils to accelerate
      
      * fix accelerate launcher support
      
      * oops
      
      * fix 🐛
      
      * save ckpt fix
      
      * Trigger CI
      
      * nasty 🐛 😅
      
      * as deepspeed needs grad_acc fixes, transfer grad_acc to accelerate
      
      * make tests happy
      
      * quality 
      
      * loss tracked needs to account for grad_acc
      
      * fixing the deepspeed tests
      
      * quality 
      
      * 😅😅😅
      
      * tests 😡
      
      * quality 
      
      
      
      * Trigger CI
      
      * resolve comments and fix the issue with the previous merge from branch
      
      * Trigger CI
      
      * accelerate took over deepspeed integration
      
      ---------
      Co-authored-by: default avatarStas Bekman <stas@stason.org>
      a73b1d59
    • Denisa Roberts's avatar
      Add TensorFlow implementation of EfficientFormer (#22620) · 88f50a1e
      Denisa Roberts authored
      * Add tf code for efficientformer
      
      * Fix return dict bug - return last hidden state after last stage
      
      * Fix corresponding return dict bug
      
      * Override test tol
      
      * Change default values of training to False
      
      * Set training to default False X3
      
      * Rm axis from ln
      
      * Set init in dense projection
      
      * Rm debug stuff
      
      * Make style; all tests pass.
      
      * Modify year to 2023
      
      * Fix attention biases codes
      
      * Update the shape list logic
      
      * Add a batch norm eps config
      
      * Remove extract comments in test files
      
      * Add conditional attn and hidden states return for serving output
      
      * Change channel dim checking logic
      
      * Add exception for withteacher model in training mode
      
      * Revert layer count for now
      
      * Add layer count for conditional layer naming
      
      * Transpose for conv happens only in main layer
      
      * Make tests smaller
      
      * Make style
      
      * Update doc
      
      * Rm from_pt
      
      * Change to actual expect image class label
      
      * Remove stray print in tests
      
      * Update image processor test
      
      * Remove the old serving output logic
      
      * Make style
      
      * Make style
      
      * Complete test
      88f50a1e
  10. 30 May, 2023 3 commits
  11. 25 May, 2023 1 commit
  12. 24 May, 2023 6 commits
    • Daniel King's avatar
      Fix the regex in `get_imports` to support multiline try blocks and excepts... · 89159651
      Daniel King authored
      Fix the regex in `get_imports` to support multiline try blocks and excepts with specific exception types (#23725)
      
      * fix and test get_imports for multiline try blocks, and excepts with specific errors
      
      * fixup
      
      * add some more tests
      
      * add license
      89159651
    • Sanchit Gandhi's avatar
      d8222be5
    • Matt's avatar
      Overhaul TF serving signatures + dummy inputs (#23234) · 814de8fa
      Matt authored
      * Let's try autodetecting serving sigs
      
      * Don't clobber existing sigs
      
      * Change shapes for multiplechoice models
      
      * Make default dummy inputs smarter too
      
      * Fix missing f-string
      
      * Let's YOLO a serving output too
      
      * Read __class__.__name__ properly
      
      * Don't just pass naked lists in there and expect it to be okay
      
      * Code cleanup
      
      * Update default serving sig
      
      * Clearer error messages
      
      * Further updates to the default serving output
      
      * make fixup
      
      * Update the serving output a bit more
      
      * Cleanups and renames, raise errors appropriately when we can't infer inputs
      
      * More renames
      
      * we're building in a functional context again, yolo
      
      * import DUMMY_INPUTS from the right place
      
      * import DUMMY_INPUTS from the right place
      
      * Support cross-attention in the dummies
      
      * Support cross-attention in the dummies
      
      * Complete removal of dummy/serving overrides in BERT
      
      * Complete removal of dummy/serving overrides in RoBERTa
      
      * Obliterate lots and lots of serving sig and dummy overrides
      
      * merge type hint changes
      
      * Fix for token_type_ids with vocab_size 1
      
      * Add missing property decorator
      
      * Fix T5 and hopefully some models that take conv inputs
      
      * More signature pruning
      
      * Fix T5's signature
      
      * Fix Wav2Vec2 signature
      
      * Fix LongformerForMultipleChoice input signature
      
      * Fix BLIP and LED
      
      * Better default serving output error handling
      
      * Fix BART dummies
      
      * Fix dummies for cross-attention, esp encoder-decoder models
      
      * Fix visionencoderdecoder signature
      
      * Fix BLIP serving output
      
      * Small tweak to BART dummies
      
      * Cleanup the ugly parameter inspection line that I used in a few places
      
      * committed a breakpoint again
      
      * Move the text_dims check
      
      * Remove blip_text serving_output
      
      * Add decoder_input_ids to the default input sig
      
      * Remove all the manual overrides for encoder-decoder model signatures
      
      * Tweak longformer/led input sigs
      
      * Tweak default serving output
      
      * output.keys() -> output
      
      * make fixup
      814de8fa
    • Matt's avatar
      Better TF docstring types (#23477) · f8b25744
      Matt authored
      * Rework TF type hints to use | None instead of Optional[] for tf.Tensor
      
      * Rework TF type hints to use | None instead of Optional[] for tf.Tensor
      
      * Don't forget the imports
      
      * Add the imports to tests too
      
      * make fixup
      
      * Refactor tests that depended on get_type_hints
      
      * Better test refactor
      
      * Fix an old hidden bug in the test_keras_fit input creation code
      
      * Fix for the Deit tests
      f8b25744
    • Tim Dettmers's avatar
      Paged Optimizer + Lion Optimizer for Trainer (#23217) · 796162c5
      Tim Dettmers authored
      
      
      * Added lion and paged optimizers and made original tests pass.
      
      * Added tests for paged and lion optimizers.
      
      * Added and fixed optimizer tests.
      
      * Style and quality checks.
      
      ---------
      Co-authored-by: default avataryounesbelkada <younesbelkada@gmail.com>
      796162c5
    • Tim Dettmers's avatar
      4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479) · 9d73b922
      Tim Dettmers authored
      
      
      * Added lion and paged optimizers and made original tests pass.
      
      * Added tests for paged and lion optimizers.
      
      * Added and fixed optimizer tests.
      
      * Style and quality checks.
      
      * Initial draft. Some tests fail.
      
      * Fixed dtype bug.
      
      * Fixed bug caused by torch_dtype='auto'.
      
      * All test green for 8-bit and 4-bit layers.
      
      * Added fix for fp32 layer norms and bf16 compute in LLaMA.
      
      * Initial draft. Some tests fail.
      
      * Fixed dtype bug.
      
      * Fixed bug caused by torch_dtype='auto'.
      
      * All test green for 8-bit and 4-bit layers.
      
      * Added lion and paged optimizers and made original tests pass.
      
      * Added tests for paged and lion optimizers.
      
      * Added and fixed optimizer tests.
      
      * Style and quality checks.
      
      * Fixing issues for PR #23479.
      
      * Added fix for fp32 layer norms and bf16 compute in LLaMA.
      
      * Reverted variable name change.
      
      * Initial draft. Some tests fail.
      
      * Fixed dtype bug.
      
      * Fixed bug caused by torch_dtype='auto'.
      
      * All test green for 8-bit and 4-bit layers.
      
      * Added lion and paged optimizers and made original tests pass.
      
      * Added tests for paged and lion optimizers.
      
      * Added and fixed optimizer tests.
      
      * Style and quality checks.
      
      * Added missing tests.
      
      * Fixup changes.
      
      * Added fixup changes.
      
      * Missed some variables to rename.
      
      * revert trainer tests
      
      * revert test trainer
      
      * another revert
      
      * fix tests and safety checkers
      
      * protect import
      
      * simplify a bit
      
      * Update src/transformers/trainer.py
      
      * few fixes
      
      * add warning
      
      * replace with `load_in_kbit = load_in_4bit or load_in_8bit`
      
      * fix test
      
      * fix tests
      
      * this time fix tests
      
      * safety checker
      
      * add docs
      
      * revert torch_dtype
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * multiple fixes
      
      * update docs
      
      * version checks and multiple fixes
      
      * replace `is_loaded_in_kbit`
      
      * replace `load_in_kbit`
      
      * change methods names
      
      * better checks
      
      * oops
      
      * oops
      
      * address final comments
      
      ---------
      Co-authored-by: default avataryounesbelkada <younesbelkada@gmail.com>
      Co-authored-by: default avatarYounes Belkada <49240599+younesbelkada@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      9d73b922