1. 03 Jun, 2024 17 commits
    • Yih-Dar's avatar
      Fix GPU OOM for `mistral.py::Mask4DTestHard` (#31212) · 8a1a23ae
      Yih-Dar authored
      
      
      * build
      
      * build
      
      * build
      
      * build
      
      ---------
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      8a1a23ae
    • miivanov90's avatar
      Set greater_is_better to False if metric_for_best_model ends with "loss" (#31142) · df5abae8
      miivanov90 authored
      * update to not(endswith(loss))
      
      * ruff formatting
      df5abae8
    • Younes Belkada's avatar
      Cohere: Fix copied from (#31213) · 924c46d4
      Younes Belkada authored
      Update modeling_cohere.py
      924c46d4
    • Jade Choghari's avatar
      Wrong translation FR : Contents = Contenu (#31186) · 98dd8423
      Jade Choghari authored
      Update index.md - Contents = Contenu
      
      French typo -
      Contents = Contenu
      98dd8423
    • Qubitium's avatar
      Rename sanity_evaluation to eval_on_start (#31192) · c6c78733
      Qubitium authored
      * Rename sanity_evaluation to eval_on_start
      
      * move arg back to last
      c6c78733
    • Bojun Feng's avatar
      Fix typo in utils (#31169) · c230504b
      Bojun Feng authored
      fix typo
      c230504b
    • Sangbum Daniel Choi's avatar
      fix the get_size_with_aspect_ratio in max_size situation (#30902) · 874ac129
      Sangbum Daniel Choi authored
      
      
      * fix the get_size_with_aspect_ratio in max_size situation
      
      * make fix-up
      
      * add more general solution
      
      * consider when max_size is not defined
      
      * fix typo
      
      * fix typo
      
      * simple fix
      
      * fix error
      
      * fix if else error
      
      * fix error of size overwrite
      
      * fix yolos image processing
      
      * fix detr image processing
      
      * make
      
      * add longest related test script
      
      * Update src/transformers/models/yolos/image_processing_yolos.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * add more test
      
      * add test script about longest size
      
      * remove deprecated
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      874ac129
    • Isotr0py's avatar
      Add Qwen2 GGUF loading support (#31175) · e4628434
      Isotr0py authored
      * add qwen2 gguf support
      
      * Update docs
      
      * fix qwen2 tokenizer
      
      * add qwen2 gguf test
      
      * fix typo in qwen2 gguf test
      
      * format code
      
      * Remove mistral, clarify the error message
      
      * format code
      
      * add typing and update docstring
      e4628434
    • Yih-Dar's avatar
      Fix `test_compile_static_cache` (#30991) · df848acc
      Yih-Dar authored
      
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      ---------
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      df848acc
    • NielsRogge's avatar
      馃毃 [Mistral and friends] Update MLP (#31057) · 70c87138
      NielsRogge authored
      Update MLP
      70c87138
    • Joao Gante's avatar
      SlidingWindowCache: reduce differences to other Cache classes (#30970) · d475f767
      Joao Gante authored
      * tmp commit
      
      * sliding window with fewer differences
      
      * make fixup + rebase
      
      * missing overwrite
      d475f767
    • fxmarty's avatar
      Ignore non-causal mask in more cases with SDPA (#30138) · 221aaec6
      fxmarty authored
      * update non-causal mask for sdpa
      
      * add test
      
      * update docstrings
      
      * add one more test
      
      * fix cross attention bug
      
      * gentler atol/rtol
      221aaec6
    • Pavithra Devi M's avatar
      Fix Cannot convert [array()] to EagerTensor of dtype int64 (#31109) · f4f69625
      Pavithra Devi M authored
      While running the model.prepare_tf_dataset() method,
      it raises the error below:
      ```
      TypeError: Cannot convert [array([322.,   1.])] to EagerTensor of dtype int64
      ```
      
      This happens, in  "DataCollatorForSeq2Seq" function when we are try
      to convert the labels to tensors. While converting the labels to tensors,
      the labels can be in the format of list of list or list of ndarrays.
      There is no problem converting the list of list lables. There is a problem
      when the list of ndarrays are float values(like below).
      
      ```
      [array([322.,   1.])]
      ```
      
      so the exception raises while trying to convert this label to tensors using
      below code.
      
      ```
      batch["labels"] = tf.constant(batch["labels"], dtype=tf.int64)
      ```
      
      The labels are always integer values, so this got converted to float
      values in the label padding operation below.
      ```
      batch["labels"] = [
                          call(label)
                          if padding_side == "right"
                          else np.concatenate([[self.label_pad_token_id] * (max_label_length - len(label)), label])
                          for label in labels
                          ]
      ```
      Here we have 2 cases:
      1 - Concatenating an array having integer padding token value with labels.
      2 - Concatenating an empty array with labels.
      
      ----------------------------------------------------------------------------------------
      case 1: Concatenating an array having integer padding token value with labels.
      WORKS EXPECTED:
      ----------------------------------------------------------------------------------------
      ```
      label = np.array([233, 1])
      max_label_length = 4
      label_pad_token_id = -100
      np.concatenate([[label_pad_token_id] * (max_label_length - len(label)), label])
      o/p:
      array([-100, -100,  233,    1])
      ```
      
      ----------------------------------------------------------------------------------------
      Case 2: Concatenating an empty array with labels.
      GIVES THE ISSUE:
      This scenorio can happen when the label has the maximum label length -- No padding needed.
      ----------------------------------------------------------------------------------------
      ```
      label = np.array([233, 1])
      max_label_length = 2
      label_pad_token_id = -100
      np.concatenate([[label_pad_token_id] * (max_label_length - len(label)), label])
      o/p:
      array([233.,   1.])
      ```
      
      ----------------------------------------------------------------------------------------
      Solution:
      ----------------------------------------------------------------------------------------
      We need to concatenate a ndarray of dtype int with labels.
      
      AFTER FIX:
      ----------
      case 1:
      ```
      
      label = np.array([233, 1])
      max_label_length = 4
      label_pad_token_id = -100
      np.concatenate([np.array([label_pad_token_id] * (max_label_length - len(label)), dtype=np.int64),label])
      
      o/p:
      array([-100, -100,  233,    1])
      ```
      
      case 2:
      ```
      
      label = np.array([233, 1])
      max_label_length = 2
      label_pad_token_id = -100
      np.concatenate([np.array([label_pad_token_id] * (max_label_length - len(label)), dtype=np.int64),label])
      
      o/p:
      array([233,   1])
      ```
      f4f69625
    • Arthur's avatar
      [`GemmaModel`] fix small typo (#31202) · 1749841a
      Arthur authored
      * fixes
      
      * fix-copies
      1749841a
    • Ahmed Moubtahij's avatar
      Token healing (#30081) · 39b2ff69
      Ahmed Moubtahij authored
      
      
      * token healing impl + trie with extensions
      
      * make fixup
      
      * prefix-robust space tokenization
      
      * examples readme and requirements
      
      * make fixup
      
      * allow input prompt and model
      
      * redundant defaults
      
      * Specialized Trie
      
      * make fixup
      
      * updated tests with new inherited Tree
      
      * input ids to auto device_map
      
      * rm unused import
      
      * Update src/transformers/generation/utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * naming convention
      
      * Revert "naming convention"
      
      This reverts commit dd39d9c5b7a969e2d8a8d2a8e54f121b82dc44f0.
      
      * naming convention
      
      * last -hopefully- changes
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      39b2ff69
    • amyeroberts's avatar
      Remove copied froms for deprecated models (#31153) · 5b5b48b1
      amyeroberts authored
      * Remove copied froms for deprecated models
      
      * Remove automatically in script
      5b5b48b1
    • CharlesCNorton's avatar
      Fix typo: use_safetenstors to use_safetensors (#31184) · 97e5a707
      CharlesCNorton authored
      Corrected a typo in security.md. Changed `use_safetenstors` to `use_safetensors` in the section discussing the usage of safe formats for loading models to prevent arbitrary code execution.
      97e5a707
  2. 31 May, 2024 10 commits
    • Arthur's avatar
      Diff converter v2 (#30868) · 96eb0628
      Arthur authored
      * current working example!
      
      * commit regex and result file
      
      * update
      
      * nit
      
      * push the conversion file
      
      * oups
      
      * roadmap and nits
      
      * attempt diffs for 3 files
      
      * persimmon
      
      * nit
      
      * add diff file that is the same as the modeling_llama.py
      
      * fix rope nits
      
      * updates
      
      * updates with converted versions
      
      * give some breathing space to the code
      
      * delete
      
      * update
      
      * update
      
      * push the actual result
      
      * update regex patterns
      
      * update regex patterns
      
      * fix some issues
      
      * fix some issues
      
      * fix some issues
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * updates
      
      * revert changes done to llama
      
      * updates
      
      * update gemma
      
      * updates
      
      * oups
      
      * current state
      
      * current state
      
      * update
      
      * ouiiii
      
      * nit
      
      * clear diffs
      
      * nit
      
      * fixup
      
      * update
      
      * doc 馃殌
      
      * 馃敟
      
      * for now use gemma
      
      * deal with comments
      
      * style
      
      * handle funtions
      
      * deal with assigns
      
      * todos
      
      * process inheritage
      
      * keep decorators?
      
      * 馃
      
      * deal with duplicates
      
      * fixup
      
      * correctly remove duplicate code
      
      * run ruff post script
      
      * ruff deals pretty well with imports, let's leave it to him
      
      * ah maybe not lol
      
      * for now remove all imports from child.
      
      * nit
      
      * conversion of llama
      
      * okay
      
      * convert starcoder2
      
      * synch with main
      
      * update llama diff
      
      * updates
      
      * https://docs.astral.sh/ruff/rules/redefined-while-unused/
      
       fixes the imports, bit needs later version of ruff
      
      * updates
      
      * okay actual state
      
      * non zero exit
      
      * update!
      
      * revert unrelated
      
      * remove other diff files
      
      * updates
      
      * cleanup
      
      * update
      
      * less diff!
      
      * stash
      
      * current updates
      
      * updates
      
      * No need for call
      
      * finished fining deps
      
      * update
      
      * current changes
      
      * current state
      
      * current state
      
      * new status
      
      * nit
      
      * finally
      
      * fixes
      
      * nits
      
      * order is now expected
      
      * use logger info instead of prints
      
      * fixup
      
      * up
      
      * nit
      
      * update
      
      * nits
      
      * update
      
      * correct merge
      
      * update
      
      * update
      
      * update
      
      * add warning
      
      * update caution message
      
      * update
      
      * better merging strategy
      
      * copy class statements :wink
      
      * fixups
      
      * nits
      
      * update
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * nits
      
      * smaller header
      
      * do cleanup some stuff
      
      * even simpler header?
      
      * fixup
      
      * updates
      
      * ruff
      
      * update examples
      
      * nit
      
      * TODO
      
      * state
      
      * OUUUUUUF
      
      * current state
      
      * nits
      
      * final state
      
      * add a readme
      
      * fixup
      
      * remove diff llama
      
      * fix
      
      * nit
      
      * dummy noy funny
      
      * ruff format tests src utils --check
      
      * everless diffs
      
      * less diffs and fix test
      
      * fixes
      
      * naming nit?
      
      * update converter and add supper example
      
      * nits
      
      * updated for function signatures
      
      * update
      
      * update
      
      * add converted dummies
      
      * autoformat
      
      * single target assign fix
      
      * fixup
      
      * fix some imports
      
      * fixes
      
      * don't push them
      
      * `# noqa: F841`
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      96eb0628
    • Vallepu Vamsi Krishna's avatar
      Added description of quantization_config (#31133) · 372baec2
      Vallepu Vamsi Krishna authored
      * Description of quantization_config
      
      Added missing description about quantization_config in replace_with_bnb_linear for better readability.
      
      * Removed trailing spaces
      372baec2
    • Pavel Iakubovskii's avatar
      Instance segmentation examples (#31084) · cdc81311
      Pavel Iakubovskii authored
      
      
      * Initial setup
      
      * Metrics
      
      * Overfit on two batches
      
      * Train 40 epochs
      
      * Memory leak debugging
      
      * Trainer fine-tuning
      
      * Draft
      
      * Fixup
      
      * Trained end-to-end
      
      * Add requirements
      
      * Rewrite evaluator
      
      * nits
      
      * Add readme
      
      * Add instance-segmentation to the table
      
      * Support void masks
      
      * Remove sh
      
      * Update docs
      
      * Add pytorch test
      
      * Add accelerate test
      
      * Update examples/pytorch/instance-segmentation/README.md
      
      * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py
      
      * Update examples/pytorch/instance-segmentation/run_instance_segmentation_no_trainer.py
      
      * Update examples/pytorch/instance-segmentation/run_instance_segmentation_no_trainer.py
      
      * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py
      
      * Fix consistency oneformer
      
      * Fix imports
      
      * Fix imports sort
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py
      Co-authored-by: default avatarSangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
      
      * Add resources to docs
      
      * Update examples/pytorch/instance-segmentation/README.md
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update examples/pytorch/instance-segmentation/README.md
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Remove explicit model_type argument
      
      * Fix tests
      
      * Update readme
      
      * Note about other models
      
      ---------
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      Co-authored-by: default avatarSangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      cdc81311
    • Aymeric Roucher's avatar
      Add streaming, various fixes (#30838) · 9837a254
      Aymeric Roucher authored
      * Implement streaming run in ReAct agents
      * Allow additional imports in code agents
      * Python interpreter: support classes and exceptions, fixes
      9837a254
    • Marc Sun's avatar
      [trainer] add sanity evaluation option (#31146) · f8e6ba45
      Marc Sun authored
      
      
      * add sanity evaluation
      
      * fix
      
      * Apply suggestions from code review
      Co-authored-by: default avatarZach Mueller <muellerzr@gmail.com>
      
      * fix
      
      ---------
      Co-authored-by: default avatarZach Mueller <muellerzr@gmail.com>
      f8e6ba45
    • Younes Belkada's avatar
      Quantization: Enhance bnb error message (#31160) · fc5d3e11
      Younes Belkada authored
      enhance error message
      fc5d3e11
    • Asif Ajrof's avatar
      Update sam.md (#31130) · bd9d1ddf
      Asif Ajrof authored
      `mask` variable is not defined. probably a writing mistake. it should be `segmentation_map`. `segmentation_map` should be a `1` channel image rather than `RGB`.
      [on a different note, the `mask_url` is the same as `raw_image`. could provide a better example.
      bd9d1ddf
    • Marc Sun's avatar
      Fix quantized cache output (#31143) · 48cada87
      Marc Sun authored
      48cada87
    • Yih-Dar's avatar
      pytest -rsfE (#31140) · d19566e8
      Yih-Dar authored
      
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      d19566e8
    • Arthur's avatar
      helper (#31152) · f3f640dc
      Arthur authored
      
      
      * helper
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * updates
      
      * more doc
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      f3f640dc
  3. 30 May, 2024 4 commits
  4. 29 May, 2024 9 commits