1. 03 Jun, 2024 7 commits
    • Joao Gante's avatar
      SlidingWindowCache: reduce differences to other Cache classes (#30970) · d475f767
      Joao Gante authored
      * tmp commit
      
      * sliding window with fewer differences
      
      * make fixup + rebase
      
      * missing overwrite
      d475f767
    • fxmarty's avatar
      Ignore non-causal mask in more cases with SDPA (#30138) · 221aaec6
      fxmarty authored
      * update non-causal mask for sdpa
      
      * add test
      
      * update docstrings
      
      * add one more test
      
      * fix cross attention bug
      
      * gentler atol/rtol
      221aaec6
    • Pavithra Devi M's avatar
      Fix Cannot convert [array()] to EagerTensor of dtype int64 (#31109) · f4f69625
      Pavithra Devi M authored
      While running the model.prepare_tf_dataset() method,
      it raises the error below:
      ```
      TypeError: Cannot convert [array([322.,   1.])] to EagerTensor of dtype int64
      ```
      
      This happens, in  "DataCollatorForSeq2Seq" function when we are try
      to convert the labels to tensors. While converting the labels to tensors,
      the labels can be in the format of list of list or list of ndarrays.
      There is no problem converting the list of list lables. There is a problem
      when the list of ndarrays are float values(like below).
      
      ```
      [array([322.,   1.])]
      ```
      
      so the exception raises while trying to convert this label to tensors using
      below code.
      
      ```
      batch["labels"] = tf.constant(batch["labels"], dtype=tf.int64)
      ```
      
      The labels are always integer values, so this got converted to float
      values in the label padding operation below.
      ```
      batch["labels"] = [
                          call(label)
                          if padding_side == "right"
                          else np.concatenate([[self.label_pad_token_id] * (max_label_length - len(label)), label])
                          for label in labels
                          ]
      ```
      Here we have 2 cases:
      1 - Concatenating an array having integer padding token value with labels.
      2 - Concatenating an empty array with labels.
      
      ----------------------------------------------------------------------------------------
      case 1: Concatenating an array having integer padding token value with labels.
      WORKS EXPECTED:
      ----------------------------------------------------------------------------------------
      ```
      label = np.array([233, 1])
      max_label_length = 4
      label_pad_token_id = -100
      np.concatenate([[label_pad_token_id] * (max_label_length - len(label)), label])
      o/p:
      array([-100, -100,  233,    1])
      ```
      
      ----------------------------------------------------------------------------------------
      Case 2: Concatenating an empty array with labels.
      GIVES THE ISSUE:
      This scenorio can happen when the label has the maximum label length -- No padding needed.
      ----------------------------------------------------------------------------------------
      ```
      label = np.array([233, 1])
      max_label_length = 2
      label_pad_token_id = -100
      np.concatenate([[label_pad_token_id] * (max_label_length - len(label)), label])
      o/p:
      array([233.,   1.])
      ```
      
      ----------------------------------------------------------------------------------------
      Solution:
      ----------------------------------------------------------------------------------------
      We need to concatenate a ndarray of dtype int with labels.
      
      AFTER FIX:
      ----------
      case 1:
      ```
      
      label = np.array([233, 1])
      max_label_length = 4
      label_pad_token_id = -100
      np.concatenate([np.array([label_pad_token_id] * (max_label_length - len(label)), dtype=np.int64),label])
      
      o/p:
      array([-100, -100,  233,    1])
      ```
      
      case 2:
      ```
      
      label = np.array([233, 1])
      max_label_length = 2
      label_pad_token_id = -100
      np.concatenate([np.array([label_pad_token_id] * (max_label_length - len(label)), dtype=np.int64),label])
      
      o/p:
      array([233,   1])
      ```
      f4f69625
    • Arthur's avatar
      [`GemmaModel`] fix small typo (#31202) · 1749841a
      Arthur authored
      * fixes
      
      * fix-copies
      1749841a
    • Ahmed Moubtahij's avatar
      Token healing (#30081) · 39b2ff69
      Ahmed Moubtahij authored
      
      
      * token healing impl + trie with extensions
      
      * make fixup
      
      * prefix-robust space tokenization
      
      * examples readme and requirements
      
      * make fixup
      
      * allow input prompt and model
      
      * redundant defaults
      
      * Specialized Trie
      
      * make fixup
      
      * updated tests with new inherited Tree
      
      * input ids to auto device_map
      
      * rm unused import
      
      * Update src/transformers/generation/utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * naming convention
      
      * Revert "naming convention"
      
      This reverts commit dd39d9c5b7a969e2d8a8d2a8e54f121b82dc44f0.
      
      * naming convention
      
      * last -hopefully- changes
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      39b2ff69
    • amyeroberts's avatar
      Remove copied froms for deprecated models (#31153) · 5b5b48b1
      amyeroberts authored
      * Remove copied froms for deprecated models
      
      * Remove automatically in script
      5b5b48b1
    • CharlesCNorton's avatar
      Fix typo: use_safetenstors to use_safetensors (#31184) · 97e5a707
      CharlesCNorton authored
      Corrected a typo in security.md. Changed `use_safetenstors` to `use_safetensors` in the section discussing the usage of safe formats for loading models to prevent arbitrary code execution.
      97e5a707
  2. 31 May, 2024 10 commits
  3. 30 May, 2024 4 commits
  4. 29 May, 2024 12 commits
  5. 28 May, 2024 7 commits