"docs/source/vscode:/vscode.git/clone" did not exist on "7c63e6fc8c34dcf8b0121eaee776f41ccf3b1137"
-
Ramiro Leal-Cavazos authored
* Remove unnecessary `view` of `position_ids` in `modeling_llama` When `position_ids` is `None`, its value is generated using `torch.arange`, which creates a tensor of size `(seq_length + past_key_values_length) - past_key_values_length = seq_length`. The tensor is then unsqueezed, resulting in a tensor of shape `(1, seq_length)`. This means that the last `view` to a tensor of shape `(-1, seq_length)` is a no-op. This commit removes the unnecessary view. * Remove no-op `view` of `position_ids` in rest of transformer models
8878eb1b