Unverified Commit a074a5d3 authored by Joao Gante's avatar Joao Gante Committed by GitHub
Browse files

Docs: change some `input_ids` doc reference from `BertTokenizer` to `AutoTokenizer` (#24730)

parent 25411085
...@@ -32,7 +32,7 @@ LOGITS_PROCESSOR_INPUTS_DOCSTRING = r""" ...@@ -32,7 +32,7 @@ LOGITS_PROCESSOR_INPUTS_DOCSTRING = r"""
input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`): input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.encode`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.encode`] and
[`PreTrainedTokenizer.__call__`] for details. [`PreTrainedTokenizer.__call__`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)
......
...@@ -17,7 +17,7 @@ STOPPING_CRITERIA_INPUTS_DOCSTRING = r""" ...@@ -17,7 +17,7 @@ STOPPING_CRITERIA_INPUTS_DOCSTRING = r"""
input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`): input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.encode`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.encode`] and
[`PreTrainedTokenizer.__call__`] for details. [`PreTrainedTokenizer.__call__`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)
......
...@@ -576,7 +576,7 @@ BART_INPUTS_DOCSTRING = r""" ...@@ -576,7 +576,7 @@ BART_INPUTS_DOCSTRING = r"""
input_ids (`tf.Tensor` of shape `({0})`): input_ids (`tf.Tensor` of shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.encode`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.encode`] and
[`PreTrainedTokenizer.__call__`] for details. [`PreTrainedTokenizer.__call__`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)
......
...@@ -65,7 +65,7 @@ BRIDGETOWER_START_DOCSTRING = r""" ...@@ -65,7 +65,7 @@ BRIDGETOWER_START_DOCSTRING = r"""
BRIDGETOWER_INPUTS_DOCSTRING = r""" BRIDGETOWER_INPUTS_DOCSTRING = r"""
Args: Args:
input_ids (`torch.LongTensor` of shape `({0})`): input_ids (`torch.LongTensor` of shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices can be obtained using [`BertTokenizer`]. See Indices of input sequence tokens in the vocabulary. Indices can be obtained using [`AutoTokenizer`]. See
[`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details. [What are input [`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details. [What are input
IDs?](../glossary#input-ids) IDs?](../glossary#input-ids)
......
...@@ -943,7 +943,7 @@ CLIP_TEXT_INPUTS_DOCSTRING = r""" ...@@ -943,7 +943,7 @@ CLIP_TEXT_INPUTS_DOCSTRING = r"""
input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`): input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.__call__`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.__call__`] and
[`PreTrainedTokenizer.encode`] for details. [`PreTrainedTokenizer.encode`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)
...@@ -1000,7 +1000,7 @@ CLIP_INPUTS_DOCSTRING = r""" ...@@ -1000,7 +1000,7 @@ CLIP_INPUTS_DOCSTRING = r"""
input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`): input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.__call__`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.__call__`] and
[`PreTrainedTokenizer.encode`] for details. [`PreTrainedTokenizer.encode`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)
......
...@@ -882,7 +882,7 @@ FUNNEL_INPUTS_DOCSTRING = r""" ...@@ -882,7 +882,7 @@ FUNNEL_INPUTS_DOCSTRING = r"""
input_ids (`torch.LongTensor` of shape `({0})`): input_ids (`torch.LongTensor` of shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.encode`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.encode`] and
[`PreTrainedTokenizer.__call__`] for details. [`PreTrainedTokenizer.__call__`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)
......
...@@ -1502,7 +1502,7 @@ GROUPVIT_TEXT_INPUTS_DOCSTRING = r""" ...@@ -1502,7 +1502,7 @@ GROUPVIT_TEXT_INPUTS_DOCSTRING = r"""
input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`): input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.__call__`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.__call__`] and
[`PreTrainedTokenizer.encode`] for details. [`PreTrainedTokenizer.encode`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)
...@@ -1560,7 +1560,7 @@ GROUPVIT_INPUTS_DOCSTRING = r""" ...@@ -1560,7 +1560,7 @@ GROUPVIT_INPUTS_DOCSTRING = r"""
input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`): input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]` ``Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.__call__`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.__call__`] and
[`PreTrainedTokenizer.encode`] for details. [`PreTrainedTokenizer.encode`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)
......
...@@ -1560,7 +1560,7 @@ LED_INPUTS_DOCSTRING = r""" ...@@ -1560,7 +1560,7 @@ LED_INPUTS_DOCSTRING = r"""
input_ids (`tf.Tensor` of shape `({0})`): input_ids (`tf.Tensor` of shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.encode`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.encode`] and
[`PreTrainedTokenizer.__call__`] for details. [`PreTrainedTokenizer.__call__`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)
......
...@@ -106,7 +106,7 @@ MMBT_INPUTS_DOCSTRING = r""" ...@@ -106,7 +106,7 @@ MMBT_INPUTS_DOCSTRING = r"""
Encoder, the shape would be (batch_size, channels, height, width) Encoder, the shape would be (batch_size, channels, height, width)
input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`): input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`):
Indices of input sequence tokens in the vocabulary. It does not expect [CLS] token to be added as it's Indices of input sequence tokens in the vocabulary. It does not expect [CLS] token to be added as it's
appended to the end of other modality embeddings. Indices can be obtained using [`BertTokenizer`]. See appended to the end of other modality embeddings. Indices can be obtained using [`AutoTokenizer`]. See
[`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details. [`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)
......
...@@ -761,7 +761,7 @@ MOBILEBERT_INPUTS_DOCSTRING = r""" ...@@ -761,7 +761,7 @@ MOBILEBERT_INPUTS_DOCSTRING = r"""
input_ids (`torch.LongTensor` of shape `({0})`): input_ids (`torch.LongTensor` of shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.encode`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.encode`] and
[`PreTrainedTokenizer.__call__`] for details. [`PreTrainedTokenizer.__call__`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)
......
...@@ -960,7 +960,7 @@ T5_INPUTS_DOCSTRING = r""" ...@@ -960,7 +960,7 @@ T5_INPUTS_DOCSTRING = r"""
Indices of input sequence tokens in the vocabulary. T5 is a model with relative position embeddings so you Indices of input sequence tokens in the vocabulary. T5 is a model with relative position embeddings so you
should be able to pad the inputs on the right or the left. should be able to pad the inputs on the right or the left.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.__call__`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.__call__`] and
[`PreTrainedTokenizer.encode`] for details. [`PreTrainedTokenizer.encode`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)
......
...@@ -814,7 +814,7 @@ TRANSFO_XL_INPUTS_DOCSTRING = r""" ...@@ -814,7 +814,7 @@ TRANSFO_XL_INPUTS_DOCSTRING = r"""
input_ids (`tf.Tensor` or `Numpy array` of shape `(batch_size, sequence_length)`): input_ids (`tf.Tensor` or `Numpy array` of shape `(batch_size, sequence_length)`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See [`PreTrainedTokenizer.__call__`] and Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.__call__`] and
[`PreTrainedTokenizer.encode`] for details. [`PreTrainedTokenizer.encode`] for details.
[What are input IDs?](../glossary#input-ids) [What are input IDs?](../glossary#input-ids)
......
...@@ -610,7 +610,7 @@ VILT_START_DOCSTRING = r""" ...@@ -610,7 +610,7 @@ VILT_START_DOCSTRING = r"""
VILT_INPUTS_DOCSTRING = r""" VILT_INPUTS_DOCSTRING = r"""
Args: Args:
input_ids (`torch.LongTensor` of shape `({0})`): input_ids (`torch.LongTensor` of shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices can be obtained using [`BertTokenizer`]. See Indices of input sequence tokens in the vocabulary. Indices can be obtained using [`AutoTokenizer`]. See
[`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details. [What are input [`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details. [What are input
IDs?](../glossary#input-ids) IDs?](../glossary#input-ids)
...@@ -665,7 +665,7 @@ VILT_INPUTS_DOCSTRING = r""" ...@@ -665,7 +665,7 @@ VILT_INPUTS_DOCSTRING = r"""
VILT_IMAGES_AND_TEXT_CLASSIFICATION_INPUTS_DOCSTRING = r""" VILT_IMAGES_AND_TEXT_CLASSIFICATION_INPUTS_DOCSTRING = r"""
Args: Args:
input_ids (`torch.LongTensor` of shape `({0})`): input_ids (`torch.LongTensor` of shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices can be obtained using [`BertTokenizer`]. See Indices of input sequence tokens in the vocabulary. Indices can be obtained using [`AutoTokenizer`]. See
[`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details. [What are input [`PreTrainedTokenizer.encode`] and [`PreTrainedTokenizer.__call__`] for details. [What are input
IDs?](../glossary#input-ids) IDs?](../glossary#input-ids)
......
...@@ -851,7 +851,7 @@ class TF{{cookiecutter.camelcase_modelname}}PreTrainedModel(TFPreTrainedModel): ...@@ -851,7 +851,7 @@ class TF{{cookiecutter.camelcase_modelname}}PreTrainedModel(TFPreTrainedModel):
input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]`, `Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`): input_ids (`np.ndarray`, `tf.Tensor`, `List[tf.Tensor]`, `Dict[str, tf.Tensor]` or `Dict[str, np.ndarray]` and each example must have the shape `({0})`):
Indices of input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [`BertTokenizer`]. See Indices can be obtained using [`AutoTokenizer`]. See
[`PreTrainedTokenizer.__call__`] and [`PreTrainedTokenizer.encode`] for [`PreTrainedTokenizer.__call__`] and [`PreTrainedTokenizer.encode`] for
details. details.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment