"docs/source/en/serialization.mdx" did not exist on "4f4e5ddbcbdcd9d6353fc27d0137ac887a7f2f25"
Unverified Commit 74814574 authored by wiio12's avatar wiio12 Committed by GitHub
Browse files

Add doc about `attention_mask` on gpt2 (#16829)

* Add doc about `attention_mask` on gpt2

Add a simple sentence describing how `attention_mask` needs to be constructed when ``past_key_values` is used.

* Add doc about attention_mask on gpt2_tf

* clean up style

* remove empty line white spaces

* remove whitespace in empty line
parent b96e82c8
...@@ -565,6 +565,10 @@ GPT2_INPUTS_DOCSTRING = r""" ...@@ -565,6 +565,10 @@ GPT2_INPUTS_DOCSTRING = r"""
- 1 for tokens that are **not masked**, - 1 for tokens that are **not masked**,
- 0 for tokens that are **masked**. - 0 for tokens that are **masked**.
If `past_key_values` is used, `attention_mask` needs to contain the masking strategy that was used for
`past_key_values`. In other words, the `attention_mask` always has to have the length:
`len(past_key_values) + len(input_ids)`
[What are attention masks?](../glossary#attention-mask) [What are attention masks?](../glossary#attention-mask)
token_type_ids (`torch.LongTensor` of shape `(batch_size, input_ids_length)`, *optional*): token_type_ids (`torch.LongTensor` of shape `(batch_size, input_ids_length)`, *optional*):
Segment token indices to indicate first and second portions of the inputs. Indices are selected in `[0, Segment token indices to indicate first and second portions of the inputs. Indices are selected in `[0,
......
...@@ -655,6 +655,10 @@ GPT2_INPUTS_DOCSTRING = r""" ...@@ -655,6 +655,10 @@ GPT2_INPUTS_DOCSTRING = r"""
- 1 for tokens that are **not masked**, - 1 for tokens that are **not masked**,
- 0 for tokens that are **masked**. - 0 for tokens that are **masked**.
If `past_key_values` is used, `attention_mask` needs to contain the masking strategy that was used for
`past_key_values`. In other words, the `attention_mask` always has to have the length:
`len(past_key_values) + len(input_ids)`
[What are attention masks?](../glossary#attention-mask) [What are attention masks?](../glossary#attention-mask)
token_type_ids (`tf.Tensor` or `Numpy array` of shape `(batch_size, sequence_length)`, *optional*): token_type_ids (`tf.Tensor` or `Numpy array` of shape `(batch_size, sequence_length)`, *optional*):
Segment token indices to indicate first and second portions of the inputs. Indices are selected in `[0, Segment token indices to indicate first and second portions of the inputs. Indices are selected in `[0,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment