"...git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "4893d919f16e754fe00be14d5e70ffc46cb3ebbd"
Unverified Commit 7bc4a01c authored by Ngo Quang Huy's avatar Ngo Quang Huy Committed by GitHub
Browse files

Update bad_words_ids usage (#15641)

* Improve the parameter `bad_word_ids' usage

* Update the bad_words_ids strategy
parent 67047b86
...@@ -380,8 +380,9 @@ class NoBadWordsLogitsProcessor(LogitsProcessor): ...@@ -380,8 +380,9 @@ class NoBadWordsLogitsProcessor(LogitsProcessor):
Args: Args:
bad_words_ids (`List[List[int]]`): bad_words_ids (`List[List[int]]`):
List of list of token ids that are not allowed to be generated. In order to get the tokens of the words List of list of token ids that are not allowed to be generated. In order to get the token ids of the words
that should not appear in the generated text, use `tokenizer(bad_word, add_prefix_space=True).input_ids`. that should not appear in the generated text, use `tokenizer(bad_words, add_prefix_space=True,
add_special_tokens=False).input_ids`.
eos_token_id (`int`): eos_token_id (`int`):
The id of the *end-of-sequence* token. The id of the *end-of-sequence* token.
""" """
......
...@@ -901,8 +901,8 @@ class GenerationMixin: ...@@ -901,8 +901,8 @@ class GenerationMixin:
If set to int > 0, all ngrams of that size that occur in the `encoder_input_ids` cannot occur in the If set to int > 0, all ngrams of that size that occur in the `encoder_input_ids` cannot occur in the
`decoder_input_ids`. `decoder_input_ids`.
bad_words_ids(`List[List[int]]`, *optional*): bad_words_ids(`List[List[int]]`, *optional*):
List of token ids that are not allowed to be generated. In order to get the tokens of the words that List of token ids that are not allowed to be generated. In order to get the token ids of the words that
should not appear in the generated text, use `tokenizer(bad_word, add_prefix_space=True, should not appear in the generated text, use `tokenizer(bad_words, add_prefix_space=True,
add_special_tokens=False).input_ids`. add_special_tokens=False).input_ids`.
num_return_sequences(`int`, *optional*, defaults to 1): num_return_sequences(`int`, *optional*, defaults to 1):
The number of independently computed returned sequences for each element in the batch. The number of independently computed returned sequences for each element in the batch.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment