Unverified Commit 58eee5f2 authored by Vadim Gimpelson's avatar Vadim Gimpelson Committed by GitHub
Browse files

[PERF] Use faster way of decode in tokenizer: avoid useless list-to-list conversion (#20000)


Signed-off-by: default avatarVadim Gimpelson <vadim.gimpelson@centml.ai>
parent 067c34a1
...@@ -50,11 +50,12 @@ def decode_tokens( ...@@ -50,11 +50,12 @@ def decode_tokens(
`skip_special_tokens=None` means to use the backend's default `skip_special_tokens=None` means to use the backend's default
settings. settings.
""" """
decode_method = getattr(tokenizer, "_decode", tokenizer.decode)
if skip_special_tokens is not None: if skip_special_tokens is not None:
return tokenizer.decode(token_ids, return decode_method(token_ids,
skip_special_tokens=skip_special_tokens) skip_special_tokens=skip_special_tokens)
return tokenizer.decode(token_ids) return decode_method(token_ids)
def encode_tokens( def encode_tokens(
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment