Allow tokenization of sequences > 512 for caching
For many applications requiring randomized data access, it's easier to cache the tokenized representations than the words. So why not turn this into a warning?
Showing
Please register or sign in to comment