• Kyeongpil Kang's avatar
    Update 01-training-tokenizers.ipynb (typo issue) (#3343) · 8eeefcb5
    Kyeongpil Kang authored
    I found there are two grammar errors or typo issues in the explanation of the encoding properties.
    
    The original sentences:
    If your was made of multiple \"parts\" such as (question, context), then this would be a vector with for each token the segment it belongs to
    If your has been truncated into multiple subparts because of a length limit (for BERT for example the sequence length is limited to 512), this will contain all the remaining overflowing parts.
    
    I think "input" should be inserted after the phrase "If your".
    8eeefcb5
01-training-tokenizers.ipynb 14.1 KB